Unknown's avatar

About Greg Moore

Founder and owner of Green Mountain Software, a consulting firm based in the Capital District of New York focusing on SQL Server. Formerly, a consulting DBA ("and other duties as assigned") by day, and sometimes night, and caver by night (and sometimes day). Now, a PA student working to add PA-C after my name so I can work as a Physician Assistant. When I'm not in front of a computer or with my family I'm often out hiking, biking, caving or teaching cave rescue skills.

Choices

“If you choose not to decide You still have made a choice” – Rush Freewill

One of the things that we believe makes us uniquely human is the concept of freewill; that we can rise above our base instincts and make choices based on things other than pure instinct. While there’s some question if that’s unique to humans, let’s stick with it for now.

Overall, we think choice is good. I can choose to eat cake for breakfast, or I can choose to eat a healthy breakfast. I can choose if I want get up early and exercise, or sleep in.

Sometimes we may think it’s hard to decide between two such things as in the examples above, but the truth is, it’s not that hard.

But, what happens when the choices aren’t nearly as simple. What happens when we sit down with a menu with 3 items versus 30 or even 100? We can become paralyzed. With 3 options, our odds of making a “wrong” decision is only 66%. I say “wrong”because it’s often purely subjective and may not necessarily have much impact.  But when we have 100 different things to choose from, the odds of a “wrong” decision goes up to 99%. In other words, we’re faced with the concept that no matter what we do, we’re virtually guaranteed to make a “wrong” decision.

The Jam Experiment

One example of this effect was seen in what is often called the jam experiment. Simply put, when given the choice of 6 varieties of jam, consumers showed a bit less interest, but sales were higher. When the choice of 24 jams were presented, there was more interest, but sales actually dropped, significantly. People were apparently paralyzed by having too many choices.

Locally there’s an outdoor hamburger/hot-dog stand I like to frequent called Jack’s Drive In. People will stand in long lines, in all sorts of weather (especially on opening day, like this year when the line was 20 people deep and with the windchill it was probably about 20F!) One can quibble over the quality of the burgers and fries, but there’s no doubt they do a booming business. And part of the reason is because they have few choices and keep the line moving.  This makes it far faster for people to order and faster to cook.  With only a few choices, patrons don’t spend 5 minutes dithering over a menu.

Hint: If you’re ever in the area, simply tell them you want “Two burgers and a small french”.  Second hint: No matter how hungry you are, don’t as a former co-worker once did, try “6 Burgers and a large french”. You will regret that particular choice.

Choices to Europe

What brought this particular post on was all the choices I’m facing in trying to plan our family vacation. It’s rather simple really, “we want to visit Europe”. But, I also am hoping to speak at SQL Saturday in Manchester, UK. And we want to visit London (where my cousin lives) and Paris. And we can fly out of the NYC area. Or Boston. Or possibly other areas if the price was cheaper enough.  So suddenly what one would hope is a simple thing becomes very complicated. And of course every airline has their own website design, which complicates things.

Of course the simple choice would be not to fly. The second simplest would be not to care about cost.  Of course neither of those work. So, I’m stuck in deciding between 24 types of jam. Wish me luck!

Getting Unlost

There’s a concept I teach people when I teach outdoor skills. If you’re going to be wrong, be confidently wrong. There’s two reasons for this. For one, people are more likely to follow a leader who appears to be confident and knows what they’re doing. This can lead to better group dynamics and a better outcome.

But the second, for example, if you’re lost is, if for whatever reason you choose NOT to stay in one place (which by the way is often the best choice, especially for children) is that if you make a plan and stick to it, you’re far more likely to get unlost. This isn’t just wishful thinking.

Imagine you’re lost and you decide, “I’m going to hike North!”  And you start to hike north, and after 15 minutes you decide, “eh maybe that was the wrong decision. I should hike East!” And you do this for another 15 minutes, and then you decide, “Nah, now that I think about it, South is much better.” 15 minutes later you decide you’re going to the wrong way and West was the right way all along.  An hour later, you’re back where you started. But, if you had decided to stick with North the entire time, an hour later, depending on your pace, terrain and other factors, you could be 2-4 miles further north. “So what?” you might ask. Well, take a look at a map of almost any part of the country.  In most cases you’re less than 10 miles from some sort of road.  If you’ve spent 3 hours hiking, in a single direction, you’ve probably hit a road, or a powerline or some other sign of civilization. (note this is NOT advice to wander in the woods if you get loss or a promise this will work anyplace. There are definitely places in the US this advice is bad advice).  Also obviously, if you hit a gorge or other impassible geologic feature, you may have to change directions. Or you might get another clue (like hearing a chainsaw or engine or something human-caused in a specific direction).

Final Thoughts

So, if you’re going to make a choice, make it confidently. And don’t second-guess yourself until new, solid reasons come along.

So, keep your choices simple and stick to them.

And with that, I choose to stop typing now.

 

Fail-safes

Dam it Jim, I’m a Doctor, not a civil engineer

I grew up near a small hydro-electric dam in CT. I was fascinated by it (and still am in many ways). One of the details I found interesting was that on top of this concrete structure they had what I later found are often called flashboards. These were 2x8s (perhaps a bit wider) running the length of the top of the dam, held in place by wooden supports.  The general idea was they increased the pooling depth by 8″ or so, but in the advent of a very heavy water flow or flood, they could be easily removed (in many cases removed simply by the force of the water itself).  They safely provided more water, but were designed in fact to fail (i.e. give away) in a safe and predictable manner.

This is an important detail that some designers of systems often don’t think about; how to fail. They spend so much time trying to PREVENT a failure, they don’t think about how the system will react in the EVENT of a failure. Properly designed systems assume that at some point failure IS an not only an option, it’s inevitable.

When I was first taught rigging for cave rescue, we were always taught “Have a mainline and a belay”.  The assumption is, that the system may fail. So, we spent a lot of time learning how to design a good belay system. The thinking has changed a bit these days, often we’re as likely to have TWO “mainlines” and switch between them, but the general concept is still the same, in the event of a failure EITHER line should be able to catch the load safely and be able to recover. (i.e. simply catching the fall but not being able to resume operations is insufficient.)

So, your systems. Do you think about failures and recovery?

Let me tell you about the one that prompted this post.  Years ago, for a client I built a log-shipping backup system for them. It uses SSH and other tools to get the files from their office to the corporate datacenter.  Because of the network setup, I can’t use the built-in SQL Server log-shipping copy commands.

But that’s not the real point. The real point is… “stuff happens”. Sometimes the network connection dies. Sometimes the copy hangs, or they reboot the server in the office in the middle of a copy, etc. Basically “things break”.

And, there’s another problem I have NOT been able to fix, that only started about 2 years ago (so for about 5 years it was not a problem.) Basically the SQL Server in the datacenter starts to have a memory leak and applying the log-files fails and I start to get errors.

Now, I HATE error emails. When this system fails, I can easily get like 60 an hour (every database, 4 times an hour plus a few other error emails). That’s annoying.

AND it was costing the customer every time I had to go in and fix things.

So, on the receiving side I setup a job to restart SQL Server and Agent every 12 hours (if we ever go into production we’ll have to solve the memory leak, but at this time we’ve decided it’s such a low priority as to not bother, and since it’s related to the log-shipping and if we failed over we’d be turning off log-shipping, it’s considered even less of an issue). This job comes in handy a bit later in the story.

Now, on the SENDING side, as I’ve said, sometimes the network would fail, they’d reboot in the middle of a copy or something random would make the copy job get “stuck”. This meant rather than simply failing, it would keep running, but not doing anything.

So, I eventually enabled a “deadman’s switch” in this job. If it runs for more than 12 hours, it will kill itself so that it can run normally again at the next scheduled time.

Now, here’s what often happens. The job will get stuck. I’ll start to get email alerts from the datacenter that it has been too long since logfiles have been applied. I’ll go in to the office server, kill the job and then manually run it. Then I’ll go into the datacenter, and make sure the jobs there are running.  It works and doesn’t take long. But, it takes time and I have to charge the customer.

So, this weekend…

the job on the office server got stuck. So I decided to test my failsafes/deadman switches.

I turned off SQL Agent in the datacenter, knowing that later that night my “cycle” job would turn it back on. This was simply so I wouldn’t get flooded with emails.

And, I left the stuck job in the office as is. I wanted to confirm the deadman’s switch would kick in and kill it and then restart it.

Sure enough later that day, the log files started flowing to the datacenter as expected.

Then a few hours later the SQL Agent in the datacenter started up again and log-shipping picked up where it left off.

So, basically I had an end to end test that when something breaks, on either end, the system can recover without human intervention. That’s pretty reassuring. I like knowing it’s that robust.

Failures Happen

And in this case… I’ve tested the system and it can handle them. That lets me sleep better at night.

Can your systems handle failure robustly?

 

 

Things Left Unsaid

Pop Quiz

You show up at an accident scene and see two patients. One is screaming in pain about a broken arm. The other is propped up against the wall seemingly fine, not saying a word. Which one do you check out first?

Many will answer, “the one screaming in pain about the broken arm, the other person is fine.” The experienced responder will most likely check out the 2nd person. Why? Because they’re NOT saying a word.

Here’s the thing. You know the 1st person has a pulse and an airway. They’re breathing just fine. Perhaps a bit too fine. A broken arm, by itself isn’t going to kill them.  But what about that 2nd person? Are they breathing? You don’t know. Perhaps they’re not saying a word because they’ve stopped breathing.  If you take the time to splint the broken arm and then get to the 2nd person, they may have died. So, check out the 2nd patient first, then determine your course of action.

We’re Safe! Really, we are. Trust us, because we keep repeating it!

I saw this because in problem solving, I often find what’s NOT said is often far more important than what is said.  Several years ago my son received a letter saying he had been nominated for a program that took children to other countries on basically extended field trips. It actually sounded really interesting. We went to the presentation. I sat through it thinking, “this is really cool.” But, two things struck me. First, they kept emphasizing how safe it was. At first pass, and the first time they mentioned it, I wasn’t bothered. I mean as a parent, you want to know your kid is going to be safe if you put them in the hands of strangers for an extended period of time. But, they kept emphasizing it. It got to the point that all three of us (my wife, my son and I) started to wonder, “why the hell are the dwelling on this point?”

The other thing that was bothersome was once we got out of the lecture hall and tried to speak to some of the individuals, we asked them “How did our son get nominated?” “Oh it must have been a teacher at his school.” Which sounded great until we thought about it and thought it strange that no teacher had mentioned this to us or our son.

So, when we got home, we did some digging and found out there had been several incidents of accidents happening to students while overseas with this group. On one hand, nothing struck me as too statistically terrible, but the reports of the handling and the fact that we were only reading about the ones reported made me even more paranoid about how unsafe the program really was. I mean why emphasize safety unless you really feel like you have to?

The other detail we uncovered was most parents had the same experience about “your child has been nominated” without any word of by whom. The most troubling was at least one or two parents who chimed in who said that their child had been killed in an accident or otherwise died after their name had appeared in the newspaper for being on the honor roll. i.e. a fact that a teacher who might be in a position to nominate the said child would be well aware of. As far as we and other parents could determine, the “nomination” process was solely a matter of the group scanning the newspapers for honor roll students and the like.

So, relating this back to IT

As a person who loves troubleshooting, one of the things I’ve learned is NOT to trust what the user initially reports to me. “I haven’t changed a thing and this stopped working!”  That generally means, they changed something. 🙂

I once had a client, that had a problem that took at least two winters to diagnose. Why so long you might ask? Because the problem only happened in the winter. The first year it was complaints of “ever since you networked our computers, they reboot without warning.” Now, I had networked them several months previously and they only started to report the problem come the late fall/early winter. I tried several things, but nothing really fixed the problem. I had an idea of what it was, but they wouldn’t listen.  So, among other things, I ended up rewiring their entire network (sounds like a lot of work, but it was a total of 4-5 computers and I moved from thinwire Ethernet to 10baseT (I did say this was a long time ago, right?)

Eventually I sort of gave up. Until the next winter rolled around and they started to call again. Again, I told them what I thought the problem was. Again, they dismissed it.  I’m not sure what finally convinced them, but they finally took me up on my suggestion and put in a humidifier and had their office carpet treated with anti-static spray.  Yes, despite all their instance that “I was just sitting there typing and it rebooted” what was really happening and they weren’t saying was, “I just walked from one office to the other, across the carpet, in the drier than normal air and as soon as I touched my computer it rebooted.”  It was the static build-up all the time.

So this week’s moral of the story: Look beyond what’s being said and pay attention to what’s NOT being said. It might shock you.

 

SQL Data Partners Podcast

I’ve been keeping mum about this for a few weeks, but I’ve been excited about it. A couple of months ago, Carlos L Chacon from SQL Data Partners reached out to me about the possibility of being interviewed for their podcast. I immediately said yes. I mean, hey, it’s free marketing, right?  More seriously, I said yes because when a member of my #SQLFamily asks for help or to help, my immediate response is to say yes.  And of course it sounded like fun.  And boy was I right!

What had apparently caught Carlos’s attention was my book: IT Disaster Response: Lessons Learned in the Field.  (quick go order a copy now.. that’s what Amazon Prime is for, right?  I’ll wait).

Ok, back? Great. Anyway, the book is sort of a mash-up (to use the common lingo these days) of my interests in IT and cave rescue and plane crashes. I try to combine the skills, lessons learned, and tools from one area and apply them to other areas. I’ve been told it’s a good read. I like to think so, but I’ll let you judge for yourself. Anyway, back to the podcast.

So we recorded the podcast back in January. Carlos and his partner Steve Stedman were on their end and I on mine. And I can tell you, it was a LOT of fun. You can (and should) listen to it here.  I just re-listened to it myself to remind myself of what we covered. What I found remarkable was the fact that as much as I was really trying to tie it back to databases, Carlos and Steve seemed as much interested, if not more in cave rescue itself. I was ok with that.  I personally think we covered a lot of ground in the 30 or so minutes we talked. And it was great because this is exactly the sort of presentation, combined  with my air plane crash one and others I’m looking to build into a full-day onsite consult.

One detail I had forgotten about in the podcast was the #SQLFamily questions at the end. I still think I’d love to fly because it’s cool, but teleportation would be useful too.

So, Carlos and Steve, a huge thank you for asking me to participate and for letting me ramble on about one of my interests.  As I understand it my Ray Kim has a similar podcast with them coming up in the near future also.

So thought for the day is, think how skills you learn elsewhere can be applied to your current responsibilities. It might surprise you and you might do a better job.

 

 

 

What a Lucky Man He Was….

Being a child of the 60s my musical tastes run the gamut from The Doors through Rachel Platten.  In this case, the title of course comes from the ELP song.

Anyway, today’s post is a bit more reflective than some. Since yesterday I’ve been fighting what should be simple code. Years back I wrote a simple website to handle student information for the National Cave Rescue Commission (NCRC).  The previous database manager had started with a database designed back in the 80s. It was certainly NOT web friendly. So after some work I decided it was time to make it a bit more accessible to other folks.  Fortunately ASP.NET made much of the work fairly easy.  It did what I wanted to do. But now, I’m struggling to figure out how to get and save profile information along with membership info.  Long story short, due to a design decision years back, this isn’t as automatic and easy as I’d like.  So, I’ve been banging my head against the keyboard quite a bit over the last 24 hours. It’s quite frustrating actually.

So, why do I consider myself lucky? Because I can take the time to work on this. Through years of hard work, education and honestly a bit of luck, I’m at the point where my wife and I can provide for our family to live a comfortable life and I can get away with working less than a full 40 hours a week. This is important to me as I get older. Quality of life becomes important.

I’ve talked about my involvement in cave rescue in the past and part of that is wearing of multiple hats. Some of which take more work than others.

I am for example:

  • Co-captain of the Albany-Schoharie Cave Rescue Team – This is VERY sporadic and really sort of unofficial and some years we will have no rescues at all locally.
  • I’m an Instructor with the NCRC – This means generally a week plus a few days every year I take time out to travel, at my own expense to a different part of the country and teach students the skills required to be effective in a cave rescue. For this, I get satisfaction. I don’t get paid and like I say I travel at my own expense.  Locally I generally take a weekend or two a year to teach a weekend course.
  • I’m a Regional Coordinator with the NCRC – Among other things this means again I travel at my own expense once a year, generally to Georgia, to meet with my fellow coordinators so we can conduct the business of the NCRC. This may include approving curriculum created by others, reviewing budgets and other business.
  • Finally, I’m the Database Coordinator. It’s really a bit more of IT Coordinator but the title is what it is. This means not only do I develop the database and the front end, I’m responsible for inputting data and running reports.

As you can see, this time adds up, quickly.  I’d say easily, in terms of total time, I dedicate a minimum of two weeks a year to the NCRC.  But it’s worth it. I can literally point at rescues and say, “those people are alive because of the work I and others do”. Sometimes it’s direct like when I’m actually on a rescue, sometimes it’s indirect when I know it’s a result of the training I and others have provided.  But it’s worth it.  I honestly can claim I work with some of the best people in the world. There are people here that I would literally put my life on the line for; in part because I know they’d do the same.

So, I’m lucky. I’m lucky that I can invest so much of my time in something I enjoy and love so much.  I’ll figure out this code and I’ll keep contributing, because it’s worth it, and because I’m lucky enough that I can.

How are you lucky?

 

 

Hours for the week

Like I say, I don’t generally post SQL specific stuff because, well there’s so many blogs out there that do. But what the heck.

Had a problem the other day. I needed to return the hours worked per timerange for a specific employee. And if they worked no hours, return 0.  So basically had to deal with gaps.

There’s lots of solutions out there, this is mine:

Alter procedure GetEmployeeHoursByDate @startdate date, @enddate date , @userID varchar(25)
as

— Usage exec GetEmployeeHoursByDate ‘2018-01-07’, ‘2018-01-13’, ‘gmoore’

— Author: Greg D. Moore
— Date: 2018-02-12
— Version: 1.0

— Get the totals for the days in question

 

 

set NOCOUNT on

— First let’s create simple table that just has the range of dates we want

; WITH daterange AS (
SELECT @startdate AS WorkDate
UNION ALL
SELECT DATEADD(dd, 1, WorkDate)
FROM daterange s
WHERE DATEADD(dd, 1, WorkDate) <= @enddate)

 

select dr.workdate as workdate, coalesce(a.dailyhours,0) as DailyHours from
(
— Here we get the hours worked and sum them up for that person.

select ph.WorkDate, sum(ph.Hours) as DailyHours from ProjectHours ph
where ph.UserID=@userid
and ph.workdate>= @startdate and ph.workdate <= @enddate
group by ph.workdate
) as a
right outer join daterange dr on dr.WorkDate=a.WorkDate — now join our table of dates to our hours and put in 0 for dates we don’t have hours for
order by workdate

GO

There’s probably better ways, but this worked for me. What’s your solution?

The Basics

Last night at our local SQL Server User Group meeting we had the pleasure of Deborah Melkin speaking.  I first met Deborah at our Albany SQL Saturday Event last year. She gave: Back to the Basics: T-SQL 101. Because of the title I couldn’t help but attend. It wasn’t the 101 part by itself that caught my eye. It was the “Back to the Basics”. While geared to beginners, I thought the idea of going back to the basics of something I take for granted was a great idea. She was also a first time speaker, so I’ll admit, I was curious how she would do.

It was well worth my time. While I’d say most of it was review, I was reminded of a thing or two I had forgotten and taught a thing or two.  But also very importantly, she had a great ability to break down the subject into a clearly understandable talk. This is actually harder than many people realize. I’ve heard some brilliant speakers, who simply can’t convey their message, especially on basic items of knowledge, in a way that beginners can understand it.

So, after the talk last summer, I cornered her at the Speaker’s Dinner and insisted she come up with a follow up, a 201 talk if you will. Last night she obliged, with “Beyond the Select”.  What again struck me about it, was other than a great tip in SSMS 17.4 (highlighting a table alias will show you what the base table is), again nothing was really new to me. She talked about UDFs; I’ve attended entire sessions on UDFs. She talked about CTE; I’ve read extensively about them. She discussed windowing functions; we’ve had one of our presenters present on them locally. Similarly with some of the other items she had brought up.

Now, this is NOT a slight at all, but really a compliment. Both as an attendee and as the guy in charge of selecting speakers, it was great to have a broad-reaching topic. Rather than a deep-drive, this was a bit of everything that gave the audience a chance to learn a bit of everything if they hadn’t seen it before (and based on the reactions and feedback I know many learned new stuff) and to compare different methods of doing things.  For example what’s the advantage of a CTE vs. a derived table vs. a temp table.  Well the answer is of course the DBA’s favorite answer, “it depends”.

As a DBA with decades of experience and as an organizer, it’s tempting to have a Bob Ward type talk every month. I enjoyed his talk last month. But, honestly, sometimes we need to go back and review the basics. We’ll probably learn something new or relearn something we had forgotten. And with talks like Deborah’s, we get to see the big picture, which is also very valuable.

So my final thought this week is that in any subject, not only should we be doing the deep dives that extend our knowledge, but we should review our basics. As DBAs, we do a select every day. We take it for granted, but how many people can really tell you clearly the order of operations? Review the basics once in awhile. You may learn something.

And that’s why I selected this topic for this week’s blog.

Debugging – Powerhell (sic)

Generally I don’t plan to talk too much about specific programming problems. There’s too many blogs out there about how to solve particular problems; and most of them are generally good and even better, accurate.

I’m going to bore most of you, so if you want to jump to the summary, that’s fine. I won’t be insulted (heck I won’t even know!)

That said, last night I spent a lot of time debugging a particular PowerShell script I had written for a client.

I want to talk a bit about what was going on and how I approached the problem.

So, first the setup: It’s really rather simple

  1. Log into an SFTP site
  2. Download a zip file
  3. Expand the zip file
  4. Run an SSIS package to import the files in the zip package to a SQL Server

I had written the script several months ago and it’s been in production since then.  It’s worked great except for one detail. When it was put into production, we didn’t yet have the service account to run it under, so I set it up to run as a scheduled task using my account information. Not ideal, but workable for now.

Since then, we received the service account. So, I had made several accounts to move to the service account, but it wasn’t working.  Yesterday was the first time in awhile I had time to work on it so I started looking at it in more detail.

Permissions Perhaps?

So, the first problem I had was, I’d run the scheduled task and nothing would happen. Ok, that’s not entirely accurate. The script would start, but never run to completion. It just sort of sat there. And it certainly was NOT downloading the zip file.

My first thought was it might be a files permission or other security issue. So the first thing I did was write a really simply script.

get-date | out-file “D:\foo\timetest.text”

Then I setup a scheduled task and ran it manually. And sure enough, timetest.txt showed up exactly where it was supposed to.

So much for security in the OS.

out-file to the Rescue!

So, my next step is probably one of the oldest debug techniques ever: I put in statements in the code to write to a file as I hit various steps in the code.  Not quite as cool as a debugger, but hey, it’s simple and it works.

It was simple stuff like:

“Path $SftpPath Set” | out-file ftp_log.txt -append

BTW, if you’re not familiar with Powershell, one of the nice things is in the above statement, it’ll automatically insert the variable $SftpPath into the string it’s printing out.

This started to work like a charm. And then… jackpot!

“Setting Session using $sftpURL and Password: $password and credentials: $Credential” | out-file ftp_log.txt -append

When I ran it, I’d get a line in the log like:

Setting Session using secureftp.example.com and Password: System.Security.SecureString and Credentials: System.Management.Automation.PSCredentials

Not the most informative, but useful. And it’s nice to know that one can’t easily just print out the password!

But, when I ran it as the service I was getting something VERY different.

Setting Session using secureftp.example.com and Password:  and Credentials:

HUGE difference. What the heck was happening to the password and credentials?

That’s when the first lightbulb hit.

Secure credentials in PowerShell

Let’s take a slight side trip here.  We all know (right, of course you do) that we should never store passwords in cleartext in our scripts.  So I did some digging and realized that PowerShell actually has a nice way to handle this. You can pass a plaintext string to a cmdlet and write that out to a file.  Then when you need credentials, you read from the file and it gets parsed and handed to whatever needs it.

$password = get-content $LocalFilePath\cred.txt | ConvertTo-SecureString

Of course first you have to get the password INTO that cred.txt file.

That’s easy!

if (-not (test-path $LocalFilePath\cred.txt))
{
read-host -AsSecureString | ConvertFrom-SecureString | Out-File     $LocalFilePath\cred.txt
}

So, the first time the program is run, if the cred.txt file isn’t found, the user is prompted for it, they enter the password, it gets put into a secure string and written out. From then on, the cred.txt file is used and the password is nice and secure.

And I confirmed that the service account could actually see the cred.txt file.

So what was happening?

The Key!

It took me a few minutes and when I hit a problem like this I do what I often do, step away from the keyboard. I had to stop and think for a bit. Then it dawned on me. “How does Powershell know how to encrypt and decrypt the password in a secure fashion. I mean at SOME point it needs to have a clear-text version of to pass to the various cmdlets. It wouldn’t be very secure if just anyone could decrypt it.  That’s when it struck me! The encryption key (and decryption key) has to be based on the USER running the cmdlet!

Let me show some more code

$password = get-content $LocalFilePath\cred.txt | ConvertTo-SecureString
$Credential = New-Object System.Management.Automation.PSCredential (‘ftp_example’, $Password)

$sftpURL = ‘secureftp.example.com’

$session = New-SFTPSession -ComputerName $sftpURL -Credential $Credential

So first I’d get the password out of my cred.txt file and then create a Credential Object with a username (ftp_example) and the aforementioned password.

Then I’d try to open a session on the SFTP server using that credential.  And that’s exactly where it was hanging when I ran this under the service account. Now I was on to something. Obviously the $password and $Credential wasn’t getting set as we saw from my debug statements.  It wasn’t able to decrypt cred.txt!

Great, now I just need to have the service create its own cred.txt.  But, I can’t use the same technique where the first time the service runs it prompts the user for the password.  This bothers me. I still don’t have a perfect solution.

For now I fell back on:

if (-not (test-path $LocalFilePath\cred_$env:UserName.txt))
{
# after running once, remove password here!
“password_here” | ConvertTo-SecureString -AsPlainText -Force | ConvertFrom-SecureString | Out-File $LocalFilePath\cred_$env:UserName.txt
}

Note I changed the name of the cred.txt file to include the username, so I could have one when I ran it and another when the service account ran it. This makes testing far easier AND solves conflicts when debugging (i.e. “which credentials are currently being used).

So for now the documentation will be “after running the first time, remove the password”. I think more likely I’ll write one-time script that runs to setup a few things and then deletes itself. We’ll see.

Anyway, this worked much better. Now my debug lines were getting the password and all. BUT… things were STILL hanging.  Talk about frustrating. I finally tracked it down to the line

$session = New-SFTPSession -ComputerName $sftpURL -Credential $Credential

Caving Blind

As you know, I love caving. And for caving, a headlamp is a critical piece of equipment. Without it you have no idea where you’re going.  That’s how I felt here. Things were simply hanging and I was getting NO feedback.

I tried

$session = New-SFTPSession -ComputerName $sftpURL -Credential $Credential | out-file ftp_log.txt -append

But I still wasn’t getting anything. I tried a few other things, including setting parameters on how New-SFTPSession would handle errors and warnings. But still nothing. It was like caving blind. The cmdlet was being called, but I couldn’t see ANY errors.  Of course when I ran it as myself, it ran fine. But when I ran it as a service, it simply hung.

I was getting frustrated I couldn’t get any feedback. I figured once I had feedback, the rest would be simple. I was mostly right.

I needed my headlamp! I needed to see what was going on!

Finally it dawned on me, “wrap it in an exception”. So now the code became:

try
{
$session = New-SFTPSession -ComputerName $sftpURL -Credential $Credential
}
catch
{
$exception = $_.Exception.Message
“Error in New-SFTPSession – $exception – ” | out-file ftp_log.txt -append
}

Now THIS got me somewhere. I was getting a timeout message!  Ok, that confused me. I mean I knew I had the right username and password, and it never timed out when I ran it. Why was it timing out when the service ran it?

So again, I did what I do in cases like this. Got up, went to the fridge, got some semi-dark chocolate chips and munched on them and ran through my head, “why was the secure FTP session timing out in one case, but not the other?”  At times like this I feel like Jack Ryan in Hunt for Red October, “How do you get a crew to want to get off a nuclear sub…”

Eureka!

DOH, it’s SECURE FTP. The first time it runs it needs to do a key exchange!  I need to accept the key!  At first I panicked though. How was I supposed to get the service to accept the key?  Fortunately it turned out to be trivial. There’s actually a parameter -Acceptkey. I added that… and everything ran perfectly.

Except the SSIS package

That’s a separate issue and again probably related to security and that’s my problem to debug today. But, I had made progress!

Summary

Now, quite honestly, the above is probably more detail than most care to read. But let me highlight a couple of things I think that are important when debugging or trouble shooting in general.

First, simplify the problem and eliminate the obvious. This is what I did with my first script. I wanted to make sure it wasn’t something simple and obvious. Once I knew I could write to a file using the service account that ruled out a whole line of questions. So I could move on to the next step.

Often I use an example from Sesame Street of “one of these things is not like another.”  In this case, I had to keep figuring out what was different between when I ran it and when the service account ran things. I knew up front that anything requiring keyboard input would be a problem. But, I thought I had that covered. Of course starting to compare and contrast the results of decrypting the cred.txt file showed that there was a problem there. And the issue with the initial acceptance of the SFTP key was another problem.

So, gather information and compare what works to what doesn’t work. Change one thing at a time until you narrow down the problem.

The other issue is being able to get accurate and useful information. Using debugging statements if often old school, and I know some developers look down on them, but often the quick and dirty works. So I had no problem using them. BUT, there are still cases where they won’t work. If for example a cmdlet hangs or doesn’t give output to standard output, they won’t work. Catching the exception definitely solved this.

The biggest problems I really ran into here is I’m still a beginner with PowerShell. I’m loving it, but I often still have to lookup things. But the basic troubleshooting skills are ones I’ve developed and honed over the years.

And quite seriously, sometimes don’t be afraid to walk away and do something else for a bit. It almost always brings a fresh perspective. I’ve been known to take a shower to think about a problem. Working from home that is easier for me than it might be for someone in an office. But even then, go take a walk. Or do something.  Approach the problem with a fresh mind. All too often we start down the wrong path and just keep blindly going without turning around and reevaluating our position.

When we teach crack and crevice rescue in cave rescue we tell our students, “If you’re not making progress after 15 minutes, let someone else try. The fresh approach will generally help.”

Looking back on this, the problems seem pretty obvious. But it took a bit of work to get there. And honestly, I love the process of troubleshooting and/or debugging. It’s like being a detective, compiling one clue after another until you get the full picture.

So now on to comment some of the code and then figure out why the SSIS package isn’t executing now (I have ideas, we’ll see where I get to!)

The Streisand Effect

I had originally planned on a slightly different topic for this week’s blog, but an email I received from my alma mater last night changed my mind.  First, a little background. I’m a 1990 graduate of RPI in Troy NY, a fact I’m quite proud of. Second, lately there has been a growing controversy over the shape and direction of the school administration, led by Dr. Shirley Jackson.  Let me say that I find Dr. Jackson’s credentials impressive and many of her initiatives have led RPI into the right direction for the 21st Century.

But (and you knew that was coming), all is not rosy in Troy (especially today as I write, it’s a dreary, cloudy day).

So let’s back up a bit though and discuss the Streisand Effect. Originally and mainly the effect refers to bringing unwanted attention to something by trying to suppress access to information in the first place. A similar reaction can be had by telling someone, “Don’t think about a pink elephant.”  Ok, how many of you were thinking of a pink elephant before I told you not to?  Now how many are thinking about one?  But I’m serious, please stop thinking about a pink elephant. There’s no such thing as a pink elephant. Ok, now I’m just being cruel about the whole pink elephant thing.  I’ll stop.

So, back to RPI. As I mentioned not everything is as pink and rosy as it might be.  Generally in cases like this, you have one of three choices. Hire a PR firm that advises you on a course of action, admit the issues and work to solve them, or to shut up and hope things blow over and people stop talking about it.

In this case the Alumni office at RPI decided to take the 4th option. They decided to send a letter to all alumni, including many who probably had no inkling of the ongoing controversies or if they did, didn’t care that it was going on.  The letter was written by an RPI professor in response to a set of well-written and researched articles and a website setup by a bunch of upset alumni. Like good RPI graduates, the alumni backed up their criticisms with research and data.  (For example the website notes how RPI’s credit rating has tanked over the years. An easily verifiable fact.)

The letter unfortunately did not address any of the data (except for one) and instead included highlights such as:

Could it be that the residual racism and sexism (no to mention heightism) that sits in the backs of the minds of the white male majority of our alumni makes it just a bit easier to see Dr Jackson as outside of her league, … out of her place?

Yes, somehow pointing out ongoing critical financial issues results in the Alumni office calling all alumni racists and sexists. Based on the reaction on several social media forums I’m on, after this letter, several alumni who were giving or thinking about giving have changed their minds.  I obviously know only a small subset of the thousands of alumni/ae who must have received this email.  But not a single one I know was convinced by this email to START donating to RPI. And at least one person who wasn’t aware of the majority of the issues said they were made aware as a result of this email and would stop donating.

So effectively, the RPI Alumni office has not only seriously insulted its donor base, it has brought attention to issues that many of the donor base apparently were not even aware of. Streisand Effect is now in full force!

I want to toss in one aside here to be clear: I am not ignorant of the fact, nor do I deny the fact that Dr. Jackson certainly has faced some pushback because of her identity. I’ve see comments about her skin color and gender and have pushed back against such comments. They’re not relevant to the issues at hand.  She has faced and overcome a great deal of discrimination and dislike simply because of who she is.  But, that does not make her or the Board of Trustees immune of criticism based on their actual actions and ones that are backed up by data. The drop in RPI’s credit rating is not due to who she is but rather the actions she and the Board of Trustees have made over the past 18 years.

In closing, as I step off my soapbox here; I realize this blog post is a bit off-topic from my usual fare, but it’s not really. It comes down to how we approach problems.  Trying to ignore them doesn’t necessarily make them go away, but shaming a wider audience doesn’t help either, it only brings more attention to the issue. If in 2003 Barbra Streisand had decided to simply drop the issue of the photographs of her home, the issue would have faded into the woodwork and most people wouldn’t have cared.

In 2018, if the RPI alumni office hadn’t blasted an insulting and condescending email, devoid of facts to its entire alumni base, fewer alumni would know about the issues. But I can guarantee now, many alumni that weren’t aware, or didn’t care, now do.

Think about this when trying to do damage control at your company.

Crane Operators

Talking online with friends the other day, someone brought up that crane operators in NYC can make $400-$500K a year. Yes, a year. I figured I’d confirm that before writing this post and it appears to be accurate.

At first glance one may think this is outrageous, or perhaps they chose the wrong field. I mean I enjoy being a DBA and a disaster geek, but I can’t say I’ve ever made $400K in one year!  And for what, I mean you lift things up and them down. Right?

Let me come back to that.

So, last night, I got paid quite a tidy bundle (but not nearly that much) for literally logging into a client computer, opening up VisualCron and clicking on a task and saying, “disable task”. On one hand, it seemed ridiculous;  not just because of what they were paying me, but because this process was the result of several meetings, more than one email and a review process.  All to say, “stop copying this file.”

But, this file was part of a key backup process for a core part of the client’s business. I had initially setup an entire process to ensure that a backup was being copied from an AIX server in one datacenter to a local NAS and then to the remote datacenter.  It is a bit more complex than it sounds.  But it worked. And the loss of a timely backup would impact their ability to recover by hours if not days. This could potentially cost them 100s of thousands of dollars if not into the millions.

So the meetings and phonecalls and emails weren’t just “which button should Greg click” but covered questions like, “do we have the backups we think we have?” “Are they getting to the right place(s)?” “Are they getting there in a timely fashion?”  And even, “when we uncheck this, we need to make sure the process for the day is complete and we don’t break it.”

So, me unchecking that button after hours, as much as it cost the company was really the end of a complex chain of events designed to make sure that they didn’t risk losing a LOT of money if things went wrong. Call it an insurance payment if you will.

Those crane operators in NYC? They’re not simply lifting up a beam here and there and randomly placing it someplace. They’re maneuvering complex systems in tight spaces with heavy loads where sudden gusts can set things swaying or spinning and a single mistake can do $1000s in damage or even kill people.

It’s not such much what they’re being paid to do, as much as how much they are being paid to avoid the cost of a mistake. I wasn’t paid just to unclick a button. I was paid (as were the others in the meetings) to make sure it was the right button and at the right time and that it wouldn’t cost even more.

Sometimes we’re not paid for what we do, as much as we’re paid for what we’re not doing.