Unknown's avatar

About Greg Moore

Founder and owner of Green Mountain Software, a consulting firm based in the Capital District of New York focusing on SQL Server. Formerly, a consulting DBA ("and other duties as assigned") by day, and sometimes night, and caver by night (and sometimes day). Now, a PA student working to add PA-C after my name so I can work as a Physician Assistant. When I'm not in front of a computer or with my family I'm often out hiking, biking, caving or teaching cave rescue skills.

Swiss Cheese

This blog post will try to tie together several of my favorite things: Cheese, caving, and accidents.

I was making lunch the other day and I was looking at the stick of sliced Swiss cheese I had. I should note, I love Swiss cheese, especially with a good roast beef sandwich.

But first, an existential question.  “What is a cave?”

Oh, that’s easy, it’s a passage through rock in the ground.  In other words it’s the area where there’s no rock.  Great. Let’s start simple. I think we can agree if it’s dark and I can walk through it, it’s a cave. What if I have to crawl? Yeah, that’s still a cave. What if I have to shimmy through and can barely fit? Yeah, that’s still a cave. What if I can’t fit, but one of my much smaller friends can fit through? Yeah, that’s a cave. But what if the entire thing is too small for anyone to crawl through but small animals can? What if two rooms that are large enough for humans to be in are connected by a passage too tight for a human, but say you can shine a light through, or can make a “voice connection” and hear people at the other end? Is that still part of the cave? As an aside, humans have mapped over 190 miles of Jewel Cave (and more all the time, big shout out to my friends who are mapping it!) But airflow studies estimate that we’ve only mapped about 3-5% of it. Let that sink in. But, what if the other 95% is too small for a human to fit in. I don’t think anyone would not call that part of the cave.

But here’s the real question. So we’ve mapped the cave. We know where the passages (i.e. lack of rock) are.  We find a plug of mud and remove that.  We’ve made more cave! Yeah! But what if we remove ALL the rock around the existing passage. When does the cave disappear? I mean now we just have a lot more “absence of rock”.  But I think we’d agree at some point we no longer have a cave!

So back to Swiss cheese.  One of the distinguishing details of such cheese are the holes, or more properly named the eyes. Did you know there’s actual Federal guidelines on what can be called Swiss cheese. Ayup, you can’t simply have a cheese with eyes in it. So I guess Swiss cheese is sort of like a cave. We actually have to think about it to give it some definition we can agree on.  Take away all the cheese, eyes and all, and you have no more cheese and I’m quite sad.

But what about accidents? Well, there’s a model of risk analysis called the Swiss cheese model. Basically, very few accidents occur out of the blue or entirely without a relation to other factors. The idea is you have multiple slices of Swiss cheese and all the holes have to line up for the accident to occur. For example, in my own personal experience, years ago I came close to all the “pieces” of the cheese lining up; while driving through New Jersey, I came fairly close to hydroplaning off an exit ramp into the woods.  Let’s look at some of the slices of cheese that came into play.

  • I was tired. Had I been more awake I’d have been paying a bit more attention.
  • It was dark. I might have noticed exactly how wet the exit ramp was during daylight.
  • I was travelling too fast.
  • I had nearly missed the ramp, I might have been travelling slower (see above) had I noticed the ramp sooner.

The instant I hit the ramp, I knew I was in trouble. I think the ONE slice that didn’t line up was, experience. Had I been 20 years younger with less experience driving, I suspect I’d have ended up off the road. I was at the very edge of being able to brake and maneuver and I called upon all my years of experience to stay on the correct side of that edge. One thin slice of “cheese” saved me that night.

When one looks through accident reports, of almost any industry or activity, one can start to look for where the slices lined up and how any one could be changed. One reason I read the American Cave Accidents report when I receive it is to learn where the slices could have been moved so I can make sure I don’t line up my slices of cheese.

So, the question for you is where do your slices of cheese line up?

And other question is, what sort of cheese do you put on YOUR roast beef sandwich? And do you make sure your Swiss cheese eyes don’t line up so every bite is ensured a bit of cheese?

 

 

 

 

Copying a Large File

It was a pretty simple request actually. “Can you copy over the Panama database from FOO\WAS_21 to server BAR\LAX_45?”

“Sure, no problem.”

Of course it was a problem.  Here’s the issue. This is at one of my clients. They have a couple of datacenters and have hundreds of servers in each.  In addition, they have servers in different AD domains.  This helps them partition functionality and security requirements. Normally copying files between servers within a datacenter isn’t an issue. Even copying files between the different domains in the same datacenter isn’t normally too bad. To be clear, it’s not great.  Between servers in the same domain, it appears they have 1GB connections, between the domains, the firewall seems to throttle stuff down to 100MB.

The problem is when copying between different domains in different datacenters. This can be abysmally slow. That was my problem this week.  WAS_21 and LAX_45 were in different datacenters, and in different domains.

Now, for small files, I can use the cut and paste functionality built into RDP and simply cut and paste. This doesn’t work for large files. The file in this case was 19GB.  So this was out.

Fortunately, through the Citrix VDI they provide, I have a temp folder I can use. So, easily enough, I could copy the 19GB file from FOO\WAS_21 to that. That took just a few minutes.  Then I tried to copy it from there to BAR\LAX_45. This was slow, but looked like it would work.  It was going to take 4-5 hours, but they didn’t need the file for a week.

After about 4.5 hours, my RDP session locked up. I logged out and back in and saw the copy had failed. I tried again. This time at just under 4.5 hours I noticed an out of memory error. And then my session locked up.

So, apparently this wasn’t going to work. The obvious solution was to split the file (it was already compressed) into multiple files; except I’m not allowed to install most software on the servers. So that wasn’t a great option. I probably could have installed something like 7zip and then uninstalled it, but I didn’t want to deal with that and the paperwork that would result.

So I fell back to an old friend: Robocopy.  This appeared to be working great. Up until about 4.5 hours.  And guess what… another out of memory error.

But I LIKE challenges like this.

So I looked more closely. Robocopy has a lot of options. There are two that stuck out: /Z – restartable mode. That looked good. I figured worst case, I’d start my backup, let it fail at about 85% done and then resume it.

But then the holy grail: /J :: copy using unbuffered I/O (recommended for large files). 

Wow… unbuffered… that looks good. Might use less memory.

So I gambled and tried both.  And low and behold, 4:19 later… the file was copied!

So, it was an annoying problem but… I had solved it.  I like that!

So the take-away: Don’t give up. There’s always a way if you’re creative enough!

Math is Hard, Let’s Go Shopping

If I were to ask my readers to take a math test right now, approximately 1/2 would perform worse than if I had used a more neutral title such as “Math Quiz Below”. I’ll let you as a the reader guess which 1/2.

This is a subtle form of priming. Multiple studies have shown that by priming people before taking tests or making decisions, we can influence their outcome. It isn’t quite subliminal advertising, but it can be close.

I’m currently reading Delusions of Gender by Cordelia Fine and it’s quite the read. I recommend it to my audience here. She goes into the studies showing how priming can impact outcomes and references them in more detail.

Overall, we know that women are less represented in STEM fields, but this lack of representation doesn’t start out this way. Studies show in grade school the interest in STEM by gender is about equal. But over time, there’s less representation of women in most STEM fields and often when they are represented, their positions either carry less weight (not as much advancement) or perceived to carry less weight (ignored, spoken over, etc.) And before anyone comments, “but I know a woman who is a CTO at my company” or similar, keep in mind that those are noteworthy because they are the exceptions, not the norm.

Now, no single solution will solve the problem of women’s representation in STEM. But there are things we can do. First, we need to recognize that the human brain is probably built to be primed for certain responses. But don’t confuse this with saying that we can’t change what we’re primed for or how we respond. And, we can also avoid priming.

One study that is cited by Fine appears to suggest that collecting gender-biased demographic data AFTER a test or survey doesn’t cause a gender based result in the test. In other words, if you simply give a math test and then at the end ask questions like gender, or even to put ones name on it (which can often have a influence on self-perception) it appears to remove the bias towards poorer performance by women.  Similarly if you don’t ask at all.

But, most of us aren’t giving math tests are we?

But we are doing things like looking at resumes, deciding what conferences or seminars to attend, what blogs to read or respond it and how we interact with our coworkers and bosses.

One technique to consider is blind recruitment. Here much if not all demographic data is removed from a resume. This sort of work goes back to the Toronto Symphony Orchestra in the 1970s. But note, there is some evidence that it’s not the panacea some make it out to be. So proceed with caution.

When attending a conference or seminar, you can do one of a few things. For one, try to read the session descriptions without seeing the name of who is presenting. This can be a bit hard to do and may not quite get the results you want. Or, and I’m going to go out on a limb here because some people find this concept a bit sexist and I don’t have a great deal of data to support it, but…. go based on the names, and select sessions where woman are presenting. Yes, I’m suggesting making a conscious, some would say sexist, choice.

So far I’ve been pleasantly pleased by doing so. Over a year ago at SQL Saturday Chicago 2017 I decided to attend a session by Rie Irish called Let Her Finish: Supporting Women’s Voices from meetings to the board room. I’d like to say I was surprised to find that I was only 1 of 2 men in the room, but I wasn’t. I was a bit disappointed however, since really it was men who needed to hear the talk more than women.  Oh and the other gentleman, was a friend of Rie’s she had invited to attend. And a related tip, when attending such topics, generally, KEEP YOUR MOUTH SHUT. But that’s a different blog post for a different time.

Other great talks I’ve heard were Mindy Curnutt‘s talk at SQL Summit 2017 on Imposter Syndrome. Or Deborah Melkin’s Back to the Basics: T-SQL 101 at SQL Saturday Albany 2017. Despite her being a first time speaker and it being a 101 class, it was great and I learned some stuff and ended up inviting her to speak at our local user group in February of this year.

Besides making your fellow DBAs, SQL professionals, IT folks etc feel valuable and appreciated, you’re also showing the event coordinators that their selections were well made. If more people attend more sessions given by women, eventually there will be more women presenting simply because more will be asked to present.

But what if you can’t go?  Encourage others. Rie and her partner Kathi Kellenberger (whom I’m indebted to for encouraging me to write my first book) are the leaders of the PASS WIT (Women in Technology) Virtual Chapter of PASS. Generally before a SQL Saturday they will retweet announcements of the various women speaking. It doesn’t hurt for you to do the same, especially for women that you know and have heard speak.

But what about when there are no women, or they’re poorly represented. Call folks out on it. Within the past year we’ve seen a “Women in Math” poster, which featured no women.  There was a conference in Europe recently (I’m trying to find links) where women were extremely underrepresented. When women AND men finally started to speak out and threaten not to attend or speak, the conference seemingly suddenly found more women qualified to speak.

I’ve heard sometimes that “it’s hard to find women speakers” or “women don’t apply to speak”. The first is a sign of laziness. I can tell you right now, at least in the SQL world, it’s not hard. You just have to look around.  In the second case, there may be some truth to that. Sometimes you have to be more proactive in making sure that women are willing to apply and speak. For my SQL/PASS folks out there, I would suggest reaching to Rie and Kathi and finding out what you can do to help attract speakers to your conference or user group. Also, for example, if you don’t already have women speaking or in visible public positions within your organization, this can discourage women from applying because, rightly or wrongly, you’re giving off a signal that women may not be welcome.

Math may be hard, but it should not be because of gender bias, and we shouldn’t let gender bias, primed or not allow under-representation to occur.

PS – bonus points if anyone can recognize the mountains in the photo at the top.

PPS – Some of the links below may end up outdated but:

And that’s just a small sampling of who is out there!

 

Alarming

So a recent trip to the ER (no, nothing serious, wasn’t me, thanks for asking) reminded me of a topic near and dear to my heart: Alarms and Alerts. What prompted this thought was the number of beeps, boops, and chirps I heard while there that no one responded to.  This leads to the question: Why have them, if no one responds to them?

I have a simple rule for alarms: “Don’t put an alert on something unless you have a response pre-planned for it.”

This is actually more complex than it sounds. And it can sometimes lead to seemingly illogical conclusions if you follow it in a reductio ad absurdum fashion.

Let’s start with an example of one alert I heard while sitting waiting. It was a constant beep, about 90 times a minute. I soon tracked it down to a portable monitor attached to a patient that was soon to be moved upstairs.  It was the person’s pulse.  Besides a possible HIPAA violation (I was now in theory privy to private medical information) it really served no purpose other than to annoy the patient and those around them. “But Greg, perhaps they were afraid the patient would suddenly go into cardiac arrest or something else would happen.”  And I agree, but then let’s alert on the sudden change in conditions, not in what was, at the time, a stable pulse for the patient. This beeping went on for over 10 minutes. And no one was monitoring it, other than the patient and us annoyed strangers.

So, there was an alert, that apparently needed no response.

But let’s go to the other extreme. What about when an alert isn’t needed. Let’s say you’re driving your car and it throws a rod. (Yes, this happened to me once, well I wasn’t driving, my father was. It was his sister’s Volkswagon campervan). I can tell you there is NO alert when such an event happens. But, there’s no need for it. The vehicle stops. It won’t go. So an alert in that case is pretty superfluous.

But let’s tie this to IT. I’m going to give you an absurd example of when not to have an alert: When you run out of disk space.  Again, you might disagree. You’d think this would be the perfect time to have an alert. But go back to my rule. What if you have no plan for this? You’ve never gamed out the possibility.  Now, you’re out of disk space. You don’t have a plan. Does it really matter if you had an alert or not? If you can’t respond, the alert really hasn’t added anything.

The main lesson to take away from that example is, if you’re setting up an alert, make sure you do have a plan. (The other lesson of course is perhaps to have an alert BEFORE you run out of disk space!) The plan may be as simple as, “delete as many files as I can”. But of course that only works if you have files to delete. Or it might be “add another filegroup to the database for now and then figure out the long-term solution during our next planned outage.”  Or, in the worst case it might be, “update my resume.”  But the point is, if you have an alert, have SOME plan for it.

On the flip side, how many times do you have an alert that you look at and say, “oh yeah, we can ignore that, that always happens.”  Sure, that’s a plan, but honestly, ask yourself, do you need an alert in that case? Probably not. I hate getting woken up at 2:00 AM for an alert I don’t need to respond to.  So in this case if there is no plan because you don’t need a plan, eliminate the alert.

I could go on (and perhaps this will be a good topic for my next book) but I’ll add one last real-world case where people all to often ignore alerts: smoke and CO detectors; especially CO detectors.  If you have a CO detector and it alerts, do NOT assume it’s faulty and unplug it. Respond. Somehow. Don’t automatically assume it’s a faulty battery, especially if it’s the winter. If you have any doubt, please call the fire department. Trust me, they’d much rather respond to a call where you’re all alive and it’s a false CO alarm than to show up and find the alarm going off, but everyone is now dead.

So the take away is, alerts are only useful if they generate a useful response.

Oh and because the inner child in me can’t resist: be a lert because the world needs more lerts! 🙂

 

 

Wet Paint

There’s an old saw that if you tell someone there’s 1 billion people in China, they’ll believe you, but if you put up a wet paint sign, they’ll have to touch the paint to be sure. I find it interesting that we believe some things easily and others not so easily.  Or even sometimes, we may think we believe something until we actually experience it. Then we somehow believe it even more.

Two incidents of this nature occurred to me within the last year.  Last year I drove to my uncle’s house in South Carolina in order to observe the total eclipse. Prior to totality, one of the phenomena that I knew would happen was seeing crescent shaped “shadows”. (Really it’s sort of the reverse since you’re seeing crescent shaped light.) I mean I had read about it, I had seen pictures, and I basically understood why they would occur.  And yet, at one point I was walking back into the house to do something when I looked down and lo and behold… I saw crescent shaped “shadows”.  My reaction was one of “Holy Cow, this really DOES happen!”  Now, I had no doubt intellectually that it should happen, or that I would most likely see it. But, actually seeing it was still amazing and while I really had no reason to have to verify the phenomenon, the fact that I personally had experienced it was incredible. I understood it at a visceral level, not just an intellectual one.

The second incident occurred just the other day. As a long-time sufferer of allergies, spring time has sometimes been a bit less than comfortable for me. And over the years, I had seen videos of pine trees releasing their pollen in a massive cloud (I’m thankfully NOT allergic to pine pollen it appears) but I had never actually experienced it myself. So again, I knew intellectually it happens.

So two days ago I was sitting outside looking at a pine tree when suddenly I saw this puff of what looked like smoke, and then a cloud of pollen waft away from the tree. Again, I had experienced that Holy Cow moment when the visceral experience matched the intellectual one. It was pretty cool.

That all said, I have stopped touching wet paint at this point in my life. But I still love these sort of confirming experiences (though I’m not eager to start counting heads in China at this time).

What facts did you know that when you finally experienced them first hand, had an impact upon you. I’d love to hear them.

Teamwork

I spent the majority of last week with some of the greatest people I know: my fellow NCRC (National Cave Rescue Commission) instructors and students.  Let’s start by the obvious, it takes a strange twist of mind to be the sort of person that actually enjoys crawling through dark holes in the ground. Now take that same group of people and add  a sense of altruism and you’ve got folks willing to rescue others trapped in caves.

Let me interject in here by the way, that caving is actually a fantastically safe sport. When an accident in a cave makes the news, it’s because it’s so rare and generally so unique as to attract attention. The NSS puts out a report every 2 years detailing the accidents across the US. It’s well worth the read if you can find a copy. Now back to the rest of my blog post.

I have been one of the many instructors who teach the “Level 2” class. I love this level because students have gotten beyond the basics and are starting to learn the “why” we do things more than simply the “how”.  And we start to really work on their team leadership and teamwork skills (as I mentioned last week in my post about the lost Sked).

Good teamwork isn’t just a bunch of people working to solve a task. It’s them working together and often anticipating the needs.  Two events last week illustrated a failure and a success.  At one point, I was a mock patient and the class had been broken into two teams. The first one had to get me out of a tight crawl-way (packaged in a Sked) and up to an open area. From there another team had set up a haul system to get me to the top of a 60′ or so (I’ve never really measured it) block of rock.  This was towards the end of the week when the students (and instructors) are a bit tired and overwhelmed with everything they had learned.  We had allocated 90 minutes for the exercise.

During the first part, I just felt like something was off. Nothing serious, nothing I could put my finger on. But the magic the students had shown all week just wasn’t quite there.

And then there I was, 2 hours into the exercise, laying on top of the tight area, waiting to be hauled up. Someone on the lower team said that I was ready to go.  Someone at the top of the block said they were ready to go.  And then I sat for 2-3 minutes until finally things started happening again.  But for a few minutes the days of teamwork had fallen apart. They weren’t trying to anticipate needs or even work cooperatively.  It wasn’t a huge issue, but it did get a reaction from our oldest, most irascible Level 2 instructor.  I think the word ‘disappointed’ was used at least twice.  It might have been a bit harsh, but… it worked.

As the final exercise of the day, the instructors decided to have a bit of fun with the students. We hid the previously lost (and then found) Sked.  The instructor in charge then informed the students that their lost “patient” for this particular exercise was about 4′ tall, last seen wearing bright orange, and was afraid of the dark. As details were added, it suddenly dawned on them what we had in mind.

But they took the task seriously. They did an excellent search and when they were then told the “patient” wasn’t responsive to stimuli, and they couldn’t rule out a c-spine injury, therefore the Sked had to be packaged in another litter, they did so with gusto and honestly, one of the best packaging jobs I had seen all week. All this while laughing.

They took an absolutely silly scenario, laughed while doing it, yet exhibited amazing teamwork. They were back on their game!

Sometimes teams can start to fall apart, but reminding them of how good they can be and providing a bit of levity can help elevate them back to their best!

 

A Lost Sked

Not much time to write this week. I’m off in Alabama crawling around in the bowels of the Earth teaching cave rescue to a bunch of enthusiastic students. The level I teach focuses on teamwork. And sometimes you find teams forming in the most interesting ways.

Yesterday our focus was on some activities in a cave (this one known as Pettyjohn’s) that included a type of a litter known as a Sked. When packaged it’s about 9″ in diameter and 4′ tall. It’s packaged in a bright orange carrier. It’s hard to miss.

And yet, at dinner, the students were a bit frantic; they could not account for the Sked. After some discussion they determined it was most likely left in the cave.

As an instructor, I wasn’t overly concerned, I figured it would be found and if not, it’s part of the reason our organization has a budget for lost or broken equipment, even if it’s expensive.

That said, what was quite reassuring was that the students completely gelled as a team. There was no finger pointing, no casting blame. Instead, they figured out a plan, determined who would go back to look for it and when. In the end, the Sked was found and everyone was happy.

The moral is, sometimes an incident like this can turn into a group of individuals who are blaming everyone else, or it can turn a group into a team where everyone is sharing responsibility. In this case it was it was the latter and I’m quite pleased.

Legacy

“…the good is oft interred with their bones.”

Or these days, lives on in the Internet. I never quite agreed with Shakespeare in this line. I think the good lives on beyond the grave.

In the book, Lies my Teacher Taught Me author James Loewen talks about how certain African tribes divide people into three categories: those alive, the sasha or living-dead, and zamani or the dead.

This was brought to mind yesterday when I was trying to debug an issue which turned out to be a bug in SQL Server 2016 SP2.  While trying to debug it, I needed to add a user to an SSIS setup. This has been a problem in the past, but I recalled I had used #SQLHELP on Twitter to ask the question and gotten a great answer. So, a quick search later found the response I was looking for. The fully correct answer (since MSFT’s page leaves out a step) was available at: http://sqlsoldier.net/wp/sqlserver/howdoigrantaccesspermissionsforssistousers

Now, many of my readers won’t recognize the name, but some will: @SQLSoldier, a member of the #SQLFamily that passed away recently. At the time of his passing I had forgotten that he had reached out to help me last year. The search yesterday though brought it back to me. I never had the honor of meeting Mr. Davis in person, but I know many others spoke highly of him. It was comforting to me to know that even months later his legacy was still helping me (and presumably others).

After thinking about that, I got thinking about my dad.  Soon after he passed in 2015, I picked up the hefty Milwaukee right-angle drill that had been his and was now mine. I was working on the addition (that he had helped design before his death) that has since become my office. I had always liked that particular tool. It has a certain heft and power to it.  At the time, with his death so close at hand, it was a form of grief therapy for me. I had often used this in my youth, helping him out with various construction projects. To this day I’ll pick up one of the tools I inherited, or start a house project using the skills he taught me and I realize, he’s sasha, living-dead. He’s still lives on in me.

SQLSoldier is also living-dead, he’s very real in the hearts and minds of those who knew him and his legacy lives on, still helping others, such as myself.

As my age is now entering its 2nd half-century,  I wonder more and more what my legacy will be. I hope that when I’m sasha, my legacy can still help and aid others.

And with that, I will conclude with a scene from one of my favorite actors in one of my favorite movies:  “What will your verse be?”

 

RCA or “get it running!”

How often have any of us resorted to fixing a server issue by simply rebooting the server?  Yes, we’re all friends here, you can raise your hands. Don’t be shy. We all know we’ve done it at some point.

I ask the question because of a recent tweet I saw with the hashtag #sqlhelp where Allan Hirt made a great comment:

Finding root cause is nice, but my goal first and foremost is to get back up and running quickly. Uptime > root cause more often than not.

This got me thinking, when is this true versus when is it not? And I think the answer ends up being the classic DBA answer, “it depends”.

I’m going to pick two well studied disasters that we’re probably all familiar with. But we need some criteria.  In my book IT Disaster Response: Lessons Learned in the Field I used the definition:

Disaster: An unplanned interruption in business that has an adverse impact on finances or other resources.

Let’s go with that.  It’s pretty broad, but it’s a starting point. Now let’s ignore minor disasters like I mention in the book, like the check printer running out of toner or paper on payroll day. Let’s stick with the big ones; the ones that bring production to a halt and cost us real money.  And we’re not going to restrict ourselves to IT or databases, but we’ll come back to that.

The first example I’m going to use is the Challenger Disaster. I would highly recommend folks read Diane Vaughen’s seminal work: The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA. That said, we all know that when this occurred, NASA did a complete stand-down of all shuttle flights until a full RCA was complete and many changes were made to the program.

On the other hand, in the famous Miracle on the Hudson, airlines did not stop flying after the water landing. But this doesn’t mean a RCA wasn’t done. It in fact was; just well after the incident.

So, back to making that decision.  Here, it was an easy decision. Shuttle flights were occurring every few months and other than delaying some satellite launches (which ironically may have led to issues with the Galileo probe’s antenna) there wasn’t much reason to fly immediately afterwards.  Also, while the largest points were known, i.e. something caused a burn-thru of the SRB, it took months to determine all the details. So, in this case, NASA could and did stand-down for as long as it took to rectify the issues.

In the event of the Miracle on the Hudson, the cause was known immediately.  That said, even then an RCA was done to determine the degree of the damage, if Sullenberg and Skiles had done the right thing, and what procedural changes needed to be made.  For example one item that came out of the post-landing analysis was that the engine restart checklist wasn’t really designed for low altitude failures such as they experienced.

Doing a full RCA of the bird strike on US Airways 1549 and stopping all over flights would have been an economic catastrophe.  But it was more than simply that. It was clear, based on the millions of flights per year, that this was a very isolated incident. The exact scenario was unlikely to happen again.  With Challenger, there had only been 24 previous flights, and ALL of them had experienced various issues, including blow-bys of the primary O-ring and other issues with the SRBs.

So back to our servers.  When can we just “get it running” versus taking downtime to do a  complete RCA vs other options?

I’d suggest one criteria is, “how often has this happened compared to our uptime?”

If we’ve just brought a database online and within the first week it has crashed, I’m probably going to want to do more of an immediate RCA.  If it’s been running for years and this is first time this issue has come up, I’m probably going to just get it running again and not be as adamant about an immediate RCA. I will most likely try to do an RCA afterwards, but again, I my not push for it as hard.

If the problem starts to repeat itself, I’m more likely to push for some sort of immediate RCA the next time the problem occurs.

What about the seriousness of the problem? If I have a server that’s consistently running at 20% CPU and every once in awhile it leaps up to 100% CPU for a few seconds and then goes back to 20% will I respond the same way as if it crashes and it takes me 10 minutes to get it back up? Maybe.  Is it a web-server for cat videos that I make a few hundred off of every month? Probably not. Is it a stock-trading server where those few seconds costing me thousands of dollars?  Yes, then I almost certainly will be attempting an RCA of some short.

Another factor would be, what’s involved in an RCA? Is it just a matter of copying some logs to someplace for later analysis and that will simply take a few seconds or minutes, or am I going to have to run a bunch of queries, collect data and do other items that may keep the server off-line for 30 minutes or more?

Ultimately, in most cases, it’s going to come down to balancing money and in the most extreme cases, lives.  Determining the RCA now, may save money later, but cost money now. On the other hand, not doing an RCA now might save money now, but might cost money later.  Some of it is a judgement call, some of it depends on factors you use to make your decision.

And yes, before anyone objects, I’m only very briefly touching upon the fact that often an RCA can still be done after getting things working again. I’m just touching upon the cases where it has to be done immediately or evidence may be lost.

So, are your criteria for when you do an RCA immediately vs. getting things running as soon as you can? I’d love to hear them.

And credit for the Photo by j zamora on Unsplash

SQL Saturday Philly Followup

So last week I visited a client I have near King of Prussia, PA and then went to SQL Saturday.

This particular client I’ve worked with for over 5 years now and it’s been quite an interesting time. What started out as a 3-6 month project turned into a multi-year, basically full-time engagement and now it’s down to some piecemeal work. But that too is unfortunately slowly ending as they bring their new in-house DBA up to speed. I spent about 1/2 my time there doing a data-dump to him and my manager.

But, I’m not here to talk about that, I’m here to talk about SQL Saturday, customer service and a bit more.

But first, a joke:

“How many DBAs does it take to solve a hardware problem?”

By the count of it, at least a 1/2 dozen.

I got there and for my first session decided to attend Kathi (aka Aunt Kathi) Kellenberger’s session on windowing functions. Fortunately she showed up early because it turns out she could not get her laptop to talk to the monitor. We tried one fix using an existing cable until we realized we had the wrong end plugged in (basically the monitor end we stole from a monitor).  This is one of the big fears of any presenter, showing up and not being able to project ones screen!  So, over the next 30 minutes several of us tried to help with a bit of everything including the “reboot the projector advice”.

Finally after one of the organizers (with permission of the hosting organization) pried off the back of the podium was I able to realize “oh, THIS cable will work”. I handed it up to Kathi and she plugged in her laptop and was able to project. And it was, as I expected a great, informative presentation.  I definitely learned a few things.

I have Kathi to thank (or to blame!) for inspiring me to write my book. So I was more than glad to help her out.

My talk on presenting was well received with a good turnout and a number of questions from audience members. This was in contrast to when I gave it in DC where I had only had a few audience members. And it was in definite contrast to my experience in Colorado Springs where I had no one show up for my presentation. I’ll admit, it was nice to get back on the horse and have such a successful presentation.

Later, I made a point of attending a session by Sarah Hutchins on how to Ace your Job Interview. It was her first time presenting at SQL Saturday and besides being interested in the topic, wanted to support her. She did great.  It did turn out that she needed help with her clicker for PowerPoint so I loaned her mine. I in fact have a slide in my presentation about clickers and helping out fellow speakers, etc.

So, it was with a bit of a laugh that I saw Grant Fritchey’s blog post this week on Presentation Tools. Grant was one of the first speakers I ever saw at a SQL Saturday, back in Boston, I believe 4 years ago.  Besides being a great speaker, I’ve appreciated he’s felt a need to “give back” to the community and in part he does that by supporting and encouraging up and coming speakers and writing informative posts like his most recent one cited here.

So a lot of this weekend was about how #SQLFamily helps each other. Kathi encouraged me to write a book, I was able to help her and Sarah with their hardware issues, Grant funny enough this week follows up on advice on hardware for speakers and so the circle continues.

Contrast that to my stay at Extended Stay America. There’s an adage in business:

It takes months to find a customer and only seconds to lose one.

ESA certainly lost one this weekend. After arriving at SQL Saturday, I realized I had left my shoes in my room at the hotel.  As soon as I got an opportunity I emailed them. I didn’t hear back right away, so I later called.  The response was less than stellar. First, they’d have to check with the housekeeper in question and they’d call me back. But additionally their policy was not to mail items to customers and in the event they did, they expected the customer to pay for shipping. Not the most customer friendly response, but I could deal with the shipping if they did in fact find my shoes.

No more response that day and I wasn’t about to drive 20 minutes in the opposite direction on the off-chance they had found my shoes because it wasn’t even clear the front desk would have access to them (since they couldn’t confirm anything until they spoke to the housekeeper in question.)

Sunday morning I woke up to an email which I will quote in its entirety:

We are unable to send these to you as our mail delivery does not pick up packages unless it is addressed for ESA business.

So, now at least the way I read this, it still doesn’t answer my question if they had even found them.

Finally last evening I spoke on the phone with the manager who kept reiterating their policy, but never said they had actually found them. I finally had to stop her and ask, “Do you even have them? You’ve never actually said that.” “Oh yes we do, but we can’t ship them to you.” “What if I pay for the shipping.” “We don’t do that.” Meanwhile she says repeatedly, “I’m doing everything I can help you.”

I’m still not sure how, “I can’t ship them to you” and “I’m doing everything I can to help you” jives.

But let’s just say, this whole experience has left a sour taste in my mouth.

Again a little effort can go a long way.

So, that’s my experience this weekend.  Some great people who will help each other and others who are willing to write off paying customers.

But, despite not being a very code heavy blog, I’m going to toss out this tidbit for future reference:

$sourceserver = ‘Myserver\sqlexpress’
$sourcedb = ‘Adventurework2014’
$outputdirectory = ‘c:\temp\’

 

$tables = invoke-sqlcmd -server $sourceserver -Database $sourcedb ‘select ss.name as schema_name, so.name as table_name, ss.name+”.”+so.name as full_name from sysobjects so inner join sys.schemas ss on ss.schema_id=so.uid where type=”u”’

ForEach ($table in $tables)
{
$bcpstring=”bcp $($sourcedb).$($table.full_name) out $outputdirectory[$($table.schema_name)].[$($table.table_name)].bcp -S $sourceserver -T -E -n”
#Write-Host $bcpstring
Invoke-Expression $bcpstring

}

It’s not much, but I had a recent need to dump out every table of a particular database for a client. So I wrote this.  BTW, by including the [] in the filenames, when I go to load this data, the QUOTENAME version of the schema.table is automatically used.