About Greg Moore

Founder and owner of Green Mountain Software, a consulting firm based in the Capital District of New York focusing on SQL Server. Lately I've been doing as much programming as I have DBA work so am learning a lot more about C# and VB.Net than I knew a couple of years ago. When I'm not in front of a computer or with my family I'm often out caving or teaching cave rescue skills.

Locked Out

As I’ve mentioned, not only am I fascinated by disasters and their root causes and how we react, I’m also fascinated by how we take steps to prevent them.  In my book IT Disaster Response: Lessons Learned in the Field I discuss the idea of blue-flagging on railroads.  The important concepts were two-fold: 1) a method of indicating that the train should not be moved and 2) controls on who could remove that indication.

During my recent power outage, I came across something similar.  It should be the featured image for this article.  It’s basically an orange flag locked to a utility pole.  Note the key word there, locked.

The photo doesn’t show the fact that this utility pole contained circuit breakers (I believe that’s the proper term in this context) for the overhead power lines. They had been tripped as a result of a tree further down the road taking out all three supply lines.

Close up of orange flag on utility pole, along with tag with lock-out information

Close-up with tag and flag.

So let’s analyze this a bit:

The orange flag itself was VERY visible. This ensures any other crews that might be in the area that there is something they need to notice.

There is a tag with detailed information. It’s hard to see in the above photo, but it includes who tagged it, the location, and date and some other information.

What’s not clear, is it’s padlocked to the pole.

Now, to be clear, this is NOT a physical lock-out like you see on some power panels (i.e. where the padlock physically prevents the circuit-breaker from being opened or closed).

In this case, a physical lock-out would most likely have to be placed 30′ in the air at the top of the pole where it wouldn’t be easily noticed.

But that said, this served its purpose. It alerted other crews to a danger in the area and presumably can only be removed by the person who put it there. And it contains information on that person so they can be reached if there are questions.

Since power was restored within 1 hour and I didn’t hear of any reports of line worker getting electrocuted, this appears to have worked.

Today’s take-away: when you have a change from the normal state of operation, what steps can you take to ensure that others don’t try to return items to a normal state of operations without confirming things first? By the way, for a good read-up on how bad things can go when intentions about a non-standard mode of operation don’t get properly communicated, I recommend reading up on the events leading up to the Chernobyl disaster.

Procedures are important. Deviating from them can have serious consequences. Do what you can to minimize the possibility of deviations.

 

Conspiracy Theories

If I told you I thought the Earth was flat, you’d probably think I was off my rocker. What if I told you that we never landed on the Moon?  Probably a similar reaction.  What if I told you for decades the government ran a medical experiment on black men and denied them the proper treatment for their disease, a treatment that once discovered basically had a 100% success rate in curing the disease?  If you’ve correctly guessed that I’m referring to the Tuskegee Syphilis Experiment, you’d be right.  But what if I told you this in 1952? You’d probably think I was nuts.

And yet.. that was the truth. The US Government knowingly withheld proper treatment to see what would happen. And didn’t really tell anyone.

If you think about it, this has all the hallmarks of a typical conspiracy theory. And at the time, most likely it would have been dismissed as one.

Now, to be clear, I’m convinced that the Earth is roughly an oblate spheroid, that we did land on the Moon and that vaccines do not cause autism. I also believe that there are over 1 billion people living in China.

But the truth is… how does any of us really know any of that? At some point we have to make a decision to believe certain facts.  Yes, we can say, “but there’s overwhelming evidence” but even then, much of the evidence is something we end up having to place faith in.  Someone can show us the multitude of studies that show no correlation between vaccines and autism, but ultimately, we have to believe THOSE studies.

Some things we can verify for ourselves, or hopefully we can build enough of a logical framework that it makes sense to believe what we’re told. For example, a good question to ask about the Moon landings is, “if they were a hoax, why didn’t the Soviet Union expose it?” (And by the way, I did once get talking to a moon hoaxer who simply and calmly explained that in exchange for them not revealing it we agreed to lose Vietnam.)

But even then, logic may fail us or steer us in the wrong direction. For millennia geometry was based on Euclid’s original 5 axioms. Until someone tossed out the one on parallel lines and we suddenly had various forms of non-Euclidean geometry.

For millennia we believed that we had an absolute reference frame. Until Einstein (and others) tossed out that idea.

Ultimately, even with logic, we have to make some assumptions, and occasionally question them.

For example, you have to believe that I’m really Greg Moore and I’m writing this. Perhaps even that is a lie.

But it’s not. 🙂

So my takeway here is: don’t believe in conspiracy theories, but sometimes it’s worth questioning assumptions and sometimes some conspiracy theories MIGHT actually have a grain of truth to them. You decide.

 

 

Sharing and Building

I’ve mentioned in the past that I think it’s important to share and give back knowledge.

This week’s blog post will be short (sorry, they can’t all be great works of art.) But first I want to mention an event that just happened. I’m the leader of the local SQL Server User Group: CASSUG. We had our monthly meeting last night and I was grateful that Hilary Cotter was willing and able to drive up from New Jersey to present on Service Broker.

When I arrange for speakers, I always hope my group gets something out of it. Well, last night we had a new member visiting from out of town. So, it’s probably rare he’ll make future meetings. And today, I read from him: “Hilary’s presentation was very informative and interesting. “ and “Now it has piqued my interest and I’ve started a Pluralsight course to learn more.”  To me, that’s success.

At our July meeting we had lightning rounds. Instead of a single presenter, we had four of our local members present on a topic of their choice for about 15 minutes each.  One of them, presented on using XML results in a SQL query to help build an HTML based email. He adopted the idea from I believe this blog post. Twice now in the last month I’ve used it to help clean up emails I had a system sending out. Yesterday, I finally decided to cleanup an old, ugly, hard to read text based email that showed the status of several scheduled jobs we were running overnight.  A few hours later, after some tweaking I now had a beautiful, easy to read email.  Excellent work and all based on an idea I never would have come up with it my colleague had not shared it from his source.

And that leads me to a bit of self-promotion. When I created this blog, my goal was not to have lots of posts around SQL Server. Several months ago, a mentor of mine (I don’t know if she considers herself that, but I do, since she’s the one that planted the seed in my head for my first book: IT Disaster Response: Lessons Learned in the Field) approached me at SQL Saturday Atlanta and mentioned she was now an editor for Red-Gate’s Simple-Talk blog section and asked me if I’d be interested in writing.  I was.

So I’m proud to say that the first of my blog at the Red-Gate Simple-Talk site is up. Go read it. I’m excited. As of today it’s had over 2000 views! Far more than I get here. And there’s more to come.

And here’s the kicker. Just today I had a client say, “Hey, I need to get this data from this SQL 2014 database to a SQL 2008 Database.”  I was able to say, “I’ve got JUST the answer for that!”

Sharing knowledge is a good thing. It makes us all far more capable and smarter.

 

Less than our Best

I’ve mentioned in the past that I participate a lot in SQL Saturday events and also teach cave rescue. These are ways I try to give back to at least two communities I am a member of. I generally take this engagement very seriously; for two reasons.

The first, which is especially true when I teach cave rescue, is that I’m teaching critical skills that may or may not put a life on the line. I can’t go into teaching these activities without being prepared or someone may get injured or even killed.

The second is, that the audience deserves my best. In some cases, they’ve paid good money to attend events I’m talking or teaching at. In all cases, they’re taking some of their valuable time and giving it to me.

All the best SQL Saturday speakers and NCRC instructors I know feel generally the same about their presentations. They want to give their best.

But here’s the ugly truth: Sometimes we’re not on our A game. There could be a variety of reasons:

  • We might be jet-lagged
  • We may have partied a bit too much last night (though for me, this is not an issue, I was never much of a party animal, even when I was younger)
  • You might have lost your power and Internet the day before during the time you were going to practice and found yourself busy cutting up trees
  • A dozen other reasons

You’ll notice one of those became singular. Ayup, that was my excuse. At the SQL Saturday Albany event, due to unforeseen circumstances the day before, the time I had allocated to run through my presentation was spent removing trees from the road, clearing my phone line and trying to track down the cable company.

So, one of my presentations on Saturday was not up to the standard I would have liked it to be. And for that, to my audience, I apologize (and did so during the presentation).

But here’s the thing: the feedback I received was still all extremely positive. In fact the only really non-positive feedback was in fact very constructive criticism that would have been valid even had I been as prepared as I would have liked!

I guess the truth is, sometimes we hold ourselves to a higher standard than the audience does. And I think we should.

PS: a little teaser, if all goes as planned, tomorrow look for something new on Red-Gate’s Simple Talk page.

White (K)nights

I apologize for skipping two weeks of blog posts, but I was a bit busy; for about 11 days my family and I were visiting Europe for the first time. It was a wonderful trip. It started with a trip to Manchester UK for a SQL Saturday event.

I had sort of forgotten exactly how much further north we were until it dawned on me how early dawn was.  Actually we had noticed the night before as we walked back from the amazingly wonderful speakers’ dinner how light it was despite how late it was.  When I woke up at around 4:30 AM (a bit of jetlag there) I noticed despite the blackout curtains how bright it was around their edges. I later looked it up, and it appears that technically it never reached “night” there, but simply astronomical twilight.

Ever since seeing the movie “White Nights” my wife has always wanted to experience the white nights of Russia. This wasn’t that, but it was close.

This trip followed up on the heels of the amazingly successful Thai Cave Rescue that I had previously commented on. As long term readers know, I’m a caver who also teaches cave rescue and has a role as the Northeast Coordinator of the National Cave Rescue Commission. During the 18 day saga, I and others were called upon by various media outlets to give our insight and perspective. I was fortunate, I only did a little under a dozen media events. Our National Coordinator, Anmar Mirza did well over 100, and most of those in about a 5 day period. A link to one of my media events is here: The Takeaway.

I don’t want to talk about the operation itself, but I want to talk about White Knights. We love our White Knights: the term often refers to a character who will ride into town and single-handedly solve the town’s problems. The truth is, white knights rarely if ever exist and that most problems require a lot more effort to solve.

We’ve seen this in politics, and we saw this with this cave rescue. Let me start by saying I think the work Elon Musk has done with SpaceX is amazing. SpaceX has in fact single-handedly revolutionized the space launch market.

It was perhaps inevitable that Musk’s name would show up in relation to this cave rescue. Musk has previously gotten attention for attempting to help with the power outage crisis in Puerto Rico and now his vow to help the people of Flint (both by the way I think worthy causes and I wish him and more importantly the people he’s trying to help, well).

But here’s the thing, a cave rescue isn’t solved by a white knight. It’s solved by a lot of effort and planning with a lot of people with a variety of skills and experience. There’s rarely a magic breakthrough that magically makes things easier.

And I’ll be blunt: his “submarine” idea, while interesting, was at best a PR distraction and at worst, possibly caused problems.

“But Greg, he was trying to help, how could this make things worse?”  I actually disengaged from an online debate with some Musk fanbois who couldn’t see why Musk’s offer was problematic. To them, he was the white knight that could never do wrong.

Here’s the thing: I know for a fact that several of us, myself included, had to take part of our allotted airtime or written coverage to address why Musk’s idea probably wouldn’t work. This meant less time or room for useful information to be passed on to the audience. Part of my role as regional coordinator is to educate people about cave rescue, and I can’t do this effectively when I’m asked to discuss distractions.

“But so what, that didn’t impact the rescue.” No, it didn’t. But, it appears from the Twitter fights I’ve seen, and other information, that at least some resources on the ground were tasked to deal with Musk. This does mean that people had to spend time dealing with both Musk and the publicity. This means those resources couldn’t be spent elsewhere. At least one report from Musk (which honestly I question) suggests he actually entered the cave during the rescue operations. This means that resources had to be spent on assuring his safety and possibly prevented another person who could have provided help in other ways (even if it was simply acting as a sherpa) from entering.

And apparently, there’s now a useless “submarine” sitting outside the cave.  I’ll leave discussion of why I had problems with the submarine itself for another post.

But here’s one final reason I have problem with Musk bringing so much attention to himself and his idea: It could have lead to second guessing.

Let’s be clear: even the cave divers themselves felt that they would most likely lose some of the kids; this was exactly how dangerous the rescue was. This is coming from the folks who best knew the cave and best understand the risks and issues.  Some of the best cave divers in the world, with rescue experience, who were on-site, thought that some kids would die in the attempt to rescue them. And, if reports are true, they were aware of Musk’s offer and obviously rejected it (and in fact one suggested later that Musk do something anatomically impossible with it.)

Had the rescuers worst fears come true, Musk fan bois would have second guessed every decision. In other words, people would have put more faith in their favorite white knight, who had zero practical experience in the ongoing operations , than they would have in the very people who were there and actively involved. I saw the comments before and during the operations from his fans and all of them were upset that their favorite white knight wasn’t being called in to save the day. I can only imagine how bad it would have been had something tragic occurred.

This is why I’m against white knights. They rarely if ever solve the problem, and worse when they do ride into town, they take time and energy away from those who are actually working on the problems. Leave the white knights on a chess board.

“Today is D-Day”

As I’m writing this, word has rocketed around the world that the 12 soccer players and their coach have been safely rescued from Tham Luang cave. We are awaiting word that all the rescuers themselves, including one of the doctors that had spent time with the boys since they were found, are still on their way out.

Unfortunately, one former Thai SEAL diver, Saman Kunan, who had rejoined his former teammates to help in the rescue, lost his life. This tragic outcome should not be forgotten, nor should it cast too large of a shadow on the amazing success.

What I want to talk about though is not the cave or the rescue operations, but the decision making progress. The title for this post comes from Narongsak Osottanakorn’s statement several days ago when they began the evacuation operations.

 

The term D-Day actually predates the famous Normandy landings that everyone associates it with. However, success of the Normandy landings and their importance in the ultimate outcome of WWII has forever cemented that phrase in history.

One of the hardest parts of any large scale operation like this is making the decision on whether to act. During the Apollo Program, they called them GO/NO GO decisions. Famously you can see this in the movie Apollo 13 where Gene Kranz goes around the room asking for a Go/No Go for launch. (it was pointed in a Tindellgram out before the Apollo 11 landing, that the call after the Eagle landed should be changed to Stay/No Stay – so there was no confusion on if they were “go to stay” or “go to leave”.)

While I’ve never been Flight Commander for a lunar mission, nor a Supreme Allied Commander for a European invasion, I have had to make life or death decisions on much smaller operations. A huge issue is not knowing the outcome. It’s like walking into a casino. If you knew you were always going to win, it would be an easy decision on how to bet. But obviously that’s not possible. The best you can do is gather as much information as you can, gather the best people you can around you, trust them and then make the decision.

What compounds the decision making progress in many cases, and especially in cave rescue is the lack of communication and lack of information. It can be very frustrating to send rescuers into the cave and not know, sometimes for hours, what is going on. Compound this with what is sometimes intense media scrutiny (which was certainly present here with the entire world watching), and one can feel compelled to rush the decision making progress. It is hard, but generally necessary to resist this. In an incident I’m familiar with, I recall a photograph of the cave rescue expert advising rescue operations, standing in the rain, near the cave entrance waiting for the waters to come down so they could send search teams in.  Social media was blowing up with comments like, “they need to get divers in there now!” “Why aren’t the authorities doing anything?”  The fact is, the authorities were doing exactly what the cave rescue expert recommended; waiting for it to be safe enough to act. Once the waters came down, they could send people and find the trapped cavers.

The incident in Thailand is a perfect example of the confluence of these factors:

  • There was media pressure from around the world with people were asking why they were taking so long to begin rescuing the boys and once they did start to rescue them, why it took them three days. Offers and suggestions flowed in from around the world and varied from the absurd (one suggestion we received at the NCRC was the use of dolphins) to the unfortunately impractical (let’s just say Mr. Musk wasn’t the only one, nor the first, to suggest some sort of submarine or sealed bag).
  • There was always a lack of enough information. Even after the boys had been found, it could take hours to get information to the surface, or from the surface back to the players. This hinders the decision making process.
  • Finally of course are the unknowns:
    • When is the rain coming?
    • How much rain?
    • How will the boys react to being submerged?
    • What can they eat in their condition?

And finally, there is, in the back of the minds of folks making the decisions the fact that if the outcome turned tragic, everyone will second guess them.

Narongsak Osottanakorn and others had to weigh all the above with all the facts that they had, and the knowledge that they couldn’t have as much information as they might want and make life-impacting decisions. For this I have a great deal of respect for them and don’t envy them.

Fortunately, in this case, the decisions led to a successful outcome which is a huge relief to the families and the world.

For any operation, especially complex ones, such as this rescue, a moon landing or an invasion of the beaches of Normandy, the planning and decision making process is critically important and often over shadowed by the folks executing the operation. As important as Neil Armstrong, Buzz Aldrin and Michael Collins (who all to often gets overlooked, despite writing one of the better autobiographies of the Apollo program) were to Apollo 11, without the support of Gene Kranz, Steve Bales, and hundreds of others on the ground, they would have very likely had to abort their landing.

So, let’s not forget the people behind the scenes making the decisions.

 

The Thai Cave Rescue

“When does a cave rescue become a recovery?’ That was the question a friend of mine asked me online about a week ago. This was before the boys and their coach had been found in the Thai cave.

Before I continue, let me add a huge caveat: this is an ongoing dynamic situation and many of the details I mention here may already be based on inaccurate or outdated information. But that’s also part of the point I ultimately hope to make: plans have to evolve as more data is gathered.

My somewhat flippant answer was “when they’re dead.” This is a bit of dark humor answer but there was actually some reasoning behind it. Before I go on, let me say that at that point I actually still had a lot of hope and reason to believe they were still alive. I’m very glad to find that they were in fact found alive and relatively safe.

There’s a truth about cave rescue: caves are literally a black-hole of information. Until you find the people you’re searching for, you have very little information.  Sometimes it may be as little as, “They went into this cave and haven’t come out yet.” (Actually sometimes it can be even less than that, “We think they went into one of these caves but we’re not even sure about that.”)

So when it comes to rescue, two of the items we try to teach students when teaching cave rescue is to look for clues, and to try to establish communications. A clue might be a footprint or a food wrapper. It might be the smell of a sweaty caver wafting in a certain direction. A clue might be the sound of someone calling for help. And the ultimate clue of course is the caver themselves. But there are other clues we might look for: what equipment do we think they have? What experience do they have? What is the characteristics of the cave? These can all drive how we search and what decisions we make.

Going back to the Thai cave situation, based on the media reports (which should always be taken with a huge grain of salt) it appeared that the coach and boys probably knew enough to get above the flood level and that the cave temps were in the 80s (Fahrenheit).  These are two reasons I was hopeful. Honestly, had they not gotten above the flood zone, almost certainly we’d be talking about a tragedy instead. Had the cave been a typical northeast cave where the temps are in the 40s (F) I would have had a lot less hope.

Given the above details then, it was reasonable to believe the boys were still alive and to continue to treat the situation as a search and eventually rescue situation.  And fortunately, that’s the way it has turned out. What happens next is still open for speculation, but I’ll say don’t be surprised if they bring in gear and people and bivouac in place for weeks or even months until the water levels come down.

During the search process, apparently a lot of phone lines were laid into parts of the cave so that easier communications could be made with the surface. Now that they have found the cavers, I’d be shocked if some sort of realtime communications is not setup in short order. This will allow he incident commander to make better informed decisions and to be able to get the most accurate and up to date data.

So, let me relate this to IT and disasters. Typically a disaster will start with, “the server has crashed” or something similar. We have an idea of the problem, but again, we’re really in a black-hole of information at that moment. Did the server crash because a hard drive failed, or because someone kicked the power cord or something else?

The first thing we need to do is to get more information. And we may need to establish communications. We often take that for granted, but the truth is, often when a major disaster occurs, the first thing to go is good communications. Imagine that the crashed server is in a datacenter across the country. How can you find out what’s going on? Perhaps you call for hands on support. But what if the reason the server has crashed is because the datacenter is on fire? You may not be able to reach anyone!  You might need to call a friend in the same city and have them go over there.  Or you might even turn on the news to see if there’s anything on worth noting.

But the point is, you can’t react until you have more information. Once you start to have information, you can start to develop a reaction plan. But let’s take the above situation and imagine that you find your datacenter has in fact burned down. You might start to panic and think you need to order a new server.  You start to call up your CFO to ask her to let you buy some new hardware when suddenly you get a call from your tech in the remote. They tell you, “Yeah, the building burned down, but we got real lucky and our server was in an area that was undamaged and I’ve got it in the trunk of my car, what do you want me to do with it?”

Now your previous data has been invalidated and you have new information and have to develop a new plan.

This is the situation in Thailand right now. They’re continually getting new information and updating their plans as they go. And this is the way you need to handle you disasters, establish communications, gather data and create a plan and update your plan as the data changes. And don’t give up hope until you absolutely have to.