Failure is Required

Last week one of my readers, Derek Lyons correctly called me out on some details on my post about Lock outs. Derek and I go back a long ways with a mutual interest in the space program. His background is in nuclear submarines and some of the details of operations and procedures he’s shared with me over the years have been of interest.  The US nuclear submarine program is built around “procedures” and since the adoption of their SUBSAFE program, has only suffered one hull-loss and that was with the non-SUBSAFE-certified USS Scorpion.

The space program is also well known for its heavy reliance on procedures and attention to detail and safety. Out of the Apollo 13 incident, we have the famous quote, “Failure is not an option” attributed to Gene Kranz in the movie (but there’s no record of him saying it at the time.)

Anyway, his comments got me thinking about failures in general.

And I’d argue that with certain activities and at a certain level, this is true. When it comes to bringing a crew home from the Moon, or launching nuclear missiles, or performing critical surgeries, failure is not an option.

But sometimes, not only is it an option I’d say it’s almost a requirement. I was reminded of this at a small event I was asked to help be a panelist at last week.  It turned out there were 3 of us panelists and just 2 students from a local program to help folks learn to code: AlbanyCanCode. The concept of agile development was brought up and the fact that agile development basically relies on failing fast and early.  For software development, the concept of failing fast really only costs you time. And agile proponents will argue that in fact it saves you time and money since you find your failures much earlier meaning you spend less time going down the wrong path.

But I’m going to shift gears here to an area that’s even more near and dear to my heart: cave rescue.  At an overarching, one might say strategic level, failure is not an option. We teach in the NCRC that our goal is to get the patient(s) out in as good or better shape than we found them as quickly and safely as possible.  In other words, if we end up killing a patient, but get them out really quickly, that’s considered a failure; whereas if we take twice as long, but get them out alive, that’s considered a success.

But how do we do that?  Where does failure come into play?

One of the first lessons I was taught by one of my mentors was to avoid “the mother of all discussions.” This lesson hit home during an incident in my Level 1 training here in New York. We had a mock patient in a Sked. Up to this point it had been walking passage through a stream with about 1″ of water. But we had hit a choke point where the main part of the ceiling came down to about 12″ above the floor passage.  There was alternative route that would involve lifting the patient up several feet and then over some boulders and through some narrow and low (but not 12″ low passage) and then we’d be back to walking passage.  I and two others were near the head of the litter.  At this point we had placed the litter on the ground (out of the water).  We scouted ahead to see how far the low passage went and noticed it went about a body length.  A very short distance.

Meanwhile the rest of our party were back in the larger passage having the mother of all discussions. They were discussing whether we should could drag the litter along the floor, lift it up to go high, or perhaps even for this part, remove the patient from the litter and have them drag themselves a bit.  There may have been other ideas too.

My two partners and I looked at each other, looked at the low passage, looked at the patient, shrugged our shoulders and dragged the patient through the low passage to the other side.

About 10 seconds later someone from the group having the mother of all discussions exclaimed, “where’s the patient?”

“Over here, we got him through, now can we move on?”

They crawled through and we completed the exercise.

So, our decision was a success. But what if it had been a failure. What if we realized that the patient’s nose was really 13″ higher than the floor in the 12″ passage. Simple, we’d have pulled the patient back out. Then we could have shut down the mother of all discussions and said, “we have to go high, we know for a fact the low passage won’t work.”

Failure here WAS an option and by actually TRYING something, we were able to quickly succeed or fail and move on to the next option.

Now obviously one has to use judgement here. What if the water filled passage was 14″ deep. Then no, my partners and I certainly would NOT have tried to move the patient with just the three of us. But perhaps we might have convinced the group to try.

The point is, sometimes it can often be faster and easier to actually attempt a concept than it is to discuss it to death and consider every possibility.

Time and time again I’ve seen students in our classes fall into the mother of all discussions rather than actually attempt something. If they actually attempt something they can learn very quickly if it will work or not. If it works, great, the discussion can now end and they can move on to the next challenge. If it doesn’t work, great, they’ve narrowed down their options and can discuss more intelligently about the remaining options (and then perhaps quickly iterate through those too.)

So today’s take away, is don’t be afraid of failure. Embrace it. Enjoy it. Experience it. It will lead to learning.  Just make sure you understand the price of failure.  Failure may be an option and is sometimes mandatory, but in other cases, the old saw is true, failure is not an option, especially if failure means the loss of life.

 

Locked Out

As I’ve mentioned, not only am I fascinated by disasters and their root causes and how we react, I’m also fascinated by how we take steps to prevent them.  In my book IT Disaster Response: Lessons Learned in the Field I discuss the idea of blue-flagging on railroads.  The important concepts were two-fold: 1) a method of indicating that the train should not be moved and 2) controls on who could remove that indication.

During my recent power outage, I came across something similar.  It should be the featured image for this article.  It’s basically an orange flag locked to a utility pole.  Note the key word there, locked.

The photo doesn’t show the fact that this utility pole contained circuit breakers (I believe that’s the proper term in this context) for the overhead power lines. They had been tripped as a result of a tree further down the road taking out all three supply lines.

Close up of orange flag on utility pole, along with tag with lock-out information

Close-up with tag and flag.

So let’s analyze this a bit:

The orange flag itself was VERY visible. This ensures any other crews that might be in the area that there is something they need to notice.

There is a tag with detailed information. It’s hard to see in the above photo, but it includes who tagged it, the location, and date and some other information.

What’s not clear, is it’s padlocked to the pole.

Now, to be clear, this is NOT a physical lock-out like you see on some power panels (i.e. where the padlock physically prevents the circuit-breaker from being opened or closed).

In this case, a physical lock-out would most likely have to be placed 30′ in the air at the top of the pole where it wouldn’t be easily noticed.

But that said, this served its purpose. It alerted other crews to a danger in the area and presumably can only be removed by the person who put it there. And it contains information on that person so they can be reached if there are questions.

Since power was restored within 1 hour and I didn’t hear of any reports of line worker getting electrocuted, this appears to have worked.

Today’s take-away: when you have a change from the normal state of operation, what steps can you take to ensure that others don’t try to return items to a normal state of operations without confirming things first? By the way, for a good read-up on how bad things can go when intentions about a non-standard mode of operation don’t get properly communicated, I recommend reading up on the events leading up to the Chernobyl disaster.

Procedures are important. Deviating from them can have serious consequences. Do what you can to minimize the possibility of deviations.