March | 2012 | Green Mountain Software

So, listening to the 24 hours of SQL Pass webinars. The current topic is “I Was Young and Didn’t Know Any Better” and the panelists are sharing war stories of mistakes they’ve made.

So far they all sound familiar. So I thought I’d share one of mine. Well technically not my mistake, but one that I adopted.

Many moons ago, I was advising a company that was involved in building websites for car dealerships. One day they needed to do an update to the live data. This was back in the days when all code and updates were cowboy updates. Of course you ran the query on the live database the first time. You didn’t necessarily have a stating database or even as was later discovered, good backups.

Apparently a customer needed to update a car in their inventory.

UPDATE AUTO set cartype=’White Ford Taurus’

Nice, syntactically valid… and a disaster. Ayup. Suddenly every car in the database at every dealership was now a White Ford Taurus.

Ever since then we called that the “White Ford Taurus” problem.

Now, I might mock doing updates on live data, but sometimes its necessary. I’m curious how others prevent their own “White Ford Taurus” problems.

Personally, I just now make EXTRA effort to make sure I have a WHERE clause.

But I also tend to almost always do it as:

BEGIN TRAN
UPDATE AUTO set cartype=’White Ford Taurus’
if @@rowcount<> 1 rollback tran else commit tran

Or sometimes I’ll simply hardcode the rollback tran, run it once, see what happens and then rerun it with a commit tran.

So, if rather than updating the 1 row I want, I find myself updating 1000s of rows, I’ll catch myself and be safe.

Sure, it’s not perfect, both it and using the WHERE clause require me to make sure I don’t forget them. But the more ways to catch it, the better.

Obviously avoiding ad-hoc updates on live data is preferable, but when you can’t, be extra careful. And of course make sure you have good backups. But that goes without saying.

It’s funny. In my personal life, I’m a pretty casual person. I don’t generally create grocery lists. I don’t write detailed lists of things I need to get done.

That said I’m a HUGE fan of “process” when performing tasks that absolutely have to be done in a specific manner. In my old job, I often had to do complex updates on web and databases servers with zero downtime. In some cases, this is like replacing the engines on a 747 while it’s in flight. In cases like that I or my staff would create what we called a “CRP – Change Review Plan”. (I wanted to call them Change Review Analysis Plan, but I decided I didn’t want to take CRAP from anyone.)

Anyway, a book I would highly recommend is The Checklist Manifesto. This delves into this concept far more than I can here.

However, one thing I learned years ago was when too much process actually can make things worse. There’s a story, possibly apocryphal, of an incident during the servicing of the space shuttle many years ago. In the VAB while it was being rotated from the horizontal position to the vertical for attachment to the ET, a loud klunk was heard from inside the engine compartment. Now, one doesn’t have to be a rocket scientist to realize that a loud klunk is probably NOT supposed to happen during this procedure.

So, what had happened? Had a procedure been violated? Well, in reality yes. On paper no. In order to provide quality control, almost anything NASA touched when it was servicing a shuttle would get signed off on by at least one if not multiple people. Supposedly there was a checklist that at least 5 people signed off on to ensure that all tools had been removed from the engine compartment. Sure enough, that list had 5 signatures on it. However, the tool, I believe a wrench, now sitting at the aft end of the shuttle proved otherwise.

More recently (like tonight) I was reminded of this as I sat in on a meeting. The meeting was at a local college student club and the purpose was to discuss the fact that some unauthorized people may have gained access to an area they were not allowed access to. There was some good discussion of what had occurred and how to avoid it in the future.

At one point someone suggested, “How about a video cam or something so the folks sitting at the desk can check to make sure the room really is empty?” That’s a nice high-tech solution. But it was honestly in search of a problem. The real problem appears to be that the people at the desk weren’t doing their job properly in the first place: making sure doors are locked and checking proper IDs. I pointed out, adding yet another task to their job description was unlikely to solve the root problem and was unlikely to have kept the unauthorized people out.

Ultimately, it looks like the approach the students will take is honestly, probably the simplest one: Asking to change the door lock so that the desk person isn’t responsible for locking it, but rather make it autolocking. This way, when the last authorized person does leave, it is locked automatically. No additional processes are required and in fact the existing ones are simplified and made more failsafe. The door in question should now be locked when it should be, whether or not the desk person checks it as they are supposed to.

Sometimes, the simplest solutions really are the better ones.

Green Mountain Software

Formerly thoughts on SQL, Disasters and thinking. Now musings on PA School, and more.

Monthly Archives: March 2012

White Ford Taurus

Processes