Legacy

“…the good is oft interred with their bones.”

Or these days, lives on in the Internet. I never quite agreed with Shakespeare in this line. I think the good lives on beyond the grave.

In the book, Lies my Teacher Taught Me author James Loewen talks about how certain African tribes divide people into three categories: those alive, the sasha or living-dead, and zamani or the dead.

This was brought to mind yesterday when I was trying to debug an issue which turned out to be a bug in SQL Server 2016 SP2.  While trying to debug it, I needed to add a user to an SSIS setup. This has been a problem in the past, but I recalled I had used #SQLHELP on Twitter to ask the question and gotten a great answer. So, a quick search later found the response I was looking for. The fully correct answer (since MSFT’s page leaves out a step) was available at: http://sqlsoldier.net/wp/sqlserver/howdoigrantaccesspermissionsforssistousers

Now, many of my readers won’t recognize the name, but some will: @SQLSoldier, a member of the #SQLFamily that passed away recently. At the time of his passing I had forgotten that he had reached out to help me last year. The search yesterday though brought it back to me. I never had the honor of meeting Mr. Davis in person, but I know many others spoke highly of him. It was comforting to me to know that even months later his legacy was still helping me (and presumably others).

After thinking about that, I got thinking about my dad.  Soon after he passed in 2015, I picked up the hefty Milwaukee right-angle drill that had been his and was now mine. I was working on the addition (that he had helped design before his death) that has since become my office. I had always liked that particular tool. It has a certain heft and power to it.  At the time, with his death so close at hand, it was a form of grief therapy for me. I had often used this in my youth, helping him out with various construction projects. To this day I’ll pick up one of the tools I inherited, or start a house project using the skills he taught me and I realize, he’s sasha, living-dead. He’s still lives on in me.

SQLSoldier is also living-dead, he’s very real in the hearts and minds of those who knew him and his legacy lives on, still helping others, such as myself.

As my age is now entering its 2nd half-century,  I wonder more and more what my legacy will be. I hope that when I’m sasha, my legacy can still help and aid others.

And with that, I will conclude with a scene from one of my favorite actors in one of my favorite movies:  “What will your verse be?”

 

RCA or “get it running!”

How often have any of us resorted to fixing a server issue by simply rebooting the server?  Yes, we’re all friends here, you can raise your hands. Don’t be shy. We all know we’ve done it at some point.

I ask the question because of a recent tweet I saw with the hashtag #sqlhelp where Allan Hirt made a great comment:

Finding root cause is nice, but my goal first and foremost is to get back up and running quickly. Uptime > root cause more often than not.

This got me thinking, when is this true versus when is it not? And I think the answer ends up being the classic DBA answer, “it depends”.

I’m going to pick two well studied disasters that we’re probably all familiar with. But we need some criteria.  In my book IT Disaster Response: Lessons Learned in the Field I used the definition:

Disaster: An unplanned interruption in business that has an adverse impact on finances or other resources.

Let’s go with that.  It’s pretty broad, but it’s a starting point. Now let’s ignore minor disasters like I mention in the book, like the check printer running out of toner or paper on payroll day. Let’s stick with the big ones; the ones that bring production to a halt and cost us real money.  And we’re not going to restrict ourselves to IT or databases, but we’ll come back to that.

The first example I’m going to use is the Challenger Disaster. I would highly recommend folks read Diane Vaughen’s seminal work: The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA. That said, we all know that when this occurred, NASA did a complete stand-down of all shuttle flights until a full RCA was complete and many changes were made to the program.

On the other hand, in the famous Miracle on the Hudson, airlines did not stop flying after the water landing. But this doesn’t mean a RCA wasn’t done. It in fact was; just well after the incident.

So, back to making that decision.  Here, it was an easy decision. Shuttle flights were occurring every few months and other than delaying some satellite launches (which ironically may have led to issues with the Galileo probe’s antenna) there wasn’t much reason to fly immediately afterwards.  Also, while the largest points were known, i.e. something caused a burn-thru of the SRB, it took months to determine all the details. So, in this case, NASA could and did stand-down for as long as it took to rectify the issues.

In the event of the Miracle on the Hudson, the cause was known immediately.  That said, even then an RCA was done to determine the degree of the damage, if Sullenberg and Skiles had done the right thing, and what procedural changes needed to be made.  For example one item that came out of the post-landing analysis was that the engine restart checklist wasn’t really designed for low altitude failures such as they experienced.

Doing a full RCA of the bird strike on US Airways 1549 and stopping all over flights would have been an economic catastrophe.  But it was more than simply that. It was clear, based on the millions of flights per year, that this was a very isolated incident. The exact scenario was unlikely to happen again.  With Challenger, there had only been 24 previous flights, and ALL of them had experienced various issues, including blow-bys of the primary O-ring and other issues with the SRBs.

So back to our servers.  When can we just “get it running” versus taking downtime to do a  complete RCA vs other options?

I’d suggest one criteria is, “how often has this happened compared to our uptime?”

If we’ve just brought a database online and within the first week it has crashed, I’m probably going to want to do more of an immediate RCA.  If it’s been running for years and this is first time this issue has come up, I’m probably going to just get it running again and not be as adamant about an immediate RCA. I will most likely try to do an RCA afterwards, but again, I my not push for it as hard.

If the problem starts to repeat itself, I’m more likely to push for some sort of immediate RCA the next time the problem occurs.

What about the seriousness of the problem? If I have a server that’s consistently running at 20% CPU and every once in awhile it leaps up to 100% CPU for a few seconds and then goes back to 20% will I respond the same way as if it crashes and it takes me 10 minutes to get it back up? Maybe.  Is it a web-server for cat videos that I make a few hundred off of every month? Probably not. Is it a stock-trading server where those few seconds costing me thousands of dollars?  Yes, then I almost certainly will be attempting an RCA of some short.

Another factor would be, what’s involved in an RCA? Is it just a matter of copying some logs to someplace for later analysis and that will simply take a few seconds or minutes, or am I going to have to run a bunch of queries, collect data and do other items that may keep the server off-line for 30 minutes or more?

Ultimately, in most cases, it’s going to come down to balancing money and in the most extreme cases, lives.  Determining the RCA now, may save money later, but cost money now. On the other hand, not doing an RCA now might save money now, but might cost money later.  Some of it is a judgement call, some of it depends on factors you use to make your decision.

And yes, before anyone objects, I’m only very briefly touching upon the fact that often an RCA can still be done after getting things working again. I’m just touching upon the cases where it has to be done immediately or evidence may be lost.

So, are your criteria for when you do an RCA immediately vs. getting things running as soon as you can? I’d love to hear them.

And credit for the Photo by j zamora on Unsplash

SQL Saturday Philly Followup

So last week I visited a client I have near King of Prussia, PA and then went to SQL Saturday.

This particular client I’ve worked with for over 5 years now and it’s been quite an interesting time. What started out as a 3-6 month project turned into a multi-year, basically full-time engagement and now it’s down to some piecemeal work. But that too is unfortunately slowly ending as they bring their new in-house DBA up to speed. I spent about 1/2 my time there doing a data-dump to him and my manager.

But, I’m not here to talk about that, I’m here to talk about SQL Saturday, customer service and a bit more.

But first, a joke:

“How many DBAs does it take to solve a hardware problem?”

By the count of it, at least a 1/2 dozen.

I got there and for my first session decided to attend Kathi (aka Aunt Kathi) Kellenberger’s session on windowing functions. Fortunately she showed up early because it turns out she could not get her laptop to talk to the monitor. We tried one fix using an existing cable until we realized we had the wrong end plugged in (basically the monitor end we stole from a monitor).  This is one of the big fears of any presenter, showing up and not being able to project ones screen!  So, over the next 30 minutes several of us tried to help with a bit of everything including the “reboot the projector advice”.

Finally after one of the organizers (with permission of the hosting organization) pried off the back of the podium was I able to realize “oh, THIS cable will work”. I handed it up to Kathi and she plugged in her laptop and was able to project. And it was, as I expected a great, informative presentation.  I definitely learned a few things.

I have Kathi to thank (or to blame!) for inspiring me to write my book. So I was more than glad to help her out.

My talk on presenting was well received with a good turnout and a number of questions from audience members. This was in contrast to when I gave it in DC where I had only had a few audience members. And it was in definite contrast to my experience in Colorado Springs where I had no one show up for my presentation. I’ll admit, it was nice to get back on the horse and have such a successful presentation.

Later, I made a point of attending a session by Sarah Hutchins on how to Ace your Job Interview. It was her first time presenting at SQL Saturday and besides being interested in the topic, wanted to support her. She did great.  It did turn out that she needed help with her clicker for PowerPoint so I loaned her mine. I in fact have a slide in my presentation about clickers and helping out fellow speakers, etc.

So, it was with a bit of a laugh that I saw Grant Fritchey’s blog post this week on Presentation Tools. Grant was one of the first speakers I ever saw at a SQL Saturday, back in Boston, I believe 4 years ago.  Besides being a great speaker, I’ve appreciated he’s felt a need to “give back” to the community and in part he does that by supporting and encouraging up and coming speakers and writing informative posts like his most recent one cited here.

So a lot of this weekend was about how #SQLFamily helps each other. Kathi encouraged me to write a book, I was able to help her and Sarah with their hardware issues, Grant funny enough this week follows up on advice on hardware for speakers and so the circle continues.

Contrast that to my stay at Extended Stay America. There’s an adage in business:

It takes months to find a customer and only seconds to lose one.

ESA certainly lost one this weekend. After arriving at SQL Saturday, I realized I had left my shoes in my room at the hotel.  As soon as I got an opportunity I emailed them. I didn’t hear back right away, so I later called.  The response was less than stellar. First, they’d have to check with the housekeeper in question and they’d call me back. But additionally their policy was not to mail items to customers and in the event they did, they expected the customer to pay for shipping. Not the most customer friendly response, but I could deal with the shipping if they did in fact find my shoes.

No more response that day and I wasn’t about to drive 20 minutes in the opposite direction on the off-chance they had found my shoes because it wasn’t even clear the front desk would have access to them (since they couldn’t confirm anything until they spoke to the housekeeper in question.)

Sunday morning I woke up to an email which I will quote in its entirety:

We are unable to send these to you as our mail delivery does not pick up packages unless it is addressed for ESA business.

So, now at least the way I read this, it still doesn’t answer my question if they had even found them.

Finally last evening I spoke on the phone with the manager who kept reiterating their policy, but never said they had actually found them. I finally had to stop her and ask, “Do you even have them? You’ve never actually said that.” “Oh yes we do, but we can’t ship them to you.” “What if I pay for the shipping.” “We don’t do that.” Meanwhile she says repeatedly, “I’m doing everything I can help you.”

I’m still not sure how, “I can’t ship them to you” and “I’m doing everything I can to help you” jives.

But let’s just say, this whole experience has left a sour taste in my mouth.

Again a little effort can go a long way.

So, that’s my experience this weekend.  Some great people who will help each other and others who are willing to write off paying customers.

But, despite not being a very code heavy blog, I’m going to toss out this tidbit for future reference:

$sourceserver = ‘Myserver\sqlexpress’
$sourcedb = ‘Adventurework2014’
$outputdirectory = ‘c:\temp\’

 

$tables = invoke-sqlcmd -server $sourceserver -Database $sourcedb ‘select ss.name as schema_name, so.name as table_name, ss.name+”.”+so.name as full_name from sysobjects so inner join sys.schemas ss on ss.schema_id=so.uid where type=”u”’

ForEach ($table in $tables)
{
$bcpstring=”bcp $($sourcedb).$($table.full_name) out $outputdirectory[$($table.schema_name)].[$($table.table_name)].bcp -S $sourceserver -T -E -n”
#Write-Host $bcpstring
Invoke-Expression $bcpstring

}

It’s not much, but I had a recent need to dump out every table of a particular database for a client. So I wrote this.  BTW, by including the [] in the filenames, when I go to load this data, the QUOTENAME version of the schema.table is automatically used.

 

Oil Change Time and a Rubber Ducky

Sometimes, the inspiration for this blog comes from the strangest places. This time… it was an oil change.

I had been putting off changing my oil for far too long and finally took advantage of some free time last Friday to get it changed. I used to change it myself, but for some reason, in this new car (well new used car, but that’s a story for another day) I’ve always paid to get it changed. (And actually why I stopped changing it myself is also a blog post for another day.)

Anyway, I’ve twice now gone to the local Valvoline. This isn’t really an add for Valvoline specifically but more a comment on what I found interesting there.

So, most places where I’ve had my oil changed, you park, go in, give them your name and car keys and wait. Not here, they actually have you drive the car into the bay itself and you sit in the car the entire time. I think this is a bit more efficient, but since, instead of lifting the car, they have a pit under the car, I suppose they do risk someone driving their car into the pit (yes, it’s guarded by a low rail on either side, but you know there are drivers just that bad out there).

So, while sitting there I observed them doing two things I’m a huge fan of: using a checklist and calling out.

As I’ve talked about in my book and here in my own blogs, I love checklists. I recommend the book The Checklist Manifesto. They help reduce errors.  And while changing oil is fairly simple, mistakes do happen; the wrong oil gets put in, the drain plug isn’t properly tightened, too much gets put in, etc.

So hearing them call out and seeing them check off on the computer what they were doing, helps instill confidence. Now, I’m sure most, if not all oil change places do this, but if you’re sitting in the waiting room, you don’t get to see it.

But they also did something else which I found particularly interesting: they did a version of Pointing and Calling.  This is a very common practice in the Japanese railway system. One study showed it reduced accidents by almost 85%. So while changing my oil, the guy above would call out what he was doing. It was tough to hear everything he was calling out, but I know at one point the call was “4.5 Maxlife”  He then proceeded to put in what I presume was 4.5 quarts of the semi-synthetic oil into my engine (I know it was the right oil because I could see which nozzle he selected). I didn’t count the clicks, but I believe there was 9.  Now, other than the feedback of the 9 clicks, the guy in the pit couldn’t know for sure that it was the right oil and amount, but, I’m going to guess he had a computer terminal of his own and had his screen said “4 quarts standard” he’d have spoken up.  But even if he didn’t have a way of confirming the call, by speaking it out loud the guy above was engaging more of his brain in his task, which was more likely to reduce the chances of him making a mistake.

I left the oil change with a high confidence that they had done it right. And I was glad to know they actually were taking active steps to ensure that.

So, what about the rubber duck?

Well, a while back I started to pick up the habit of rubber duck debugging. Working at home, alone, it’s often hard to show another developer my code and ask, “Why isn’t this working?”  But, if I encounter a problem and I can’t seem to figure out why it’s not working. I now pull out a rubber duck and start working through the code line by line. It’s amazing how well this works.  I suspect that by taking the time to slow down to process the information and by engaging more of my brain (now the verbal and auditory portions), like pointing and calling, it helps bring more of my limited brain power to bear on the problem.  And if that doesn’t work, I still have my extended brain.

PS As a reminder, this coming Saturday I’m speaking at SQL Saturday Philadelphia. Don’t miss it!

Sharking

The title refers to a term I had not given much thought to in years, if not perhaps decades. But first let me mention what prompted the memory.

This weekend my daughter was competing at the State Odyssey of the Mind competition in Binghamton, NY. While waiting for her team to compete, I noticed a member of one of the other teams walking around with a stuffed, cloth sharkfin pinned to the back of a sport jacket.

This reminded me of a t-shirt my mom made for me years back with a similar design.

So, you may be asking yourself, “why?” and perhaps asking “what’s the point of this particular blog post”.  I’ll endeavor to answer both. But first we have to jump back into the time machine and again go back to my days at RPI. The year is 1989 and I’m now helping out with the Student Orientation (SO) staff. We were a bunch of students who would return to RPI over the summer and help the incoming Freshman class get oriented while they visited RPI in prep coming in as students in the fall.

Back then, the ratio at RPI was pretty lopsided, it was 5 men for every one 1 women. This among other things lead to some women using the phrase, “The odds are good, but the goods are odd.” In a strictly mathematical sense this was in a way accurate, if a woman wanted to date, she had 5 men vying for her attention. The reality of course was much different. It meant that if a woman didn’t want to date, she still had 5 men vying for her attention. (Of course it was far more than that since things didn’t divvy up nearly as cleanly.)

This was a tough social environment and combine that with fairly geeky students who often didn’t develop good social skills in high school and you often ended up with a lot of awkward situations and honestly, some pretty bad behavior all around; hence the goods being odd.

And unfortunately, some SO staff weren’t immune from being problematic. We tried to self-police, but there were always the 1-2 men who would be extra friendly to the incoming women and like a shark swimming the waters, look for their easy prey. We called this sharking. We would look out for it among ourselves and try to stop anyone SO advisor we thought was doing it and if they were particular egregious, make sure they weren’t invited back the next year. But the problem definitely existed.

My mom, bless her heart made me a shirt with a shark fin on the back, not because I personally was a shark, or to mock the problem, but more to highlight the problem and help us be more self-aware.

So, this weekend I was reminded of sharking.

So why bring it up? Because, being a member of several communities, including IT savvy communities, caving, and others, I still see this as an ongoing problem; someone in a position of power or influence, preying upon the newcomers; often young women. Now it often can start out with the best of intentions and without the person meaning to. You see someone new, they ask for help. You decide to mentor them. You’re just being helpful, right? But then it becomes the extra friendly touch, the slight innuendo in a comment, the off-color joke or even the outright blatant consent violations.

Watch out for it. Don’t do it and if you catch others doing it, say something. Nip it in the bud. If you’re mentoring, mentor. Provide them with professional guidance and advice. Don’t use it as an opportunity to prey upon their naivete and lack of knowledge or experience. Remember, as a mentor, you are in a position of power and influence and so you should be like Spiderman and only use that power and influence for the greater good and to help them, not to help yourself.

And if you do for some reason find yourself slipping beyond the role of a mentor and your mentee also appears to be comfortable with this (hey, it does happen, we’re all human), then STOP BEING THEIR MENTOR.  Make it clear that you can’t do both. A mentor, by definition and nature, is a position of influence. Don’t mix that with relationships in a professional setting. Just don’t.

As many of you know, I love teaching, it’s a reason I’m a cave rescue instructor and a reason I teach at SQL Saturdays and at other events.  I encourage folks to teach and help mentor others.  But please, be aware of boundaries and keep it professional.

Oh and a final note, I’m not immune to my own follies and mistakes and if you ever catch me crossing a line, by all means call me out on it. I don’t want to be “that guy”.

 

Slacking on the Job: Or using someone else’s brain as your own!

This post by Thomas LaRock came across my feed a few weeks ago. I’ve had it in my queue to write about since then. Basically he talks about cutting back on using Slack communities. For those not familiar with Slack, it’s a tool used to basically chat with other folks. It can be used internally for companies, set for just two people, or for larger groups of people.

But before we go further, let’s jump in a time machine and go back to the fall of 1985. I was a freshly minted freshman at RPI. I was so freshly minted, one or two of my friends to this day joke about how minty fresh I smelled.

Back then the main computer on campus was Sybil, a dual processor IBM 3081D. Some students had written a program called *CB. Some of you may recall the popularity of CB radios in the 70s and 80s. (If you’re too young for that, click the link back there and read up on it). *CB was a computer based version of it. It had as I recall 10 channels; 0-9, 0 being a ‘public channel’ and the other 9 used for different types of discussions.

I found this early on and started to use it. Of course a problem was, this being a mainframe, all CPU cycles got billed to the student. Of course CPU cycles were cheaper after midnight, so, yes, I did a lot of late nights at a terminal (preparing me for a life in computers).

But, *CB had its limits and it wasn’t long before the powers that be decided to shut it down. But the students weren’t to be denied. Next came *CONNECT. This ran in a different mode so was better tolerated. But that too eventually went away.

At some point *CONNECT was replaced by Clover, which I believe ran on an UNIX system. Clover was soon replaced by Lily. I’m not sure how long Lily has been around, but I know it’s been around for at least 24 years.

From *CB to Lily several features were added or improved upon. The number of discussions on Lily is infinite. Discussions can be private, so only allowed members can see the discussion and who is in it. Discussions can be moderated to control who can and can’t talk. Moderators of discussions can control who is or isn’t in them.

One of the coolest features, and as far as I know the first system to implement this was the concept of a detached user, i.e. you could leave the system, come back and reconnect and review what you had missed. This predated by a number of years AOL introducing a similar feature (and other systems introducing it). Many features found in IRC and SLACK and other systems were first tried out at RPI on Lily or one of its predecessors. (more of a history at that preceeeding link)  (Yes, I’m bragging a bit about my alma mater and the students there.)

Anyway, I write all this because it leads me to Slack. I’ve used Slack. I’m not a fan of Slack. There’s no one specific reason and I’m not saying Slack is bad. That said, one issue I personally find is that everything is so separated that I end up with 3-4 separate Slack Windows and I lose track of what’s going on.

But, I still use Lily. I continue to use Lily every day. I’m a member of 333 discussions, I own 16 and some of those I’m in and I own are private (no I’m not revealing any secrets, sorry!) Why?

Well first I should note, of the 333 discussions, probably 300 of them get little to no traffic. For example the discussion Usenet gets extremely little traffic.  Others, such as Space can be very popular at times (like yesterday around 4:30 PM during the Falcon 9 launch).

So why am I on Lily so much? There’s two reasons. One is the obvious reason: we’re social creatures and I like the interaction. And since I work from home, it’s nice to chat with other folks. And I should note that Lily members are scattered around the country and even the world.

But there’s also another very important reason.  I’m not as smart as some of you may think. My brain is limited to the size of my skull. BUT, I’m not limited to that. I call Lily my extended brain. And it really is. At one point I was having a particularly difficult time with some Javascript (we’ve all been there right?). So I asked one of my Lily friends for help. She’s a full-time web-developer and she was able to help me out. When I’ve had Perl questions, various people have helped me out, including at least one member of the Perl Foundation (I may be mistating his actual role/title). I routinely answer questions about SQL Server.

Other RPI Alumns or associated people have played a major role in writing or being involved with the development of Usenet, DNS, the modern Internet infrastructure. Several work at Google, Microsoft, Amazon or other major companies and can provide a great deal of information I might not have access to otherwise.

I’ve also hired people I’ve met through Lily. It’s been a great resource for job hunting for myself and others.

Even long before we had the wealth of knowledge easily available on today’s Internet, Lily was my extended brain. And it continues to be my extended brain.

I’m not as smart as you think I am. But my friends are, and they’re even smarter than that. And this is one thing that basically makes us humans unique: our language and ability to communicate permits us to be smarter than we really are. We can and do share knowledge. If you really want to be smart and improve your lives and your careers, develop your network. Find your extended brain and exploit it. And remember your role in being an extended brain in others.

So no, I won’t develop a love for Slack. But I won’t give up Lily either. It’s part of who I am.

As a postscript, I will remind folks: if you like what I write, please subscribe so you get the updates when I write more.

And look for me at:

SQL Saturday Philadelphia: April 21st
SQL Saturday Atlanta: May 19th
SQL Saturday Albany: July 28th

You can pick my brain and extend yours there.

 

And so it Happened…

New Faces

Last year I made a decision to try to do at least one SQL Saturday outside my “normal” geographic region; which basically encompasses down to Washington DC and out to Rochester NY and Boston. I’ve spoken at a number of SQL Saturdays in this area. I’ve enjoyed all of them. And generally I’ve drawn a decent audience, with a few exceptions.

But, one of the problems of doing that is you keep seeing the same speakers over and over. And while we’ve got a great crowd of speakers, I wanted to hear from speakers I might not normally hear from. Also, unless you’re constantly creating new content (which you should for a multitude of reasons), after awhile your possible audience has heard everything you have to say.

So last year, I put a bid in for SQL Saturday Chicago and was very pleased to be accepted. I had a great time staying with some friends in the area and also a great time at the Speakers Dinner and After Party as well as at the event itself. I met a number of speakers I had not met before and heard a few speaker that I had not previously heard. And, I had a fresh new audience who seemed to really enjoy my topic on “Tips that have saved my Bacon.”

Colorado Springs

So this year, I had a choice of places to put in bids for. I selected Colorado Springs and was pleasantly surprised to find they’d accepted me.  Since I’ve got a friend in the area, that cut down on costs considerably.  It was a win win.

I had a great time at the Speakers Dinner on Friday night and met more speakers that I had not previous met. A quick shout out to @toddkleinhans and Cyndi Johnson and @DBAKevlar among others. It was great. We talked a bit about using VR to navigate a query, about reprogramming our brains and more.

I was excited for the next day. Sure, it was last session of the day, but I showed up early so I could hang out in the Speakers’ Lounge, see some of the other sessions, and hang with my friends, the MidnightDBAs, Sean and Jenn McCown.

Then it happened

As a speaker you have a lot of fears; the slide deck crashing, your computer applying updates in the middle of your talk (it happens!) and more. But I think the one that perhaps you don’t necessarily dread the most, but you’re most disappointed by, is when…. no one shows up! Catherine Wilhelmsen has a great blog post about this and I have to agree with pretty much everything she says.

All I can say is… “it happens”. I know it’s happened to other speakers, many who I have a great deal of respect for and think are a tier above me in terms of their talks.

Sometimes it’s just luck of the draw. Sometimes, as I suspect played a role here, it’s the end of the day, a number of folks have gone home already and ALL the sessions have lower numbers than ones earlier in the day. It could be the organizers misjudged the topics the audience wanted. It could be my title or description just didn’t entice folks (I suspect this is part of the issue with a different talk I gave, where I got too cutesy with the title. I’ve changed the title and updated the description and I’m scheduled to present it again at another SQL Saturday. So at least the organizers there think it’ll draw folks.)

But overall, yeah, it’s frustrating, but a single talk doesn’t make or break me as a speaker. It happens and we move on.

Conclusion

It was still worth coming out to SQL Saturday Colorado Springs and I don’t regret it. I’m grateful to the organizers that gave me the opportunity.  So thanks.

Oh and one more thing I noticed while going back through notes for this blog entry: SQL Saturday Chicago 2017 was event 600, Colorado Springs 2018 was 700. That’s 100 in a year, almost 2 a week. And I was asked to speak at (including Chicago) 6 of them I believe. That’s a pretty good percentage.

I’m content.

That said come see me next month at SQL Saturday Philadelphia! I’m not sure what time I’m scheduled for yet, but I’ll be speaking on “So you want to Present: Tips and Tricks of the Trade”. And yes, I will talk about when people don’t show up. That’s assuming I have an audience 🙂

 

 

Choices

“If you choose not to decide You still have made a choice” – Rush Freewill

One of the things that we believe makes us uniquely human is the concept of freewill; that we can rise above our base instincts and make choices based on things other than pure instinct. While there’s some question if that’s unique to humans, let’s stick with it for now.

Overall, we think choice is good. I can choose to eat cake for breakfast, or I can choose to eat a healthy breakfast. I can choose if I want get up early and exercise, or sleep in.

Sometimes we may think it’s hard to decide between two such things as in the examples above, but the truth is, it’s not that hard.

But, what happens when the choices aren’t nearly as simple. What happens when we sit down with a menu with 3 items versus 30 or even 100? We can become paralyzed. With 3 options, our odds of making a “wrong” decision is only 66%. I say “wrong”because it’s often purely subjective and may not necessarily have much impact.  But when we have 100 different things to choose from, the odds of a “wrong” decision goes up to 99%. In other words, we’re faced with the concept that no matter what we do, we’re virtually guaranteed to make a “wrong” decision.

The Jam Experiment

One example of this effect was seen in what is often called the jam experiment. Simply put, when given the choice of 6 varieties of jam, consumers showed a bit less interest, but sales were higher. When the choice of 24 jams were presented, there was more interest, but sales actually dropped, significantly. People were apparently paralyzed by having too many choices.

Locally there’s an outdoor hamburger/hot-dog stand I like to frequent called Jack’s Drive In. People will stand in long lines, in all sorts of weather (especially on opening day, like this year when the line was 20 people deep and with the windchill it was probably about 20F!) One can quibble over the quality of the burgers and fries, but there’s no doubt they do a booming business. And part of the reason is because they have few choices and keep the line moving.  This makes it far faster for people to order and faster to cook.  With only a few choices, patrons don’t spend 5 minutes dithering over a menu.

Hint: If you’re ever in the area, simply tell them you want “Two burgers and a small french”.  Second hint: No matter how hungry you are, don’t as a former co-worker once did, try “6 Burgers and a large french”. You will regret that particular choice.

Choices to Europe

What brought this particular post on was all the choices I’m facing in trying to plan our family vacation. It’s rather simple really, “we want to visit Europe”. But, I also am hoping to speak at SQL Saturday in Manchester, UK. And we want to visit London (where my cousin lives) and Paris. And we can fly out of the NYC area. Or Boston. Or possibly other areas if the price was cheaper enough.  So suddenly what one would hope is a simple thing becomes very complicated. And of course every airline has their own website design, which complicates things.

Of course the simple choice would be not to fly. The second simplest would be not to care about cost.  Of course neither of those work. So, I’m stuck in deciding between 24 types of jam. Wish me luck!

Getting Unlost

There’s a concept I teach people when I teach outdoor skills. If you’re going to be wrong, be confidently wrong. There’s two reasons for this. For one, people are more likely to follow a leader who appears to be confident and knows what they’re doing. This can lead to better group dynamics and a better outcome.

But the second, for example, if you’re lost is, if for whatever reason you choose NOT to stay in one place (which by the way is often the best choice, especially for children) is that if you make a plan and stick to it, you’re far more likely to get unlost. This isn’t just wishful thinking.

Imagine you’re lost and you decide, “I’m going to hike North!”  And you start to hike north, and after 15 minutes you decide, “eh maybe that was the wrong decision. I should hike East!” And you do this for another 15 minutes, and then you decide, “Nah, now that I think about it, South is much better.” 15 minutes later you decide you’re going to the wrong way and West was the right way all along.  An hour later, you’re back where you started. But, if you had decided to stick with North the entire time, an hour later, depending on your pace, terrain and other factors, you could be 2-4 miles further north. “So what?” you might ask. Well, take a look at a map of almost any part of the country.  In most cases you’re less than 10 miles from some sort of road.  If you’ve spent 3 hours hiking, in a single direction, you’ve probably hit a road, or a powerline or some other sign of civilization. (note this is NOT advice to wander in the woods if you get loss or a promise this will work anyplace. There are definitely places in the US this advice is bad advice).  Also obviously, if you hit a gorge or other impassible geologic feature, you may have to change directions. Or you might get another clue (like hearing a chainsaw or engine or something human-caused in a specific direction).

Final Thoughts

So, if you’re going to make a choice, make it confidently. And don’t second-guess yourself until new, solid reasons come along.

So, keep your choices simple and stick to them.

And with that, I choose to stop typing now.

 

Fail-safes

Dam it Jim, I’m a Doctor, not a civil engineer

I grew up near a small hydro-electric dam in CT. I was fascinated by it (and still am in many ways). One of the details I found interesting was that on top of this concrete structure they had what I later found are often called flashboards. These were 2x8s (perhaps a bit wider) running the length of the top of the dam, held in place by wooden supports.  The general idea was they increased the pooling depth by 8″ or so, but in the advent of a very heavy water flow or flood, they could be easily removed (in many cases removed simply by the force of the water itself).  They safely provided more water, but were designed in fact to fail (i.e. give away) in a safe and predictable manner.

This is an important detail that some designers of systems often don’t think about; how to fail. They spend so much time trying to PREVENT a failure, they don’t think about how the system will react in the EVENT of a failure. Properly designed systems assume that at some point failure IS an not only an option, it’s inevitable.

When I was first taught rigging for cave rescue, we were always taught “Have a mainline and a belay”.  The assumption is, that the system may fail. So, we spent a lot of time learning how to design a good belay system. The thinking has changed a bit these days, often we’re as likely to have TWO “mainlines” and switch between them, but the general concept is still the same, in the event of a failure EITHER line should be able to catch the load safely and be able to recover. (i.e. simply catching the fall but not being able to resume operations is insufficient.)

So, your systems. Do you think about failures and recovery?

Let me tell you about the one that prompted this post.  Years ago, for a client I built a log-shipping backup system for them. It uses SSH and other tools to get the files from their office to the corporate datacenter.  Because of the network setup, I can’t use the built-in SQL Server log-shipping copy commands.

But that’s not the real point. The real point is… “stuff happens”. Sometimes the network connection dies. Sometimes the copy hangs, or they reboot the server in the office in the middle of a copy, etc. Basically “things break”.

And, there’s another problem I have NOT been able to fix, that only started about 2 years ago (so for about 5 years it was not a problem.) Basically the SQL Server in the datacenter starts to have a memory leak and applying the log-files fails and I start to get errors.

Now, I HATE error emails. When this system fails, I can easily get like 60 an hour (every database, 4 times an hour plus a few other error emails). That’s annoying.

AND it was costing the customer every time I had to go in and fix things.

So, on the receiving side I setup a job to restart SQL Server and Agent every 12 hours (if we ever go into production we’ll have to solve the memory leak, but at this time we’ve decided it’s such a low priority as to not bother, and since it’s related to the log-shipping and if we failed over we’d be turning off log-shipping, it’s considered even less of an issue). This job comes in handy a bit later in the story.

Now, on the SENDING side, as I’ve said, sometimes the network would fail, they’d reboot in the middle of a copy or something random would make the copy job get “stuck”. This meant rather than simply failing, it would keep running, but not doing anything.

So, I eventually enabled a “deadman’s switch” in this job. If it runs for more than 12 hours, it will kill itself so that it can run normally again at the next scheduled time.

Now, here’s what often happens. The job will get stuck. I’ll start to get email alerts from the datacenter that it has been too long since logfiles have been applied. I’ll go in to the office server, kill the job and then manually run it. Then I’ll go into the datacenter, and make sure the jobs there are running.  It works and doesn’t take long. But, it takes time and I have to charge the customer.

So, this weekend…

the job on the office server got stuck. So I decided to test my failsafes/deadman switches.

I turned off SQL Agent in the datacenter, knowing that later that night my “cycle” job would turn it back on. This was simply so I wouldn’t get flooded with emails.

And, I left the stuck job in the office as is. I wanted to confirm the deadman’s switch would kick in and kill it and then restart it.

Sure enough later that day, the log files started flowing to the datacenter as expected.

Then a few hours later the SQL Agent in the datacenter started up again and log-shipping picked up where it left off.

So, basically I had an end to end test that when something breaks, on either end, the system can recover without human intervention. That’s pretty reassuring. I like knowing it’s that robust.

Failures Happen

And in this case… I’ve tested the system and it can handle them. That lets me sleep better at night.

Can your systems handle failure robustly?

 

 

Things Left Unsaid

Pop Quiz

You show up at an accident scene and see two patients. One is screaming in pain about a broken arm. The other is propped up against the wall seemingly fine, not saying a word. Which one do you check out first?

Many will answer, “the one screaming in pain about the broken arm, the other person is fine.” The experienced responder will most likely check out the 2nd person. Why? Because they’re NOT saying a word.

Here’s the thing. You know the 1st person has a pulse and an airway. They’re breathing just fine. Perhaps a bit too fine. A broken arm, by itself isn’t going to kill them.  But what about that 2nd person? Are they breathing? You don’t know. Perhaps they’re not saying a word because they’ve stopped breathing.  If you take the time to splint the broken arm and then get to the 2nd person, they may have died. So, check out the 2nd patient first, then determine your course of action.

We’re Safe! Really, we are. Trust us, because we keep repeating it!

I saw this because in problem solving, I often find what’s NOT said is often far more important than what is said.  Several years ago my son received a letter saying he had been nominated for a program that took children to other countries on basically extended field trips. It actually sounded really interesting. We went to the presentation. I sat through it thinking, “this is really cool.” But, two things struck me. First, they kept emphasizing how safe it was. At first pass, and the first time they mentioned it, I wasn’t bothered. I mean as a parent, you want to know your kid is going to be safe if you put them in the hands of strangers for an extended period of time. But, they kept emphasizing it. It got to the point that all three of us (my wife, my son and I) started to wonder, “why the hell are the dwelling on this point?”

The other thing that was bothersome was once we got out of the lecture hall and tried to speak to some of the individuals, we asked them “How did our son get nominated?” “Oh it must have been a teacher at his school.” Which sounded great until we thought about it and thought it strange that no teacher had mentioned this to us or our son.

So, when we got home, we did some digging and found out there had been several incidents of accidents happening to students while overseas with this group. On one hand, nothing struck me as too statistically terrible, but the reports of the handling and the fact that we were only reading about the ones reported made me even more paranoid about how unsafe the program really was. I mean why emphasize safety unless you really feel like you have to?

The other detail we uncovered was most parents had the same experience about “your child has been nominated” without any word of by whom. The most troubling was at least one or two parents who chimed in who said that their child had been killed in an accident or otherwise died after their name had appeared in the newspaper for being on the honor roll. i.e. a fact that a teacher who might be in a position to nominate the said child would be well aware of. As far as we and other parents could determine, the “nomination” process was solely a matter of the group scanning the newspapers for honor roll students and the like.

So, relating this back to IT

As a person who loves troubleshooting, one of the things I’ve learned is NOT to trust what the user initially reports to me. “I haven’t changed a thing and this stopped working!”  That generally means, they changed something. 🙂

I once had a client, that had a problem that took at least two winters to diagnose. Why so long you might ask? Because the problem only happened in the winter. The first year it was complaints of “ever since you networked our computers, they reboot without warning.” Now, I had networked them several months previously and they only started to report the problem come the late fall/early winter. I tried several things, but nothing really fixed the problem. I had an idea of what it was, but they wouldn’t listen.  So, among other things, I ended up rewiring their entire network (sounds like a lot of work, but it was a total of 4-5 computers and I moved from thinwire Ethernet to 10baseT (I did say this was a long time ago, right?)

Eventually I sort of gave up. Until the next winter rolled around and they started to call again. Again, I told them what I thought the problem was. Again, they dismissed it.  I’m not sure what finally convinced them, but they finally took me up on my suggestion and put in a humidifier and had their office carpet treated with anti-static spray.  Yes, despite all their instance that “I was just sitting there typing and it rebooted” what was really happening and they weren’t saying was, “I just walked from one office to the other, across the carpet, in the drier than normal air and as soon as I touched my computer it rebooted.”  It was the static build-up all the time.

So this week’s moral of the story: Look beyond what’s being said and pay attention to what’s NOT being said. It might shock you.