Unknown's avatar

About Greg Moore

Founder and owner of Green Mountain Software, a consulting firm based in the Capital District of New York focusing on SQL Server. Formerly, a consulting DBA ("and other duties as assigned") by day, and sometimes night, and caver by night (and sometimes day). Now, a PA student working to add PA-C after my name so I can work as a Physician Assistant. When I'm not in front of a computer or with my family I'm often out hiking, biking, caving or teaching cave rescue skills.

Bacon

Another SQLSaturday has come and gone. This one in Philadelphia. I had actually been to this location twice before for SQL Server User Group meetings. I have a client in the area, so sometimes my schedule syncs up and I manage to get in some time with the Philly SQL Server User Group. It’s free, and it’s always a good way to learn something new.

Several months ago there were several memes going around on the Internet about how much money it cost to rescue Matt Damon from various places (WWII Europe, from beyond a black hole, from Mars). Today I caught a clip of a deleted scene from The Martian where his character talks about how much was spent to rescue one person.

It comes down to the fact that humans are social creatures and we like being part of a community. Sometimes I stop and think about the communities I’m in and how they overlap or don’t overlap.

As I’ve mentioned, I’m an instructor with the National Cave Rescue Commission. I’m also the Northeast Regional Coordinator for said organization. We technically don’t rescue folks, we teach folks to rescue people. We do it because the people who are best suited to rescue cavers are other cavers.  We do it because we know that someday there will be someone stuck or injured or lost in a cave and they may need our help. And we do it because we know that person might be us.  And if we’re not willing to help others, who will help us?

The other community on my mind today is the SQLSaturday community. Sure, we’re generally not dealing with live or death decisions and I can’t think of the last time someone’s life was on the line as I was doing work as a  DBA. But the concept is still the same.  We’re only as good as the community. Sure, there are some brilliant DBAs out there. There are some folks who speak T-SQL more fluently than many of us speak our native language. But ultimately, even they often will rely on the knowledge of others because there’s simply too much to know about this platform. So we come together and share.

Both of these communities, caver rescuers and SQL Saturday presenters share a couple of things in common.  We have a love for what it is that we do. We desire to help others. And, we don’t do it for the profit or the glory.  When I teach any of the NCRC cave rescue courses I get paid with my meals. The same is true with SQL Saturdays. Sometimes I might get a free shirt. And generally I’m paying for my own travel to any of these events.

However, there’s one other form of payment I strive for in each case. That of passing on knowledge and of gaining knowledge. With the cave rescue classes, I can often recall distinctly the look on a student’s face when a concept I’m teaching registers and they have what I call the “aha” moment. That’s worth the time and effort for me.

The same is true with when I teach at SQL Saturdays. If I have a single attendee come up to me and say, “Hey, I really liked your talk. The idea for X really resonated” I feel like I’ve accomplished my goal.

Equally, I almost always come away with learning something myself.That too makes it worth my while.

My talk yesterday was “Tips that have saved my Bacon”. I’ll admit I was a bit nervous giving this talk, since the first time I gave it at SQLSaturday NYC 2015 I had only about 8 people show up. So I wasn’t sure how popular it would be this time and being slotted for the first sessions in the morning I wasn’t sure if folks would be  up yet.

Fortunately, over 25 people showed up and seemed to enjoy it. And yes, I did have several people come up to me afterwards thanking me for the session and that they enjoyed it. So, my mission was accomplished. I also received some good critical feedback for improving it next time.

I’ll be speaking again at SQLSaturday here in Albany NY on July 30th. This time though will be a different topic, one I really enjoy and so far I’ve received great feedback on. It’s not about SQL specifically, but more about how IT and Management can learn from plane crashes.

I hope you can make it.

Documentation

Do it, it’s important.

Ok, I suppose I should expand a bit upon that and in this case add an actual example.

So last night, I again attended the local SQL Server User Group meeting. The talk this month was by Ray Kim and was on Documentation for Techies.  While we all agree that documentation is good, it’s sort of interesting how rare most techs actually do it. Ray’s talk covered some of this and further talked about exactly how valuable it is. In addition, several audience members spoke about how proper documentation saved their company a great deal of money simply by giving their tech support people the ability to answer questions in a far faster form.

I got thinking about some of the clients I’ve worked for and how I’ve wanted to document stuff, but often they have very little actually setup in the way of procedures to handle documentation. This is unfortunate, because it can cost them money. For example, for a client right now I’m working on automating a task.  It turns out that there’s not much documentation, so I’m basically struggling to figure things out as a I go.

One thing you hear tech folks talk about a lot is “oh the code is self-documenting”. And sometimes it is.  Since I work in SQL, often, but not always it’s clear what the code is doing. For example

Select firstname, lastname from Clients where ClientID=@ClientID

probably doesn’t need a comment saying what it does.  It’s pretty clear.  But a more complex query might need some commenting, or it may need some explanation as why a particular approach was taken. For example I was recently writing a stored procedure where the where clause was not quite what one would expect if one were to naively write it in the most obvious manner.  However, the obvious manner would have resulted in a table scan of a very large table. By writing what I did, I could ensure a seek would occur.

I also had a habit, which after thinking about last night and testing today, I’m going to modify a bit. Often I’d write procedures such as:

-- Usage: Exec FOO
-- Author: Greg D. Moore
-- Date: 2016-03-15
-- Version: 1.0
-- This simply returns bar when executed
if OBJECT_ID('foo', 'p') is not null drop procedure foo
go
create procedure foo
as
select 'bar'
go

Now, note technically this is a script (T-SQL) that will drop and then create the procedure, so it’s more than just the script. But it’s useful for me because I can ensure I’m running the latest and greatest and drop the old one if it exists before running it.

But, last not got me thinking. What happens if 3 years down the road someone comes along and needs to edit my code. Let’s say the client didn’t do a good job of keeping track of source code and they have to extract the scripts to create the procedures from SQL Server itself using say SSMS?

The results end up looking much more like this:

USE [Baz]
GO
/****** Object:  StoredProcedure [dbo].[foo]    Script Date: 03/15/2016 10:47:22 ******/
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[foo]') AND type in (N'P', N'PC'))
DROP PROCEDURE [dbo].[foo]
GO
USE [Baz]
GO
/****** Object:  StoredProcedure [dbo].[foo]    Script Date: 03/15/2016 10:47:22 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
create procedure [dbo].[foo]
as
select 'bar'
GO

Ignore the extra USE statements and the SSMS generated comments and SET statements. Notice my comments are gone.  This actually makes sense because in the first script, the comments occur before a GO statement so the SQL engine interprets them as completely separate from the statements to create the actual stored proc.  All my useful comments are now history.

BUT, there’s a simple solution. Move the comments to after the first GO statement.

if OBJECT_ID('foo', 'p') is not null drop procedure foo
 
go
 
-- Usage: Exec FOO
-- Author: Greg D. Moore
-- Date: 2016-03-15
-- Version: 1.0
-- This simply returns bar when executed
-- Version: 1.1
-- Comments moved below GO statement
 
create procedure foo
as
 
select 'bar'
go

Now if I use SSMS to generate my script I get:

USE [Baz]
GO

/****** Object: StoredProcedure [dbo].[foo] Script Date: 03/15/2016 10:48:53 ******/
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[foo]’) AND type in (N’P’, N’PC’))
DROP PROCEDURE [dbo].[foo]
GO

USE [Baz]
GO

/****** Object: StoredProcedure [dbo].[foo] Script Date: 03/15/2016 10:48:53 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

— Usage: Exec FOO
— Author: Greg D. Moore
— Date: 2016-03-15
— Version: 1.0
— This simply returns bar when executed
— Version: 1.1
— Comments moved below GO statement

create procedure [dbo].[foo]
as

select ‘bar’

GO

Now my great documentation is preserved. This is a small thing but down the road could save the next developer a lot of trouble.

So, stop and think about not only documentation, but how to make sure it’s preserved and useful in the future.

Who’s Flying the Plane

I mentioned in an earlier post my interest in plane crashes. I had been toying with a presentation based on this concept for quite awhile.

A little over a month ago, at the local SQL Server User group here in Albany I offered to present for the February meeting. I gave them a choice of topics: A talk on Entity Framework and how its defaults can be bad for performance and a talk on plane crashes and what IT can learn from them.  They chose the latter. I guess plane crashes are more exciting than a dry talk on EF.

In any event, the core of the presentation is based on the two plane crashes mentioned in the earlier post, Eastern Airlines Flight 401, the L-1011 crash in Florida in 1972 and US Airways Flight 1549, the Miracle on the Hudson in 2009.

I don’t want to reproduce the entire talk here (in part because I’m hoping to present it elsewhere) but I want to highlight one slide:

Flight 401 vs 1549

  • Flight 401 – Perfectly good aircraft
  • Flight 1549 –About as bad as it gets
  • Flight 401 – 101 Fatalities/75 Survivors
  • Flight 1549 – 0 Fatalities

Flight 401 had a bad front nosegear landing light and crashed.

Flight 1549 had two non-functional engines and everyone got off safely.

The difference, was good communications, planning, and a focus at all times on who was actually flying the airplane.

This about this the next time you’re in a crisis.  Are you communicating well? How is your planning, and is someone actually focused on making sure things don’t get worse because you’re focusing on the wrong problem.  I touch upon that here when I talk about driving.

The moral: always make sure someone is “flying the plane”.

Avoiding Mistakes

People often treat mistakes as unavoidable. Or sometimes people think mistakes are made because the person making them is unfamiliar with the situation or environment.

The truth is far more complex and often mistakes are not only avoidable, but they’re a direct result of the person being overly familiar with the situation or environment.

A post that came across my desk the other day discusses the Normalization of Deviance. No, I’m not talking about how references to 50 Shades of Grey are all over the place. I’m talking about how we come to accept errors as “ok” or even normal.

A classic example of this are the cases of O-ring burnthrough on flights prior to the Challenger disaster. The original specs called for no burnthrough. Any burnthrough was not acceptable. Yet once it was observed the basic attitude was that it hadn’t caused a problem so it was acceptable. At one point apparently when one was burned through approximately 33% the way, a claim was made that the O-rings had a safety factor of 3. This is a gross misapplication of the concept of a safety-factor since the O-rings were specced to have zero burn-through. By moving the goalposts, they permitted further launches to occur and burnthroughs to continue to until 51L and seven lives were lost. This was a huge management error. In this case the mistake was to ignore the original rule and essentially rewrite it without adequate review. The engineers had become used to the new norm, despite it being wrong.

In the example given in the first link above, a different form of deviance occurred. This was a social deviance that apparently made the lack of use of checklists acceptable.

In the crash, a large red warning device was completely ignored. One would think this was the mistake that caused the crash. However it’s really secondary to the original problem. The original problem is that checklists were developed precisely because humans CAN fail to notice large red warning devices. By not performing the checklist a mistake was missed and lives lost. Everything else is sort of fluff.

For pilots, take-offs become a routine procedure. So routine they begin to make simple mistakes.  Had this been their first time flying or even their first time in that particular model of aircraft they most certainly would have been paying attention. This is why checklists exist in cases like this, to eliminate the mistakes routine can introduce. Either pilot should have questioned the lack of the take-off checklists and insisted on their use.

They didn’t and people died.

 

 

Post hoc ergo propter hoc

One of my favorite shows is The West Wing and there is an episode of the same name as this post. Unfortunately for you, Aaron Sorkin is a better writer than I.

That said, this concept, “After it, therefore because of it” is a common mistake many of us make when forming theories. It’s related to the concept that correlation is not causation.

I was reminded of this the other night when another phrase entered my mind: “Rain Follows The Plow”. This was a hopeful theory in the 19th century that as settlers settled past the 100th Meridian, the rain would follow where they plowed. Simply put, by farming the land, rainfall would increase.

The theory sounds a bit perverse until one considers that for awhile, increased rainfall did seem to increase as the more land came under the plow. So, there was some basis for the idea at first. The correlation seemed to match. However, this just ended up being a short-term climate change.

Unfortunately the theory was also a product of the idea that humans were the center of creation. As the subsequent Dust Bowl and other issues showed however, this theory was, (excuse the bad pun) all wet.

Sometimes correlation is not causation and we should not let our all too human biases influence our theories.

Fortunately, properly done, science is eventually self-correcting. Scientists make mistakes, but over time, the winnowing process eliminates them.  The idea of scientific racism was once extremely popular, but over time has clearly been shown to be false.  The idea of an ether was shown to be false.

Meanwhile, other theories have continued to hold up to intense scrutiny. As weird as quantum mechanics appears to be, evidence continues to mount that much of the current theory is in fact correct. When scientists discover particles that travel faster than light the default assumption continues to be (and so far correctly) that there is an error in the experiment.

Not much of a moral here other than just because the rooster crows when the sun rises, don’t mistake the crowing for the cause of the sunrise.

 

GIGO

A huge tenet of programming is GIGO: Garbage In/Garbage Out.

Years ago when I was practicing for a play (Night of January the 16th by Ayn Rand). I was the bailiff.  At one point in the play I’m handed a copy of a check that is evidence. I’m supposed to “read” what’s on the check. Of course since it’s a play, I have my lines memorized.

But during this dress rehearsal I’m given a piece of paper with actual writing on it. Unfortunately it was just some random writing. But my brain went into segfault and I stopped. Part of my brain wanted to read what’s on the piece of paper.  Part of my brain wanted to say my lines, but it could no longer remember them.

It was a perfect example of how easy it is to scramble the input for our brains.  In the actual performances we made sure the piece of paper was actually blank.

I was reminded of this the other night when Steve Harvey made his gaff on live television. I was curious how he could make such a mistake but I had my suspicions. And I was right.  The cue card apparently was VERY poorly designed and his visual input system (i.e. his eyes and brain) screwed up. Read here for more details. Bad input lead to bad output.

These are humorous examples, but in the software world, these can be very dangerous.

At one point during the shuttle program, they found an error where the arm thought it had rotated more than 360 degrees, a physical impossibility. This link has some details (though in my recollection the issue was not a rounding error but that the code went from 0-360 instead of 0-359 or 1-360).  Garbage in could have lead to potentially bad garbage out.

Much more recently however, here’s an example of intentional “garbage” in. This is part of the encryption software used on many firewalls. Your bank or other financial institution for example may be using this code.

Ironically true garbage, as in a purely random number, might be better. But here it seems someone poisoned the input with their specific number and then set it up to use the results in a dangerous manner. I say dangerous because the 3rd party using this code may not realize that they’re completely vulnerable to having all their data seen.  About the only thing worse than unencrypted data is data you think is encrypted but isn’t.  In the former, I’m probably going to pay far more attention to who has access.  I’ll add too that some of us suspect the NSA had a hand in this.

This is by the way I highly recommend folks don’t write their own encryption. Unless you’re an expert you’re liable to screw it up.

Moral: So be careful of your inputs, they definitely influence your outputs, both in code and in your brain.

Page 81

One of the things that has always fascinated me is human memory; how we create them, what sticks, what doesn’t and how it evolves.

Many people think that human memories are rather static. The truth is that’s far from the truth.  As we saw in the 1990s is easy to form false memories and easy to conflate them.

One detail that is interesting is that human memories are a bit like DRAM in a computer.  In essence when we recall a memory, we have to basically read out the memory space and write it back.  One of the side effects is this can actually help strengthen memories.  However, it also means when it’s written back, other memories can be conflated with it and a new, slightly different memory is formed.

There’s two main ways of remember something that stand out to me as I write this.  Repetition and what I’ll call “sudden shock”.

Many things we need to repeat until we remember them.  An example is a child learning their times tables. There’s really not much context and really only rote repetition will cause these to sink in.

At the other end of the spectrum are the memories that are etched in our minds. “Where were you when Challenger blew up?”  “How did you first hear about 9/11?”  If you ask someone of the right age, they’ll know exactly when/where they were and probably recall vivid details.

If you ask them where they were on the 3rd shuttle mission, they’d probably have no clue.  The same is true if you ask them what they were doing on 9/9.

In between are more general memories. Memories of childhood that don’t necessarily have a specific timestamp or even importance.  I recall playing in some woods behind my house growing up, but there was nothing really significant about the time or place. I have no idea why I have that memory.

I mentioned above that memories can be modified or manipulated. There’s some work on treating PTSD this way; helping patients recall specific events under controlled circumstances and essentially rewriting the memory into something that doesn’t cause an attack. (Propanolol is one drug being experimented with to do this.)

Strangely there’s one memory of mine that persists that while not a real issue is sort of pointless and annoying to me.  It’s “Page 81”.

What’s that you ask? Many years ago (let’s just say before I was a teenager I think) I was staying at my cousin’s grandmother’s house.  On the bookshelf they had a copy of Jaws 2. I started reading it but had to leave before I could finish it. Since I knew I’d be back the next summer I decided to remember what page I was on. I repeated the page number to myself over and over again. And to this day, I can remember, I was on page 81 of Jaws 2 when I stopped reading.  Of course decades later I have no idea what happened in pages 1-80 so the memory doesn’t do me much good. But there it is. It’s still there. Page 81.

As a note, most of this post was based on memory (I had to look up the name of the drug) so some details may be wrong.

Page 81.

 

 

On Call

I want to pass on a video I’ve finally gotten around to watching:

Dave O’Conner speaks

I’ve managed a number of on-call teams to various levels of success. One point I’d add that makes a difference is good buy-in from above.

He addresses several good points, most of which I would fully agree with and even at various times adopted at my various jobs.

One thing he mentions is availability.  Too often folks claim they need 99.999% uptime. My question has often been “why?” and then followed by, “Are you willing to pay for that?”  Often the why boils down to “umm.. because…” and the paying for it was “no”, at least once they realized the true cost.

I also had a rule that I sometimes used: “If there was no possible response or no response necessary, don’t bother alerting!”.

An example might be traffic flow.  I’ve seen setups where if the traffic exceeds a certain threshold once in say a one hour period (assume monitoring every 5 seconds) a page would go out.  Why? By the time you respond it’s gone and there’s nothing to do.

A far better response is to automate it such that if it happens more than X times in Y minutes, THEN send an alert.

In some cases, simply retrying works.  In the SQL world I’ve seen re-index jobs fail due to locking or other issues.  I like my sleep.  So I set up most of my jobs to retry at least once on failure.

Then, later I’ll review the logs. If I see constant issue of retries I’ll schedule time to fix it.

At one client, we had an issue where a job would randomly fail maybe once a month.  They would page someone about it, who would rerun the job and it would succeed.

I looked at the history and realized simply by putting a delay in of about 5 minutes on a failure and retrying would reduce the number of times someone had to be called from about once a month to once every 3 years or so.  Fifteen minutes of reviewing the problem during a normal 9-5 timeframe and 5 minutes of checking the math and implementing the fix meant the on-call person could get more sleep every month. A real win.

Moral of the story: Not every thing is critical and if it is, handle it as if it is, not as a second thought.

American Cave Accidents

I’ve mentioned in the past that I teach cave rescue and I work with some great people.  This week I received the biannual American Cave Accidents report that the NSS puts out. (I’ll mention I’m friends with the editor and several of the folks who helped advise on this issue.)

This report in theory covers all reported cave accidents in the last 2 years in the US (and sometimes Canada and Mexico and other parts of the Americas.) In reality, anyone who is familiar with caving knows it misses some.

As I was reading today, I noticed in  several of the reports, the names of several people that I recognized; most are fellow instructors, or at the very least students.

Now, one could apply correlation is causation and determine that perhaps the presence of NCRC trained people causes caving accidents.  Thankfully I’m pretty sure that’s not true. But it got me thinking, “Why does it seem such a high proportion of the accident reports involve people I know?”  I certainly do NOT know that many cavers.

Then it dawned on me: these people are perhaps far more likely to actually submit accident reports. I know of at least two accidents where the people involved did NOT want to report.  Why? Because they thought either it was a bit embarrassing or because they didn’t think there was anything to learn from their experience.

I posit those who go through the NCRC training realize more than many others the value of sharing their experiences and that it is perhaps it is worth risking a little embarrassment (and to be honest there never really is any embarrassment in my opinion) if others can learn from their experiences.

I happened to be talking to another friend of mine tonight who is also in the EMS field and he also brought up as a separate discussion how much value there was in shared knowledge like this and how some folks don’t like to do so.

The more you can broaden your horizons and experiences you can hear about and share in your field of expertise will make you a better person in that field.

Take the time to share your mistakes and your success. Share your unusual stories, you never know when someone else can benefit from them.

Sharing and teaching

I spent this past Saturday in NYC at another SQL Saturday event. This blog isn’t typically about my day job. But sometimes things overlap.

SQL Saturday is basically a community run, volunteer event where folks who use SQL Server, or are interested in it, get together, present and attend talks and share ideas.

I unfortunately missed the keynote given by Grant Fritchey, but from what I understand, he discussed the importance of networking for DBAs: and he didn’t mean the kind running on TCP/IP handshakes as much as the one running on human handshakes.

This is exactly why I’ve wanted to teach at a SQL Saturday for awhile. I can’t claim my presentation was as well attended as many others and I can’t claim my presentation was as useful as some, but it was well received. (I would be lying if I didn’t say I think handing out cookies didn’t help!)

I’m a firm believer in passing on what I’ve learned. It’s how I learned, others took the time to pass information on to me and I take the time to pass it on to others.

One thing I really like about the SQL Saturday community is how much folks leverage off of each other. Often an idea presented by one presenter is seized upon by another who then expands upon the idea and adds to the body of knowledge.  Then the first presenter will take the updated idea even further.

This is how we learn, by taking ideas, expanding and sharing them back.