Experimenting

There are times when you have to take at face value what you are told.

There are 1.31 billion people living in China. This according to several sources (that all probably go back to the same official document from the Chinese government.)  I’m willing to believe that number. I’m certainly not going to go to China and start counting heads. For one, I don’t have the time, for another, I might look awfully weird doing so. It’s also accurate enough for any discussions I might have about China. But if I were going to knit caps for every person in China I might want a more accurate number.

That said, sometimes one shouldn’t take facts at face value. A case in point is given below. Let me start out with saying the person who gave me this fact, wasn’t wrong.  At least they’re no more wrong than the person who tells me that the acceleration due to gravity is 9.8m/s².  No, they are at worst inaccurate and more likely imprecise. Acceleration due to gravity here on Earth IS roughly 9.8m/s². But it varies depending where on the surface I am. And if I’m on the Moon it’s a completely different value.

Sometimes it is in fact possible to actually test and often worth it. I work with SQL Server and this very true here. If a DBA tells you with absolute certainty that a specific setting should be set, or a query must be written a specific way or an index rebuilt automatically at certain times, ask why. The worst answer they can give is, “I read it some place.”  (Please note, this is a bit different from saying, “Generally it’s best practice to do X”. Now we’re back to saying 9.8m/s², which is good enough for most things, but may not be good enough if say you want to precisely calibrate a piece of laboratory equipment.)

The best answer is “because I tested it and found that it works best”.

So, last night I had the pleasure of listening to Thomas Grohser speak on the SQL IO engine at local SQL Server User Group meeting. As always it was a great talk. At one point he was talking about backups and various ways to optimize them. He made a comment about setting the maxtransfersize to 4MB being ideal. Now, I’m sure he’d be the first to add the caveat, “it depends”. He also mentioned how much compression can help.

But I was curious and wanted to test it. Fortunately I had access to a database that was approximately 15GB in size. This seemed liked the perfect size with which to test things.

I started with:

backup database TESTDB to disk=’Z:\backups\TESTDB_4MB.BAK’ with maxtransfersize=4194304

This took approximately 470 seconds and had a transfer rate of 31.151 MB/sec.

backup database TESTDB to disk=’Z:\backups\TESTDB_4MB_COMP.BAK’ with maxtransfersize=4194304, compression

This took approximately 237 seconds and a transfer rate of 61.681 MB/sec.

This is almost twice as fast.  While we’re chewing up a few more CPU cycles, we’re writing a lot less data.  So this makes a lot of sense. And of course now I can fit more backups on my disk. So compression is a nice win.

But what about the maxtransfersize?

backup database TESTDB to disk=’Z:\backups\TESTDB.BAK’

This took approximately 515 seconds and a transfer rate of 28.410 MB/sec. So far, it looks like changing the maxtransfersize does help a bit (about 8%) over the default.

backup database TESTDB to disk=’Z:\backups\TESTDB_comp.BAK’ with compression

This took approximately 184 seconds with a transfer rate of 79.651 MB/sec.  This is the fastest of the 4 tests and by a noticeable amount.

Why? I honestly, don’t know. If I was really trying to optimize my backups, most likely I’d run each of these tests 5-10 more times and take an average. This may be an outlier. Or perhaps the 4MB test with compression ran slower than normal.  Or there may be something about the disk setup in this particular case that makes it the fastest method.

The point is, this is something that is easy to setup and test. The entire testing took me about 30 minutes and was done while I was watching tv last night.

So before you simply read something on some blog someplace about “you should do X to SQL Server” take the time to test it. Perhaps it’s a great solution in your case. Perhaps it’s not. Perhaps you can end up finding an even better solution.

 

 

 

 

Sharing and teaching

I spent this past Saturday in NYC at another SQL Saturday event. This blog isn’t typically about my day job. But sometimes things overlap.

SQL Saturday is basically a community run, volunteer event where folks who use SQL Server, or are interested in it, get together, present and attend talks and share ideas.

I unfortunately missed the keynote given by Grant Fritchey, but from what I understand, he discussed the importance of networking for DBAs: and he didn’t mean the kind running on TCP/IP handshakes as much as the one running on human handshakes.

This is exactly why I’ve wanted to teach at a SQL Saturday for awhile. I can’t claim my presentation was as well attended as many others and I can’t claim my presentation was as useful as some, but it was well received. (I would be lying if I didn’t say I think handing out cookies didn’t help!)

I’m a firm believer in passing on what I’ve learned. It’s how I learned, others took the time to pass information on to me and I take the time to pass it on to others.

One thing I really like about the SQL Saturday community is how much folks leverage off of each other. Often an idea presented by one presenter is seized upon by another who then expands upon the idea and adds to the body of knowledge.  Then the first presenter will take the updated idea even further.

This is how we learn, by taking ideas, expanding and sharing them back.

Rolling in the Deep

I was at SQL Saturday in Boston this past weekend and I sat in on a session given by Paresh Motiwala: “Why do we shun using tools for DBA job?”

It’s a decent question and sometimes rolling your own (hence the title here and yes I’m listening to Adele right now) is the right answer.

But often, it’s not.

A case in point are DBAs who avoid using the built-in SQL Server maintenance plans, especially for simple tasks such as Backups.

Now, I’ll start out by saying straight up, sometimes they’re not the optimal solution. (You may want to backup certain partitions on a different a rotation schedule and the like as one example.)

Below is a recent situation I came across.

But, if you do decide to roll your own, please do NOT do the following:

  1. First write a sproc that has to be placed into each database as it’s created in order to run the backup. (If you DO decide to go this route, please make this a scripted part of your release procedure so it’s not missed.)
  2. If you DO in fact ignore suggestion #1, I will point out that it doesn’t do much good to make the first step of your sproc to check to see if the database exists. BIG HINT: If the database doesn’t exist, the sproc won’t run in the first place! Yes, I know you’re saying, “Obviously” but this is the situation I just came across.
  3. If you DO in fact ignore suggestion #2, you probably don’t need to call your own special “alert sproc”.
  4. But if you DO in fact ignore suggestion #3, make sure your “alert sproc” does more than call a “email sproc” and pass it a few parameters.
  5. And if you DO in fact ignore suggestion #4, please make sure your “email sproc” does a bit more than build a message and call the built-in SQL Server stored proc to send an email.

Now granted, there may be reasons to do some of the above. Perhaps your alert sproc also calls some sort of 3rd party monitoring tool. Or your email needs are very specific.

In the case I just rectified, none of that was true.  So there was a lot of additional complexity (which really didn’t work anyway) for no good reason.

Another problem this roll your own backup setup had was that it used the same filename every time for its backups. i.e. On Monday the backup name was M:\Backups\FOO_FULL.BAK.  On Tuesday it was M:\Backups\FOO_FULL.BAK, etc.

In theory (and generally in practice) each of these would be backed-up to a 3rd party so it was in theory possible to find the backup for a specific day, but that was an added complexity; and probably not one you want in an actual DR situation.  Also, if for some reason the backup to the the 3rd party failed (but the local backups continued) they’d definitely lose the ability to restore specific days of backups.

In addition, the person who built this procedure setup differential backups for MOST databases to run every 15 minutes.  Now, I wouldn’t necessarily call that a terrible idea, but in this case, almost certainly not the best approach in my opinion.  However, again, the same file name was used each time.

This means that in a DR event, the company could restore the previous nights backup and if they wanted, the most recent Diff backup and that was it.  If they wanted to restore to a point in time in between, that was impossible. And in my experience this is far more common than most other restore needs.

Finally, the developer who wrote all this clearly did not understand what the CHECKPOINT command did. He had scheduled a checkpoint job to run every 30 minutes.  Again, in general, not only not necessary, but probably a bad idea. However in this case it not only was not necessary, the reason given in the job comments was completely wrong.  He seemed to think it would keep the transaction logs from growing.

This is of course NOT what it does and sure enough on the one database still with FULL RECOVERY enabled the transaction log was far larger than the actual database. (Fortunately it was a lightly used database or the disk might have filled up years ago.)

Since discovering all this, I’ve gone and replaced all this complexity with a set of maintenance jobs.  These will guarantee each system database is backed up weekly (for their needs this should be fine) with unique names.  User databases will be backed up nightly and retained for 4 nights (and perhaps extended once we determine fully how much disk space we want to set aside for this.)  Transaction logs will be performed every 15 minutes. These too will have unique names.

Now the customer can restore to any point in time in the last 4 days (if they go to their 3rd party backup, even further back) up to 15 minutes before a failure (and in some cases if the log is still available and they can backup the tail of the log, up to the instant before the failure).

If they add additional databases, they don’t have to worry about remembering to put in 3 separate sprocs for each database added and adding new jobs to the SQL Server Agent.

Now they not only have a far more robust backup plan, they have one that is far easier to maintain.  Oh and one that will actually send an email if there’s a problem with the backup.

The morale is: Don’t make things more complex unless you absolutely need to and if you do, make sure you actually achieve the goals you’re trying to achieve.

Practicing for Disaster

I’ve had this post by Wayne Hale in my queue for awhile since I’ve wanted to comment on it for awhile and until lately have been to busy to do so.

One of my current contracts requires them to do an annual DR test.  Since the end of the year is approaching, they’re trying to get the test in. Part of the test requires an “official” test observed by an outside auditor.

So, being smart, and since a lot has changed in the past year, we decide to schedule a dry-run or two before hand.

Well let’s just say those have not gone as expected.

Some might consider the dry-runs failures.

I don’t. I consider them successes. We are finding out now, in a controlled environment with no real time pressures, where we are weak and need to fix things.

It’s far better to do this now than during the audited test or even better than during an actual disaster event! So the dry-runs are serving their purpose, they’re helping us find the holes before it’s too late.

That said, I have to claim the part that I’m most involved with, the SQL Log-Shipping has been working well.  The only issue this week with that was a human error made by another DBA that was completely unrelated to the DR test and within minutes of him discovering his error he executed the proper procedure to begin fixing it.  The total fix on his end took no more than 5 minutes and other than monitoring on my end, the effort on my end took no more than 5 minutes.  That’s an excellent example of a robust design and set of procedures.

Today’s moral is don’t just have a DR plan, practice it. And not every failure is really a failure.

All set to think

A short post today.

This past weekend I attended another SQL Saturday event. This one in Washington DC.  For anyone interested in SQL Server, I highly recommend these events. Typically the cost is just $10 to attend for the day.

I had put in a bid to present but unfortunately was declined.  That’s fine, there were plenty of other seminars worth my time to attend.

One of them, Common Coding Mistakes and how to Mitigate Them by William Wolf was a good example of that. @SQLWareWolf did a great job of illustrating a number of fairly common mistakes.  That itself was worth the price of admission. But what I really enjoyed was an observation of his, one that I’ve had in the past.  As a DBA, I can often spot stored procs written by a programmer with a non-SQL background.  The tell-tale is that many (certainly not all) programmers who come to SQL from another background often think in terms of rows, not sets.

An example of this would be the case of a programmer opening up a cursor and looping through the cursor to set the date on a number of records.  Of course for any person with a SQL background, we understand that can be a single statement which will execute far faster.

Now, there are certainly times when row by row is the only way to do it, but if you see that in a sproc or script, I’ll bet you 5:1 it was written by a programmer who didn’t know better.

Now, to be fair as a DBA who learned to program when “object oriented” wasn’t a buzz-word, I’ve been doing a lot of VB.net programming lately and I have to admit often I’ll find myself down a hole of twisty paths before I’ll realize I’m writing bad code and if I simply think of the data I’m trying to modify as an object, suddenly things get FAR more clear and easier.  I would not be surprised if 5 years from now the next programmer comes along, looks at my code and thinks, “What was he thinking? He obviously understood objects, but didn’t do everything he should have using objects.”

But in both cases, over time the non-SQL programmer coming to SQL will learn and I hope my OO code is slowly getting better.

No real moral here other than common observations and that hopefully overtime, we all improve.