And Fun Was Had By All

I want to give shout-outs to Rob Farley, Peter Shore, Deborah Melkin, and Rob Sewell for making the final Capital Area SQL Server User Group meeting of the year a rousing success. And this being T-SQL Tuesday, I’m going to try to very loosely tie this into the topic this month How much do you love meeting in person, where would you like for your next event to take place, and why Costa Rica? as invited by Xavier Morera.

tsqltuesday
T-SQL Tuesday

First, let me get his questions out of the way:

  1. Favorite Conference: The easy answer has been SQL Pass, but honestly, at this point, any where I get to see folks in person!
  2. Best Venue: Ignoring Pass at Seattle, I have to say Manchester UK was nice, simply because it was my first overseas SQL Saturday, or perhaps Virginia Beach SQL Saturday, because Monica Rathbun and her group provided a nice charcuterie board!
  3. Best Presenter: Oh, this is a tough one. I’m going to take a pass. But then cheat and answer below. Sort of.
  4. Next event and why it’ll be Costa Rica: I’m suspecting sort of a bias in this question, but to be honest, I’d love to go. I think 2022 will be a bit too busy for me to visit, but perhaps 2023 or 2024. Maybe I can work in some caving then too!

That all said, I want to get back to my shout-outs above and tie that into this T-SQL Tuesday.

As the coordinator for the Capital Area SQL Server User Group, one of my primary roles, in fact perhaps the most important, is finding speakers to present. I’ve tried over the past few years to have a good variety and to bring some variety. We haven’t really missed a meeting since the pandemic has started, but we have been virtual for well over a year now. This has presented both drawbacks and opportunities. The biggest drawback of course is the lack of actual in-person interaction and the feeling of connectedness that has brought. On a personal note it also means not only have I not gotten out of cooking dinner the night of meetings, but often, I’m juggling getting something together for dinner and getting the session started (though last night my wonderful wife did take care of dinner for me.)

On the flip side, being virtual has allowed me to invite speakers who might not otherwise be willing or able to travel in person to Albany NY and for attendees from across the country to show up. It has also given me the opportunity to experiment a bit more with formats.

Last year, instead of our traditional in-person holiday party format, we did a version of “Bluff the Listener” where I asked various presenters to tell their worst IT/SQL horror stories, but one was lying. It was a success and a lot of fun.

Not wanting to repeat that, this year I decided to ask the above 4 presenters to present lightning rounds. That’s not so bad, except I added a twist. They didn’t get to choose their topics, they were given them: 10 minutes before they were scheduled to present. (And yes, some may I stole this idea from Buck Woody, I’d like to say I was inspired).

I’ll admit I was very nervous about this idea. It seemed a bit gimmicky and it could have been a complete disaster with lesser speakers. Fortunately, all four brought their A-Game.

Rob Farley, presenting from the future, in I believe a public work space, managed to give one of the best talks on column-store indices I’ve seen. Given he had only 10 minutes of prep, I was impressed. His presentation included the use of Powerpoint in sort of a “green screen” mode so he could draw on his screen and we could see what he was drawing.

Peter Shore followed up talking about Tips in Advancing a Career in Data. Again, off-the-cuff with limited prep time, he did very well with this topic. I think in some ways this was almost harder than the more technical topics because you can’t fall back on a demo or graphics.

Deborah Melkin followed, talking about the Best new SQL Server features (2019, 2022, Azure). I had announced previously that the best speaker would be awarded a prize. By I think unanimous declaration, even before Rob Sewell finished out the night with his presentation, the other speakers decided Deborah was the winner. She included some demos in her presentation, which, given the lead time, really impressed folks.

Closing out the evening, Rob Sewell entertained us with a demo of SQL Injection. Not surprisingly, he made use of PowerShell and Notebooks.

As I said, it was an entertaining and educational evening. I purposely set expectations low and made sure folks understood that the entertainment value was as much, if not more important than actual educational value. But I was very pleased with how educational it turned out to be. It was a nice way to end the year and honestly, I think a decent way to get a break from the bad news that seems to have surrounded us lately.

I do have a theory though about why the educational part turned out as well as it did though. In general I’ve always enjoyed lightning talks and I honestly, think they’re among the hardest type of talk to give. Sometimes people promote them as a good introduction to speaking for novice speakers, but I’m not so sure. To give a successful lightning talk, one really has to strip a presentation to the bare essentials and really focus on just one or two key concepts. This can be difficult. But done well I think it really makes those concepts stick.

Now, combine that with topics only being given out 10 minutes in advance, I think that really forces a presenter to focus on key concepts even more. I wouldn’t give an inexperienced presenter a random topic, and even with an experienced presenter, I’d give them a chance to decline a topic if they feel it’s completely outside their wheelhouse. But otherwise, give them a chance to see what they can do. It might surprise you. Heck, it might surprise them.

So, to go back and answer a question from above: Best Presenter… at least last night Deborah Melkin, who if nothing else proved her Google-foo was impressive.

And I think if I can find volunteers, I will definitely try to do an in-person version of this at a future SQL Saturday or Data Saturday or other conference.

Thanks to all who participated and joined us. It was a blast. But honestly, next year, I hope to see you all in person at our holiday party!

Summit Recap

As many of my readers know, last week was the 2021 PASS Data Community Summit hosted by Redgate. In the past I would have travelled to Seattle to attend in person. Last year, due to Covid the Summit became a virtual event. I was a bit disappointed since I had finally been selected to present and was looking forward to doing so in person. I ended up presenting virtually. That alone would have been disappointing enough, but there were other issues and basically the underlying structure that supported PASS and its structure went belly-up and declared bankruptcy.

More Thoughts on Last Year’s Summit

My recap last year was positive but I have to be honest now. I was certainly trying to put a good spin on things. The truth is, I was more frustrated than I originally let on. I’m still not only grossly disappointed by the poor closed-captioning, I’m still a bit offended. I’m as guilty as anyone for probably not being inclusive enough when it comes to things like color-blindness, difficulty of hearing, etc, but to know that the organization had weeks to get good closed-captioning done, and didn’t still offends me.

I was also extremely insulted when weeks before the Summit, User Group leaders were asked to pony up money to attend. A benefit of doing the work of leading a User Group has traditionally been a free ticket to Summit. To have that change weeks or just a few months before Summit, especially one that was going virtual definitely felt like a bait and switch.

This and some behind the scenes factors in regards to PASS had left a bitter taste in my mouth. Then of course we finally got word that PASS as an organization was no more.

There were wails of anguish, and rending of garments, and wearing of sackcloth. (ok, I may be overdramatizing a bit). But I was hopeful. As I stated then, PASS really is the community. It’s the people. And they’re some of the best people I’ve known professionally: #sqlfamily.

This Year’s Summit

So, on to Summit this year. Within months of the demise of the former structure, Redgate announced it was buying the intellectual property associated with PASS. Microsoft in the meantime was rolling out tools to help the user groups. Steve Jones was working on SQL Saturday. Things were looking good. I was hopeful.

Redgate announced the Pass Data Community Summit would happen, albeit virtually. I was excited and looking forward to it. That said, I’ll admit I did not put in to speak this year. I just didn’t have the motivation. This was a mark on me, not on Redgate’s efforts. Redgate also announced that while the precons would cost money, the Summit itself would be free. Last year there was a lot of discussion about charging for a virtual summit and while I defended the concept a bit, because there are still enormous costs associated, I also was not a fan of it, because I knew it would be a hard sell to managers. I think I was proven right there. The very preliminary numbers I heard for attendance this year appear to have FAR outpaced the numbers for last year. I think that’s a good sign.

So, enough rambling, what about Summit this year?

First two criticisms

I common complaint, and one I knew I was guilty of, was it was unclear that for many sessions one was supposed to watch the recorded session first and then attend what was essentially a live Q&A. I know for the first presentation I attended on Wednesday, even the presenter, who was in the Q&A was confused by this. I’ll admit, I never did figure out how to watch the recorded sessions beforehand.

The second was, I didn’t discover the Spatial.Chat system until the 2nd day, and that was only because I am a Friend of Redgate had received a specific email inviting me to a private chat.

Now, partly, I will put the blame for the above two criticisms squarely on myself for probably not reading the emails in enough detail. But, it does seem others had the same issue and perhaps more succinct, clear links or emails might have helped. I honestly don’t know.

The Positives

That said, despite the above criticisms, I really enjoyed Summit and in a huge part because once I learned about the Spatial.chat system and how to use it, I for the first time, felt like I was in a virtual space that closely mimicked real life. The idea of being able to move closer to people to hear them better, or to move away if I wanted to focus on some work but still be “part of the crowd” worked REALLY well. Trying to translate a physical presence into a virtual one is often tough, but I think the Spatial.Chat stuff worked really well. I found myself hanging out there more than anything else.

Since I never did figure out how to watch recorded stuff before the live Q&A, I focused on the actually live sessions and they did NOT disappoint. As usual, while folks talk about how great the social atmosphere is at Summit, the truth is we tell our bosses we go for the technical content and it was topnotch as expected. Once again great content!

I also really enjoyed the Keynotes this year. I’ll admit, because of jetlag and because I’m often up late talking with the college friends whose house I crash at while in Seattle, I am often late to the keynotes (if I make it at all) and sometimes doze off in the large, warm dark room. This year, none of that happened. I was entertained and really enjoyed them. I especially enjoyed Brent Ozar‘s Keynote on Friday and the impact of the Cloud on ones career.

I think this year’s Summit overall felt more positive for a number of reasons. For one, I think many of us are finally hoping to see a light at the end of the Covid Tunnel. For another, I think most of us are far more hopeful about the overall #SQLFamily community and future PASS Summits than we were a year ago. Finally, I think we just all felt the need to socialize again, albeit it virtually.

Redgate has already announced next year’s summit will be in Seattle again next year, but will be a hybrid event. I will be very curious to see how that works, but I can tell you right now I’m already budgeting to attend in person.

  • End note: I am a Friend of Redgate and write for their Simple-Talk blog. That said, in this case Redgate isn’t paying for my thoughts and my thoughts are my own.

Inbound and Outbound to NYC

I still recall the first computer program I wrote. Or rather co-wrote. It was a rather simple program, in Fortran I believe, though that’s really an educated guess. I don’t have a copy of it. It was either in 7th or 8th grade when several of us were given an opportunity to go down to the local high-school and learn a bit about the computer that they had there. I honestly have NO idea what kind of computer it was, perhaps a PDP-9 or PDP-11. We were asked for ideas on what to program and the instructor quickly ruled out our suggestion of printing all numbers from 1 to 1 Million. He made us estimate how much paper that would take.

So instead we wrote a program to convert temperature Fahrenheit to Celsius. The program was as I recall a few feet long. “A few feet long? What are you talking about Greg?” No, this was not the printout. This wasn’t how much it scrolled on the screen. Instead it was the length of the yellow (as I recall) paper tape that contained it. The paper tape had holes punched into it that could be read by a reader. You’d write your program on one machine, and then take it over to the computer and feed it into the reader and it would run it. I honestly don’t recall how we entered the values to be converted, if it was already on the tape or through some other interface. In any case, I loved it and fell in love with computers then. Unfortunately, somewhere over the years, that paper tape has since disappeared. That saddens me. It’s a memento I wish I still had it.

In four or five short years, the world was changing and quickly. The IBM PC had been released while I was in high school and I went from playing a text adventure game called CIA on a TRS-80 Model II to programming in UCSD Pascal on an original IBM PC. (I should note that this was my first encounter with the concept of a virtual machine and p-code machine.) This was great, but I still wanted more. Somewhere along the line I encountered a copy of Microsoft’s Flight Simulator. I loved it. In January of 1985 my dad took me on a vacation to St. Croix USVI. Our first step on that trip was a night in NYC before we caught our flight the next morning. To kill some time I stepped into 47th Street Photo and bought myself a copy of Flight Simulator. It was the first software I ever bought with my own money. (My best friend Peter Goodrich and I had previously acquired a legal copy of DOS 2.0, but “shared” it. Ok, not entirely legal, but hey, we were young.)

I still have the receipt.

For a High School Student in the 80s, this wasn’t cheap. But it was worth it!

I was reminded of this the other day when talking with some old buddies that I had met when the Usenet sci.space.policy was still the place to go for the latest and greatest discussions on space programs. We were discussing our early intro to computers and the like.

I haven’t played this version in years, and honestly, am not entirely sure I have the hardware any more that could. For one thing, this version as I recall was designed around the 4.77Mhz speed of the original IBM PC. This is one reason that some of my readers may recall when the PC AT clones came out running the 80286 chip running at up to 8Mhz (and faster for some clones) there was often a switch to run the CPU at a slower speed because many games otherwise simply ran twice as fast and as a result the users couldn’t react fast enough. So even if I could find a 5 1/4″ floppy and get my current machine to read the drives in a VM, I’m not sure I could clock down a VM slow enough to play this. But, I may have to do this one of these days. Just for the fun of it.

I still have the original disks and documentation that came with it.

Flying outbound from NYC

A part of me does wonder if this is worth anything more than the memories. But for now, it remains in my collection; along with an original copy of MapInfo that was gifted to me by one of the founders. But that’s a stroll down memory lane for another day.

And then I encountered SQL Server only a short 6 or so years later. And that ultimately has been a big part of where I am today.

SQL Server Scheduled Email Queries

Every once in a while I come across a problem that both surprises me that it exists and that the solution is often trivial. This is one of those. Basically developer at a client wanted to setup a scheduled task that would execute a query and email the results. Everything appeared to work, but the emails never went out. There were no errors that were obvious other than the emails never arrived. Some digging via profiler showed that the SQL Agent user was having permission issues. But basically giving it every permission possible didn’t solve it.

So let me walk you through it. First, let’s create a real simple stored procedure. This assumes you have AdventureWorks2014 installed (yeah, it’s old, but it’s what I had handy).

Use adventureworks2014
GO
Create or Alter Procedure Send_Test_Email
as
exec msdb.dbo.sp_send_dbmail @recipients=’test@example.com’, @body=’This is the body of the email’, @subject=’Test Email w query embedded’,
@query=’select * from adventureworks2014.sales.SalesOrderHeader where subtotal > 150000′;

Now, here’s an important detail I did discover during testing. If my query didn’t actually query a table, but instead was say @query=’select getdate()’ things worked fine.

That said, if you simply execute the query above in SSMS, it should work just fine. (I’d recommend you put your own email in it for testing. This way you’ll know if the email is actually being sent.)

Before you do that, also create the following stored procedure:

Use adventureworks2014
GO
Create or Alter Procedure Send_Test_Email_Query_attached
as
exec msdb.dbo.sp_send_dbmail @recipients=’test@example.com’, @body=’This is the body of the email’, @subject=’Test Email w query embedded’,
@query=’select * from adventureworks2014.sales.SalesOrderHeader where subtotal > 150000′,
@attach_query_result_as_file = 1,
@query_result_no_padding = 1,
@query_attachment_filename = ‘Sales Report.csv’,
@query_result_separator = ‘;’,
@exclude_query_output = 1,
@append_query_error = 0,
@query_result_header = 0

This will execute the same query as the original stored procedure, but place the contents in a CSV file as an attachment.

Again, if you execute the above directly from SSMS, you should receive the email without an issue. This is basically what the client was attempting to do.

Now create the following job:

USE [msdb]
GO

/ Object: Job [Send AdventureWorks Email] Script Date: 7/27/2021 9:12:49 AM / BEGIN TRANSACTION DECLARE @ReturnCode INT SELECT @ReturnCode = 0 / Object: JobCategory [[Uncategorized (Local)]] Script Date: 7/27/2021 9:12:49 AM /
IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]’ AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N’JOB’, @type=N’LOCAL’, @name=N'[Uncategorized (Local)]’
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

END

DECLARE @jobId BINARY(16)
EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N’Send AdventureWorks Email’,
@enabled=1,
@notify_level_eventlog=0,
@notify_level_email=0,
@notify_level_netsend=0,
@notify_level_page=0,
@delete_level=0,
@description=N’No description available.’,
@category_name=N'[Uncategorized (Local)]’,
@owner_login_name=N’sa’, @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
/ Object: Step [Send email] Script Date: 7/27/2021 9:12:50 AM /
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N’Send email’,
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=1,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N’TSQL’,
@command=N’exec Send_test_email_Query_attached’,
@database_name=N’adventureworks2014′,
@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)’
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
GO

If you execute this job as is, you should see the following:

Success! Or is it?

But you will never get the email!

Modify the above job and replace the line:

@command=N’exec Send_test_email_Query_attached’,

with

@command=N’exec Send_test_email’,

Now if you run the job, it will fail!

This time it clearly failed!

But as is usual with sp_send_dbmail, the error message isn’t overly helpful:

Executed as user: NT AUTHORITY\SYSTEM. Failed to initialize sqlcmd library with error number -2147467259. [SQLSTATE 42000] (Error 22050). The step failed.

But it does help give a clue. The problem seems to be some sort of permissions issue. So I’ll admit, at this point I tried all sorts of solutions, including setting up a proxy user, giving unfettered rights to my SQL Agent user and other things (thinking once I got it working I could then lock things back down). Instead, I found a much easier solution buried in a thread on the Microsoft site.

You’ll note when I wrote the original stored procedure I fully qualified the table name. This is often generally useful, but here I did it because I was cheating and knew I’d need it for this demo.

The solution is actually VERY simple and I’ll show it both graphically and via a script.

First: change the database to MSDB and then fully qualify your call to the stored procedure as below:

Execute it from the MSDB database

However, there’s one more critical step: under the ADVANCED tab in the step change the Run As User to dbo:

Seems simple, but it’s critical

Now if you script out the scheduled task it should look like:

USE [msdb]
GO

/ Object: Job [Send AdventureWorks Email] Script Date: 7/27/2021 9:26:37 AM /
EXEC msdb.dbo.sp_delete_job @job_id=N’08a55e18-eecf-4d4c-8197-8135a0d7520b’, @delete_unused_schedule=1
GO

/ Object: Job [Send AdventureWorks Email] Script Date: 7/27/2021 9:26:37 AM / BEGIN TRANSACTION DECLARE @ReturnCode INT SELECT @ReturnCode = 0 / Object: JobCategory [[Uncategorized (Local)]] Script Date: 7/27/2021 9:26:37 AM /
IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]’ AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N’JOB’, @type=N’LOCAL’, @name=N'[Uncategorized (Local)]’
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

END

DECLARE @jobId BINARY(16)
EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N’Send AdventureWorks Email’,
@enabled=1,
@notify_level_eventlog=0,
@notify_level_email=0,
@notify_level_netsend=0,
@notify_level_page=0,
@delete_level=0,
@description=N’No description available.’,
@category_name=N'[Uncategorized (Local)]’,
@owner_login_name=N’sa’, @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
/ Object: Step [Send email] Script Date: 7/27/2021 9:26:37 AM /
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N’Send email’,
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=1,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N’TSQL’,
@command=N’exec adventureworks2014.dbo.Send_Test_Email_Query_attached’,
@database_name=N’msdb’,
@database_user_name=N’dbo’,
@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)’
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
GO

If you then execute THIS scheduled task you’ll get both a success message:

And you should actually receive the email with the query results attached.

So, basically an issue that surprised me that it even existed, actually ended up with a fairly simple solution.

… Other Duties as Assigned

I’ve mentioned once before that at one of my clients I describe my job as “DBA and other duties as assigned.”

This phrase has really been on my mind this week, especially during a phone call with another client yesterday. This second client is a local consulting company that has hired me a few times to back them with my skills in SQL Server and MS Access. This time around the work they’re looking for is definitely SQL Server related. It was refreshing.

But it reminded me of my last two weeks with two of my other clients. One is having an issue with their app (that they always call “the database”) that is most likely a design issue that I need to dig into. This is a perfect example of what I call “software archeology” where I at times have to shift through “pot shards” to determine what the original developer was thinking. At times it can be fun and interesting, at other times, frustrating. I’ll be shifting through more pot shards in the near future to get to the bottom of this problem.

For my largest client, I spent most of my hours with them last week trying to true up a file with some financial data in it. In this case it’s part of an ETL process where I receive data, compile it and send it to a vendor. The process uses a combination of PowerShell and Pentaho. So while they interact with the database, the work I was doing wasn’t in T-SQL or directly on the database server.

The numbers weren’t adding up. There was an undercurrent of “Greg, your numbers are wrong” or “You’re filtering on the wrong criteria.” I kept pointing out that “I simply add up the numbers you give me.” Eventually the problem was narrowed down to the fact that in the source system, which is the system of record, they had deleted rows. Arguably, one should never be deleting rows in such a system, but rather issuing a 2nd row (a credit if you want to reverse a debit, or a debit to reverse a credit) and this was typically what was done. But in this case the maintainers of the source of record decided to wholesale delete these rows. I explained that from day one, since deletions are never supposed to happen (and given the way the system works, extremely hard to detect) all I do is either insert new rows, or update existing rows. In any event, with one minor schema change, some updates to the rows in question and an updated PowerShell script, I was able to make the numbers come out to match with theirs. So, is that really DBA work? Not in the traditional sense. But it’s definitely other duties as assigned.

Now that’s not to say I didn’t do what some might consider actual DBA work. On Saturday morning I patched one of their servers. And at one point during the week, I deployed a script to production. So, out of 18 hours of work for the customer last week, I think I can say maybe 1-2 total was “dba work” or about 5%.

Now, I want to be clear. This is not a rant or a complaint. I’ll admit I tend to prefer to work directly with SQL Server, but I was reminded of a quick discussion I had with a fellow DBA over the weekend about how they probably needed to start to learn PowerShell for their job.

I’ve been arguing for years that the role of a DBA has changed, and will continue to change dramatically over the next few years. Once where we might spend days head down slinging T-SQL code, setting up backups and restores, tuning indices, etc. now much of that is automated or at least far easier to do. Which is a good thing. In years past, a DBA might be responsible for a dozen machines or so at the most. If it was more than that, we’d feel sorry for them. That’s no longer uniformly true. I know a DBA who is responsible for over 100 machines. They’re the soul DBA. But, through PowerShell and other modern tools, it’s generally not an overwhelming job.

However, like the online presentation from the Atlanta Azure Data User Group I attended last night on SQL Database Edge, there is a growing list of things DBAs need to learn. Steve Jones recently posted about whether DBAs need to learn Linux? The short take away is not necessarily, but it’s probably a good idea, but we definitely need to learn about containers.

I have heard for years, “Microsoft will automate everything and the DBA’s job will go away.” Not only is that not true in my experience, the exact opposite is. I think being a successful DBA is in some ways harder than it was a decade ago. There’s so much more to be aware of and to learn.

Off the top of my head, without any real priority I came up with the list below of technologies that a modern DBA might find useful to know. This is not to say I know them all, or that one has to be an expert in all of them. And I will note, this is far from an inclusive list. I also left out third-party tools which are so common place. But I think it illustrates just how broad the required skillset of a good DBA is these days.

  • T-SQL
  • PowerShell
  • Query Store
  • Linux – at least at the most basic level
  • Containers
  • SSIS
  • SSAS
  • SSRS
  • Storage – (at least how different types can impact performance and the advantages and disadvantages of each)
  • Azure
  • SQL Database Edge
  • git or some form of version control

In conclusion, I’ll say, I’m not going to make any predictions about where the Microsoft data platform will be a decade from now, but I can tell you that DBAs will still be needed but their skillset will be as different from today as today is from a decade ago.

And post conclusion, I’ll add I’ll continue to rely on #sqlfamily and all my fellow DBAs to help me out. And continue to help them.

Your Boss Doesn’t Care About Backups!

It’s true. Even if they don’t realize it. Or even if they claim they do. They really don’t.

I’ve made this point before. Of course this is hyperbole. But a recent post by Taryn Pratt reminded me of this. I would highly recommend you go read Taryn’s post. Seriously. Do it. It’s great. It’s better than my post. It actually has code and examples and the like. That makes it good.

That said, why the title here? Because again, I want to emphasize what your boss really cares about is business continuity. At the end of the day they want to know, “if our server crashes, can we recover?” And the answer had better be “Yes.” This means that you need to be able to restore those backups, Or have another form of recovery.

Log-Shipping

It seems to me that over the years log-shipping has sort of fallen out of favor. “Oh we have SAN snapshots.” “We have Availability Groups!” “We have X.” “No one uses log-shipping any more, it’s old school.”

In fact this recently came up in a DR discussion I had with a client and their IT group. They use a SAN replication software to replicate data from one data center to another. “Oh you don’t need to worry about shipping logs or anything, this is better.”

So I asked questions like was it block-level, file-level, byte-level or what? I asked how much latency there was? I asked how we could be sure that data was hardened on the receiving side. I actually never got really clear answers to any of that other than, “It’s never failed in testing.”

So I asked the follow up question, “How was it tested.” I’m sure their answer was supposed to reassure me. “Well during a test, we’d stop writing to the primary, shut it down and the redirect the clients to the secondary.” And yes, that’s a good test, but it’s far from a complete test. Here’s the thing, many disasters don’t allow the luxury of cleaning stopping writes to the primary. They can occur for many reasons, but in many cases the failure is basically instantaneous. This means that data was inflight. Where in flight? Was it hardened to the log? Was that data in flight to the secondary? Inquiring minds want to know.

Now this is not to say these many methods of disk based replication (as opposed to SQL based which is a different beast) aren’t effective or don’t have their place. It’s simply to say, they’re not perfect and one has to understand their limitations.

So back to log-shipping. I LOVE log-shipping. Let me start with a huge caveat. In an unplanned outage, your secondary will only be up to date as the most recent log backup. This could be an issue. But, the upside is, you should have a very good idea of what’s in the database and your chances of a corrupted block of data, or the like is very low.

But there’s two facts I love about it.

  1. Every time I restore a log file, I’ve tested the backup of that log file. This may seem obvious, but, it does give me a constant check on my backups. If my backups fail for any reason, lack of space, a bad block gets written and not noticed, etc. I’ll know as soon as my next restore fails. Granted, my FULL Backups aren’t being restored all the time, but I’ve got at least some more evidence that my backup scheme in general is working. (and honestly, if I really needed to, I could backup my copy and use that in a DR situation.)
  2. It can make me look like a miracle worker. I have, in the past, in a shop where developers had direct access to prod and had been known to mess up data, used log-shipping to save the day. Either on my DR box, or a separate box I’d keep around that was too slow CPU wise for DR, but had plenty of diskspace, I’d set it to delay applying logs for 3-4 hours. In the event of most DR events, it was fairly simple to catch-up on log-shipping and bring the DR box online. But more often than not, I used it (or my CPU weak but disk heavy box) in a different way. I’d get a report from a developer, “Greg, umm, I well, not sure how to say this, but just updated the automobile table so that everyone has a White Ford Taurus.” I’d simply reply, “give me an hour or so, I’ll see what I can do.” Now the reality is, it never took me an hour. I’d simply look at the log-shipped copy I had, apply any logs I needed to catch up to just before their error, then script out the data and fix the data in production. They were always assuming I was restoring the entire backup or something like that. This wasn’t the case, in part because doing so would have taken far more than an hour, and would have caused a complete production outage.

There was another advantage to my 2nd use of log-backups. I got practice at manually applying logs, WITH NOROLLBACK and the like. I’m a firm believer in Train as you Fight.

Yes, in an ideal world, a developer will never have such unrestricted access to Production ( and honestly it’s gotten better, I rarely see that these days) and you should never need to deal with an actual DR, but we don’t live in an ideal world.

So, at the end of the day, I don’t care if you do log-shipping, Taryn Pratt’s automated restores or what, but do restores; both automated and manually. Automated because it’ll test your backups. Manually because it’ll hone your skills for when your primary is down and your CEO is breathing down your neck as you huddle over the keyboard trying to bring things back.

Reminder

As a consultant, I’m always looking for new clients. My primary focus is helping to outsource your on-prem DBA needs. If need help, let me know!

“Houston, we’re venting something into Space…”

This post is the result of several different thoughts running through my head combined with a couple of items I’ve seen on social media in the past few days. The first was a question posted to #SQLHelp on Twitter in regards to if a DBA came into a situation with a SQL Server in an unknown configuration what one would do. The second was a comment a friend made about how “it can’t get any worse” and several of us cheekily corrected him saying it can always get worse. And of course I’m still dealing with my server that died last week.

To the question of what to do with an unknown SQL Server, there were some good answers, but I chimed in saying my absolute first thing would be to make backups. Several folks had made good suggestions in regards to looking at system settings and possibly changing them, possibly re-indexing, etc. My point though was, all that could wait. If the server had been running up until now, while fixing those might be very helpful, the lack of fixing things would not make things worse. On the other hand, if there were no up to date backups and the server failed, the owner would be in a world of hurt. Now, for full disclosure, I was “one-upped” when someone pointed out that assuming they did have backups, what one really wanted to do was a restore. I had to agree. The truth is, no one needs backups, what they really need are restores. But the ultimate point is really the same, without a tested backup, your server can only get much worse if something goes wrong.

I’ve had to apply this thinking to my own dead server. Right now it’s running in a Frankenbeast mode on an old desktop with 2GB of RAM. Suffice to say, this is far from ideal. New hardware is on order, but in the meantime, most things work well enough.

I actually have a newer desktop in the house I could in theory move my server to. It would be a vast improvement over the current Frankenbeast; 8GB of RAM and a far faster CPU. But, I can’t. It doesn’t see the hard drive. Or more accurately, it won’t see an OS on it. After researching, I believe the reason comes down to a technical detail about how the hard drive is setup (namely the boot partition is what’s known as a MBR and it needs to be GPT). I’ll come back to this in a minute.

In the meantime, let’s take a little detour to mid April, 1970. NASA has launched two successful Lunar landings and the third, Apollo 13 is on its way to the Moon. They had survived their launch anomaly that came within a hair’s breadth of aborting their mission before they even made orbit. Hopes were high. Granted, Ken Mattingly was back in Houston, a bit disappointed he had been bumped from the flight due to his exposure to rubella. (The vaccine had just been released in 1969 and as such, he had never been vaccinated, and had not had it as a child. Vaccines work folks. Get vaccinated lest you lose your chance to fly to the Moon!)

Stack of Swiss cheese slices showing holes lined up.

A routine mission operation was to stir the oxygen tanks during the flight. Unfortunately, due to a Swiss Cheese effect of issues, this nearly proved disastrous when it caused a spark which caused an “explosion” which blew out the tank and ruptured a panel on the Service Module and did further damage. Very quickly the crew found themselves in a craft quickly losing oxygen but more importantly, losing electrical power. Contrary to what some might think, the loss of oxygen wasn’t an immediate concern in terms of breathing or astronaut health. But, without oxygen to run through the fuel cells, it meant there was no electricity. Without electricity, they would soon lose their radio communication to Earth, the onboard computer used for navigation and control of the spacecraft and their ability to fire the engines. Things were quickly getting worse.

I won’t continue to go into details, but through a lot of quick thinking as well as a lot of prior planning, the astronauts made it home safely. The movie Apollo 13, while a somewhat fictionalized account of the mission (for example James Lovell said the argument among the crew never happened, and Ken Mattingly wasn’t at KSC for the launch), it’s actually fairly accurate.

As you may be aware, part of the solution was to use the engine on the Lunar Module to change the trajectory of the combined spacecraft. This was a huge key in saving the mission.

But this leads to two questions that I’ve seen multiple times. The first is why they didn’t try to use the Service Module (SM) engine, since it was far more powerful and had far more fuel and they in theory could have turned around without having to loop around the Moon. This would have saved some days off the mission and gotten the astronauts home sooner.

NASA quickly rejected this idea for a variety of reasons, one was a fairly direct reason: there didn’t appear to be enough electrical power left in the CSM (Command/Service Module) stack to do so. The other though was somewhat indirect. They had no knowledge of the state of the SM engine. There was a fear that any attempt to use it would result in an explosion, destroying the SM and very likely the CM, or at the very least, damaging the heatshield on the CM and with a bad heatshield that would mean a dead crew. So, NASA decided to loop around the Moon using the LM descent engine, a longer, but far less risky maneuver.

Another question that has come up was why they didn’t eject the now dead and deadweight, SM. This would have meant less mass, and arguably been easier for the LM to handle. Again, the answer is because of the heatshield. NASA had no data on how the heatshield on the CM would hold up after being exposed to the cold of space for days and feared it could develop cracks. It had been designed to be protected by the SM on the flight to and from the Moon. So, it stayed.

The overriding argument here was “don’t risk making things worse.” Personally, my guess is given the way things were, firing the main engine on the SM probably would have worked. And exposing the heatshield to space probably would have been fine (since it was so overspecced to begin with). BUT, why take the risk when they had known safer options? Convenience is generally a poor argument against potentially catastrophic outcomes.

So, in theory, these days it’s trivial to upgrade a MBR disk to a GPT one. But, if something goes wrong, or that’s not really the root cause of my issues, I end up going from a crippled, but working server to a dead server I have to rebuild from scratch. Fortunately, I have options (including now a new disk so I can essentially mirror the one disk, have an exact copy and try the MBR->GPT solution on that one) but they may take another day or two to implement.

And in the same vein, if it’s a known SQL Server, or an unknown one, you’re working on, PLEASE make backups before you make changes, especially anything dramatic that risks data loss. (and I’ll add a side note, if you can, avoid restarting SQL Server when diagnosing issues, you lose a LOT of valuable information in the DMV tables.

So things CAN get worse. But that doesn’t mean there’s any need to take steps that will. Be cautious. Have a backout plan.

A Speaker’s Timeline

This post will be short, for reasons that are hopefully obvious by the end.

Sometime in February

Hmm, I should put together some ideas to submit to present to SQL Summit in Houston (not Dallas as Mistress SQL pointed out to me) this year.

March 16th

An update, the call for speakers has been postponed. Darn.

March 23rd

Call for speakers is finally open!

March 30th

Submit 3 possible topics.

April 1st

Approach a fellow speaker about a possible joint session, but after discussion, decide not to go ahead with the idea.

June 3rd

Get an update, Summit will be virtual this year. Thankfully I didn’t book any tickets or hotel rooms in Dallas.

July 20th 6:49 PM EDT

Woohoo! I got the email! One of my submissions got selected to present!

July 20th 6:50 PM EDT

Crap, now I actually have to write the entire thing!

July 20th 6:51 PM EDT

Wait, and it’s going to be virtual too. That’s going to make it a bit more of a challenge to present. But I’m up to it!

Sometime in August

I really should get started. Hmm, here’s one of the scripts I want to present.

But honestly, I’m preparing to teach a bunch of cavers and medical students cave rescue, I need to concentrate on that first.

September 5th

I just biked over 100 miles. I’m certainly not working on my presentation THIS weekend.

Later in September

Ok, now I’m going to sit down and really work through this. Here’s a basic outline.

October 1st

Oh wait, it’s going to be virtual AND I have to prerecord it? How is that supposed to work? I had better read up at the speaker portal!

October 2nd

Huh, ok, that sorta makes sense, upload the slides, do a recording, but I still don’t get how it’ll work with a presentation like mine with lots of demos. Well I’ll figure it out.

October 6th around 11 PM EDT

Well the PowerPoint template deck they provided looks pretty slick. I should start prepping my slides.

October 6th, approximately 5 minutes later

There, got the first slide done. Of course it’s only my name and pronouns, etc. But it’s a start.

Oh and the 2nd slide is done, but that’s simply the default PASS slide talking about chapters, SQL Saturday etc, so technically I didn’t do anything there.

I’ll start working on the closing slides.

October 7th, sometime after midnight

Ok, about 5 slides done. I’ll like to myself and say I’ve made great progress!

October 9th, approximately 10:00 PM EDT

Ok, I’ll at least start writing out the scripts I need.

October 9th, 20 minutes later

What the bloody hell? Why is this script failing? I’ve got to present this. If I can’t get this script working how is anyone going to believe that I know PowerShell, let alone actually use it.

October 9th, 5 minutes later

Well, damn, that was an embarrassing mistake, just had the , in the wrong place

October 10th around 9:00 PM EDT

Hmm, to properly demo this, I really need to run against 3-4 SQL Servers and I really don’t want to spin up a bunch of VMS and I can’t use my development one, too much proprietary data there.

I know, NOW is a perfect time to start to learn to use Docker! Why not? And besides Cathrine Wilhemsen has a great post on it. I’ll simply follow that.

2 hours and 1 reboot later

Hey, would you look at that? I’ve actually got a docker container running SQL. This is awesome!

Another minute later

But why can’t I actually connect? What network is it on? Why did I decide docker was easier? Why did I even submit this proposal? What the heck am I doing here? What is the meaning of life?

5 more minutes

That’s it, I’m going to bed.

October 11th, late night

Oh, I get it it now, I didn’t setup a full separate network, it’s bridged and that’s why it’s showing 0.0.0.0. I just need to change the port and I’m good to go!

A minute later

This is pretty awesome. Not what I’d do for a production setup, but definitely works for my demos. Now if I were really smart, I’d also setup persistent storage and the like, but this is good enough. And honestly now, setup a loop, increment a variable and bam, I’ve got 4 instances of SQL running in docker, 2 are 2017 and 2 are 2019. This is really incredible. I’m proud of myself.

Oh and even better, I’m doing all this in a PowerShell script, so I can actually make it PART of my presentation!

October 12th 2:26 PM EDT

Send off an email to the Program folks at PASS asking about how the recording stuff works with demos. Eagerly awaiting a reply.

October 15th, another late night

Yes, there’s a theme here, much of my work is being done late at night. It seems to work for me. But dang that deadline is getting closer!

October 16th, late night, again

Watched some Schitt$ Creek with the family. “Why didn’t we start watching this sooner? It’s hilarious! But I need to work on my presentation some more.”

Get all the PowerShell scripts basically done. I’m happy with it, need to work on my speaking script some.

October 19th 3:00 PM EDT

Get off the phone with a fellow Cave Rescue expert. Just before I get off, I mention my upcoming virtual, prerecorded session I have to finish. He says, “Oh, you know I just did 2-3 of those for a rescue conference, exact same format. It worked out really well. I can send you some details and feedback.”

I find that reassuring.

Also recheck email, still no answer from the folks at PASS on my questions about demos, etc.

October 19th, guess what time

I’ve finished everything, even updated the slides and scripts a bit more. I’m a bit worried I’m going to run too long, but decide to do my first of several practice run throughs.

Do my first full run through. Stop and correct a few mistakes or rough edges here and there. I’m not too worried if I run over now since I know I’ve artificially added some time.

October 19th, 42 minutes later

I get done, look at the PowerPoint timer: 42 minutes. “CRAP! I need this to be 60 minutes!” I’m not too worried, I can add more, but I’m not sure where and I don’t want to simply add fluff for the sake of fluff. I need to give this some thought.

Later on October 19th

Talking to a friend of mine who among other things has a background in adult education. She doesn’t know SQL or PowerShell, but she’s a good sounding board and she’s going to sit through my next run-through, not so much for the technical details but to give feedback on the flow and perhaps suggestions on where I may be making too many assumptions on what my listeners will know.

October 20th Early Morning

It’s a Tuesday, time to blog. As always I face that question, what should I blog about?

“I know, I’ll blog about how I’m getting my presentation together and the deadline is fast approaching. I can’t be the only speaker that often finds themselves up against the deadline and panicking.”

Next 36 hours

Add a bit more content and run through it 2-3 more time and then… RECORD! (technically it looks like I have until the 26th to upload my recording, but I want to get done early).

Conclusion

The above may or may not be a wholly accurate timeline or description of the process I’ve gone through trying to get my presentation ready for Pass Virtual Summit. I may have elided a few details and over-hyped a few others, but in general it’s close to true and accurate. Despite my always best intentions, I find myself often working up close to the deadline for submissions. Since for Summit they want NEW presentations, I can’t simply dust-off one of my previous presentations and use that, so there’s definitely more work involved here.

And honestly up until I learned it was going to be prerecorded, I thought I’d have most of October to work on it. The deadline to get the slides and recordings submitted sort of threw my original timeline for working on it in the dumpster so I’m actually a bit further behind than I expected to be.

On the other hand, I really did learn to use Docker and I think that’s valuable and I am making that part of my presentation. And, when all is said and done, I think I’ll be happy with it. I think though like any good speaker, I’ll look back and think “well next time, I’ll have to improve this or that.” There’s always room for improvement. I’m not keen on giving it prerecorded. I value the instantaneous feedback I get from the audience. So that will be different. But I at least can elicit questions during the presentation and there’s a life Q&A afterwards. But, I’ll still be nervous.

I’m in awe of speakers who get their presentations all prepped and prepared months in advance, but I suspect there’s a number out there like me, that don’t operate that way. And I suspect there’s a few who are even more nervous than I thinking, “OMG, am I the only one in this spot?” Nope, you’re not. Or rather, “Please let me know I’m not the only one!”

See you all at Summit, at least virtually!

And in the meantime there’s another possible deadline coming up I need to think about…

#SQLFamily

I’ve mentioned this in the past and thought I’d write something quick about it today. The quick is because I’m lacking time, not because the topic isn’t important or worthy of exposition.

Anyone who has spent much time at any PASS events such as SQL Saturday or Pass Summit has an inkling of what #SQLFamily is.

At its base, it’s a group of professionals who all have SQL Server in common. That might be a start, but it’s hardly a good definition. It’s also:

  • Professional contacts
  • LOTS of people willing to give SQL help when you need to solve a problem
  • Folks that will fact check your blog or post
  • It’s the folks willing to step up for a User Group Meeting talk

And that might be enough, but that’s not all it’s also:

  • Someone who loves bicycling as much or more than me.
  • At least one amateur radio operator (and quite the ham in other ways at times)
  • Several with 3D printers making mask band holders and the like
  • Several that sing karaoke
  • Someone who makes more homemade pizza than I do
  • At least one with cute puppies she’s been known to have on her webinars

But, honestly, it doesn’t end there. In this time of Covid it’s been more.

  • It’s been the folks who I get together with on Friday’s for a long-distance social hour
  • It’s the ones I’ve been able to talk about fears of COVID and schooling and kids
  • It’s been the ones I’ve reached out to to make sure they’re ok
  • It’s been the ones that have checked in on me
  • It’s the folks that write blog posts, sometimes daily, about how to support others

In short it really is a family. We’re not together by blood but we still share our thoughts and feelings and support each other. And you know what, right now I’m extremely grateful for that family.

So to my #SQLFamily, if I haven’t said it enough, thank you for who you are and for being there, especially during this time of Covid. I know I’ve needed it. And I really appreciate it.

And I can’t wait to see you all in person again at some point.

P.S. – if you’re shy or don’t think you’re welcome in the family, don’t worry, you are welcome. Pop in, say hi, or even just reach out to one person and say hi or ask for an introduction.

The Value of Testing

This is one of those posts where I wish I could show actual code snippets, but since it involves a 3rd party vendor for one of my clients and I don’t have permission, I can’t.

So, I’m forced unfortunately to talk about the issue in a roundabout way.

My client uses a 3rd party tool to track documents. I’ve mentioned this before. They’ve been growing fairly fast and running into performance issues. I suppose growing fast is a good thing, but having performance issues is not.

In any case, using Query Store, I was able to send the vendor a list of queries and stats about them for them to review and to ideally improve the queries that needed work.

Yesterday they got back to me. The email was essentially we took this first query (let’s call it Doubly-Joined) and rewrote it as this second query (let’s call it Singly-Joined). I looked at the two queries, which join 4 tables. They’re very similar to each other, but the first one did join in the main table a second time (hence why I’m calling it Doubly-Joined). It’s not clear why this was done. The second query basically removed the second join and in the select clause, changed the aliases to the second join to the first join. This does give them a slightly different query plan, but ultimately, they return the same number of rows.

The first query plan
The second query plan

As you can see, the 2nd query plan is definitely a bit simpler (ignore the one warning, it’s not something that appears to be fixable here).

So, a naive take would be “we removed an unnecessary join, so of course it should be faster!” But is it?

Sometimes intuition can be correct, sometimes not so much. In this case though, it’s easy to confirm by seeing exactly how many rows are being read in each query.

I wrapped each query in a

Set Statistics IO ON/OFF
Set Statistics TIME ON/OFF

block and ran it. Here are the results

The Doubly-Joined

Table 'Table1'. Scan count 0, logical reads 337264, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table2'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table3'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table4'. Scan count 1, logical reads 396, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The Singly-joined

Table 'Table1'. Scan count 0, logical reads 337260, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table2'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table3'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table4'. Scan count 1, logical reads 396, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

I’ve highlighted the relevant change. The single-joined query consistently performed with 4 fewer logical reads. Now, if the original number had been 8 and had dropped in half to 4, I’d be happy. But the change from 337264 to 337260 leaves me a bit underwhelmed. Furthermore, under multiple runs, the second query did not consistently use less CPU time, sometimes it took faster to run. Further testing was consistent in the lack of apparent improvement.

Needless to say, I don’t think this query improvement will help much. I’ve reached out to the vendor to see if they can provide more details, but honestly, I’m not hoping for much.