About Greg Moore

Founder and owner of Green Mountain Software, a consulting firm based in the Capital District of New York focusing on SQL Server. Lately I've been doing as much programming as I have DBA work so am learning a lot more about C# and VB.Net than I knew a couple of years ago. When I'm not in front of a computer or with my family I'm often out caving or teaching cave rescue skills.

Giving Blood and Pride Month

I gave blood yesterday. It got me thinking. First, let me show a few screenshots:

male blood donor shot 1

7 Male Donor #1 screen shot

female blood donor shot 1

Female Donor #1 screen shot

Let me interject here I’m using the terms Male and Female based on the criteria I selected in the American Red Cross’s Fast Pass screen. More on why I make that distinction further on. But first two more screen shots.

female blood donor shot 2

Pregnancy question highlighted for female

male blood donor shot 2

No pregnancy question for males

Now, on the face of it, this second set of questions especially almost seems to make sense: I mean if I answered Male early on in the questionnaire, why by asked about a pregnancy? But what I’m asked at the beginning is about my gender, not my actual child-bearing capability. Let me quote from Merriam-Webster:

2-b: the behavioral, cultural, or psychological traits typically associated with one sex

Or from the World Health Organization:

Gender refers to the roles, behaviours, activities, attributes and opportunities that any society considers appropriate for girls and boys, and women and men. Gender interacts with, but is different from, the binary categories of biological sex.

Who can be pregnant?

So above, really what the Red Cross is asking isn’t about my gender, but really my ability to be pregnant. Now, this is a valid medical concern. There are risks they want to avoid in regards to pregnant women, or recently pregnant women giving blood. So their ultimate goal isn’t the problem, but their initial assumption might be. A trans-man might still be able to get pregnant, and a trans-woman might be incapable of getting pregnant (as well as a cis-woman might be incapable.) And this is why I had the caveat above about using the terms male and female. I’m using the terms provided which may not be the most accurate.

Assumptions on risk factors

The first set of images is a problematic in another way: it is making assumptions about risk factors. Now, I think we can all agree that keeping blood borne pathogens such as HIV out of the blood supply is a good one. And yes, while donated blood is tested, it can be even safer if people who know they are HIV or at risk for it can potentially self-select themselves out of the donation process.

But…

Let me show the actual question:

Male Male 3 month contact question

Question 21, for Men

This is an improvement over the older restrictions that were at one year and at one point “any time since 1977”. Think about that. If a man had had sex with another man in 1986, but consistently tested negative for HIV/AIDS for the following 30+ years, they could not give blood under previous rules. By the way, I will make a note here that these rules are NOT set by the American Red Cross, but rather by the FDA. So don’t get too angry at the Red Cross for this.

The argument for a 3 month window apparently was based on the fact that HIV tests now are good enough that they can pick up viral particles after that window (i.e. at say 2 months, you may be infected, but the tests may not detect it.)

Based on the CDC information I found today, in 2018, male-to-male sexual contact resulted in 24,933 new infections. The 2nd highest category was heterosexual contact (note the CDC page doesn’t seem to specify the word sexual there.) So yes, statistically it appears male-male sexual contact is a high-risk category.

But…

I know a number of gay and bisexual men. I don’t inquire about their sexual habits. However, a number are either married or appear to be in monogamous relationships. This means if they want to give blood and not lie on the forms, they have to be celibate for at least 3 months at a time!  But hey if you’re a straight guy and had sex with 4 different women in the last week, no problem, as long as you didn’t pay any of them for sex! I’ll add that more than one gay man I know wants to give blood and based on their actual behavior are in a low risk category, but can’t because of the above question.

Why do I bring all this up at the end of Pride Month and what, if anything does it have to do with database design (something I do try to actually write about from time to time)?

As a cis-het male (assigned at birth and still fits me) it’s easy to be oblivious to the problematic nature of the questions on such an innocuous and arguably well-intended  form. The FDA has certain mandates that the Red Cross (and other blood donation agencies) must follow. And I think the mandates are often well-intended. But, there are probably better ways of approaching the goals, in the examples given above, of helping to rule out higher-risk donations. I’ll be honest, I’m not always sure the best way.  To some extent, it might be as simple as rewording the question. In others, it might be necessary to redesign the database to better reflect the realities of gender and sex, after all bits are cheap.

But I want to tie this into something I’ve said before: diversity in hiring is critical and I think we in the data world need to be aware of this. There are several reasons, but I want to focus on one for now.

Our Databases Model the World as We Know It.

The way we build databases is an attempt to model the world. If we are only aware of two genders, we will build our databases to reflect this. But sometimes we have to stop and ask, “do we even need to ask that question?” For one thing, we potentially add the issue of having to deal with Personally Identifiable Information that we don’t really need.  For another, we can make assumptions: “Oh they’re male, they can’t get pregnant so this drug won’t be an issue.”

Now, I’m fortunate enough to have a number of friends who fall into various places on the LGBTQIA+ (and constantly growing collection of letters) panoply and the more I listen, the more complexity I see in the world and how we record it.

This is not to say that you must go out instantly and hire 20 different DBAs, each representing a different identity. That’s obviously not practical. But, I suspect if your staff is made up of cis-het men, your data models may be suffering and you may not even be aware of it!

So, listen to others when they talk about their experiences, do research, get to know more people with experiences and genders and sexualities different from yours. You’ll learn something and you also might build databases. But more importantly, you’ll get to know some great people and become a better person yourself. Trust me on that.

 

 

 

Yesterday was NOT a “Monday”

Last week I wrote about how the previous day was A Monday and by that I didn’t just mean by its position on the calendar.

Well I’m here to say that while yesterday as Monday, it was distinctly NOT A Monday. Part of my morning habit when I get up is to check the emails from my largest client. There are of course a bunch of processes that run overnight that I need to check on and fix and rerun if there’s a problem.

I wrote last week about how one ETL job had failed and I had to fix it. I fixed it, but Thursday, Friday and Saturday it failed again due to a related but different issue. Saturday I implemented a change that I was confident would solve the problem. Here I am 3 days later and I can say that it’s been running well since.

Another process that had broken last Monday ran fine. It was off to a good start.

A bit later in the day I received a copy of the weekly audit report. This shows security items that need to be reviewed and mitigated. There was a huge drop in issues from the previous week and one whole column had disappeared. This was a great day.

Later in the day I was able to complete some VB.NET code for an internal web app I maintain for the customer and add new functionality, and too boot, I found a far simpler way to implement what I want than I had started with.

About the only real issue yesterday that was work related was that the promised production data for a new project didn’t show up until dinner time. But, hey that’s on teh customer.

So, not every Monday is necessarily A Monday. Some are distinctly better.

Yesterday was “A Monday”

Yesterday was a Monday. I don’t just mean it was Monday, but it was in the Garfield comic sense of things A Monday.

As a consultant, I’ve come to expect certain patterns in my work load. For one client, I know approximately every 2 months, over 2 weekends I’m going to have to patch their SQL Servers. I know certain passwords will need to be updated quarterly or annually. And I know sometimes I’ll have A Monday.

Yesterday was one of those. I woke up, checked my email and noticed two jobs had not run. So I logged in and it appeared that the PowerShell script on each server had hung. I killed it and tried to rerun it, but got an error. This wasn’t entirely surprising. This script, in its first part downloads a file from a 3rd party vendor and last week for example, their SFTP server had been down. At first I expected this to be the problem again. But further testing showed I was getting inconsistent errors. Finally the script ran. But, what normally took about 20 minutes to download, took about 2 hours. We learned later the vendor had done an upgrade to their product over the weekend. This shouldn’t have impacted their SFTP server performance, but here we were. Today (Tuesday) the process took 20 minutes again and is back to normal. Chalk yesterday’s issue up to being A Monday.

Then I took a look at another job that had failed. This one is purely internal. Basically SFTP a file from a Linux server to a NAS for a backup. A quick check showed that the NAS share was inaccessible. Reporting this triggered an avalanche of emails back and forth. The most interesting line basically came down to “Yes, the internal IT team did a migration of the NAS, but the migration was supposed to be completely transparent to the users.” Famous last words in my book. Actually, honestly, what I decided was more disturbing was that the failure was on the new NAS device apparently due to a typo. To me, this means, most likely, all the old shares were recreated on the new device by hand, rather than using a script that read out the old shares and recreated them. In any event, the problem was solved, the job was rerun and the backup created on the now new NAS. Chalk that one up to being A Monday.

Then one of the developers for one of the platforms at this client emailed me and said, “Hey database FOO is in recovery mode, what happened?” This one, fortunately I knew exactly what the problem was. Unfortunately I knew it was my fault. We had decided to reconfigure that database to be a log-shipped copy of the main database and I had set it up over the weekend. I had simply forgotten to set it up to place itself in Stand-by/Read-only mode after it had applied the most recent logs. I’ll chalk that one up to it being A Monday.

All of the above was taken care of before 10:00 AM. The rest of the day was filled with a variety of other issues and items, including looking at a Hyper-V host machine with 16 physical CPUs with hyperthreading turned on hosting 4 VMs, 1 with 4 vCPUs allocated, and the other 3 with 8 each. They’re having performance issues. I’m still tackling that one. Looking at that happened on Monday, but it’s not A Monday issue, it’s been an ongoing issue for months.

So what was it about this particular Monday, or Mondays in general?

Well in this case, all 3 of my early AM issues had one thing in common: upgrades or changes made over the weekend. I’m not going to debate the value or wisdom of the timing here, but just note, that on the particular Monday, it wasn’t just one issue, but three. It was definitely A Monday. But I survived as did my customer.

Now back to my regularly scheduled workload.

 

“Houston, We Have an Opportunity”

This is not quite the famous quote from the movie Apollo 13, one of my favorite movies. And, well we have a problem. But also an opportunity.  I’ll get to the opportunity in a bit, but first, the problem.

My Event and Disappointment

The problem of course is COVID-19. As the end of the week in which I’m writing this, I was scheduled to help host the National Weeklong Cave Rescue training seminar for the NCRC. For years, I and others had been working on the planning of this event. It’s our seminal event and can attract 100 or more people from across the country and occasionally from other countries. Every year we have it in a different state in order to allow our students to train in different cave environments across the country (New York caves are different from Georgia caves which are different from Oregon caves and more) as well as to make it easier for attendees to attend a more local event.

Traditionally, the events held in NY (This would have been the fourth National Seminar) have attracted fewer people than our events in Alabama which is famous for its caves. But this year was different, with lots of marketing and finding a great base camp, we were on track for not only for the largest seminar yet in New York, but for one on par with our largest seminars anywhere.

Then, in February I started to get nervous. I had been following the news and seeing that unlike previous outbreaks of various flus and other diseases, COVID-19 looked like it would be something different. This wasn’t going to pass quite as quickly. This was going to have an impact. As a result, I started to make contingency plans with the rest of my planning staff. I consulted with our Medical Coordinator. I talked to the camp. By March I started to regain some optimism, but I still wasn’t 100% confident we could pull this off. And then, the questions from attendees started to come in. “Are we still having it?” “What are the plans?” etc.  Another week or two later, “I need to cancel. My agency/school/etc won’t cover the cost this year.”  Finally by the start of April, after talking to several of my planning staff and my fellow regional coordinators it became obvious, we could not, in good conscience host a seminar in June. Yes, here in upstate New York the incidence of COVID-19 is dropping quickly. We’re starting the re-opening process. Honestly, if folks came here, I would NOT be worried about them getting infected from a local source.

However, we would have nearly 100 people coming from across the country, including states where the infection rates are climbing. Many would be crammed into planes for hours, or making transfers at airports with other travelers. So, while locally we might be safe; if we held this event, where folks are in classrooms for 3-4 hours a day, then in cars to/from cave and cliff sites and then often in caves for hours, it’s likely we would have become ground zero for a spike in infections. That would not have been a wise nor ethical thing to do. So, we postponed until next year.

I mention all that because of what happened to me about three weeks ago and then the news from last week: 2020 PASS Summit is going Virtual. About three weeks ago I was asked to participate in a meeting with some of the folks who help to run PASS as well as other User Group leaders. The goal was to discuss how to make a Virtual Summit a great experience if it went virtual. This was one of several such meetings and I know a lot of ideas we brought up and discussed. NONE of us were looking forward to PASS 2020 being virtual, but we all agreed that it was better than nothing. And of course, as you’re well aware, last week PASS made the decision, I suspect due in large part for the very same reasons we postponed our cave rescue training event.

Sadness and Disappointment

Mostly I’ve heard, if not happy feedback, at least resigned feedback. People have accepted the reality that PASS will be virtual. I wrote above about my experience with having to postpone our cave rescue training (because it’s so hand’s on, it’s impossible to host it virtually). It was not an easy decision. I’ll admit I was frustrated, hurt, disappointed and more. I and others had put in a LOT of hard work only to have it all delayed. I know the organizers of Summit must be feeling the same way. And I know many of us attendees must feel the same. Sure, Houston is not Seattle and I’ve come to have a particular fondness for Seattle, in part because of an opportunity to see friends there, but I was looking forward to going to Houston this year (as was my wife) and checking out a new city.

One thing that has helped buoy my emotions in regards to our weeklong cave rescue class is that over 1/2 the attendees said, “roll my registration over to next year. I’m still planning on coming!” That was refreshing and unexpected. Honestly, I was hoping for maybe 1/4 of them at best to say so. This gave me hope and the warm fuzzies.

Opportunity

Let me start with stating the obvious: a virtual event will NOT be the same as in-person event. There will most definitely be things missing. Even with attempts that PASS will be making to try to recreate the so-called “Hallway track” of impromptu discussions and hosting other virtual events to mimic the real thing, it won’t be the same. You won’t get to check out the Redgate Booth in person, hang out on the sofa at Minionware, or get your free massage courtesy of VMWare.

20191106_123733

After a great massage courtesy VMWare.

And we’ll miss out on:

20191105_143514

Achievement unlocked: PASS Summit 2019 Selfie with Angela Tidwell!

But, we’ll still have a LOT of great training and vendors will have virtual rooms and more.

So what’s the opportunity? Accessibility!

Here’s the thing, I LOVE PASS Summit. I think it’s a great training and learning opportunity. But let’s face it. It can get expensive, especially when you figure in travel costs, hotel costs and food costs. This year though, most of those costs disappear. This means that when you go to your boss, they have even less of an excuse to say, “sorry it’s not in the budget”.  And honestly, if they DO say that, I would seriously suggest that you consider paying for it out of your own budget. Yes, I realize money might be tight, but after all the wonderful training you can then update your resume and start submitting it to companies that actually invest in training their employees.

I would also add, from my understanding, while convention centers by law ADA accessible, this doesn’t necessarily mean everyone with a disability can attend. There can be non-physical barriers that interfere. Hosting virtually gives more people the ability to “attend” in a way that works for them. It might be in a quiet, darkened room if they’re sensitive to noise and lights. It might be replaying sessions over and over again if they need to hear things in that fashion. It very well could be taking advantage of recorded sessions and the like in ways that I, an able-bodied person isn’t even aware of. So that’s a second way in which it’s accessible.

Now, I know folks are questioning “well if its virtual, why should we pay anything, especially if vendors are still paying a sponsorship fee?” There’s several answers to that and none of them by themselves are complete, but I’ll list some. For one, I haven’t confirmed, but I’m fairly confident that vendors are paying a lot less for sponsorship, because they won’t get the same face to face contact. For another, PASS takes money to run. While we often think of it as a single big weeklong event, there’s planning and effort that goes on throughout the year. This is done by an outside organization that specializes in running organizations like PASS. (Note the PASS Board is still responsible for the decision making that goes on and the direction of PASS as a whole, but day to day operations are generally outsourced. This is far from uncommon. Those costs don’t disappear. There’s other costs that don’t automatically disappear because the event is no longer physical. And of course there are costs that a virtual event has that the physical event doesn’t. Now EVERY single session will be available as a live-stream (as well as recorded for later download) and this requires enough bandwidth and tools to manage them. And it requires people to help coordinate.  Making an event virtual doesn’t automatically make it free to run.

The Future

Now, I know right now I’m on track for hosting the NCRC Weeklong Cave Rescue training event next year at the location we planned on for this year. Our hope of course is that by then COVID-19 will be a manageable problem. But in the meantime, I’ll keep practicing my skills and sharing my knowledge and when and where I can, caving safely. And as always willing to take new folks caving. If you’re interested, just ask!

I don’t know what PASS 2021 Summit will bring or even where it will be. But I know this year we can make the most of the current situation and turn this into an opportunity to turn PASS into something new and more affordable. Yes, it will be different. But we can deal with that. So, register today and let’s have a great PASS 2020 Summit in the meantime!  I look forward to seeing you there. Virtually of course!

“It’s a Jump to the left…

… and then a double-hop to the right.” Or something like that.

I’ve commented before on the fact that I’m a consultant. I enjoy it. People will ask me what I do, and it varies. At one client they refer to their VB app as “the database” and because they found an ad of mine on Google where I talked about database administration, they hired me. About 80% of the work I do for them is actually on the VB app or related, very little is actually what I’d consider traditional database work. But that’s ok, they’re a pleasure to work with and I enjoy the work. Another client I recently worked with, asked me to help them conduct an audit of their web based product and help them with some steps to make it more secure. I was more than happy to help.

And then there’s my largest, by far, client. I actually get to do a fair amount of work that most of my #sqlfamily would recognize as “database work”. But there, perhaps more than any other, I describe my duties as “DBA and other duties as assigned.”  So between the work at this client and all my other clients, I’m often jumping or stepping around stuff.

The Double-Hop

This time though I was asked to double-hop. What is that exactly? It’s an issue that has to do with how Windows can pass security credentials from one server to another. This article, while old, describes it well. This was essentially the situation my client was trying to solve: Users needed to use their Active Directory (AD) Credentials to log into the Reporting Server (RS_Server) and run a report that in turn accessed data on a separate database server (DB_Server), and thus, the double-hop. Now, from my point of view, this isn’t really database work, but since the reporting server talks to the database server it was dropped in my lap under “other duties as assigned.”

Now, honestly, this SHOULD be simple to solve. It wasn’t. One reason was in part because, like many companies, this client has a separate team that handles much of their infrastructure needs, such as AD requests. And they have to go through tickets. To be clear, I support this concept, in theory. In practice, it can often take 2-3 weeks for even simple requests to go through. This meant that my first attempt at solving the double-hop failed. Their IT department did exactly what I requested. Unfortunately there was a typo in my ticket. So it failed. So round two. And round two didn’t work. Nor did round three. At this point though it wasn’t due to typos or mistakes on my end.

I started reading every article I could. My great editor at Red-Gate, Kathi Kellenberger has one, and trust me I wasn’t too shy to ask at that point! But nothing was working. I even asked another DBA at the client (they actually head up a different group and is their lead DBA). She pointed me to one of her people saying, “talk to him, he solved it.” I did, and he hadn’t. His solution was the one we were trying to avoid (basically using a fixed user in the datasource).

Frustration Sets In

I was getting frustrated. Fortunately at this point I started to exploit a loophole in the ticketing process. Since the problem wasn’t being fixed, I was able to keep it open and ended up getting assigned someone from their IT group who was as interested in fixing this as I was. This meant rather than “open a ticket, wait 1-2 weeks, have ticket be closed as complete, test, find out it failed, rinse” we could now actually schedule Zoom sessions and make changes in real-time. AND…. nothing we tried worked.

At this point you’re probably saying, “Yeah, yeah, get to the point. Did you solve it?”  The answer is yes, but I wanted you to feel a bit of my pain first, and I needed to make this post long enough to make it worth posting.

A Solution!

Now, let me say, I wish I could write out an exact recipe card solution for you. For various reasons, I can’t. But bear with me.

Finally, we found an additional resource in the IT group who had solved this before. His first recommendation was yet again, the solutions I mentioned above. He saw they didn’t work, agreed we were trying the right thing. So he said, “well let’s try a solution known as “Resource-based Kerberos Constrained Delegation“. This didn’t work at first either.

But then he suggested that we turn on 128 and 256 bit encryption on the DB_Server SQL account. Bingo that worked. Mostly. More on that in a second.

So here’s the setup we ended up with.

  1. RS_Server – running reporting services under an account domain\RS_Server_Service
  2. DB_Server – running SQL Server under an account domain\DB_Server_Service
  3. Setup some SPNs
    1. MSSQLsvc/DB_Server domain\DB_Server_Service
    2. MSSQLsvc/DB_Server.domain.com domain\DB_Server_Service
    3. Note in this case you do not appear to need one on the RS_Server side.
  4. Run a Powershell script
    1. $FEIdentity1 = Get-ADUser -Identity domain\RS_Server_Service
    2. $BEIdentity = Get-ADUser -Identity domain\DB_Server_Service
    3. Set-ADuser $BEIdentity -PrincipalsAllowedToDelegateToAccount $FEIdentitity1
  5. At this point things should have worked, but they didn’t until we then enabled the encryption options:

    encryption options for Kerberos

    Kerberos Encryption

  6. Then on our test box, things magically worked! Ok, not quite magically, but things worked. We had a solution.

And I was even more ecstatic when later that day, I tested this on a second report server box we had and it too suddenly was working without any changes. And this was a box where we had NOT even setup an SPN for the original double-hop solution, so I was pretty confident that the Resource-based Kerberos Constrained Delegation was working. In addition, in the rsserver.config file, the only authentication enabled was NTLM.

The next step was to try this on a production server. In that case, I did have to reconfigure the service it was running under to use the domain account domain\RS_Server_Service.

And… my test failed.

I was at wit’s end. I couldn’t quite figure out what was different. I checked my service names, my SPNs, the rsserver.config file, and more. Nothing was working. I took a break and came back and had an idea. In the datasource I changed it from:

Data source=DB_Server;Initial Catalog=TestDB

to

Data source=DB_Server.domain.com;Initial Catalog=TestDB

Bingo, it worked! A little digging confirmed my suspicion. This client actually has multiple DNS domains and the ordering and like under the TCP/IP settings was different on this box from the other two boxes. And that made the difference.

Sure enough when I tried deploying to a fourth box, I had the same issue, but changing it to the Fully Qualified Domain Name (FQDN) solved my issue.

So, my take-aways for this week:

  •  Resource-based Kerberos Constrained Delegation may be a better solution at solving the Double-Hop solution than the solution generally proposed.
  • Once you’ve setup the “target” SQL Server service account and source Reporting Server Service accounts, additional reporting servers can be added to the mix without needing assistance from a domain admin.
  • It appears you still need an SPN (well multiple) for the SQL Server itself.
  • You need to run a PowerShell Script to setup the accounts. Note that if you run it again, it overwrites the old settings, so you need to add ALL of the source accounts in a single step.
  • Depending on your domains security setup, you may need to enable 128/256 bit Kerberos authentication.
  • DNS resolution may determine if you can use just the NetBIOS name or the FQDN in your data sources.
  • This solution will NOT work if you need to cross domains or have more complex setups, but in general, it can be simpler to setup and to maintain, especially if you have limited access to making changes to AD in the first place.
  • My reading indicates this only works on Windows 2012 and beyond. But you shouldn’t be running older versions of Windows in any case!

If I get the time and energy, I may setup a test environment in my home lab to further experiment with this and write up better demos, but for now, use this as you can. Hopefully it’ll save you some of the stress I experienced.

And that’s it from here, back to other duties as assigned.

 

 

The Customer is Always Right?

You’ve heard this adage before. Some often believe it. And honestly, there’s a bit of truth to it. But the truth is, the customer pays your bills and if they stop paying your bills, they’re no longer your customer.

I was reminded of this actually during the testimony of Dr. Fauci before the Senate a few weeks ago. To be clear, neither the Senate nor the CDC is a customer here, nor is the President of the United States. But, I think it ties into the thesis I want to make.

Over the past few months there’s been a lot of discourse over whether states should shut down, for how long and how and when they should open up. At the extreme ends you have folks who seem to argue for a continual shutdown to save as many lives as possible and the people who seem to argue that the economy is far more important and that any shutdown is a bad idea.

Dr. Fauci has been accused of wanting to “quarantine the entire country” and is the subject of a hashtag campaign, #faucifraud. During the Senate hearing, Senator Rand Paul took several swipes at Dr. Fauci including a statement implying that some people might treat Dr. Fauci as the end-all. Finally, with only 32 seconds left, Dr. Fauci gave his reply:

“Well, first of all, Senator Paul, thank you for your comments. I have never made myself out to be the end all and only voice in this. I’m a scientist, a physician, and a public health official. I give advice, according to the best scientific evidence. There are a number of other people who come into that and give advice that are more related to the things that you spoke about, about the need to get the country back open again, and economically. I don’t give advice about economic things. I don’t get advice about anything other than public health. So I wanted to respond to that.”

I think this was an incredible reply and one that I think behooves any consultant to keep in mind. Dr. Fauci politely but firmly refutes Senator Paul’s comment about being the end all and then points out what his qualifications are. He then suggests that there are other experts, in other fields, who should be consulted. He then reiterates the limitations of his advice.

As a consultant, this mirrors my own experience. A client may ask me to recommend a HA/DR strategy for them. I might go ahead and recommend some sort of Always On Availability Group with multiple synchronous replicas in one data center and then an asynchronous replica to a second data center. I might then recommend daily backups to tape with the tapes taken off-site. Everything would be encrypted and we would test failovers on a regular basis. With that, the proper selection of hardware, a proper deployment setup, and a completely developed runbook for various scenarios, I could probably guarantee them nearly perfect uptime.

Then, the CFO steps in and points out that their budget is only 1/20th of what I had planned around and that trying to spend more would bankrupt the company.

Then the VP of Sales points out that the business model of the company is such that in reality, they could operate for several hours of downtime and while it might hurt business a little bit, it wouldn’t bring them to a halt.  In fact, they suggest that the order system just be done with a bunch of Excel spreadsheets that accounting can true up at the end of the month. After all, they just want to focus on sales, not on entering data into the system.

Finally the CEO steps in. They decide that it’s true, the company can’t afford a 24/7 HA/DR setup that is the envy of NASA, at least not at this time. Nor do they think the VP of Sales plan has much merit since it won’t allow future growth into online sales and while it might be easier for the salespeople to just jot down stuff, it would mean hiring more people in accounting to figure out the data at the end of each month.

Instead, they direct the CTO to work with all the parties involved to develop a system that can have 3 hours of downtime, but costs no more than 1/10th of what I proposed, and  that also incorporates features that allow them to move to a more advanced HA/DR setup down the road and also will eventually allow for online sales.

So who was right? Me at my extreme of a huge investment in hardware, licenses and resources, or the VP of Sales who wanted to do the whole thing using some Excel spreadsheets.

Both and neither. Either of our ways would have worked, but neither was the best solution for the company.

I think good experts realize exactly where their expertise begins and ends. My role as a consultant is to provide the best advice I can to a company and hope that they take my advice, at least as it is applicable. And I should understand that every situation is different. In these cases, the customer is always right. Their final decision might not be what recommended, but ideally they’ve taken it into consideration.

Finally though, I have to recognize that there are situations where I may have to withdraw myself from the situation. In the above scenario, I crafted a situation where compromise is not only a viable option, it is perhaps the best option. But there are times when compromise is not an option. If a potential client came to me and they were dealing with PII data and refused all advice in regards to encrypting data and other forms of data security, it would be in my best interests to simply say, “here, the customer is wrong” and recuse myself. So in some cases, the customer may be outright wrong, but they should also stop being my customer at that point and would no longer be paying me.

So, do we re-open the country completely or do we shutdown everything until the fall? Honestly, I’m glad I’m not the one making that decision. I don’t think there’s a single answer for every community. But I think the best leaders will take into account the best advice they can from a variety of experts and synthesize the best answer that they can and adjust it as more data and experience come to light. It’s simply not practical to prevent every possible COVID-19 death. But it’s also not ethical to re-open without a plan or even thought as to the impact.

Neither extreme is fully right.

 

Advanced Braining

I’m currently reading the tome The Power Broker by Robert Caro. For those not familiar with it, it’s the Pulitzer Prize winning biography of Robert Moses. “Robert who?” you may be asking? Robert Moses, perhaps more than any single person literally shaped New York City in the mid-20th Century. Due to his power, he was responsible in NYC alone, for getting the Triborough Bridge, Brooklyn-Battery Tunnel, West Side Highway, Cross-Bronx Expressway, and many other large scale projects built. He outlived a number of borough presidents, mayors, governors and even Presidents. Arguably, for decades he was the most powerful man in NYC, at least in terms of how many was spent and what projects were completed. In many ways he was a visionary.

However, as the chapter I’m currently in discusses, he also could be extremely short-sighted. I’ll come back to that in a moment.

In the past week, several small incidents occurred in my life. Separately, they don’t necessarily mean much, but taken together, I realized there was a little theme associated with them.

Last Tuesday I posted an update on my dryer repair and an issue at one of my clients. I described the work incident as an example of the normalization of deviance. A few hours later, someone I’ve known for decades, originally online, but have since met in person, Derek Lyons (who has a great blog of his own on anime, a subject about which I know nothing) posted a reply to me on Facebook and said he had read my article, liked it, but thought I was wrong. I was intrigued. You can see his comment and my reply at the bottom of last week’s post. The general point though is I think he showed my thinking was incomplete, or at least my explanation was. In either case, it made the overall article a better one.

Then on Wednesday, my editor at Redgate, Kathi Kellenberger  emailed me with changes to my most recently submitted article. One of the changes was to the title of the article. Now, I’ve come to value Kathi’s input, but I wasn’t keen on the title change, so I suggested something different. She wrote back and recommended we go with hers, How to Add Help to PowerShell Scripts because she said “How to…” generates more hits on search engines and in fact a previous article of mine How to Use Parameters in PowerShell was one of their most read articles at the time (106K hits and climbing). I went with her advice.

Yesterday, a friend contacted me. He was in the middle of doing grading for his students and the numbers on his Excel spreadsheet weren’t quite making sense. The errors weren’t huge, but just enough to make him go “hmmm”. So, he reached out to me to take a look. After a few minutes of digging I understood what was happening and able to write back to him and give him a better solution.

All these have something in common: the final product was better because of collaboration. This is a common theme of mine: I’ve talked about the chat system I use at RPI, I’ve talked about making mistakes. In general, I think that when trying to solve a problem, getting additional input is often valuable.

So back to Robert Moses. In the early part of his career, before his efforts focused mostly on NYC itself, he was responsible for other projects, such as the Northern State Parkway and the Southern State Parkway and Jones Beach on Long Island. He started his career in a time when cars were mostly a vehicle of the well-off and driving a parkway was expected to be a pleasant experience (hence the name). His efforts were built around more and more parkways and highways.

By the 1950s though, it was becoming apparent to most everyone else that additional highways actually generated more traffic than they routed away from the area surface roads. What was originally considered a blessing in disguise, where a bridge, such as the Triborough would quickly generate more traffic (and hence more tolls) than expected, was soon seen as a curse. For every bridge or tunnel built in or around NYC, traffic increased far more than expected. And this came at a price. Urban planners around the country were starting to see the effects. Efforts to build more bridges or highways to ease traffic congestion were actually creating more. Even in NYC as Moses was planning for his next large projects opposition was slowly building. However, Robert Moses was blind to the problem. By the 1950s and 60s he had so surrounded himself by “yes men” that no dissidence was permitted. In addition, opposition outside of this offices was silenced by almost any means Moses could use, including apparently the use of private detectives to dig up dirt on opponents.

In the current chapter I’m reading, Caro, the author, details exactly how much money the Triborough Bridge Authority (which was in practice, though not theory, under Moses absolute control) and the Port Authority had available for upcoming projects, including the planned Verrazzano-Narrows Bridge. He goes on to explain how badly the infrastructure of the NY Subway system and the LIRR had fallen into disrepair. Caro suggests how much better things could have been had just a portion of the money the TBA and PA had at their authority had been spent on things like the Second Avenue Subway (something that is only now coming to fruition and will take possibly decades more to complete). Part of the issues with the subway system can be lain directly at the feet of Moses due to earlier efforts of his to get the city to fund his other projects. The issues with the LIRR however were more an indirect result of his highway building out into Long Island.

I suspect some of Caro’s claims are a bit idealistic and would have cost more than the projections at the time (like most projects) and while I think most of the projects he touches upon probably should have been built in the 50s (the Second Avenue Subway being one of them and the LIRR East Side Access being another) they weren’t because of a single man who brooked no disagreement and was unwilling to reconsider his plans.

Robert Moses was a man who got things done. Oftentimes that’s a good thing. And honestly, I think a number of his achievements are remarkable and worthy of praise.

But I have to wonder, how much better of a city could New York be, had Robert Moses listened to others, especially in the 1950s and 60s.

Today’s takeaway? Take the time to listen to input and ask for it. You may end up with a better solution in the long wrong.

 

 

 

Quiet Time and Errors

I wrote last week about finally taking apart our dryer to solve the loud thumping issue it had. The dryer noise had become an example of what is often called normalization of deviance. I’ve written about this before more than once. This is a very common occurrence and one I would argue is at times acceptable. If we reacted to every change in our lives, we’d be overwhelmed.

That said, like the dryer, some things shouldn’t be allowed to deviate too far from the norm and some things are more important than others. If I get a low gas warning, I can probably drive another 50 miles in my car. If I get an overheated engine warning, I probably shouldn’t try to drive another 50 miles. The trick is knowing that’s acceptable and what’s not.

Yesterday I wrote about some scripting I had done. This was in response to an issue that came up at a customer site. Nightly a script runs to restore a database from one server to a second server. Every morning we’d get an email saying it was successful. But, there was a separate email about a separate task that was designed to run against that database that indicated a failure. And that particular failure was actually pretty innocuous. In theory.

You can see where I’m going here. Because we were trusting the email of the restore job over the email from the second job, we assumed the restore was fine. It wasn’t. The restore was failing every night but sending us an email indicating a success.

We had unwittingly accepted a deviance from the norm. Fortunately the production need for this database hadn’t started yet. But it will soon. This is what lead my drive to rewrite and redeploy the scripts on Friday and on Monday.

And here’s the kicker. With the new script, we discovered the restore had also been failing on a second server (for a completely different reason!)

Going back to our dryer here, it really is amazing how much we had come to expect the thunking sound and how much quieter it is. I’ve done nearly a half-dozen loads since I finally put in the new rollers and every time I push the start button I still cringe, waiting to hear the first thunk. I had lived with the sound so long I had internalized the sounds as normal. It’s going to take awhile to overcome that reaction.

And it’s going to take a few days or even weeks before I fully trust the restore scripts and don’t cringe a bit every morning when I open my email for that client and check for the status of overnight jobs.

But I’m happy now. I have a very quiet dryer and I have a better set of scripts and setup for deploying them. So the world is better. On to the next problems!

Checking the Setup

A quick post outside of my usual posting schedule.

I was rewriting a T-SQL sproc I have that runs nightly to restore a database from one server to another. It had been failing for reasons beyond the scope of this article. But one of the issues we had was, we didn’t know it was failing. The error-checking was not as good as I would have liked. I decided to add a step that would email me on an error.

That’s easy enough to do. In this case I wanted to be able to use the stored procedure sp_notify_operator. This is useful since I don’t have to worry about passing in an email address or changing it if I need to update things. I can update the operator. However, the various servers at this client had been installed over a several year period and I wasn’t sure that all of them had the same operator configured. And I was curious as to who the emails the operators went to on those machines.  Now, I had a decent number of machines I wanted to check.

Fortunately, due to previous work (and you can read more here) I have a JSON file on my box so I can quickly loop through a list of servers (or if need be by servers in a particular environment like DEV or QA).

$serverobjlist = Get-Content -Raw -Path “$env:HomeDrive$env:HomePath\documents\WindowsPowerShell\Scripts\SQLServerObjectlist.json” | ConvertFrom-Json
 
foreach ($computername in $serverobjlist.computername)
{
$results = Invoke-Sqlcmd -ServerInstance $computername -query “select name, email_address from msdb.dbo.sysoperators”
write-host $computername $results.name $results.email_address
$results = Invoke-Sqlcmd -ServerInstance $computername -query “select name from msdb.dbo.sysmail_profile”
write-host $computername $results.name `n
}

This gave me a list of what operators were on what servers and who the emails went to. Now if this were a production script I’d probably have made things neater, but this worked well enough to do what I needed. Sure enough, one of the servers (ironically one of the ones more recently installed) was missing the standard mail Profile we setup. That was easy to fix because of course I have that scripted out. Open the T-Sql script on that server, run it, and all my servers now had the standard mail profile.

Once I had confirmed my new restore script could run on any of the servers and correctly send email if there was an error it was time to roll it out.

deploy

Successful deploy to the UAT environment

So one quick PowerShell Script, an updated T-SQL Script and a PowerShell Deploy Script and my new sproc has been deployed to UAT and other environments.

And best of all, because it was logged, I knew exactly when I had done it and on what servers and that everything was consistent.

I call that a win for a Monday. How is your week starting?

 

 

Getting My Hands Dirty…

… and my clothes cleaned. Or more importantly, dried.

Before I was a programmer I worked for my dad in construction over the summers of high school. It was good solid work. I enjoy working with my hands at times. For one thing, you see and feel the results of your accomplishments in a very tangible manner. For another, you generally can measure the impact of your effort.

After my dad died, I wrote about using a drill of his to work on the addition that was to become my office. I liked the heft and feel of it. I knew I was accomplishing something with it. Being a programmer, sometimes it’s hard to experience that. Currently for example I’m working on an ETL script using PowerShell to SFTP down a file, extract it to some tables and then feed it into Salesforce. For me, it’s just a bunch of data. Yeah, there’s some fun challenges; learning how to setup and deal with GPG and designing a robust and secure information because some of the data is sensitive. But, once the project is finished, it’ll run silently and other than an occasional email, I won’t think much about it. It won’t impact my day to day life in any way and I won’t be able to point to it and say, “See, THERE is something I did.”

But, my dryer on the other hand, now that’s different. For awhile now (say at least a year or two) whenever I’ve run a load it’s made a fearsome rumbling sound. It’s been annoying, but we’ve managed to live with it up until 6 or 7 weeks ago. Generally I’d do most of the laundry on Sundays and if there was a load or two left, on Monday while everyone else was out of the house or at school. But obviously things changed. My wife’s office is in the room next to the laundry room. Whereas for me the rumbling was faint and simply background noise, for her it was quite noticeable.  I tried to work the loads around her work schedule, especially since she’s on so many conference calls for her job, but it was getting less and less practical.

It was finally time for me to do something about it. Now, had I been smart, I’d have started the project on a Monday. But, I’m not always that smart. So, Saturday came around and I disassembled the dryer.

I was fairly confident I knew what the problem was. I assumed that either something had wrapped around one of the rollers for the drum, or a bearing in a roller had seized. If it was the former, the fix would be trivial and I’d have the whole thing back together before dinner. If it was the latter, I figured a shot of silicone or other lubricant and I could at least get a few more weeks out of it while I ordered the parts. And since the tight screws were now loosened and I knew how to take it apart, the final fix would go quickly.

Well, as they say, you know what happens when one assumes. I was wrong about the first guess, it was not something as simple as something wrapped around the roller. And I was even more wrong about it being a seized or flattened bearing. See for that assumption to be valid the bearing assembly inside the wheel has to actually exist.

20200502_162747

Bearing Assembly? What Bearing Assembly?

It’s a bit hard to make out, but inside the blue part of the wheel above, and behind the plastic triangle, there is supposed to be a nice little bearing assembly.  There is none.

20200502_163501

Better view of the roller

You can see the wear on the inner hub.  This is what in the trade is called “less than optimal.”

More seriously though, it unfortunately meant that this was not going to be a quick fix. I had been planning on ordering the parts, but this made it a bit more of a rush. The dryer contains four of these rollers and as such I ordered a four pack, since generally my assumption on items like this is is that if one has worn, all four are worn. Now, none of the other three have shown nearly the damage, but figure, I’m in there, I might as well make it right.

What’s most interesting to me, is that there’s literally NO sign of the roller assembly in the dryer. However it got destroyed, it was pretty cataclysmic.

I also took the time to clean out the rest of the interior space and correctly deduced that the moisture sensor was covered with lint. Now that I know where it is, I can keep that clean in the future.

In any case, sometime later this week, I’ll get my package, swap out the rollers and reassemble the dryer and start doing laundry again. Quietly.

But, unlike the ETL I’m writing above, this change will have a direct noticeable impact on my life I’ll be aware of every time I do a load of clothes. I like that.

This week’s takeaway? I do enjoy my job and the challenges that come with it, but there’s something to be said for doing work you can touch and feel and experience the tangible impact.

20200505_090057

My best sourdough yet!

And perhaps I shouldn’t be posting pictures of homemade bread after talking about dirty hands. Don’t worry, I washed my hands!