Also known as “things have changed”
For one my of clients I monitor and maintain some of the jobs that run on their various servers. One of them had started to fail about two weeks ago. The goal of the job was basically to download a file from one server, transfer it to another and upload it. Easy-peasy. However, sometimes the job fails because there’s no file to transfer (which really shouldn’t be a failure, but just a warning). So, despite the fact that it had failed multiple days in a row, I hadn’t looked at it. And of course no one was complaining (though that’s not always a good reason to ignore a job failure!)
So yesterday I took a look and realized the error message was in fact incorrect. It wasn’t failing because of a lack of a new file, but because it could no longer log into the primary server. A quick test showed the password had been changed. This didn’t really surprise me as this client is going through and updating a number of accounts and passwords. This was simply obviously one and we had missed this one. (Yes, this is where better documentation would obviously be a good idea. We’re working on that.)
So, I figured the fix would be easy, simply email the right person, get the new password and update the process. I also was taking the time to update the script to that the password would be encrypted moving forward, right now it’s in plain text and to give the correct error in the event of login in failure.
Well, the person who should have the password wasn’t even aware of this process. As we exchanged emails, and the lead developer chimed in, the conclusion was that this process probably shouldn’t be using this account, and that perhaps even then, this process may no longer be necessary.
So, now my job is to track down the person who did or does rely on this process, find out if they still are and then finish updating the password. Of course if they’re not, we’ll stop this process. In some ways that’s preferable since it’s one less place to worry about a password and one less place to maintain.
Now, the above details are somewhat specific to this particular job, but, I’m sure all of us have found a job running on a server someplace and wondered, “What is this doing?” Sometimes we find out it’s still important. Sometimes we discover that it’s no longer necessary. In a perfect world, our documentation would always be up to date and our procedures would be such that we immediately remove unnecessary jobs.
But the real world is far messier unfortunately.
(and since the full photo got cropped in the header, here it is again)
And as a reminder, if you enjoy my posts, please make sure to subscribe.