What if you’re in a situation where you see that something needs fixing but no one listens to you about what you have to say.
It seems unreal but I’ve been in situations where I had the hardest time convincing managers, colleagues and even third parties that there was something seriously wrong.
I’m going to share my experience with the three of the parties and how I made sure I was always in the clear when something would go wrong.
This is not an article how to spin the wheel of blame because that’s never a healthy situation. This article will give you tips and tricks how to make sure you get your things done and how you can avoid to become the victim of a situation where you’re to blame.
These are actual real situations I have been in so there is no fiction in the stories that are coming.
I start fresh at a new to the company as either a DBA and I was the only DBA a.k.a. the Lone DBA. The former DBA had left (I never got to know the reason) and I had little to no documentation.
The first things I did was do checks on backups and collect information from all the instances.
I noticed that all the backups are full backups. The backups ran for hours and there is a lot of contention due to other processes wanting to process their data.
It seemed all the databases are all in simple recovery mode. Normally that’s not a good sign but there could be a good explanation.
I send an e-mail to my colleagues in the IT department if anyone knew why this is set up the way it is. Of course nobody knows and it’s a dead end. Documentation is there and I’m already happy that the backups ran for the last few weeks.
I was the only DBA and therefor responsible for the situation. Unfortunately because of costs of disk space and other processes I wasn’t able to implement my solution without the consent of my manager who was also the change manager at the time.
I go to my manager have the following conversation:
Me: “Are you aware of the fact that all our databases are in simple recovery mode and that we make full backups of all our databases which take a considerate amount of time and make other processes run longer than needed?”
Manager: “Yes! I know why we did that. The transaction log backups were too difficult to recover so we make full backups all the time. It’s easier right?!”
Me: “You’re also aware that due to the fact that we run in simple recovery mode we have no way of recovering to a point in time and in the case of data loss can only return to the full backup?”
Manager: “Yes I’m aware of that but it doesn’t matter because we have calculated that it doesn’t matter if we lose 24 hours of data because we’ll just redo all the work that’s lost.”
Me: “That’s ridiculous because it’s not that difficult to implement a backup strategy that could avoid that situation. Why wouldn’t we want to do that?”
Manager: “Because we don’t need it and why put in the extra effort if nobody in the company cares.”
Me: “We’re clearly not on the same page and I think you underestimate the situation.”
I stop the meeting and walk out of the room to think of a plan to make sure this doesn’t get back to me when all hell breaks loose.
So let’s evaluate what went wrong:
- Nobody in the company knows why things were set up the way they are.
- The company has no idea what the impact of a data disaster could be.
- Nobody cared about the fact that situation could actually happen.
- I couldn’t convince my manager at that moment with good arguments.
First of all let’s be blunt, if shits hit the fan you as the DBA are ALWAYS, read it again, ALWAYS, responsible to recover the data even if you’re not responsible for the situation at hand. Management will not care about the fact that you mentioned this all months ago, they want everything fixed yesterday. It could even backfire (I had that situation) because you could be blamed not to take responsibility.
So how could we act on the points in the evaluation.
Nobody in the company knows why things were set up the way they are
Start documenting the backups, the schedules, the databases and the servers. If you don’t have something like that already read my article how to document all your servers in minutes.
Also document the architecture of the different applications and their relationship with each other. What interfaces are running between the systems etc etc.
Are there any financial applications that rely on interfaces on the main database for instance. What processes are running during the day that could be impacted.
Make a diagram of the connected applications/processes that are dependent on the database(s). Most people understand things better when they’re made visually.
Try to make sense of the current situation and make sure you have everything in writing. If it’s not documented you can’t prove that something is wrong.
I know this all sounds like a lot of work, but if nobody knows, you should. In the end this will save you loads of time and let you become the person that took the responsibility.
The company has no idea what the impact of a data disaster would be
Here is where documentation is important to get the facts straight. I’ve seen a lot of companies underestimate the situation that there is a real problem. Like I said before, if nobody knows, make sure you do.
If there is a disaster recovery plan see if it’s still up to date and if not make it so. Based on that information try to estimate how long it would take to recover all the dependent databases/processes when the main database is down.
Make sure you know how long it takes to get everything back in order and make sure you have a procedure ready when it does happen. This not only shows you’re proactive in your work but that you can also act when needed.
And one thing you should do is test your DR plan. You plan is worthless if it doesn’t work. Test it periodically to see if it’s still up-to-date.
Nobody cared about the fact that situation could actually happen
One thing I would do is manage expectations. I want everyone in the company to be on the same page with this that in the case of an emergency.
The manager in this situation thought the loss of one day of data was good enough for the other departments. These decisions were made years ago and the entire landscape had changed and the DR didn’t.
I asked the same questions I asked my manager to the managers of several departments and their reaction was a little different. Several managers explained that they would be in a lot of trouble if the application was even offline for half a day and others even for several hours.
Because this was not going to be a healthy situation I called for a meeting with all the managers and me. In this meeting I would explain the situation by the documentation (like the diagrams) and come up with a plan to get the DR up-to-date.
I did everything but still they won’t listen
If you did everything to convince the people and they still don’t want to set up things you would like to, either because of costs or other reasons, I would protect myself.
Make sure all the decisions that were made are in writing, the good and the less good decisions. I would send an e-mail to my manager with the decisions and explain the consequences. After that I would ask my manager to acknowledge the e-mail.
You don’t want a decision outside of your capabilities to come back to haunt you. I’ve been in such a situation and you don’t want to end up there.
It all starts with taking responsibility. If you don’t take responsibility for the data as a DBA I suggest you go look for another kind of job. After that it’s important to get the facts straight. You can’t build a solution based on assumptions. One of the quotes I use: “Assume” makes an “Ass” out of “U” and “Me”, AssUMe.
It’s very important to feel comfortable in your work environment and you should do everything to make sure you go to work with a good feeling. You spend more time at work than you would spend at home (remote workers excluded 😉 ).