This is How We Handle Problems
I had a production issue tonight. Still am, actually. I’ve admitted to it and here’s the email I’m sending to my management.
At 9:00PM I took a backup of database_a and database_b prior to running the database migration scripts. Once the backups were finished, I began the migration process at approximately 9:20.
I stopped the migration process at 10:15 after multiple failures and restarts. There are too many unknown cross-dependencies to go on with the roll forward. At this time I called SUPPORT PERSON and explained the situation. I also called MANAGEMENT and left a voice mail. I then began the process of restoring the production databases on SERVER_A.
No changes were made to SERVER_FIGHTING_MONGOOSE or SERVER_C.
Once I had restored database_b and database_a on SERVER_A, I began seeing multiple failures from replication and the rest of SQL Server indicating severe problems with the physical disk structure. I immediately stopped all replication involving database_a and database_b on SERVER_A and I have begun a physical drive integrity check using SQL Server’s built-in integrity check tool: DBCC CHECKDB. The CHECKDB for database_b finished at 11:15 with a clean bill of health. database_a is still running as of 11:31PM.
Once the CHECKDB process for database_a is complete, I will begin re-initialize the subscription for the database_a database on SERVER_A. Following the successful completion of the database_a re-initialization, I will begin the process of re-initializing the subscription to database_b.
If you have any questions, feel free to contact me at 867-5309.
See what I did there?
- I stated how we got into this mess – I dropped a running chainsaw into the SAN.
- I outlined my decision making process and took ownership of rolling back our production migration.
- I described the situation after the migration was rolled back and provided an assessment based on what I had observed.
- I outlined a course of action to mitigate our problems and restore our production database to an operational state as soon as possible.
Am I proud? Not really. I like it when things work. Am I tired and cranky? Yes.
Will I get this fixed before I go to bed? Hell yeah.
Is this something that I, in a sick way, live for? Only because it reminds me to keep studying and to stay on my toes.
This is Jeremiah
I live in Portland, OR. I have two dogs.
I recently received a Master's of Science in Computer Science from Portland State University.
I'm was Microsoft MVP from 2009 - 2018 with a pile of certifications. Somewhere along the way, I wrote a database client for Riak and then handed it off to the community.