Saturday, January 29, 2011

How to recover from failed Mysql schema update, with replication?

I have two MySQL servers configured with master-slave replication.

Before we deploy a new application version we:

1) STOP SLAVE

2) Take a MySQL dump of the slave.

However, if a mistake is done during the deployment of the new schema version (a table is dropped by mistake, for example), having the slave intact doesn't help. Our service is write-intensive, so we can't turn it back up until we have a master working.

If we now load the mysql dump back into the master, it will take a long time during which our service remains down.

What is the best-practice to recover from such a mistake? How can I setup the system so I can easily promote the slave, turn on our service and only then tend to the broken database? Mainly, I'm worried with re-syncing the slave and the master after changes are done on the slave.

  • The method you're using is certainly the safest but there are ways you can cheat to make the process more transparent.

    • Keep the synchronisation turned off while you perform the next few steps.
    • Make the slave the master (easier if you have master/master replication instead of master/slave) and switch your application over to it.
    • Restore the old master from the dump file.
    • Re-enable synchronisation.
    • Ensure the the old master is synced from the former slave.
    • Switch the application back if so desired.
    OmerGertel : I had the hunch the answer will be something like this. I need the log-position of the slave, before I turn the application back on, so I can correctly configure the old-master as a slave on the new-master. Anything other gotchas I need to be careful about?
    John Gardeniers : Nothing I can think of, although it's been a while since I've been through those steps, so don't take my word for it. ;)

0 comments:

Post a Comment