Failure scenarios and solutions in master-master replication
Posted in Databases on Aug 30, 2009
I’ve been thinking recently about the failure scenarios of MySQL replication clusters, such as master-master pairs or master-master-with-slaves. There are a few tools that are designed to help manage failover and load balancing in such clusters, by moving virtual IP addresses around. The ones I’m familiar with don’t always do the right thing when an irregularity is detected. I’ve been debating what the best way to do replication clustering with automatic failover really is.
I’d like to hear your thoughts on the following question: what types of scenarios require what kind of response from such a tool?
I can think of a number of failures. Let me give just a few simple examples in a master-master pair:
I don’t want to bias the jury, so I’ll stop there and ask you to contribute your failure scenarios and what you think the correct action should be.