Suppose you have a master-master replication setup, and you know one of the tables has the wrong data. How do you re-sync it with the other server?
Warning: don’t just use any tool for this job! You may destroy your good copy of the data.
If your table is large, you’ll probably want to use a tool that can smartly find the differences in a very large dataset, and fix only the rows that need to be fixed. There are several tools that are either able to do this, or claim to be able to do this. However, most of them are not replication-aware, and are likely to either break replication or destroy data.
To see why this is, let’s look at a typical scenario. You have server1 and server2 set up as co-masters. On server1, your copy of sakila.film has correct data. On server2, somehow you are missing a row in that table. A hypothetical sync tool will compare the two copies of the data and find the missing row, then insert it on server2. This INSERT statement will flow through replication to server1, where it will cause a duplicate key error and stop replication.
You can probably think of many other scenarios with lots of bad side effects, so I won’t list any more. I’ll leave it at this: when you are synchronizing data on a slave (even if it is also a master), you must not change data on the slave. Changing data on the slave can cause so much trouble in so many different ways! The correct way to do this is to make the changes on the master, and let them flow through replication to the slave.
As far as I know, there is only one tool that is capable of doing this. It is mk-table-sync, which is part of Maatkit. However, even this tool will let you point the gun at your foot and pull the trigger, if you don’t use it correctly.
The correct way to sync a master-master setup with mk-table-sync is with the --synctomaster option, which tells it to make changes on the master:
mk-table-sync --synctomaster h=server2,D=sakila,t=film
Notice that I’m connecting to the slave, but instructing it to make changes on the master. (Yes, it is able to find the master by inspecting the slave).
If you do the following, you’ll probably cause problems:
mk-table-sync h=server1,D=sakila,t=film h=server2
I’ve just updated the documentation to point out the subtleties with master-master replication. However, you should always keep in mind: it’s not just master-master replication. Any replication configuration is best synchronized by making the changes on the master, and you should always avoid changing data on a slave — even to “fix” the slave. I might also add a feature to mk-table-sync to warn you when it detects that you are trying to change data on a slave.
Technorati Tags:master master replication, mysql, replication, sql
Henceforth, I dub thee GLAMP
I’ve decided to start replacing L with GL in acronyms where L supposedly stands for Linux.
I’m not a big user of acronyms, because I think they are exclusionist and they obscure, rather than revealing. (This wouldn’t matter if I wrote for people who already knew what I meant and agreed with me, but that’s a waste of time). However, LAMP is one that I’ve probably used a few times, without thinking that it is supposed to stand for Linux, Apache, MySQL, and PHP/Perl/Python. In fact, it doesn’t refer to Linux, it refers to GNU/Linux. Therefore, it should be GLAMP.
Why does this matter? I try not to say Linux, unless I’m referring to a kernel, because a kernel is not an operating system. I try to be pretty careful about saying GNU/Linux when I’m talking about an operating system. An exception is a recruiting event yesterday at the University of Virginia, where I compromised my principles because of the noise. Trying to explain myself at that decibel level was just beyond my willingness, so I said we use Linux. If the potential recruits hire on with us, they’ll get to hear me say GNU/Linux. And if they don’t, maybe they’ll attend Richard Stallman’s upcoming talk at the engineering school there on March 27th or 28th (sorry, it’s not listed online, so I can’t link to it).
And you’ll see GNU/Linux used conscientiously if you read the book I’m helping to write, too.
GNU matters. A lot. You may not think so, but if it ceased to exist, you’d find out. That applies equally even if you don’t think you are a Free Software user. You have no idea how much you rely on Free Software in your daily life. And the GNU project has been and continues to be a keystone in that arch of freedom.
Thanks to MySQL’s Brian Aker for snapping me out of my LAMP carelessness.
Technorati Tags:Brian Aker, Free Software, GNU, Linux, Richard Stallman, University of VirginiaYou might also like: