Introducing MySQL Table Checksum
Posted in Databases on Feb 26, 2007
MySQL Table Checksum is a tool to efficiently verify the contents of any MySQL table in any storage engine. You can use it to compare tables across many servers at once. The output is friendly and easy to use, both by eyeball and in UNIX command-line scripts. The provided MySQL Checksum Filter helps you winnow output so you only see tables that have problems.
Following up on my earlier article about how to calculate a table checksum in MySQL, I’ve integrated that methodology, with improvements suggested by the commenters and others, into a single easy-to-use tool. It is distributed as part of the MySQL Toolkit, available from SourceForge.net.
The tool takes server-side checksums using user-variables, so it is very efficient. It can checksum tables on many servers at once, running in parallel for speed. It has options to help you guarantee your tables are in the same state on your master and replica servers, and you can even checksum only some rows. These features can help you verify replication without locking tables or taking servers offline.
Here’s some sample output, in this case generated by comparing my server against itself:
DATABASE TABLE HOST ENGINE COUNT CHECKSUM TIME WAIT STAT LAG test chapters localhost MyISAM 21 218345624 0 0 NULL NULL test chapters 127.0.0.1 MyISAM 21 218345624 0 0 NULL NULL test foo localhost InnoDB 1 f14825835a0c07091c7b6a28c8a9f7120667815d 0 0 NULL NULL test foo 127.0.0.1 InnoDB 1 f14825835a0c07091c7b6a28c8a9f7120667815d 0 0 NULL NULL test samples 127.0.0.1 MyISAM 7 2103838486 0 0 NULL NULL test samples localhost MyISAM 7 2103838486 0 0 NULL NULL
For efficiency reasons and to be as general-purpose as possible, the checksum tool itself doesn’t process its output, and in fact doesn’t even output in sorted order. However, the output is specifically designed to be easy to parse and manipulate with standard command-line tools like
It’s not in my nature to make you do that work yourself, so I included a tool that will do it for you. It sorts input and makes sure the checksums and row counts for a given table match on all servers. You can either pipe the checksums directly into it, or give it a list of files to process (handy when you need to run the checksum in different places, pipe their outputs to files, and then process the files).
If you use it to filter the output I showed above, you’ll see nothing by default, because the tables have identical checksums – thus there’s nothing to see.
MySQL Toolkit is a new project I started on SourceForge to contain many of the MySQL utilities I’ve written and am writing (yes, there are more goodies in progress). Eventually these and other tools will all be bundled together so you can get them in one package.
I like making you happy. Make me happy in return: donate.