Archive for the ‘transactions’ tag
My editor Andy Oram recently sent me an ACM article on BASE, a technique for improving scalability by being willing to give up some other properties of traditional transactional systems.
It’s a really good read. In many ways it is the same religion everyone who’s successfully scaled a system Really Really Big has advocated. But this is different: it’s a very clear article, with a great writing style that really cuts out the fat and teaches the principles without being specific to any environment or sounding egotistical.
He mentions a lot of current thinking in the field, including the CAP principle, which Robert Hodges of Continuent first turned me onto a couple months ago. It has been proven formally, though I have not read the proof myself.
One of the most important concepts he advances is giving up the illusion of control. As programmers and DBAs, I think we may tend to like control too much. Foreign keys are a perfect example. I think the point here is that these things make you feel safe, but they don’t really make you safe. Just as with so many things in life, recognizing our inability to really control the systems we build is key to working with their strengths — instead of trying to bind them with iron bands.
Another great point is idempotency. This is a great way to help avoid problems with MySQL replication, by the way. I’ll leave the “why” as an exercise for the reader, but let me just point out that the file MySQL uses to remember its current position in replication is not synced to disk, so it will almost certainly get out of whack if MySQL dies ungracefully. (Google has solved this problem.)
A highly recommended read — worth more than most case studies about how specific companies have scaled their specific systems.
This release of MySQL Toolkit fixes some minor bugs, and adds major new functionality to MySQL Parallel Dump.
Big News: MySQL Parallel Dump
I wrote a lot more tests and cleaned up MySQL Parallel Dump a lot (fixed bugs with failed dumps not being reported, for instance) but the really big news is I added chunking functionality to it. Now you can say
mysql-parallel-dump --chunksize 100000
and it will try to divide each table into chunks with 100,000 rows each. It can do the chunks in parallel, so it can actually be running several dumps from one table at the same time. The chunking is fuzzy: it’s a hard problem, and I adapted (and improved) the code from MySQL Table Checksum to do it. If you can improve it, please contribute your fixes (the Sourceforge project page has several ways for you to do that).
You can also dump by size, which is probably more useful for most people. To do 10MB per chunk (approximately), use this command:
mysql-parallel-dump --chunksize 10M
This is a big deal not just because it lets you parallelize dumps from a single table, but because having the dump split up makes it easier to restore in small chunks, which as readers have pointed out is a big help on transactional storage engines.
The parallel restore tool is in incubation. In the meantime, please put this tool through its paces. Clearly it’s not yet well-tested and I look forward to your bug reports!
Changelog for mysql-find: 2007-10-03: version 0.9.5 * The --dbregex parameter didn't work right. Changelog for mysql-heartbeat: 2007-10-03: version 1.0.1 * --check hung forever. Changelog for mysql-parallel-dump: 2007-10-03: version 0.9.6 * Arguments to external program weren't honored. * System exit codes were lost, so errors weren't reported. * Added chunking. * Modularized and tested. * Added documentation. * Made --locktables negatable. * Changed default output to be less verbose and added --verbose option. * Added summary output.
This release is part of the unstable 1.5 branch. Its features will ultimately go into the stable 1.6 branch. You can download it from the innotop-devel package.
The major change is I’ve ripped out the W (Lock Waits) mode and enabled innotop to discover not only what a transaction is waiting for, but what it holds too. The new mode that replaces W is L (Locks). My last article goes into more detail on this.