Archive for the 'Perl' Category

You have the right to see code samples in an interview

Joel Spolsky writes about 12 steps to better code, and elsewhere about how candidates should write code in interviews.

The reverse conditions are true, too. If you’re a candidate, you should evaluate the employer against the 12 steps, and you should also see code samples. How else will you know what you’re getting into? You really have the right to do this, and you should exercise the right. If you don’t, you’ll get stuck in a crap job maintaining crap code. [dramatic voice] It happened to me.

In many companies, you can see code they’ve released as open-source. (The fact that they’ve done this says a lot about them.) But in others, you’re going to need to surprise someone and say “pick some code that’s not sensitive and show it to me.” Something simple, like the HTML for the search form on their website, or a utility to do some systems administration task. Any company is going to have a lot of code like this that they can show you.

The other approaches I see are to ask about it, assume, or ask the interviewer to write some code for you.

  1. Asking is a valid approach. If you see hesitation, or if someone says “well, it’s not as nice as we’d like, and we’re hoping you will offset that” run don’t walk, is my advice. If you’re reading this as you consider your first job out of college or something, I strongly suggest not getting a job with a company that wants you to improve the way they do things. You should be learning from them, not vice versa.
  2. You can also assume. “Oh, they use Perl? Nevermind.” That’s a stupid approach. Really. Is it acceptable to judge people’s character by the color of their skin? Then why would you judge their code by the language? In all seriousness, I have actually written very elegant, clean VBScript. And I mean, good-quality code by anyone’s standards. It’s hard in VBScript. It’s easy in Perl if you follow the Dog, which is a sign of great intelligence. Think about it this way: people who write beautiful Perl are people you should be eager to work with; they are rocket scientists. You will be the dumbest person in the room, and that should make you happy.
  3. I’ve never asked an interviewer to write code for me. Let me know how it works out for you.
Technorati Tags:, , ,

No related posts.

Improved Cacti monitoring templates for MySQL

Download MySQL Cacti templates

As promised, I’ve created some improved software for monitoring MySQL via Cacti. I began using the de facto MySQL Cacti templates a while ago, but found some things I needed to improve about them. As time passed, I rewrote everything from scratch. The resulting templates are much improved.

You can grab the templates by browsing the source repository on the project’s homepage.

In no particular order, here are some things I improved:

  • Standard polling interval and graph size by default.
  • Full captions on every graph; you don’t have to guess at how big the values are. Each graph has current, max, and average values printed at the bottom for every value on it.
  • Much more data is captured. I’ve graphed almost everything I could think of.
  • The graphs are grouped better. Most graphs have only related values. There are some exceptions, but not many.
  • The templates don’t hijack your existing installation. They don’t depend on or alter anything in your default Cacti installation.
  • The script that gathers the data is totally rewritten from scratch, and much improved. For example, the math works on 32-bit systems. It has caching built-in so each poll cycle results in just one request to the server, instead of one request per graph. (This is a weakness of Cacti I’m trying to work around). It also has debugging aids and other good coding stuff.
  • By default, it assumes you have the same username and password across every server you’re monitoring, so you don’t have to fill in a username and password for every single graph you create.
  • One data template == one graph template. This helps work around another Cacti limitation.
  • Lots more. Honestly I can’t really remember everything I’ve done. I’m sure you’ll help me remember by asking me how to get X feature working the way you want, and I’ll go “oh, yeah, that’s another thing I improved…”

Cacti templates are very laborious to create if they’re complex at all; it takes a long time and is very error-prone. Instead of doing it through Cacti’s web interface and exporting a huge XML file, I eliminated the redundancies and created a small, easy-to-maintain file from which I generate the XML template with a Perl script. This gives the added benefit of letting me (or you) generate templates with different parameters such as polling interval or graph size. The README file has the full details. However, I’ve pre-generated a set of templates that matches Cacti’s defaults, so you can probably just use that.

This has taken a lot of time. In particular, I spent a lot of time working on it at my former employer, The Rimm-Kaufman Group (kudos to them for letting me open-source the work) and I just spent most of my weekend writing the scripts to convert from the compact format to XML templates, so it’s possible to maintain these beasts. Plus I had to develop the compact format, too. This took a lot of time because I had to understand the Cacti data model, which is pretty complex.

Please enter issue reports for bugs, feature requests, etc at the Google project homepage, not in the comments of this blog post. I do not look through comments on my blog when I’m trying to remember what I should be working on for a software project.

If these templates help you and you feel like visiting my Amazon.com wishlist and sending something my way, I’d appreciate it!

PS: You may also be interested in Alexey Kovyrin’s list of templates for monitoring servers.

Technorati Tags:, , , , , ,

You might also like:

  1. What’s the best way to choose graph colors?
  2. A new home for innotop in the new year

Review of Perl Best Practices

In my opinion, every Perl programmer needs at least these two books: the Camel, and the Dog (Perl Best Practices).

The Camel teaches you how to program Perl: the syntax and so on. But the Dog teaches you how to program Perl sanely, by recommending that you use only a subset of Perl’s syntax and abilities.

The Camel tells you that there’s more than one way to do it, which is supposed to be a strength of Perl. The Dog tells you which way is “best.” And it tells you why the other ways are worth avoiding, and in which circumstances, and how badly you can get burned if you’re not careful.

The Dog also shows you a lot of ways to do things that you might not otherwise think about. In that sense, it’s not just stripping away. It’s also adding things that you might not get elsewhere. Example: you might not know about List::Util.

The common complaint about Perl being a write-only language doesn’t have to be true if you follow the Dog’s suggestions. Perl can be a very readable language — and I mean readable for ordinary humans, so don’t think I’m just another Perl programmer insisting that something is readable when it looks like static to you. The Dog encourages readable coding. In fact, in the absence of any other coding standard, if you try to “follow the Dog” you’ll be doing a good job. (There’s really no need to invent your own coding standard, in my opinion. Just follow the Dog.)

It’s not perfect. I don’t always agree with its suggestions. For example, I think the use of “magic comments” to automagically add interactive progress bars to command-line programs is just a bad idea. (Comments should never change the behavior of a program). However, the vast majority of its suggestions have come from long experience. If you don’t agree with one of them, it’s worth following it anyway until you consider yourself expert enough to disagree with authority.

I consider this an essential Perl book. No Perl programmer should be without it.

Technorati Tags:, , ,

You might also like:

  1. You have the right to see code samples in an interview
  2. How to Break Web Software

Maatkit version 1877 released

Download Maatkit

Maatkit contains essential command-line utilities for MySQL, such as a table checksum tool and query profiler. It provides missing features such as checking slaves for data consistency, with emphasis on quality and scriptability.

This release contains major bug fixes and new features. Some of the changes are not backwards-compatible. It also contains new tools to help you discover replication slaves and move them around the replication hierarchy.

Changelog for mk-archiver:

2008-03-16: version 1.0.8

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).
   * Changed short form of --analyze to -Z to avoid conflict with --charset.

Changelog for mk-deadlock-logger:

2008-03-16: version 1.0.9

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added 'A' part to DSNs (bug #1877548).

Changelog for mk-duplicate-key-checker:

2008-03-16: version 1.1.5

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-find:

2008-03-16: version 0.9.10

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-heartbeat:

2008-03-16: version 1.0.8

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-parallel-dump:

2008-03-16: version 1.0.7

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).
   * A global database connection was re-used by children, causing a hang.

Changelog for mk-parallel-restore:

2008-03-16: version 1.0.6

   * Added --setvars option (bug #1904689, bug #1911371).
   * Changed --charset to be compatible with other tools (bug #1877548).

Changelog for mk-query-profiler:

2008-03-16: version 1.1.9

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-show-grants:

2008-03-16: version 1.0.9

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-slave-delay:

2008-03-16: version 1.0.6

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added 'A' part to DSNs (bug #1877548).

Changelog for mk-slave-find:

2008-03-16: version 1.0.0

   * Initial release.

Changelog for mk-slave-move:

2008-03-16: version 0.9.0

   * Initial release.

Changelog for mk-slave-prefetch:

2008-03-16: version 1.0.1

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-slave-restart:

2008-03-16: version 1.0.6

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).
   * Added logic to repair tables, and rewrote a lot of code.
   * Added --always option, disabled by default.  Not backwards compatible.
   * --daemonize did not work.
   * --quiet caused an undefined variable error.

Changelog for mk-table-checksum:

2008-03-16: version 1.1.26

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added 'A' part to DSNs (bug #1877548).
   * Added --unique option to mk-checksum-filter.
   * The exit status from mk-checksum-filter was always 0.
   * mk-table-checksum now prefers to discover slaves via SHOW PROCESSLIST.

Changelog for mk-table-sync:

2008-03-16: version 1.0.6

   * --chunksize was not being converted to rowcount (bug #1902341).
   * Added --setvars option (bug #1904689, bug #1911371).
   * Deprecated the --utf8 option in favor of the A part in DSNs.
   * Mixed-case identifiers caused case-sensitivity issues (bug #1910276).
   * Prefer SHOW PROCESSLIST when looking for slaves of a server.

Changelog for mk-visual-explain:

2008-03-16: version 1.0.7

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).
Technorati Tags:, ,

You might also like:

  1. Maatkit version 1314 released
  2. Maatkit version 1709 released
  3. Maatkit version 1674 released
  4. Maatkit version 1753 released
  5. Maatkit version 1508 released

Maatkit version 1753 released

Download Maatkit

This release contains minor bug fixes and new features. Besides the little bug fixes, there’s a fun new feature in mk-heartbeat: it can auto-discover slaves recursively, and show the replication delay on all of them, to wit:

baron@keywest ~ $ mk-heartbeat --check --host master -D rkdb --recurse 10
master 0
slave1 1
slave2 1
slave3 4

(Not actual results. Your mileage may vary. Closed course, professional driver. Do not attempt).

Nothing else in this release is very exciting. I just wanted to get the bug fixes out there.

Changelog for mk-heartbeat:

2008-02-10: version 1.0.7

   * Added --recurse option to check slaves to any depth.
   * Made mk-heartbeat explicitly close DB connection when done.

Changelog for mk-parallel-dump:

2008-02-10: version 1.0.6

   * Added the --losslessfp option.
   * Fixed child process exit status on Solaris (bug #1886444).

Changelog for mk-parallel-restore:

2008-02-10: version 1.0.5

   * Fixed forking issues with File::Find on Solaris (bug #1887102).
   * Fixed child process exit status on Solaris (bug #1886444).
   * The --defaults-file option caused a mysql error (bug #1886866).

Changelog for mk-show-grants:

2008-02-10: version 1.0.8

   * Added --timestamp option.

Changelog for mk-table-checksum:

2008-02-10: version 1.1.25

   * The --lock option did not work correctly (bug #1884712).

Changelog for mk-table-sync:

2008-02-10: version 1.0.5

   * The Stream algorithm wasn't chosen when a table had no key.
   * Numeric strings beginning with 0 weren't quoted (bug #1883019).
Technorati Tags:, ,

You might also like:

  1. Maatkit version 1709 released
  2. Maatkit version 1674 released
  3. Maatkit version 1314 released
  4. Maatkit version 1877 released
  5. Maatkit version 1579 released

Maatkit version 1709 released

Download Maatkit

This release contains bug fixes and new features. It also contains a new tool: my implementation of Paul Tuckfield’s relay log pipelining idea. I have had quite a few responses to that blog post, and requests for the code. So I’m releasing it as part of Maatkit.

Changelog for mk-archiver:

2008-01-24: version 1.0.7

   * Added --quiet option.
   * Added --plugin option.  The plugin interface is not backwards compatible.
   * Added --bulkins option.
   * Added --bulkdel option.
   * Added --nodelete option.
   * Changed negatable --ascend option to --noascend.

Changelog for mk-parallel-dump:

2008-01-24: version 1.0.5

   * The fix for bug #1863949 added an invalid argument to gzip (bug #1866137)
   * --quiet caused a crash.

Changelog for mk-parallel-restore:

2008-01-24: version 1.0.4

   * The -D option was used as a default DB for the connection (bug #1870415).

Changelog for mk-slave-prefetch:

2008-01-24: version 1.0.0

   * Initial release.

Changelog for mk-table-sync:

2008-01-24: version 1.0.4

   * Made the --algorithm option case-insensitive (bug #1873152).
   * Fixed a quoting bug.
   * Made the UTF-8 options configurable.
Technorati Tags:, , , ,

You might also like:

  1. Maatkit version 1877 released
  2. Maatkit version 1674 released
  3. Maatkit version 1753 released
  4. Maatkit version 1579 released
  5. Maatkit version 1508 released

How pre-fetching relay logs speeds up MySQL replication slaves

I dashed off a hasty post about speeding up replication slaves, and gave no references or explanation. That’s what happens when I write quickly! This post explains what the heck I was talking about.

I first heard Paul Tuckfield talk at the first MySQL Camp, in November 2006. He mentioned that he speeds up MySQL replication by “pre-fetching relay logs” on the slave. Actually, I think he used the term “pipelining” at that point. Next Spring, he mentioned the same thing in his keynote address at the 2007 MySQL Conference & Expo. You can download audio and video of his talk from that link. He mentions this approach pretty late in the talk, almost at the end. I’ve been meaning to try duplicating his idea since the first time I heard him talk.

The basic idea is to help overcome replication’s single-threadedness. Under the right conditions, the slave’s SQL thread can become I/O-bound, even though the slave server has lots of unused I/O capacity. As a result, it spends a lot of time just waiting for the disk to return some data, and becomes much slower than it has to be.

Paul’s solution to this problem is to read the statements from the relay log, just a little bit ahead of the SQL thread’s position, convert them into SELECT queries, and execute them on the slave. This causes MySQL to request some of the data from the disk in advance. Then when the slave’s SQL thread wants to update that data, it’s already in memory, and things can potentially go much faster.

How much faster is open to debate. I think Paul sees about 3-4 times faster, but please don’t quote me on that. Farhan Mashraqui also uses this hack and gets some speedup as well.

The problem is, it won’t automatically work for everyone. In theory, it can potentially help if the following are true:

  • Your data is much bigger than memory.
  • You use a storage engine with row-level locking, like InnoDB.
  • Your workload is mostly small (single-row is good), widely scattered, random UPDATE and DELETE statements. (INSERT is less likely to benefit, because the relevant indexes are likely to be “hot” already).
  • The slave’s SQL thread is I/O-bound, but the slave has lots of spare I/O capacity. In other words, lots of disk spindles.

My slaves don’t do this kind of work. They do a lot of big multi-table updates and summary queries. There is very little to gain from pre-fetching the indexes and data for these statements, because whatever big query the SQL thread is running is likely to just flush the pre-fetched pages out of memory again before they’re needed. I tried anyway, and sure enough, it didn’t work.

The other problem is, it’s hard to write a generically useful program to do this kind of pre-fetching. It’s not too hard to write something specific to your workload, as Farhan did. But getting it to work right in general requires a lot of smarts, such as figuring out how far ahead of the slave SQL thread to stay, which queries not to try to pre-execute, and so on. I wrote an implementation of it that’s generic and has some intelligence built in. If you’re interested in it, see my previous post (linked at the top of this post).

If you’re thinking about writing something like this yourself, be prepared: it could be a lot of work. I can see how it would be simpler on some workloads, but on mine it was far from simple. I did some silly things, like running out of disk space because of temp files for LOAD DATA INFILE statements. Fortunately, that was just one of my benchmark machines.

If conditions aren’t right, it could really screw you. For example, if your slave has only a single disk, or if you use MyISAM on the slave, you’ll probably just cause problems for yourself. You need to know your hardware and your workload really well. That’s why Paul didn’t release his code, and I’ve hesitated for the same reason.

There’s more information about this in the upcoming High Performance MySQL, 2nd Edition, which I’m helping to write. We also have a lot more information about how to understand your hardware and workload. There’s no way I can fit it all into this post, and I don’t want to try. Even if I weren’t working like a mad dog on the book and had time to put it here, I can’t give away all the book’s goodies, can I? :-)

Technorati Tags:, , , ,

You might also like:

  1. Speed up your MySQL replication slaves
  2. Maatkit version 1709 released
  3. Introducing MySQL Slave Delay
  4. How MySQL replication got out of sync
  5. What are your favorite MySQL replication filtering rules?

Speed up your MySQL replication slaves

Paul Tuckfield of YouTube has spoken about how he sped up his slaves by pre-fetching the slave’s relay logs. I wrote an implementation of this, tried it on my workload, and it didn’t speed them up. (I didn’t expect it to; I don’t have the right workload). I had a few email exchanges with Paul and some other experts on the topic and we agreed my workload isn’t going to benefit from the pre-fetching.

In the meantime, I’ve got a pretty sophisticated implementation of Paul’s idea just sitting around, unused. I haven’t released it for the same reasons Paul didn’t release his: I’m afraid it might do more harm than good.

However, if you’d like the code, send me an email at [baron at this domain] and I’ll share the code with you. In return, I would like you to tell me about your hardware and your workload, and to do at least some rudimentary benchmarks to show whether it works or not on your workload. If I find that this is beneficial for some people, I may go ahead and release the code as part of Maatkit.

Update: it’s part of Maatkit now.

Technorati Tags:, , ,

You might also like:

  1. How pre-fetching relay logs speeds up MySQL replication slaves
  2. Maatkit version 1709 released
  3. Maatkit version 1314 released
  4. A very fast FNV hash function for MySQL
  5. Maatkit version 1579 released

Maatkit version 1674 released

Download Maatkit

Maatkit contains essential command-line utilities for MySQL, such as a table checksum tool and query profiler. It provides missing features such as checking slaves for data consistency, with emphasis on quality and scriptability.

This release contains bug fixes and new features.

Changelog for mk-archiver:

2008-01-05: version 1.0.6

   * Made suffixes for time options optional (bug #1858696).

Changelog for mk-deadlock-logger:

2008-01-05: version 1.0.8

   * Made suffixes for time options optional (bug #1858696).

Changelog for mk-heartbeat:

2008-01-05: version 1.0.6

   * Made suffixes for time options optional (bug #1858696).

Changelog for mk-parallel-dump:

2008-01-05: version 1.0.4

   * Second and later chunks had DROP/CREATE TABLE (bug #1863949).
   * Made suffixes for time options optional (bug #1858696).
   * --locktables didn't disable --flushlock.

Changelog for mk-parallel-restore:

2008-01-05: version 1.0.3

   * Made suffixes for time options optional (bug #1858696).
   * --ignoretables was ignored.

Changelog for mk-slave-delay:

2008-01-05: version 1.0.5

   * Made suffixes for time options optional (bug #1858696).
   * The program was ignoring some connection parameters.
   * Made the program use master when the I/O thread waits for relay log space.

Changelog for mk-slave-restart:

2008-01-05: version 1.0.5

   * Made suffixes for time options optional (bug #1858696).
   * Added logic to discard corrupt relay logs.
   * Added --monitor, --sentinel, and --stop.
   * Added --quiet and changed --verbose to 1 by default.
   * Added the ability to monitor many servers with --recurse.

Changelog for mk-table-checksum:

2008-01-05: version 1.1.24

   * Added support for the FNV_64 UDF, which is distributed with Maatkit.
   * --emptyrepltbl didn't Do The Right Thing by default.
   * --explain didn't disable --emptyrepltbl
   * Made suffixes for time options optional (bug #1858696).
   * The --float-precision option was ignored.
   * (mk-checksum-filter) -i, -d options worked only on multiple files.

Changelog for mk-table-sync:

2008-01-05: version 1.0.3

   * Added the --function command-line option.
   * Added support for the FNV_64 hash function (see mk-table-checksum).
   * Made suffixes for time options optional (bug #1858696).
   * InnoDB tables use --transaction unless it's explicitly specified.
Technorati Tags:, , , , ,

You might also like:

  1. Maatkit version 1508 released
  2. Maatkit version 1877 released
  3. Maatkit version 1709 released
  4. Maatkit version 1314 released
  5. Maatkit version 1579 released

What is new in Maatkit

My posts lately have been mostly progress reports and release notices. That’s because we’re in the home stretch on the book, and I don’t have much spare time. However, a lot has also been changing with Maatkit, and I wanted to take some time to write about it properly. I’ll just write about each tool in no particular order.

Overall

I’ve been fixing a fair number of bugs, most of which have been in the code for a while. Every bug I fix these days gets a test case to guard against regressions. I’ve integrated the tests into the Makefile, so there’s no way for me to forget to run them.

The test suite has hundreds of tests, which is probably pretty good in comparison to many projects of this type. However, there will probably never be enough tests. I’ve moved much (in some cases, almost all) of the code into modules, which are easy to test, but it’s always a little harder to test programs themselves, so some things aren’t tested. (For example, it’s tedious to set up a test case that requires many MySQL instances to be running in a multi-tier replication setup).

Still, I think the quality has increased a lot in the last 6 months or so, since I’ve been more disciplined about tests. That discipline, by the way, was forced on me. The mk-table-sync tool was completely unmanageable. I was able to rewrite that tool in December, almost entirely using modularized, tested code.

mk-heartbeat

Jeremy Cole and Six Apart originally contributed this tool. Since then I’ve added a lot more features, allowed a lot more control over how it works, and it even works on PostgreSQL now. As an example, I added features that make it easy to run every hour from a crontab. It daemonizes, runs in the background, and then quits automatically when the new instance starts. I use it in production to give me a reliable metric for how up-to-date a slave is. When I need to know absolutely “has this slave received this update,” Seconds_behind_master won’t do, for many reasons. Load balancing and lots of other things hinge on up-to-date slaves.

mk-parallel-dump

I think this tool is probably the fastest, smartest way to do backups in tab-delimited format. I’ve been fixing a lot of bugs in this one, mostly for non-tab-delimited dumps. It has turned out to be harder to write this code because it uses shell commands to call mysqldump. (The tab-delimited dumps are done entirely via SQL, which is why it’s so good at what it does).

mk-slave-restart

I’ve been having a lot of trouble with relay log corruption, so unfortunately this tool has become necessary to use regularly in production. As a result I made it quite a bit smarter. It can detect relay log corruption, and instead of the usual skip-one-and-continue, it issues a CHANGE MASTER TO, so the slave will discard and re-fetch its relay logs. I’ve also made it capable of monitoring many slaves at once. (It discovers slaves via either SHOW SLAVE HOSTS or SHOW PROCESSLIST, so if you point it at a master, it can watch all the master’s slaves with a single command).

mk-table-checksum

I’ve made a lot of changes to this tool recently. Smarter chunking code to divide your tables into bits that are easier for the server to work with, TONS of small improvements and fixes, and much friendlier behavior.

The most recent release also includes a big speed improvement. Most of the time this tool spends is waiting for MySQL to run checksum queries. While my pure-SQL checksum queries are faster than most (all?) other ways to compare data in different servers, I’ve recently been trying to reduce the amount of work they cause.

As a result, I investigated Google’s MySQL patches. Mark Callaghan mentioned to me that he’d added a checksum function into their version of the server, and I wanted to look at that. They’re using the FNV hash function to checksum data. I decided that a UDF would be a fine way to write a faster row-checksum function, so I wrote a 64-bit FNV hash UDF. While I’m not the first person to do that, my version accepts any number of arguments, not just one. This makes it a lot more efficient to checksum every column in a row, because you don’t have to a) make multiple calls to the hash function or b) concatenate the arguments so you can make a single call. I also copied Google’s logic to make it simpler and more efficient to checksum NULLs, which avoids still more function calls. The UDF returns a 64-bit number, which can be fed directly to BIT_XOR to crush an entire table (or group of rows) into a single order-independent checksum. And finally, FNV is also a lot faster than, say, MD5 or SHA1.

The results are quite a bit faster for my hardware: 12.7 seconds instead of 80 seconds on a CPU-bound workload. So that’s at least a 6.2x speedup. (80 seconds was the best I was able to achieve before. Some of the checksum techniques used up to 197 seconds on the same data).

The UDF is really simple to compile and install, does no memory allocations or other nasty things, and should be safe for you to use. The source is included with the latest Maatkit release. (Older Maatkit versions won’t be able to take full advantage of it, by the way, but they can still be sped up somewhat). However, I would really appreciate some review from more experienced coders. I’m no C++ wizard. In fact, my first attempts at writing this thing were so blockheaded and wrong, I was almost embarrassed. (Thanks are due to the fine people hanging out on #mysql-dev).

mk-table-sync

After my week-long coding marathon on this in December, I’ve needed to continue working on this. I’ve needed it quite a few times to solve problems with replication. (Did I mention relay log corruption?). It’s much faster and less buggy now, and as a bonus, the latest release can also take advantage of the FNV UDF I just mentioned.

I think I should explain the general evolution in this tool’s life. It started out as “how to find differences in data efficiently.” This was a period where I did a lot of deep thinking on exploiting the structures inherent in data. It then progressed to “how to sync data efficiently.” At this point I was able to outperform another data-syncing tool by a wide margin, even though it was a multi-threaded C++ program and mine was just a Perl script. I did that by writing efficient queries and moving very little data across the network.

The most recent incarnation has thrown performance out the window, at least as measured by those criteria. The aforementioned C++ program now outperforms mine by a wide margin on the same tests.

What changed?

Two things: I’m focusing on quality, and I’m focusing on syncing running servers correctly with minimal interruption.

Once I have good-quality, well-tested code, I’ll be able to speed it up. I know this because I’m currently doing some things I know are slower than they could be.

But much more importantly, I’ve changed the whole angle of the tool. I want to be able to synchronize a busy master and slave, without locking tables, automatically ensuring that the data stays consistent and there are no race conditions. I do this with a lot of special tricks, such as syncing tables in small bits, using SELECT FOR UPDATE to lock only the rows I’m syncing, and so on. And I’m actively working to make the tool Do The Right Thing without needing 99 command-line arguments. (I think the latest release does this very well).

Instead of “make the sync use as little network traffic as possible,” I’ve changed the criteria of good-ness to “do it right, do it once, and don’t get in the way.”

As a result, I can sync a table that gets a ton of updates — one of the “hottest” tables in my application — without interfering with my application. Online. Correctly. In one pass. Through replication. Show me another tool that can do that, and I’ll re-run my benchmarks :-)

This doesn’t mean I don’t care about performance. I do, and I’ll bring back the earlier “go easy on the network” sync algorithms at some point. They are very useful when you have a slow network, or your tables aren’t being updated and you just want to sync things fast. I’ll also be able to speed up the “don’t interfere with the application” algorithms.

One interesting thing I did was divide up the functionality so the tool can use many different sync algorithms. I created something like a storage-engine API, except it’s a sync API. It’s really easy to add in new sync algorithms now. All I have to do is write the code that algorithm needs. This is really only about 200-300 lines of code for the current algorithms.

Tools that don’t yet exist

What I haven’t told you about is a lot of unreleased code and new tools. There’s some good stuff in the works. Also stay tuned — a third party might be about to contribute another tool to Maatkit, which will also be a very neat addition.

Conclusion

As Dana Carvey says, “If I had more time… the programs we have in place are getting the job done, so let’s stay on course, a thousand points of light. Well, unfortunately, I guess my time is up.” Maatkit is getting better all the time, just wait and see.

Technorati Tags:, , , , , , , , , ,

You might also like:

  1. How to sync tables in master-master MySQL replication
  2. Maatkit version 1417 released
  3. Progress on Maatkit bounty, part 3
  4. Maatkit version 1674 released
  5. Maatkit bounty begins tomorrow