Archive for the 'Tools' Category

Improved Cacti monitoring templates for MySQL

Download MySQL Cacti templates

As promised, I’ve created some improved software for monitoring MySQL via Cacti. I began using the de facto MySQL Cacti templates a while ago, but found some things I needed to improve about them. As time passed, I rewrote everything from scratch. The resulting templates are much improved.

You can grab the templates by browsing the source repository on the project’s homepage.

In no particular order, here are some things I improved:

  • Standard polling interval and graph size by default.
  • Full captions on every graph; you don’t have to guess at how big the values are. Each graph has current, max, and average values printed at the bottom for every value on it.
  • Much more data is captured. I’ve graphed almost everything I could think of.
  • The graphs are grouped better. Most graphs have only related values. There are some exceptions, but not many.
  • The templates don’t hijack your existing installation. They don’t depend on or alter anything in your default Cacti installation.
  • The script that gathers the data is totally rewritten from scratch, and much improved. For example, the math works on 32-bit systems. It has caching built-in so each poll cycle results in just one request to the server, instead of one request per graph. (This is a weakness of Cacti I’m trying to work around). It also has debugging aids and other good coding stuff.
  • By default, it assumes you have the same username and password across every server you’re monitoring, so you don’t have to fill in a username and password for every single graph you create.
  • One data template == one graph template. This helps work around another Cacti limitation.
  • Lots more. Honestly I can’t really remember everything I’ve done. I’m sure you’ll help me remember by asking me how to get X feature working the way you want, and I’ll go “oh, yeah, that’s another thing I improved…”

Cacti templates are very laborious to create if they’re complex at all; it takes a long time and is very error-prone. Instead of doing it through Cacti’s web interface and exporting a huge XML file, I eliminated the redundancies and created a small, easy-to-maintain file from which I generate the XML template with a Perl script. This gives the added benefit of letting me (or you) generate templates with different parameters such as polling interval or graph size. The README file has the full details. However, I’ve pre-generated a set of templates that matches Cacti’s defaults, so you can probably just use that.

This has taken a lot of time. In particular, I spent a lot of time working on it at my former employer, The Rimm-Kaufman Group (kudos to them for letting me open-source the work) and I just spent most of my weekend writing the scripts to convert from the compact format to XML templates, so it’s possible to maintain these beasts. Plus I had to develop the compact format, too. This took a lot of time because I had to understand the Cacti data model, which is pretty complex.

Please enter issue reports for bugs, feature requests, etc at the Google project homepage, not in the comments of this blog post. I do not look through comments on my blog when I’m trying to remember what I should be working on for a software project.

If these templates help you and you feel like visiting my Amazon.com wishlist and sending something my way, I’d appreciate it!

PS: You may also be interested in Alexey Kovyrin’s list of templates for monitoring servers.

Technorati Tags:, , , , , ,

You might also like:

  1. What’s the best way to choose graph colors?
  2. A new home for innotop in the new year

Maatkit version 1877 released

Download Maatkit

Maatkit contains essential command-line utilities for MySQL, such as a table checksum tool and query profiler. It provides missing features such as checking slaves for data consistency, with emphasis on quality and scriptability.

This release contains major bug fixes and new features. Some of the changes are not backwards-compatible. It also contains new tools to help you discover replication slaves and move them around the replication hierarchy.

Changelog for mk-archiver:

2008-03-16: version 1.0.8

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).
   * Changed short form of --analyze to -Z to avoid conflict with --charset.

Changelog for mk-deadlock-logger:

2008-03-16: version 1.0.9

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added 'A' part to DSNs (bug #1877548).

Changelog for mk-duplicate-key-checker:

2008-03-16: version 1.1.5

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-find:

2008-03-16: version 0.9.10

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-heartbeat:

2008-03-16: version 1.0.8

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-parallel-dump:

2008-03-16: version 1.0.7

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).
   * A global database connection was re-used by children, causing a hang.

Changelog for mk-parallel-restore:

2008-03-16: version 1.0.6

   * Added --setvars option (bug #1904689, bug #1911371).
   * Changed --charset to be compatible with other tools (bug #1877548).

Changelog for mk-query-profiler:

2008-03-16: version 1.1.9

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-show-grants:

2008-03-16: version 1.0.9

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-slave-delay:

2008-03-16: version 1.0.6

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added 'A' part to DSNs (bug #1877548).

Changelog for mk-slave-find:

2008-03-16: version 1.0.0

   * Initial release.

Changelog for mk-slave-move:

2008-03-16: version 0.9.0

   * Initial release.

Changelog for mk-slave-prefetch:

2008-03-16: version 1.0.1

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).

Changelog for mk-slave-restart:

2008-03-16: version 1.0.6

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).
   * Added logic to repair tables, and rewrote a lot of code.
   * Added --always option, disabled by default.  Not backwards compatible.
   * --daemonize did not work.
   * --quiet caused an undefined variable error.

Changelog for mk-table-checksum:

2008-03-16: version 1.1.26

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added 'A' part to DSNs (bug #1877548).
   * Added --unique option to mk-checksum-filter.
   * The exit status from mk-checksum-filter was always 0.
   * mk-table-checksum now prefers to discover slaves via SHOW PROCESSLIST.

Changelog for mk-table-sync:

2008-03-16: version 1.0.6

   * --chunksize was not being converted to rowcount (bug #1902341).
   * Added --setvars option (bug #1904689, bug #1911371).
   * Deprecated the --utf8 option in favor of the A part in DSNs.
   * Mixed-case identifiers caused case-sensitivity issues (bug #1910276).
   * Prefer SHOW PROCESSLIST when looking for slaves of a server.

Changelog for mk-visual-explain:

2008-03-16: version 1.0.7

   * Added --setvars option (bug #1904689, bug #1911371).
   * Added --charset option (bug #1877548).
Technorati Tags:, ,

You might also like:

  1. Maatkit version 1314 released
  2. Maatkit version 1709 released
  3. Maatkit version 1674 released
  4. Maatkit version 1753 released
  5. Maatkit version 1508 released

Maatkit version 1674 released

Download Maatkit

Maatkit contains essential command-line utilities for MySQL, such as a table checksum tool and query profiler. It provides missing features such as checking slaves for data consistency, with emphasis on quality and scriptability.

This release contains bug fixes and new features.

Changelog for mk-archiver:

2008-01-05: version 1.0.6

   * Made suffixes for time options optional (bug #1858696).

Changelog for mk-deadlock-logger:

2008-01-05: version 1.0.8

   * Made suffixes for time options optional (bug #1858696).

Changelog for mk-heartbeat:

2008-01-05: version 1.0.6

   * Made suffixes for time options optional (bug #1858696).

Changelog for mk-parallel-dump:

2008-01-05: version 1.0.4

   * Second and later chunks had DROP/CREATE TABLE (bug #1863949).
   * Made suffixes for time options optional (bug #1858696).
   * --locktables didn't disable --flushlock.

Changelog for mk-parallel-restore:

2008-01-05: version 1.0.3

   * Made suffixes for time options optional (bug #1858696).
   * --ignoretables was ignored.

Changelog for mk-slave-delay:

2008-01-05: version 1.0.5

   * Made suffixes for time options optional (bug #1858696).
   * The program was ignoring some connection parameters.
   * Made the program use master when the I/O thread waits for relay log space.

Changelog for mk-slave-restart:

2008-01-05: version 1.0.5

   * Made suffixes for time options optional (bug #1858696).
   * Added logic to discard corrupt relay logs.
   * Added --monitor, --sentinel, and --stop.
   * Added --quiet and changed --verbose to 1 by default.
   * Added the ability to monitor many servers with --recurse.

Changelog for mk-table-checksum:

2008-01-05: version 1.1.24

   * Added support for the FNV_64 UDF, which is distributed with Maatkit.
   * --emptyrepltbl didn't Do The Right Thing by default.
   * --explain didn't disable --emptyrepltbl
   * Made suffixes for time options optional (bug #1858696).
   * The --float-precision option was ignored.
   * (mk-checksum-filter) -i, -d options worked only on multiple files.

Changelog for mk-table-sync:

2008-01-05: version 1.0.3

   * Added the --function command-line option.
   * Added support for the FNV_64 hash function (see mk-table-checksum).
   * Made suffixes for time options optional (bug #1858696).
   * InnoDB tables use --transaction unless it's explicitly specified.
Technorati Tags:, , , , ,

You might also like:

  1. Maatkit version 1508 released
  2. Maatkit version 1877 released
  3. Maatkit version 1709 released
  4. Maatkit version 1314 released
  5. Maatkit version 1579 released

What is new in Maatkit

My posts lately have been mostly progress reports and release notices. That’s because we’re in the home stretch on the book, and I don’t have much spare time. However, a lot has also been changing with Maatkit, and I wanted to take some time to write about it properly. I’ll just write about each tool in no particular order.

Overall

I’ve been fixing a fair number of bugs, most of which have been in the code for a while. Every bug I fix these days gets a test case to guard against regressions. I’ve integrated the tests into the Makefile, so there’s no way for me to forget to run them.

The test suite has hundreds of tests, which is probably pretty good in comparison to many projects of this type. However, there will probably never be enough tests. I’ve moved much (in some cases, almost all) of the code into modules, which are easy to test, but it’s always a little harder to test programs themselves, so some things aren’t tested. (For example, it’s tedious to set up a test case that requires many MySQL instances to be running in a multi-tier replication setup).

Still, I think the quality has increased a lot in the last 6 months or so, since I’ve been more disciplined about tests. That discipline, by the way, was forced on me. The mk-table-sync tool was completely unmanageable. I was able to rewrite that tool in December, almost entirely using modularized, tested code.

mk-heartbeat

Jeremy Cole and Six Apart originally contributed this tool. Since then I’ve added a lot more features, allowed a lot more control over how it works, and it even works on PostgreSQL now. As an example, I added features that make it easy to run every hour from a crontab. It daemonizes, runs in the background, and then quits automatically when the new instance starts. I use it in production to give me a reliable metric for how up-to-date a slave is. When I need to know absolutely “has this slave received this update,” Seconds_behind_master won’t do, for many reasons. Load balancing and lots of other things hinge on up-to-date slaves.

mk-parallel-dump

I think this tool is probably the fastest, smartest way to do backups in tab-delimited format. I’ve been fixing a lot of bugs in this one, mostly for non-tab-delimited dumps. It has turned out to be harder to write this code because it uses shell commands to call mysqldump. (The tab-delimited dumps are done entirely via SQL, which is why it’s so good at what it does).

mk-slave-restart

I’ve been having a lot of trouble with relay log corruption, so unfortunately this tool has become necessary to use regularly in production. As a result I made it quite a bit smarter. It can detect relay log corruption, and instead of the usual skip-one-and-continue, it issues a CHANGE MASTER TO, so the slave will discard and re-fetch its relay logs. I’ve also made it capable of monitoring many slaves at once. (It discovers slaves via either SHOW SLAVE HOSTS or SHOW PROCESSLIST, so if you point it at a master, it can watch all the master’s slaves with a single command).

mk-table-checksum

I’ve made a lot of changes to this tool recently. Smarter chunking code to divide your tables into bits that are easier for the server to work with, TONS of small improvements and fixes, and much friendlier behavior.

The most recent release also includes a big speed improvement. Most of the time this tool spends is waiting for MySQL to run checksum queries. While my pure-SQL checksum queries are faster than most (all?) other ways to compare data in different servers, I’ve recently been trying to reduce the amount of work they cause.

As a result, I investigated Google’s MySQL patches. Mark Callaghan mentioned to me that he’d added a checksum function into their version of the server, and I wanted to look at that. They’re using the FNV hash function to checksum data. I decided that a UDF would be a fine way to write a faster row-checksum function, so I wrote a 64-bit FNV hash UDF. While I’m not the first person to do that, my version accepts any number of arguments, not just one. This makes it a lot more efficient to checksum every column in a row, because you don’t have to a) make multiple calls to the hash function or b) concatenate the arguments so you can make a single call. I also copied Google’s logic to make it simpler and more efficient to checksum NULLs, which avoids still more function calls. The UDF returns a 64-bit number, which can be fed directly to BIT_XOR to crush an entire table (or group of rows) into a single order-independent checksum. And finally, FNV is also a lot faster than, say, MD5 or SHA1.

The results are quite a bit faster for my hardware: 12.7 seconds instead of 80 seconds on a CPU-bound workload. So that’s at least a 6.2x speedup. (80 seconds was the best I was able to achieve before. Some of the checksum techniques used up to 197 seconds on the same data).

The UDF is really simple to compile and install, does no memory allocations or other nasty things, and should be safe for you to use. The source is included with the latest Maatkit release. (Older Maatkit versions won’t be able to take full advantage of it, by the way, but they can still be sped up somewhat). However, I would really appreciate some review from more experienced coders. I’m no C++ wizard. In fact, my first attempts at writing this thing were so blockheaded and wrong, I was almost embarrassed. (Thanks are due to the fine people hanging out on #mysql-dev).

mk-table-sync

After my week-long coding marathon on this in December, I’ve needed to continue working on this. I’ve needed it quite a few times to solve problems with replication. (Did I mention relay log corruption?). It’s much faster and less buggy now, and as a bonus, the latest release can also take advantage of the FNV UDF I just mentioned.

I think I should explain the general evolution in this tool’s life. It started out as “how to find differences in data efficiently.” This was a period where I did a lot of deep thinking on exploiting the structures inherent in data. It then progressed to “how to sync data efficiently.” At this point I was able to outperform another data-syncing tool by a wide margin, even though it was a multi-threaded C++ program and mine was just a Perl script. I did that by writing efficient queries and moving very little data across the network.

The most recent incarnation has thrown performance out the window, at least as measured by those criteria. The aforementioned C++ program now outperforms mine by a wide margin on the same tests.

What changed?

Two things: I’m focusing on quality, and I’m focusing on syncing running servers correctly with minimal interruption.

Once I have good-quality, well-tested code, I’ll be able to speed it up. I know this because I’m currently doing some things I know are slower than they could be.

But much more importantly, I’ve changed the whole angle of the tool. I want to be able to synchronize a busy master and slave, without locking tables, automatically ensuring that the data stays consistent and there are no race conditions. I do this with a lot of special tricks, such as syncing tables in small bits, using SELECT FOR UPDATE to lock only the rows I’m syncing, and so on. And I’m actively working to make the tool Do The Right Thing without needing 99 command-line arguments. (I think the latest release does this very well).

Instead of “make the sync use as little network traffic as possible,” I’ve changed the criteria of good-ness to “do it right, do it once, and don’t get in the way.”

As a result, I can sync a table that gets a ton of updates — one of the “hottest” tables in my application — without interfering with my application. Online. Correctly. In one pass. Through replication. Show me another tool that can do that, and I’ll re-run my benchmarks :-)

This doesn’t mean I don’t care about performance. I do, and I’ll bring back the earlier “go easy on the network” sync algorithms at some point. They are very useful when you have a slow network, or your tables aren’t being updated and you just want to sync things fast. I’ll also be able to speed up the “don’t interfere with the application” algorithms.

One interesting thing I did was divide up the functionality so the tool can use many different sync algorithms. I created something like a storage-engine API, except it’s a sync API. It’s really easy to add in new sync algorithms now. All I have to do is write the code that algorithm needs. This is really only about 200-300 lines of code for the current algorithms.

Tools that don’t yet exist

What I haven’t told you about is a lot of unreleased code and new tools. There’s some good stuff in the works. Also stay tuned — a third party might be about to contribute another tool to Maatkit, which will also be a very neat addition.

Conclusion

As Dana Carvey says, “If I had more time… the programs we have in place are getting the job done, so let’s stay on course, a thousand points of light. Well, unfortunately, I guess my time is up.” Maatkit is getting better all the time, just wait and see.

Technorati Tags:, , , , , , , , , ,

You might also like:

  1. How to sync tables in master-master MySQL replication
  2. Maatkit version 1417 released
  3. Progress on Maatkit bounty, part 3
  4. Maatkit version 1674 released
  5. Maatkit bounty begins tomorrow

Maatkit version 1579 released

Download Maatkit

This release contains bug fixes and new features. The biggest new feature, in my opinion, is a new sync algorithm for mk-table-sync. Now you can sync any table with an index more efficiently than previously. This is the return of the speed I promised earlier. (Though I haven’t yet benchmarked it; I am very short on time these days. Your benchmarks and other contributions are welcome).

I’m finally feeling like the table sync tool is getting in good shape!

Let me know what you think, and of course, if you have questions or bug reports, please use the Sourceforge forums, bug tracker, etc so others can benefit too.

Changelog for mk-heartbeat:

2007-12-27: version 1.0.5

   * Added --stop, --sentinel, and --quiet options.
   * Added --replace option.

Changelog for mk-parallel-dump:

2007-12-27: version 1.0.3

   * Views with functions caused a crash (bug #1850998, MySQL bug #29408).
   * --ignoreengine was ignored (bug #1851461).

Changelog for mk-table-checksum:

2007-12-27: version 1.1.23

   * Updated documentation about version compatibility.
   * Updated documentation for --replcheck.

Changelog for mk-table-sync:

2007-12-27: version 1.0.2

   * Syncing via replication did not use REPLACE on the master.
   * --transaction disabled waiting for a slave to catch up.
   * Allow one DSN without --replicate, as long as --synctomaster is given.
   * Added the Nibble sync algorithm.
   * MASTER_POS_WAIT() failed when the server was not a master (bug #1855480).
   * DBD::mysql died after 'commands out of sync' error (bug #1856046).
Technorati Tags:,

You might also like:

  1. Maatkit version 1508 released
  2. Maatkit version 1417 released
  3. Maatkit version 1709 released
  4. Maatkit version 1753 released
  5. Maatkit version 1674 released

Maatkit version 1417 released

Download Maatkit

Thanks again to all the great sponsors for my week of work on the kit!

This is the long-awaited “Baron worked on table sync” release. Hooray!

I have resolved all of the issues I was facing in getting a release out the door. I now have individual test suites on all the programs in the kit (some of them trivial, some not) as well as a comprehensive unit test suite on the shared code. This is properly integrated into the Makefile, so it won’t let me release when a test is broken. Yay!

I also found and solved a number of other issues, mostly minor, with other tools in the kit. Yippee!

But before we all celebrate too much, I want to say a word of caution: mk-table-sync is rebuilt from the ground up. That means I probably busted a bunch of things. One thing I know I broke: performance. It has two sync algorithms — Stream and Chunk — and Stream is not high performance, but Chunk can’t always be used. I personally advise you to run the tool with the --test option and make sure the table you’re syncing will not use the Stream algorithm if it is large. And if you are doubtful about bugs, as I am, you would do well not to touch the --execute option for critical data. Instead, use --print and save the output in a file, inspect the file, and then feed the file into mysql.

Also, please be aware that I threw away the old tool’s 99 useless, confusing command-line options and started over. Some of them are similar. Some of them are the same but now mean different things. In other words, assuming backwards compatibility is probably not a good idea! Don’t just upgrade and drop this tool in place (in case you had cron jobs running it, for example).

Performance will come back, better than ever. I promise. But for now, please help me find bugs, and report them via the project’s Sourceforge bug tracker. Also, I would like to encourage you to post in the project’s forums and/or mailing lists instead of blog comments (unless you just have comments) so they are easy for others to find. (No one will search my blog for help on this toolkit, I feel sure).

Changelog:

Changelog for mk-archiver:

2007-12-07: version 1.0.4

   * Updated common code.

Changelog for mk-deadlock-logger:

2007-12-07: version 1.0.6

   * Updated common code.

Changelog for mk-duplicate-key-checker:

2007-12-07: version 1.1.3

   * Updated common code.
   * Corrected documentation.
   * Added --engine and --ignoreengine options.

Changelog for mk-find:

2007-12-07: version 0.9.8

   * Updated common code.

Changelog for mk-heartbeat:

2007-12-07: version 1.0.3

   * Updated common code.
   * Added --time, --interval and --skew options.
   * The combination of sleep() and alarm() did not work on some systems.

Changelog for mk-parallel-dump:

2007-12-07: version 1.0.1

   * Updated common code.

Changelog for mk-parallel-restore:

2007-12-07: version 1.0.1

   * Updated common code.

Changelog for mk-query-profiler:

2007-12-07: version 1.1.7

   * Updated common code.
   * Added --session command-line option.
   * Servers without session variables crashed the tool (bug #1840320).
   * The meaning of --innodb was reversed.

Changelog for mk-show-grants:

2007-12-07: version 1.0.6

   * Updated common code.

Changelog for mk-slave-delay:

2007-12-07: version 1.0.3

   * Updated common code.

Changelog for mk-slave-restart:

2007-12-07: version 1.0.3

   * Updated common code.

Changelog for mk-table-checksum:

2007-12-07: version 1.1.21

   * Updated common code.
   * --chunksize was broken when no suffix given (bug #1845018).
   * --replcheck replaces the --recursecheck option (bug #1841407).

Changelog for mk-table-sync:

2007-12-07: version 1.0.0

   * Complete rewrite.
   * Syncs multiple tables and servers
   * Has no top-down or bottom-up algorithms
   * Integrates with mk-table-checksum results
   * Fixes many bugs, probably introduces new ones

Changelog for mk-visual-explain:

2007-12-07: version 1.0.5

   * Updated common code.
   * Queries of the form "... FROM (SELECT 1) AS X" crashed the tool.
Technorati Tags:, ,

You might also like:

  1. Maatkit version 1508 released
  2. Maatkit version 1579 released
  3. Maatkit version 1709 released
  4. Maatkit version 1674 released
  5. Maatkit version 1877 released

Progress on Maatkit bounty

My initial plans got waylaid! I didn’t pull out the checksumming code first, because the code wasn’t at all as I remembered it. Instead, I began writing code to handle the more abstract problem of accepting two sets of rows, finding the differences, and doing something with them. I’m ending up with a little more complicated system than I thought I would. However, it’s also significantly simpler in some ways. Instead of just passing references to subroutines to use as callbacks, I’m object-ifying the entire synchronization concept.

What’s the advantage of doing this? Well, as some of you may know, there are two fairly complex algorithms in the tool at present, which handle synchronization in a hierarchical manner, zooming in on the rows that need to be changed. There are a lot of complexities in them. If I wrap all that up into modules, and make them have a uniform interface (real OO interfaces would be delightful here, but Perl doesn’t support them), I can simplify the project significantly by…

…throwing them out the window! That’s right, I’m tossing out the ‘top-down’ and ‘bottom-up’ algorithms. What I want to develop, first and foremost, is the code that does the synchronization, not the really twisted code that does bitwise XORs on groupwise slices of checksums and has recursion and all that stuff. So I decided on a generic data-syncing interface, and wrote the simplest possible implementation of that, which I’m going to use to help me deal with complexity. This algorithm is called ’stream’ (for lack of a better word). It has no hierarchical drill-down or any other complexities. It amounts to “select * from source, select * from dest, diff and resolve.”

It’s not a very efficient algorithm for comparing and syncing data, at least not by my standards. (It amounts to a FULL OUTER JOIN implemented in Perl). But boy, does it make it easier to start cleaning up the nasty spaghetti code that handles locking, waiting for a slave to catch up, actually changing the data that turns out to be different, and so on.

Of course, I’ll add back the top-down and bottom-up algorithms later, as well as some others. They should turn out to be pretty simple to implement, since they won’t have, for example, locking code intertwined with them. When done, the tool will examine the table and figure out the best algorithm to use. This will go a good way towards another of my goals, which is that you should be able to just point it at two tables and tell it to sync them, and it should do it in the most efficient way possible, without needing lots of command-line options.

Technorati Tags:, , ,

You might also like:

  1. Progress on Maatkit bounty, part 2
  2. Maatkit bounty begins tomorrow
  3. Introducing MySQL Table Sync
  4. A progress report on MySQL Table Sync
  5. Comparison of table sync algorithms

Maatkit version 1314 released

Download Maatkit

Maatkit (formerly MySQL Toolkit) contains essential command-line utilities for MySQL, such as a table checksum tool and query profiler. It provides missing features such as checking slaves for data consistency, with emphasis on quality and scriptability.

This release fixes several minor bugs. It also renames all the tools to avoid trademark violation, completing the project rename. (Let me know if I missed anything.)

Changelog for mk-find:

2007-11-25: version 0.9.7

   * Added --sid option.

Changelog for mk-show-grants:

2007-11-25: version 1.0.5

  * --askpass ignored the entered password (bug #1838131).

Changelog for mk-table-checksum:

2007-11-25: version 1.1.20

   * --replcheck didn't recurse; it should recurse one level (to slaves).
Technorati Tags:, ,

You might also like:

  1. Maatkit version 1753 released
  2. Maatkit version 1877 released
  3. Maatkit version 1674 released
  4. Maatkit version 1709 released
  5. MySQL Toolkit version 675 released

Four companies to sponsor Maatkit development

A while ago I asked for people and/or organizations to sponsor development on Maatkit (formerly MySQL Toolkit) so I could take a week off work and improve the Table Sync tool. I asked for $2500 USD, but several companies have graciously offered to cover that and then some.

I’m very happy about this, as it will allow me to dedicate a solid week to fixing bugs and adding features. There’s a lot of demand for the tools, and there are a dozen or so bug reports unresolved for the table-sync tool, which I personally want to fix as much as anyone. So I’m very grateful for the support.

Here are the companies who have promised their financial support:

MySQL AB

MySQL AB

MySQL AB have offered $3000 USD in support. I had an email conversation with MÃ¥rten Mickos, MySQL’s CEO, and he expressed his happiness about the project’s success, and his pleasure in supporting the project:

We have seen you operate in the community and you always have constructive and good ideas. That’s why we want to support you. Our goal with this is to stimulate innovation in the MySQL ecosystem.

I don’t know how the idea to support the project started at MySQL AB, but that quote really tells me “we get it: we have a symbiotic relationship with our community of users.” In a follow-up email, Jay Pipes wrote,

… MySQL wants to make it clear that we very much support and appreciate the work you’ve done on the toolkit. It has proven to be one of, if not the, most popular and successful open source ecosystem projects surrounding MySQL and for good reason. So, for your work and commitment to the project, a big thank you from MySQL. :)

Secondly, we would like to encourage you to be open and public about our support of you. The community team is always looking for opportunities such as the one which presented itself with your toolkit, and we want the outside community to know about our support and encouragement. Therefore, you have our blessing and encouragement to blog about the sponsorship of your development work. Please do let us know if and when you decide to blog about it. Remember also that this sponsorship is no strings attached. There is no expectation of specific work on our end.

Blue Ridge Internetworks

Blue Ridge Internetworks

Blue Ridge Internetworks have offered $1000 USD in support. BRIworks, as they’re known locally, is headquartered in the town where I live, Charlottesville, Virginia. They offer networking consulting and services. Jeff Cornejo, who offered the support to me, is a friend and used to work where I used to work, and several other highly respected friends and ex-co-workers work at BRIworks too. BRIworks provides Internet service and hosting for my employer.

Percona

Percona

Percona have offered $500 USD in support. Percona does high-performance website consulting, and are perhaps best known for having some of the world’s top MySQL experts, including Peter Zaitsev and Vadim Tkachenko, two of the co-authors on High Performance MySQL, second edition.

The Rimm-Kaufman Group

Rimm-Kaufman Group

Last, but absolutely not least, my employer, The Rimm-Kaufman Group, who do paid search marketing and website effectiveness consulting. They have let me spend a significant amount of time writing these tools for use on our own systems, and instead of keeping them in our own Subversion repository, allowed the code to be released as Free Software. The time I’ve spent on the tools has gone well above and beyond what we needed to get our work done. Finally, RKG has blessed my unpaid week off to work on the tools.

A big thanks is due to all of these companies and individuals, as well as other people who have contributed financially and otherwise.

Closing thoughts

I’m grateful for the sponsorship, but I think the real winners are the MySQL community, who have benefited a lot from Maatkit. It has made a lot of hard things easier and impossible things possible. If you’re one of those who benefits from Free Software, I encourage you to patronize the businesses that believe in and support it. Four fine examples are listed above! Not coincidentally, all of them are the creme de la creme in their respective fields.

Finally, a quick journalistic note: I pre-approved this post with representatives from the companies I mentioned, because I respect their right to represent themselves as they wish, but the words are mine, not theirs.

Technorati Tags:, , , , , , , ,

You might also like:

  1. I have joined Percona
  2. Proposed bounty on MySQL Table Sync features
  3. Maatkit bounty begins tomorrow
  4. Maatkit t-shirts are here
  5. Coming soon: High Performance MySQL, Second Edition

Maatkit version 1297 released

Download Maatkit

Maatkit (formerly MySQL Toolkit) version 1297 contains a significant update to MySQL Table Checksum (which will be renamed soon to avoid trademark violations). The changelog follows. What you don’t see in the changelog is the unit test suite! I got a lot more of the code into modules that are tested and re-usable.

2007-11-18: version 1.1.19 

* Check for needed privileges on --replicate table before beginning. 
* Made some error messages more informative. 
* Fixed child process exit status with 8-bit right-shift. 
* Improved checksumming code auto-detects best algorithm and function. 
* Added --ignoreengine option; ignores federated and merge by default. 
* Added --columns and --checksum options. 
* Removed --chunkcol, --chunksize-exact, --index options. 
* --chunksize can be specified as a data size now. 
* Improved chunking algorithm handles more cases and uses fewer chunks. 
* Do not print --replcheck results for servers that are not slaves. 
* Create only one DB connection for each host, not one per host/tbl/chunk. 
* Code assumed backtick quoting, broke on SQL_MODE=ANSI (bug #1813030). 
* There were many potential bugs with database and table name quoting. 
* Child exit status errors could be masked by subsequent successes.
Technorati Tags:, ,

You might also like:

  1. Maatkit version 1753 released
  2. Maatkit version 1674 released
  3. Maatkit version 1877 released
  4. Maatkit version 1579 released
  5. Maatkit version 1314 released