Archive for October, 2007

MySQL Table Sync bounty: let’s do it!

A little while ago I offered to take time off work and improve MySQL Table Sync. I’ve gotten a very positive response to that, with several organizations offering to contribute to the bounty, so I’ll go ahead and commit to doing this.

The conditions

  • The bounty is $2500 USD.
  • I’ll work on the following features and improvements. I have the grand plan in my head, so this list just kind of describes the plan; I’ll probably end up improving other things at the same time.
    • Modularizing and writing a test suite (this is mandatory; the script is collapsing under its own weight without them) (the good news is a lot of this is done already; I’ve been modularizing a lot of the code for other tools in the toolkit)
    • Bugs, bugs, bugs! The test suite should help a lot here.
    • Bi-directional syncing
    • Syncing many tables
    • Syncing tables without a primary key
    • Providing useful exit codes and more informational output
    • Syncing in chunks to block updates less
    • Checking privileges before syncing
    • Syncing based on pre-computed checksums
    • Automatically choosing sensible parameters based on table structure
    • Making default locking and other behaviors smarter
    • Full UTF8 support for 3-byte characters
  • I decide when the features are done. Actually, the test suite decides, but I decide on what’s reasonable and feasible; some of these features could get way out of hand if I don’t get to decide. Please feel free to submit your requirements for consideration — for example, how do you propose to use two-way sync? I know how I’d do it, but tell me your needs too. I’ll also look at feature requests in terms of total money contributed, and work on the ones people donated most for.
  • The code will be released through the normal Sourceforge channels. Nothing will change as regards licensing, copyright, etc.
  • No guarantees about bugs, how much I’ll be able to complete, or any other warranties. This is a bounty to sponsor me working on the code, not a contract to deliver certain results.
  • I’ll work on this between December and March. I can’t do anything before December because of the book. I think I can do it in December, but I’m not sure. The book’s final deadline is February 1, so I shouldn’t commit to anything before then. In other words, I’ll probably do this before the year is over, but I want a margin of safety.

How to donate

I was thinking about using a third party to handle the money, but I decided it’s not that much money, and the people I spoke to had no qualms about sending me money directly. Here’s how I’m thinking of handling it:

  • I’m just passing the hat, I’m not an official charitable organization or anything :-)
  • Leave a comment with your email, or send an email to me [thisdomain] @ [thisdomain].com.
  • We can work out how to transfer the money, in whatever way is kosher for you. I have a PayPal account, or I can give you my address and you can mail me a check, or we can figure out something else if that doesn’t work.
  • I’ll keep this post updated with the total amount pledged.
  • Let me know if you wish to be acknowledged. I’ll add a “donors” section to this post with your name and a link, if you want. Otherwise I’ll just list “anonymous.”
  • Let me know if there’s a particular feature you want.

Donors

The total pledged so far is $0 of $2500 USD.

Technorati Tags:, , , ,

You might also like:

  1. Proposed bounty on MySQL Table Sync features
  2. Maatkit bounty begins tomorrow
  3. JavaScript number-formatting library updated
  4. Progress on Maatkit bounty, part 2
  5. Progress on Maatkit bounty, part 3

How I built the NOW_USEC() UDF for MySQL

Last week I wrote about my efforts to measure MySQL’s replication speed precisely. The most important ingredient in that recipe was the user-defined function to get the system time with microsecond precision. This post is about that function, which turned out to be surprisingly easy to write.

The manual section on user-defined functions provides very good instructions on how they work and how to build them. But just for the record, on Ubuntu 7.04 on an AMD64 machine, all I had to do was install the libmysqlclient15-dev package, and I was then able to compile the UDF with no further ado. Also for the record, MySQL header files have some dependencies they shouldn’t that break building against a downloaded tarball. So don’t be surprised if you have troubles building against anything but Ubuntu’s provided header files.

Here’s the source, which I basically cribbed from a NOW_MSEC() function I saw in a bug report somewhere. Really, there’s not much to it besides the basic skeleton of a UDF, with a few lines to actually get the system time. And I actually believe if I took another ten minutes to learn about strftime(), there’s probably no need to do it in two steps; I could probably do the whole thing with one strftime() call and save a little memory and time. But that’s what I get for copying and pasting code of unknown quality:

#include <my_global.h>
#include <my_sys.h>
#include <mysql.h>

#include <stdio.h>
#include <sys/time.h>
#include <time.h>
#include <unistd.h>

extern "C" {
   my_bool now_usec_init(UDF_INIT *initid, UDF_ARGS *args, char *message);
   char *now_usec(
               UDF_INIT *initid,
               UDF_ARGS *args,
               char *result,
               unsigned long *length, char *is_null, char *error);
}

my_bool now_usec_init(UDF_INIT *initid, UDF_ARGS *args, char *message) {
   return 0;
}

char *now_usec(UDF_INIT *initid, UDF_ARGS *args, char *result,
               unsigned long *length, char *is_null, char *error) {

  struct timeval tv;
  struct tm* ptm;
  char time_string[20]; /* e.g. "2006-04-27 17:10:52" */
  char *usec_time_string = result;
  time_t t;

  /* Obtain the time of day, and convert it to a tm struct. */
  gettimeofday (&tv, NULL);
  t = (time_t)tv.tv_sec;
  ptm = localtime (&t);   

  /* Format the date and time, down to a single second.  */
  strftime (time_string, sizeof (time_string), "%Y-%m-%d %H:%M:%S", ptm);

  /* Print the formatted time, in seconds, followed by a decimal point
 *      and the microseconds.  */
  sprintf(usec_time_string, "%s.%06ld\n", time_string, tv.tv_usec);

  *length = 26;

  return(usec_time_string);
}

The installation looks like this:

baron@tigger now_usec $ make
gcc -fPIC -Wall -I/usr/include/mysql -shared -o now_usec.so now_usec.cc
baron@tigger now_usec $ sudo cp now_usec.so /lib
baron@tigger now_usec $ mysql test
mysql> create function now_usec returns string soname 'now_usec.so';
Query OK, 0 rows affected (0.00 sec)

mysql> select now_usec();
+----------------------------+
| now_usec()                 |
+----------------------------+
| 2007-10-23 10:28:13.862116 | 
+----------------------------+

For those who have reached this page via Google searches and are looking for more information, you should check out the MySQL User Defined Function Library project. Lots of good UDFs there.

Technorati Tags:, , , , , ,

You might also like:

  1. How fast is MySQL replication?
  2. More alternatives to openxml

How fast is MySQL replication?

Very fast, as it turns out.

While writing the chapter on replication for the upcoming second edition of High Performance MySQL, I decided to do a little test and measure replication speed more accurately than I’ve seen others do before. The first edition of the book measured replication speed by inserting on the master and polling on the slave. Giuseppe Maxia later followed up on that by improving the polling process, and found events typically replicated within a half a millisecond.

Polling can only get you so far; the extra overhead caused by polling skews the measurements (even if you poll smartly). I wanted to see if I could do this without polling the slave for results. It turned out to be easier than I thought it would be.

All I had to do was write a MySQL User-Defined Function that returns the system time to microsecond precision. I’ll write another post about that later; in this post I want to talk about the results.

The setup

After writing and installing the function, I tested it. Note that it’s non-deterministic, so you get different results even when you call it twice in the same query:

SELECT NOW_USEC(), NOW_USEC(); 
+----------------------------+----------------------------+ 
| NOW_USEC()                 | NOW_USEC()                 | 
+----------------------------+----------------------------+ 
| 2007-10-23 10:41:10.743917 | 2007-10-23 10:41:10.743934 | 
+----------------------------+----------------------------+ 

The rest is easy. I set up two MySQL instances on the same server (because there’s no way the clocks on two separate machines will be synced to the microsecond), and made one of them the master of the other. On the master,

CREATE TABLE test.lag_test( 
   id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, 
   now_usec VARCHAR(26) NOT NULL 
 ); 

INSERT INTO test.lag_test(now_usec) VALUES( NOW_USEC() ); 

The results

Now all that remains is to compare the difference from the slave and the master. A Federated table is an easy way to do this. On the slave:

CREATE TABLE test.master_val ( 
   id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, 
   now_usec VARCHAR(26) NOT NULL 
) ENGINE=FEDERATED 
   CONNECTION='mysql://user:pass@127.0.0.1/test/lag_test'; 

A simple join and the TIMESTAMPDIFF function can now show the microseconds of lag between the time the query executed on the master and the slave:

SELECT m.id, TIMESTAMPDIFF(FRAC_SECOND, m.now_usec, s.now_usec) AS usec_lag
FROM test.lag_test as s
   INNER JOIN test.master_val AS m USING(id); 
+----+----------+ 
| id | usec_lag | 
+----+----------+ 
|  1 |      476 | 
+----+----------+ 

I inserted a thousand rows into the master with a simple Perl script, with a 10-millisecond delay between rows to keep help the master and slave instances from fighting each other for CPU time. I then built a temporary table with the lag of each event, and grouped the results by lag time, to see what the most frequent lag times are:

SELECT ROUND(lag / 1000000.0, 4) * 1000 AS msec_lag, COUNT(*)
FROM lag
GROUP BY msec_lag
ORDER BY msec_lag;
+----------+----------+ 
| msec_lag | COUNT(*) | 
+----------+----------+ 
|   0.1000 |      392 | 
|   0.2000 |      468 | 
|   0.3000 |       75 | 
|   0.4000 |       32 | 
|   0.5000 |       15 | 
|   0.6000 |        9 | 
|   0.7000 |        2 | 
|   1.3000 |        2 | 
|   1.4000 |        1 | 
|   1.8000 |        1 | 
|   4.6000 |        1 | 
|   6.6000 |        1 | 
|  24.3000 |        1 | 
+----------+----------+

Not all that bad, eh? It looks to me like MySQL can replicate most small queries in 200 to 300 microseconds(!!!). Of course, the speed is bounded by a) how long it takes to transfer the binary log event across the network and b) how fast the query executes. In this case, both are very fast, showing that MySQL doesn’t add much overhead of its own to the replication process.

If anyone knows of a way to measure the delay between the event being logged in the master’s binary log, and the event being logged in the slave’s relay log, I’d be interested in seeing the results. I’m guessing it’s practically instantaneous for small events like this, and most of the lag is in reading, parsing, and executing the SQL.

Technorati Tags:, , , , , , , ,

You might also like:

  1. How I built the NOW_USEC() UDF for MySQL
  2. How to sync tables in master-master MySQL replication
  3. MySQL Table Checksum 1.1.0 released
  4. How to know if a MySQL slave is identical to its master
  5. Why MySQL says the server is not configured as a slave

What are your favorite MySQL replication filtering rules?

As I wrote a few days ago, I’m writing the replication chapter for the second edition of High Performance MySQL. I’m writing about replication filtering rules right now, and I thought it would be good to get input on this. If you have favorite replication filtering tricks you’d like to share, or tasks that always frustrate and/or confuse you, please post them in the comments. I’m making a section that shows how to accomplish common filtering and rewriting needs, such as preventing GRANT statements from replicating to the slaves.

Thanks very much! I hope the community involvement will make this book more useful for everyone.

Technorati Tags:, , , ,

You might also like:

  1. MySQL Toolkit’s Show Grants tool 0.9.1 released
  2. High Performance MySQL, Second Edition: Replication, Scaling and High Availability
  3. High Performance MySQL, Second Edition: Advanced SQL Functionality
  4. How pre-fetching relay logs speeds up MySQL replication slaves

High Performance MySQL, Second Edition: Replication, Scaling and High Availability

Continuing in the tradition, which I hope has been as helpful to you as it has been to me, I’m opening the floor for suggestions on chapter 9 of the upcoming High Performance MySQL, Second Edition. Unlike the other chapters for which I’ve listed outlines, this one isn’t substantially written yet. It’s in detailed outline form at this point (a tactic that has worked very well for us so far — I’ll write about that someday).

I’m trying to get feedback much earlier in this chapter’s lifecycle, for several reasons. Two of the most important are that this is one of the first chapters I’ve had a chance to really take from scratch, and the chapters I haven’t written from scratch have been harder to organize, as you’ve probably seen from the last few outlines I posted. There’s a lot of value in working top-down on this deep encyclopedia-style material.

The outline, as it stands now, is basically headings with bulleted lists of important details. Here are the top-level headings:

[Intro]
Scaling and High Availability Requirements
Replication Overview
Configuring Replication
Under the Hood of Replication
Replication Topologies
Replication Administration and Maintenance
Replication Problems and Solutions
The Future of MySQL Replication
Scaling MySQL Horizontally
Clustering with MySQL
   MySQL Cluster
   Other Clustering Solutions
Load Balancing

Just a few notes. These sections are top-level, and will likely be split into many sub-sections like other chapter outlines I’ve posted. A typical section has a couple dozen bullet-points in it, at a high level of granularity, such as “Using DRBD for log replication only.” I think we’ll also add in a separate section on fail-over and fail-back, but that’s not in the outline as of right now (what do you think belongs in it?).

I don’t know what it’s like for you to read outlines and see little bits of the book being assembled, but the process of writing this book is just fascinating to me. It’s endlessly interesting and educational — just the process of writing, let alone the subject matter! This is a really fun project. A heck of a lot of work, but fun nonetheless, and the openness of the project makes it even more fun for me. I’ve learned a lot of surprising and interesting things about writing. I keep wishing I had time to write about this process, but I really need to keep my eye on the deadlines and put that off for later.

Anyway, the usual requests apply: what’s missing, what do you think is cool and should be included, etc etc? Thanks, as usual, for your time and feedback.

Technorati Tags:, , , , , , ,

You might also like:

  1. More progress on High Performance MySQL, Second Edition
  2. What are your favorite MySQL replication filtering rules?
  3. Progress report on High Performance MySQL, Second Edition
  4. High Performance MySQL, Second Edition: Backup and Recovery
  5. Progress on High Performance MySQL, Second Edition

Introducing MySQL Parallel Restore

Download MySQL Toolkit

The new release of MySQL Toolkit (version 1051) updates MySQL Parallel Dump in minor ways, but more importantly, it adds MySQL Parallel Restore.

MySQL Parallel Restore is the reverse of MySQL Parallel Dump. You give it one or more files and/or directories, and it discovers all the files contained within them and loads them in parallel. It understands how to load SQL and/or TXT/CSV files. If you give it some of both, it loads the SQL first and then loads the TXT/CSV as delimited files with LOAD DATA INFILE.

It does not parallelize a single table. That is, it doesn’t try to load two files into a table at the same time. But if you’re loading multiple tables, it will do them in parallel.

It has what I consider to be smart defaults. For example, by default it commits between each delimited file it loads. And since the dump tool makes it easy to dump a table in chunks, this makes it much easier on the server to restore a very large table.

I’ve been following the “release early, release often” philosophy with these two tools. You should test carefully before you trust them with your data. If you can’t restore your data accurately, it’s probably a bug. I’ve been testing with the following procedure:

mysql-table-checksum -a ACCUM localhost > checksum-before
mysqldump --all-databases | gzip -c - > sanity.gz
mysql-parallel-dump ... options ... 
mysql-parallel-restore ... options ... default/
mysql-table-checksum -a ACCUM localhost > checksum-after
mysql-checksum-filter checksum-before checksum-after

If the checksums don’t match after restoring, you can restore the original data from the sanity dump. I encourage you to report any bugs you find with this procedure. Incidentally, this exercise taught me that LOAD DATA INFILE is pretty hard to get just right. It has all sorts of weird dependencies on character sets that aren’t documented. That’s why I’m a little cautious and I’m asking you to tell me if you can’t restore correctly.

Technorati Tags:, , , ,

You might also like:

  1. MySQL Toolkit version 1254 released
  2. MySQL Toolkit version 946 released
  3. Maatkit version 1674 released
  4. MySQL Toolkit version 1030 released
  5. MySQL Toolkit version 989 released

Not a bug: MySQL Connector/J doesn’t make toast

I was just searching for something and found this MySQL bug report:

Other JDBC drivers I have used make toast for breakfast… MySQL Connector/J doesn’t make toast, it can only pour a bowl of froot loops…

…Thank you for taking the time to write to us, but this is not a bug… I believe you should look into a device called a \”toaster\” to make your toast.

Technorati Tags:, ,

No related posts.

High Performance MySQL, Second Edition: Schema Optimization and Indexing

I’ve been trying to circle back and clean up things I left for later in several chapters of High Performance MySQL, second edition. This includes a lot of material in chapter 4, Schema Optimization and Indexing. At some point I’ll write more about the process of writing this book, and what we’ve done well and what we’ve learned to do better, but for right now I wanted to complete the picture of what material we have on schema, index, and query optimization. The last two chapters I’ve written about (Query Performance Optimization and Advanced MySQL Features) have generated lots of feed back along the lines of “don’t forget X!” to which I’m obliged to reply “It’s in a different chapter.”

The truth is, it’s difficult to separate these topics sensibly. I’d like to do it in the mythical “perfect” way that serializes into a nice narrative without cross-references, but even the perfectionist in me wilts under the glare of deadlines. As a result, I don’t know if it’s really possible for us to completely avoid cross-references. (I do know there’s room for improvement in how we’ve arranged the material, but I’ve spent a lot of the day today trying to de-dupe some topics we wrote about in two places, and I’m coming to appreciate that re-organizing is an extraordinary amount of work, especially in OpenOffice.org — but more on that later).

All this is a preface to the following sentence: schema, indexing, advanced features, and query optimization are intermingled to some extent in the three chapters, even though we tried to separate the topics sensibly. I haven’t yet taken some of the suggestions I got in comments on the last chapter I posted. Like I said, reorganizing is a lot of work :-)

Here’s the outline. I have the same kinds of questions as before: what are we forgetting, do you have any questions or topics you’d like us to cover, etc? Comments are welcome.

[Update: I forgot to mention the vital statistics. So far it’s about 55 pages printed.]

[Intro]
Choosing Optimal Data Types
  General Guidelines for Data Storage
    Smaller is Usually Better
    To NULL or not to NULL?
    Choose Identifiers Carefully
  How to Choose a Good Data Type
    Numeric Types
    BIT Strings
    String Types
      [sidebar: Generosity can be Unwise]
    BLOB and TEXT Types
      [sidebar: How to Avoid On-Disk Temporary Tables]
    Using ENUM Instead of a String Type
    Date and Time Types
      [sidebar: Watch out for automatic migration programs]
Indexing Basics
  Types of Indexes
    BTREE Indexes
      Types of Queries that can Use a BTREE Index
      Indexed Column Isolation
    Prefix Indexes
    HASH Indexes
    Rolling Your Own HASH Indexes
    RTREE Indexes
    FULLTEXT Indexes
    Clustered Indexes
    Covering Indexes
  Index Scans and Using Indexes for Sorting
  Packed (Prefix-Compressed) Indexes
  Redundant and Duplicate Indexes
  Indexes and Locking
  Indexing Strategies
  An Indexing Case Study
    Supporting Many Kinds of Filtering
    Avoiding Multiple Range Conditions
    Optimizing Sorts
  Index and Table Maintenance 
    Finding and Repairing Table Corruption
    Updating Index Statistics
    Reducing Index Fragmentation
Normalization and Denormalization
  Pros and Cons of a Normalized Schema 
  Pros and Cons of a Denormalized Schema
  A Mixture of Normalized and Denormalized
  Cache and Summary Tables
    [sidebar: The Principle of Faster SELECT and Slower UPDATE]
Notes on Storage Engines
  MyISAM
  Memory
  InnoDB

Here’s a snippet of “what it’s like to write this book” that I’ll throw out there. OpenOffice.org, at least the version I’m using, doesn’t like O’Reilly’s custom heading styles and won’t show me an outline view of the document. I’m copying and pasting into this blog post by scrolling from one heading to the next. This is always enlightening, because as you can see a lot of the material isn’t organized correctly in the hierarchy. Guess what, it’s my first look at the chapter’s real outline, too! This isn’t the outline we planned to have, but the chapter evolved because of making localized changes without any real way to zoom out and make sure the outline still made sense. So my two comments on this are a) OpenOffice.org hasn’t been the most helpful tool in some ways and b) these blog posts are, to some extent, airing the project’s dirty laundry (illogical outlining, difficult separation of material among chapters, etc). I’m not afraid of that; I think it’s healthy and will help the book be better as a result. I guess my experience with open source, combined with my employer’s open-books policy, has taught me to embrace transparency instead of fearing it. In the end this material will be organized and make a lot of sense, but that’s a process of evolution — not intelligent design.

As I said, at some point I’ll write more about the process of writing. It’s been educational, and most bloggers I know who’ve written a book don’t say much about it (they just pop their heads up every now and then to apologize for not blogging). Very briefly: if you dream of writing a book, do it. It helps that my boss and co-workers support me in this venture, but it’s worth it regardless.

Technorati Tags:, , , , ,

You might also like:

  1. Progress report on High Performance MySQL, Second Edition
  2. Organizing High Performance MySQL, 2nd Edition
  3. High Performance MySQL, Second Edition: Advanced SQL Functionality
  4. High Performance MySQL, Second Edition: Backup and Recovery
  5. High Performance MySQL, Second Edition: Query Performance Optimization

Proposed bounty on MySQL Table Sync features

I am considering taking some time off work to concentrate deeply on MySQL Table Sync, which has been getting usage in very large companies whose names we all know. There are a lot of bugs and feature requests outstanding for it. It is overly complex, needs a lot of work, and I can’t do it in one-hour or even three-hour chunks. I need to focus on it. I’m considering asking for a bounty of $2500 USD for this. Please let me know what you think of this; it seems to be a successful way to sponsor development on some other projects, like Vim.

For the amount of time I think this will take, $2500 is far below my per-hour consulting rate; I considered setting the bounty higher, but I think this will be a fair amount.

I would not begin this project before December at the earliest, so there’s some time to raise funds and time for me to continue working on High Performance MySQL. I would like a volunteer to coordinate the fund-raising for me. It should be trivial, but I don’t want to do it myself, for several reasons. I can publicize the bounty on this blog and the project mailing list, and contact some of the corporations that have asked me for features. I doubt it will be hard to raise the money.

I’m not committing to this, just proposing it, though I did run it by my employer, who is very supportive. Here’s the list of features I propose to implement:

  • Writing a test suite
  • Bi-directional syncing
  • Syncing many tables
  • Syncing tables without a primary key
  • Providing useful exit codes and more informational output
  • Syncing in chunks
  • Checking privileges before syncing
  • Syncing based on pre-computed checksums
  • Automatically choosing sensible parameters based on table structure
  • Making default locking and other behaviors smarter

Alternatively, if someone wants to do it and just contribute the code to the project, I’d be delighted. I doubt that will happen, though, and there’d still be a lot of work in it for me, so I think it’s probably more realistic that I will do it.

Technorati Tags:, , , ,

You might also like:

  1. MySQL Table Sync bounty: let’s do it!
  2. Introducing MySQL Table Sync
  3. Progress on Maatkit bounty, part 2
  4. MySQL Table Checksum 1.1.5 released
  5. Progress on Maatkit bounty

MySQL Toolkit version 1030 released

Download MySQL Toolkit

This release of MySQL Toolkit updates MySQL Parallel Dump. Together you and I found a few bugs in it (table locking, argument quoting, exit status code). The restore utility is in progress.

Technorati Tags:, , , ,

You might also like:

  1. MySQL Toolkit version 1011 released
  2. Progress on High Performance MySQL Backup and Recovery chapter
  3. MySQL Toolkit version 946 released
  4. Introducing MySQL Parallel Restore
  5. MySQL Toolkit version 1254 released