A very fast FNV hash function for MySQL

I wrote a User-Defined Function that implements the FNV (Fowler-Voll-No) hash function for MySQL. I’m not the first person to do this — in fact, I was inspired by the Google patches for MySQL. But my implementation is a little bit different from most, in a very important way that leads directly to much higher performance, especially suited for the Maatkit tools.

A bit of background: FNV hashing is a very fast hash algorithm that operates in fixed memory. It is widely used in lots of important areas in computer science. My implementation requires absolutely no malloc() calls, which is a darn good thing because I am not to be trusted with malloc(), having spent too many years programming in managed languages. I made it return a 64-bit integer, which matches the size MySQL uses internally for most integer arithmetic.

The most important thing I did was make my UDF accept 1 to infinity arguments. That means you can hash entire rows with a single function call. And that is very useful for the Maatkit table-checksumming tools, which tend to run about 8-10 times faster when they don’t have to make MySQL do a bunch of string concatenation. That translates directly to less impact on the server, and less slave lag (if that is a problem for you).

Here’s how my implementation works:

SELECT FNV_64(col1, col2, col3, .... colN) FROM ...

Compare this to MD5() hashing that accomplishes the same thing:

SELECT MD5(CONCAT_WS('#', col1, col2, col3, .... colN)) FROM ...

The UDF’s code is distributed with Maatkit, and I plan to eventually build it as a binary that can be installed without requiring you to compile it. However, compiling is very easy; there are instructions in the source code comments. Installing is also easy: just a simple SQL statement.

If you’re using Maatkit to make sure your slaves have the same data as their master, you should install the UDF on all your servers for a significant performance boost. You’ll save your servers a lot of work. You don’t need to do anything extra for Maatkit to take advantage of it. Maatkit will auto-detect it if you have it installed.

I’ve been running it in production for a couple of months now with nothing but good results. And the code is drop-dead simple, so I think the chance of bugs is very slim. But if you have questions, problems, bug reports etc, please use the Maatkit project page to report them.

Technorati Tags:, , , ,

You might also like:

  1. Get Maatkit fast from the command line
  2. Speed up your MySQL replication slaves

6 Responses to “A very fast FNV hash function for MySQL”


  1. 1 Robin

    Hi! Great work. I’ve been using FNV hash in several applications in the recent past (e.g. Ketama consistent hashing lib for Memcache).
    Last week I found an ever faster hash algorithm, called MurmurHash. It may be worth to check it out: http://tanjent.livejournal.com/756623.html

  2. 2 Mark Callaghan

    Have you submitted a feature request to get this function added to MySQL? md5() is slower and produces a much larger result. crc32() isn’t a good hash function. People are now forced to add your UDF or apply th e Google patch to get this and most will do neither.

  3. 3 Xaprb

    Robin, thanks for the tip.

  4. 4 Xaprb

    Mark, thanks for the suggestion. I just submitted it: http://bugs.mysql.com/35188

    My overall feature request is basically “Integrate the Google patches already!” :-)

  5. 5 jim

    making md5() and crc32() allow multiple arguments would be a good feature request, too.

  6. 6 Marcus

    A little off-topic, but something I’d like to see is more compact encodings of hash values - your FNV can scrape through as it can fit into a 64-bit int, but when using a longer 128- or 160-bit hash, it would be nice to produce base-62 (DNS-safe) or base-64 strings instead of the default inefficient [0-9a-f] hex encoding.

Leave a Reply

Please do not use this blog to get help with problems or bugs in Maatkit or innotop: use the Sourceforge forums, mailing list, or bug trackers. If you're asking for help with MySQL, please use the MySQL mailing list instead.