Archive for the ‘memcached’ tag
Key-value databases are catching fire these days. Memcached, Redis, Cassandra, Keyspace, Tokyo Tyrant, and a handful of others are surging in popularity, judging by the contents of my feed reader.
I find a number of things interesting about these tools.
- There are many more of them than open-source traditional relational databases. (edit: I mean that there are many options that all seem similar to each other, instead of 3 or 4 standing out as the giants.)
- It seems that a lot of people are simultaneously inventing solutions to their problems in private without being aware of each other, then open-sourcing the results. That points to a sudden sea change in architectures. Tipping points tend to be abrupt, which would explain isolated redundant development.
- Many of the products are feature-rich with things programmers need: diverse language bindings, APIs, embeddability, and the ability to speak familiar protocols such as memcached protocol.
- I think there are more solutions here than the ecosystem will support, and in five years a few will stand out as the most popular.
- This process of paring down the gene pool is win-win because they’re open-source, and nothing will be lost.
- Choosing which one to use is no easy task even for a highly skilled, technical, up-to-date person. Perhaps the decision-makers will choose on the availability of commercial support and consulting.
- Many of them offer built-in, dead-simple, distributed, synchronous replication. This is very difficult to achieve with traditional relational databases. What makes key-value databases different? They don’t have MVCC, for one thing; but I’m not sure of the complete answer to that question, to tell the truth.
We live in interesting times.
Ryan posted an article on the MySQL Performance Blog about how to use mk-query-digest to analyze and understand your memcached usage with the same techniques you use for MySQL query analysis. This is an idea that came to me during the 2009 MySQL Conference, while talking to our friends from Schooner, who sell a memcached appliance.
It suddenly struck me that the science of memcached performance is basically nonexistent, from the standpoint of developers and architects. Everyone treats it as a magical tool that just performs well and doesn’t need to be analyzed, which is demonstrably and self-evidently false. memcached itself is very fast, true, so it doesn’t usually become a performance bottleneck the way a database server does. But that’s not the point. There is a lot to win or lose in the way you use it, which can heavily influence your application’s performance. That’s what the new features in mk-query-digest are designed to analyze.
Here’s an example of the types of problems we’ve seen in production memcached usage, which are very hard to catch without a good tool. What if a “global” value is accidentally stored with a key that includes the user ID? This will cause the value to be duplicated again and again for every user, instead of being stored once. There are really only two ways to catch this: 1) know the application’s source code inside and out, and 2) analyze the memcached traffic scientifically. (Even if you know the source code well, there’s a good chance you can miss a bug like this.) I could go on listing the types of problems you can inadvertently create with a key-value database, but I’ll leave it at that.
The features are only available in trunk, and will be released with this month’s scheduled release.
I’ve packaged up and released version 1.1.2 of the Cacti templates I’ve written for MySQL, Apache, memcached, nginx etc.
Anyone who would like to help write documentation (or do anything else, for that matter) is welcomed to participate. I’ll give commit access at the drop of a hat.
2009-05-07: version 1.1.2 * The parsing code did not handle InnoDB plugin / XtraDB (issue 52). * The servername was hardcoded in ss_get_by_ssh.php (issue 57). * Added Handler_ graphs (issue 47). * Config files can be used instead of editing the .php file (issue 39). * binary log space is now calculated without a MySQL query (issue 48). * There was no easy way to force inputs to be filled (issue 45). * Some graphs were partially hidden without --lower-limit (issue 43). * Flipped some elements across the Y axis (issue 42). * Added Apache, Nginx, and GNU/Linux templates. * Unknown output is now -1 instead of 0 to prevent spikes in graphs. * If you want to use a script server, you must now explicitly configure it. * UNIX sockets weren't permitted for MySQL (issue 38).