The pace of MySQL engineering has been pretty brisk for the last few years. I think that most of the credit is due to Oracle, but one should not ignore Percona, Monty Program, Facebook, Google, Twitter, and others. Not only are these organizations (and the individuals I haven’t mentioned) innovating a lot, they’re providing pressure on Oracle to keep up the improvements, too.
But if you look back over the last few years, MySQL is still functionally a lot like it used to be. OK, we’ve got row-based binary logging — but we had binary logging and replication before, this is just a variation on a theme. Partitioning — that’s a variation on a theme (partitioned tables are a variation on non-partitioned tables). Performance — same thing, only faster. And so on.
I’m painting things with too broad a brush. There’s actually a lot of stuff that’s NOT just a variation.
But if you look around at what’s out there in other open-source DBs, there’s a lot of innovation, particularly in PostgreSQL, which has had CTEs (common table expressions) for a while. CTEs are not a variation on a theme. They are major new feature, analogous to going from no-subquery-support to supports-subqueries. They enable a lot of things like recursive queries, making a SQL database useful in many more types of situations — think graph-processing, for example, which is downright annoying without them.
Will we see CTEs in MySQL soon?
I’ve been considering using TokuDB for a large dataset, primarily because of its high compression. The data is append-only, never updated, rarely read, and purged after a configurable time.
I use partitions to drop old data a day at a time. It’s much more efficient than deleting rows, and it lets me avoid indexing the data on the time dimension. Partitioning serves as a crude form of indexing, as well as helping purge old data.
I wondered if TokuDB supports partitioning. Then I remembered some older posts from the Tokutek blog about partitioning. The claim is that “there are almost always better (higher performing, more robust, lower maintenance) alternatives to partitioning.”
I’m not sure this is true for my use case, for a couple of reasons.
First, I clearly fall into the only category that the flowchart acknowledges may be a good use case for partitioning: I do need instant block deletes. Paying for data ingestion as well as purging doesn’t make sense in my case. It’s like eating a hot hot curry — I don’t want to feel the pain on the way out too :-)
Secondly, data size matters a lot. If I need to create a redundant index on the timestamp dimension, no matter how good TokuDB’s compression is, it’ll inflate my storage and I/O costs. And make my backups bigger, and so on, and so on. I don’t want an index that I don’t need. My queries operate very efficiently without the timestamp index, and creating one only to help delete older data fast wouldn’t make sense.
In the end I got sidetracked and decided to write this blog post. And I didn’t find out whether TokuDB supports partitioning or not! Silly me.
If you’re in the Washington, DC area on Sept 12th, be sure to attend Percona University. This is a free 1-day mini-conference to bring developers and system architects up to speed on the latest MySQL products, services and technologies. Some of the topics being covered include Continuent Tungsten; Percona XtraDB Cluster; MySQL Backups in the Real World; MariaDB 10.0; MySQL 5.6 and Percona Server 5.6; Apache Hadoop.
I’ll be speaking about using MySQL with Go. I’ll talk about idiomatic database/sql code, available drivers for MySQL, and tips and tricks that will save you time and frustration.
Continuent is sponsoring a complimentary breakfast and Percona will also provide refreshments throughout the day, along with a raffle for a chance to win cool t-shirts, copies of “High Performance MySQL,” and a few other great prizes.
It’s free but space is limited, so register!