Archive for the ‘SQL’ Category
I’ve been considering using TokuDB for a large dataset, primarily because of its high compression. The data is append-only, never updated, rarely read, and purged after a configurable time.
I use partitions to drop old data a day at a time. It’s much more efficient than deleting rows, and it lets me avoid indexing the data on the time dimension. Partitioning serves as a crude form of indexing, as well as helping purge old data.
I wondered if TokuDB supports partitioning. Then I remembered some older posts from the Tokutek blog about partitioning. The claim is that “there are almost always better (higher performing, more robust, lower maintenance) alternatives to partitioning.”
I’m not sure this is true for my use case, for a couple of reasons.
First, I clearly fall into the only category that the flowchart acknowledges may be a good use case for partitioning: I do need instant block deletes. Paying for data ingestion as well as purging doesn’t make sense in my case. It’s like eating a hot hot curry — I don’t want to feel the pain on the way out too :-)
Secondly, data size matters a lot. If I need to create a redundant index on the timestamp dimension, no matter how good TokuDB’s compression is, it’ll inflate my storage and I/O costs. And make my backups bigger, and so on, and so on. I don’t want an index that I don’t need. My queries operate very efficiently without the timestamp index, and creating one only to help delete older data fast wouldn’t make sense.
In the end I got sidetracked and decided to write this blog post. And I didn’t find out whether TokuDB supports partitioning or not! Silly me.
If you’re in the Washington, DC area on Sept 12th, be sure to attend Percona University. This is a free 1-day mini-conference to bring developers and system architects up to speed on the latest MySQL products, services and technologies. Some of the topics being covered include Continuent Tungsten; Percona XtraDB Cluster; MySQL Backups in the Real World; MariaDB 10.0; MySQL 5.6 and Percona Server 5.6; Apache Hadoop.
I’ll be speaking about using MySQL with Go. I’ll talk about idiomatic database/sql code, available drivers for MySQL, and tips and tricks that will save you time and frustration.
Continuent is sponsoring a complimentary breakfast and Percona will also provide refreshments throughout the day, along with a raffle for a chance to win cool t-shirts, copies of “High Performance MySQL,” and a few other great prizes.
It’s free but space is limited, so register!
I’ll be joining Percona for a free day of MySQL education and insight at their upcoming Percona University Washington DC event on September 12th. My topic is accessing MySQL from Google’s Go programming language. I’ve learned a lot about this over the past year or so, and hopefully I can help you get a quick-start.
If you’re not familiar with Go, it’s the darling of the Hacker News crowd these days. Anything with “Go” in its title gets to the front page for at least a little while! Go is a great systems programming language. It’s safe to say I’ve fallen in love with it, and it’s now my favorite programming language of all those I’ve used over my entire career. I chose it because it’s ideally suited for VividCortex’s agent programs (zero dependencies, compiled, lightweight, high performance, robust, makes concurrency easy and safe), and we’re using it for our API servers and backend processing jobs for many of the same reasons.
There’s a lot of great content at these free Percona University events. If you’re not near Washington DC, you should sign up for Percona’s conferences and training newsletter so you find out about the next one near you.