Archive for June, 2008
MySQL challenge: LIMIT rows accessed, not rows returned
Dear reader, this is a challenge. How’s your MySQL prowess? You know about LIMIT: it cuts off the results at the specified number.
mysql>s; select actor_id from sakila.actor where actor_id % 5 = 0 limit 5; +----------+ | actor_id | +----------+ | 5 | | 10 | | 15 | | 20 | | 25 | +----------+ 5 rows in set (0.00 sec)
But that query actually accessed 25 rows. What if I want to say “return up to 5 rows, but don’t read any more than 20 rows to find them?”
Right now I’ve got the following:
mysql> select actor_id, @rows
-> from actor, (select @rows := 0) as x where
-> ((@rows := @rows + 1) <= 20)
-> and actor_id % 5 = 0
-> limit 5;
+----------+-------+
| actor_id | @rows |
+----------+-------+
| 5 | 5 |
| 10 | 10 |
| 15 | 15 |
| 20 | 20 |
+----------+-------+
4 rows in set (0.00 sec)
The derived table subquery x is only there to initialize the user variable at the beginning of the query.
This appears to work, but it doesn’t. If you profile this with SHOW STATUS, you see that it reads every row in the table (Handler_read_next = 200). This is actually worse, not better, than just LIMIT.
Any ideas?
I’ve got a few. But I don’t like them for various reasons. Extra props for really efficient solutions that don’t involve subqueries (so it’ll work on pre-4.0) or things that add extra overhead (subqueries, for example). I guess you probably see the direction I want to go with this — I don’t want to use subqueries.
What if you find errors in High Performance MySQL?
The book is done now, right? What’s next?
Don’t tell my wife this, but a book is never done.
Right now I’m proofreading the printed copy. I proofread PDF after PDF during production, but some problems will always slip through and make it to paper. I’m finding quite a few little mistakes. For example, at one point we refer to TPC as TCP three times in a row. Oops.
These problems will be corrected in the next printing. Please notify me if you find any errors yourself, and I’ll add them to the list of things to fix! Also let me know if you find things that should just be “fixed” in general. For example, the layout and page-breaking on pages 364 and 365 is totally confusing — it’s hard to tell which figures are associated with which text.
I’m not offering rewards like Donald Knuth, sorry…
I will place a list of errata on the official High Performance MySQL Second Edition website, including a non-mangled Figure 8-12. (Another thing that didn’t make it through production unscathed.)
Percona wants to hire a Maatkit developer
Percona is looking to hire someone to develop Maatkit, among other things.
If I weren’t having so much fun being the consulting team lead, I’d be doing it myself. (In fact, I’m still hacking on it a lot. Got some pretty fun stuff done this weekend.) I don’t know what the rest of the world thinks, but I think Maatkit is a damn enjoyable project to work on. Hopefully someone else will have the same kind of mindset and want to get paid for it, unlike poor working-on-the-weekends me.
I’m not stepping away from the project. It’s just grown a lot, and there is room and money to grow it much more. This is actually the best compliment to the project: that it is worth hiring someone to keep improving it. Lots of people are using it, and there’s a lot of stealth-mode stuff I/we want to do with it too.
On a related note, who wants me to order another batch of Maatkit t-shirts? I’ve gotten quite a few questions about it.

