Great things afoot in MySQL 5.5
I haven’t been blogging much about the changes in MySQL for a while. But the MySQL and InnoDB engineers have been doing a ton of work over the last couple years, and now it’s seeing the light of day, so it’s time to offer words of congratulations and appreciation about that.
I was holding my breath for a big-splash 5.5 GA announcement at last week’s conference, but I was wrong. Still, there’s a lot to talk about in 5.5, with a dozen or more substantial blog posts from the InnoDB and MySQL teams alone over the last week or so! Here are a few choice tidbits of the good, the bad, and the ugly.
InnoDB is the default storage engine
“No big deal,” I thought. “Serious users do this anyway.” But then Morgan Tocker pointed out that it really is a big deal. This is going to cause a sea change in the way MySQL is used. Instead of growing up on a storage engine that doesn’t give a damn about your data (MyISAM), and then learning about one that does (InnoDB), users will be plunged into a much more advanced and complex storage engine from day one. We’re going to be seeing a lot more people learning internals, a lot more pressure on InnoDB’s seams, a lot more of everything InnoDB. And a lot less MyISAM. Instead of “why would I switch from MyISAM to InnoDB?” we’ll be hearing “I hear there’s this MyISAM thing, when should I use that?”
This was a very smart move on MySQL’s part.
InnoDB improvements
This is a mixed bag. Some improvements are awesome. Some look like improved implementations of the changes in XtraDB, which is also awesome.
In the “we peeked at XtraDB” department, I’m thinking about improvements to recovery time, a separate purge thread (inspired in XtraDB by a Sun employee’s patch), and changes to enable multiple rollback segments. The concepts for these are proven by XtraDB users to be tremendously effective in the real world, and I am hopeful that InnoDB has done a great job of implementing the concepts. The changes to recovery seem even better than Yasufumi’s work, although it’s not clear yet whether InnoDB’s recovery is really any faster than XtraDB’s. InnoDB took a tremendous leap forward by implementing these changes.
I am not that thrilled with multiple buffer pools. This looks to me like saying “it’s a hard problem and we don’t know the best solution, but how about we try this classic idea.” The buffer pool is already hard to manage (can’t be resized at runtime, can’t pin a table or index into it…) and it looks like this doesn’t improve that. Instead, it looks like a zero-sum game with respect to really advancing the internal architecture, done solely for performance reasons, and I think it’s a local optimization that won’t be very future-proof. I predict this will be changed somewhat in the future. Unless other problems with the buffer pool are fixed, any future enhancements to multiple pools will probably have ugly problems such as fragmentation. I know it’s not very helpful for me to criticize without offering suggestions, but the truth is I’m aware that all my suggestions would be hand-waving (“avoid mutexes on global structures,” duh).
The work on splitting up mutexes is complex and I don’t have an opinion on it yet. Benchmarks are great, but the real world often holds unpleasant surprises. So we’ll see about the true performance changes. I know InnoDB has done a ton of work on this, but it seems to me that Percona had reasons not to do things the way InnoDB did. The great thing about this is that if InnoDB’s approach suffers in some workload, then Percona will be able to construct a benchmark to show pretty graphs about it!
The async I/O worries me; I/O is not well instrumented, and that’s a complex change. Yet another mysterious thing that can behave badly and be difficult or impossible to understand.
I suspect that delete buffering can go completely off the rails, in the same ways that insert buffering can. At the time of writing, there is only very crude control over the buffering mechanism. There is no way to control how large the buffer is, or make InnoDB unload the buffer in the background (XtraDB lets you do those things). But I would not be surprised if InnoDB is working on this limitation. I think this is a very low-hanging fruit. The behavior of the insert buffer is utterly bizarre sometimes (“I ran STOP SLAVE, why the heck did my completely idle server start flushing tens of gigs of data to disk and become unresponsive for half an hour afterwards?”) The implementation is just silly: you cannot prevent the insert buffer from putting pressure on the LRU list and forcing things out of the buffer pool that you really want there. And Yasufumi’s last slide on his presentation showed clearly how the lack of control over the buffer causes performance problems in InnoDB, and makes XtraDB beat even the newest InnoDB versions in some benchmarks.
Performance Schema
I’m completely unimpressed with Performance Schema, and have been from day one. It was an ivory-tower project created and developed in secret, and it bears no evidence of input from people with practical experience. What I see is useless for normal people; it’s useful only for MySQL and InnoDB developers, and not even a good solution for them. If you read around the blog posts and docs about it, you find a lack of any practical examples — and IMO that’s because it’s not possible to create good examples of how it can be useful. Instead, you see phrases such as “trace issues back to the relevant file and line in the source code so you can really see what’s happening behind the scenes.” I’m not the only one; Robin Schumacher panned it too.
Conclusions
I am really heartened to see MySQL not only continuing to work really hard on the server and on InnoDB, but also to see all the hard work from the last few years finally become available where it can be reviewed, praised or criticized, and put into production. I think that it’s time to take Don MacAskill’s praise of Percona last year (“great things are afoot“) and pass it over to MySQL and InnoDB! I hope the pace of development, and getting it out the door, continues as it is now.
Further Reading:






Well, on the performance schema bit I just have to disagree.
In fact, I think it’s one of the more promising features that got into 5.5, simply because it delivers something that clearly hasn’t been there before in an extensible and accessible manner without needing to write a lot of adhoc server patches.
Now, saying that “Robin has panned” it is a little overstating of what he actually said, no?
Yes, it’s not the same thing that was in his design docs. So what.
And also yes, it doesn’t cover everything Robin or you might want to see (and trust me, I’ve asked for things, too).
Indeed I’ve written real world examples with it, as early as one year ago, btw). The data you can get out of it is simply amazing, and with more probes it’s only going to get better.
What most people don’t, and can’t, realize is what enormous effort it took to get this off the ground and eventually integrated, mainly by Marc Alff and Peter Gulutzan. For that reason alone I take issue with using the term ivory-tower (because, to be honest, it is slightly derogatory).
You might want to read chapter 20 of the 5.5 docs again. It should become apparent what you can do with it, even today.
Kay Röpke
19 Apr 10 at 6:53 pm
Robin’s actual words: “New MySQL performance schema has a ways to go in my opinion to be truly useful for bottleneck analysis”.
The method shown in http://dev.mysql.com/doc/refman/5.5/en/performance-schema-examples.html is only useful in a fantasy world.
Xaprb
19 Apr 10 at 7:22 pm
I agree that the example in the docs is not useful for day to day stuff, it serves merely as an example of how to use the interface.
I guess that http://dev.mysql.com/doc/refman/5.5/en/compile-and-link-options.html would also be the realm of fantasy then?
You conveniently ignore what he said after that, right?
It is kinda interesting that he talks about how important wait information is, then says that performance schema falls short of his design document, which was, well, a list.
But somehow it got spec’ed, implemented, passed reviews and landed in a beta release _and_ is easily extensible, just because someone took up the challenge to actually do it.
In other words, we ended up with a flexible way of recording and querying the information, which admittely is missing instrumentation at this point (refer to http://blogs.mysql.com/peterg/2010/04/18/performance-schema-at-mysql-user-conference/) but works.
Even Robin says that, by the way. Starts at 52:58 in http://www.youtube.com/watch?v=G_iaJ8TFwy8
Does it need more work? Yes.
It is useless? Sorry, no.
Kay Röpke
19 Apr 10 at 7:48 pm
I associate Robin with hype for Falcon. Why does this discussion focus on him?
Until the big 5.5.4-m3 reveal, I was very negative about the PS as it didn’t provide much data for me. I changed my mind. First, it provides a foundation for scalable performance instrumentation and we will get much more in the future. Second, the future is now and InnoDB has begun to use it. Third, it provides a lot of features that I can use (lock free hash table, portable cycle timers, and more) to get even more data.
Mark Callaghan
19 Apr 10 at 11:44 pm
My first mistake was mentioning another name in passing to support my opinion. My second was responding defensively to a comment attacking that opinion.
Xaprb
20 Apr 10 at 8:01 am
Ok, discounting the mentioning of any single person, I still disagree with you ;)
As Mark says, InnoDB is on board with it now, for various reasons we couldn’t do that ourselves back in the day, and the lack of InnoDB instrumentation was one of the major criticism voiced internally about a year or so back.
I think if you look at what performance schema makes possible today and in the future (once more and more probes get written) you will realize the practical value of it, even if it is not quite there yet today.
Also, please don’t forget that this is a beta release :)
Kay Röpke
20 Apr 10 at 8:21 am
Wow – I was 4 years at MySQL, was involved in a lot of things, yet I’m only associated with the promotion of a storage engine that got nixed. Funny…
I am looking forward to where PS goes in the future and the additions that will be made. I’m also eager to see how Drizzle implements the PS I created with the help of the MySQL CAB, and other customers and community members.
Robin
20 Apr 10 at 10:58 am
Sorry Robin, that was just my way of pointing out that ‘Robin panned this’ wasn’t the best way for Baron to make his point. Everyone associates you with the amazing progress being made at InfiniDB.
The marketing of Falcon that began in 2006 is why Oracle doesn’t promote features years ahead of a potential release. I would rather be surprised in a good way by things like the 5.5.4-m3 announcement at the conference.
Mark Callaghan
20 Apr 10 at 11:45 am
Mark – no sweat. And while I’d love to take credit for all the stuff happening with InfiniDB, the truth is the engineers are the superstars. I think you’d love working and interacting with them. I certainly do.
And I agree with you 100% on how things were done PR-wise with Falcon and that it’s much better to not prematurely trumpet any new feature until it’s ready for prime time.
Robin
20 Apr 10 at 2:10 pm
I believe I was a pretty vocal critic of the marketing around Falcon, and maybe Robin by extension. I think what InfiniDB is doing now — just saying what’s in the roadmap — is very nice.
I’m not sure if I owe any apologies for thoughtless name-dropping, or if the only one I’ve offended was myself. If I offended anyone else, I’m sorry.
I disagree with Mark. I generally don’t like surprises, even good ones. Although, back to the point of this blog post, I am happy about the newest 5.5 work.
Xaprb
20 Apr 10 at 2:21 pm
We have Hive for DW and I don’t have much time beyond working on OLTP — but I enjoy reading about the progress of InfiniDB.
@Baron – would you change your opinion of surprises if the surprises were smaller and less frequent because the releases were more regular and more frequent?
Mark Callaghan
20 Apr 10 at 2:30 pm
Probably somewhat. All of my opinions are subject to change.
Xaprb
20 Apr 10 at 2:56 pm
Any idea where we can get a copy of the presentation by Yasufumi that you talked about, or any of the presentations by the various Percona people?
Harrison Fisk
20 Apr 10 at 4:20 pm
We’re compiling them all and will post a blog post about it. Unfortunately some people are taking the long road home around the volcano, and can’t be reached right now, so it might be slow.
Xaprb
20 Apr 10 at 8:39 pm