Archive for the ‘Oracle’ tag
That’s right, I said InnoDB+, with a “plus” at the end. I didn’t know it existed until, while following some links from Monty’s appeal to save MySQL, I decided to read a Groklaw post that links to Eben Moglen’s letter to the EU Commission, which includes this text:
Innobase could therefore have provided an enhanced version of InnoDB, like Oracle’s current InnoDB+, under non-GPL license, …
I don’t know anything more. Do you?
This is easily one of the best books I’ve ever read on performance optimization. I’ve just finished reading it for the second-and-a-half time in two weeks, and I very rarely read a book more than once. I’ve been telling a lot of people about it.
Despite the title, it is actually not about Oracle performance. It is a book on how to optimize a) any system, including a MySQL-based application b) the process of optimization itself. It is very, very good, and I highly recommend it to all database users. My bet is that most people will learn more by reading this book than by spending thousands of dollars on conferences and training, especially since a lot of what you’ll learn from those sources is wrong and harmful. So not only will you save money and time on learning, you’ll reap great rewards thereafter.
The book is organized into three sections. The first section explains a performance optimization methodology called Method R (for response time) that is designed to deterministically advance from identifying what needs to be optimized, to collecting diagnostic data, and then choosing the correct activities to optimize. Simply put, it is probably the clearest and most logical process I’ve ever seen for focusing on what matters. It is quite similar to our process at Percona, which we call Goal-Driven Performance Optimization. But Cary does a really good job explaining it. Cary told me that he wrote the book while simultaneously developing Method R training classes, and did dozens of classes before the book went to press. You can tell!
The second and third parts of the book are about putting the method into practice.
Cary shows many typical mistakes, such as focusing on ratios, working on things that “look bad,” ignorance of Amdahl’s Law, trying to draw conclusions about specific activities by examining system-wide information, and so on. He brings the focus back to response time again and again.
Another typical “tuning mistake” is not knowing when to quit. Cary shows how to know when further performance improvements, even if they’re possible, will not be economically justified. At the point where the gains don’t exceed the cost, you’re done. The system is optimized. Maybe it’s not performing as well as it could, but if it costs too much to get any more performance, it’s still optimal.
This is a book that backs up every assertion with references to source material, or even with a proof or derivation from first principles. It is significantly more rigorous than most books of this type. There’s a very good chapter on kernel timings that explains a ton of stuff about how processes work, measurement intrusion effect, quantization error, and so on.
There is a very good section on queueing theory, with excellent examples of using it to prove that a desired improvement is mathematically impossible and/or impractically expensive. There are also examples of how to use queueing theory to predict when a hardware upgrade will worsen your performance problems. (And lots of other examples of when “optimizing” something will actually make it worse, as for example improving performance of some task other than the one you’re interested in, thereby making it use and contend for more resources — and hurting performance of the task you wanted to improve.)
What is it NOT about? It’s not about how to write more efficient SQL — there is practically nothing in the book about SQL. It’s not about how everything inside Oracle works, though it does have one or two chapters that are mostly about how to apply the principles to Oracle through extended SQL trace files, a discussion that will transfer well to any other system that emits trace data. (Read these, or you’ll miss learning about quantization error and such!)
Highly, highly recommended.
A while ago I posted about a comment a Sun performance engineer made about a scalable replacement for InnoDB. At the time, I did not believe it referred to Falcon. In hindsight, it seems even clearer that the Sun performance experts were already working hard on InnoDB itself.
Sun’s engineers have shown that they can produce great results when they really take the problems seriously. And I’m sure that InnoDB’s performance has untapped potential we don’t see right now. However, it does not follow that their work on InnoDB is what was meant by a scalable replacement for InnoDB. Or does it?
General-purpose MVCC transactional storage engines with row-level locking, whatever their performance and scaling characteristics in edge cases, fall into a category together. A person assembling a MySQL server for general-purpose use might choose a different storage engine for various uses — MyISAM here, Memory there… and use “one of those transactional engines” for the bulk of the work. PBXT, InnoDB, Falcon — I don’t see a justification for running more than one of those side by side. The operational costs alone (backups, training the users, etc) would be too high. It is also not at all clear that MySQL itself is ready for multiple transactional storage engines working together (e.g. cross-engine transactions) in the real world.
So what’s left for Falcon? I think they are asking themselves the same question (brilliant gallows humor, by the way). I think Falcon’s ideas and techniques are very interesting, but a storage engine — especially one with such lofty goals — is always a show-me undertaking that will require years to mature and prove itself even after the code is “ready.” With or without the Oracle acquisition, this question has loomed for years: where’s the justification for Falcon politically, functionally, economically? A third party engine such as PBXT, with eyes on replication at the storage engine level and other add-on functionality, has always seemed more likely to really add value than a straight-up InnoDB replacement.
But from my point of view, the biggest win in the short term would still be to drive InnoDB development forward at a consistent and accelerating pace to meet the needs of users and the advances in hardware. Of course, that’s what XtraDB set out to do, and I think the XtraDB project has helped snap InnoDB out of their Percheron-like plod towards improvement. This is nothing but good; when it comes to competition among storage engines, no one should be resting on their laurels. I also see that Sun’s team has more good things in the works, which is great. I’d love for InnoDB to stop being a work horse and start being a quarter horse. We need it to be both scalable and high-performance.