Archive for October, 2011
Free webinar on preventing MySQL downtime
I’ll be presenting a free one-hour webinar on preventing downtime in production MySQL servers, in conjunction with the ODTUG. It is scheduled on Thursday, November 10, 2011 3:00 PM – 4:00 PM EST, and you can register for free.
Here’s an abstract of what you’ll learn:
Everyone wants to prevent database downtime by being proactive, but how effective are the common measures such as inspecting logs and analyzing SQL? To be truly proactive, one must prevent problems, which requires studying and understanding the reasons for downtime. We have analyzed a selection of emergency issues that we have solved, to better understand what types of problems really occur in production environments. The results are somewhat surprising, and will be detailed in this talk. Most incidents we found were not MySQL-specific and will be familiar to Oracle DBAs as well as MySQL DBAs. This presentation will be valuable for the new or seasoned DBA, as well as to operational managers/directors and CTOs responsible for business critical implementations.
Please register today and I look forward to joining you.
Progress on High Performance MySQL 3rd Edition
A few people have asked me how it’s going, so I thought I’d just share it with everyone. Things are going great. I’m writing much more quickly than I thought I would be, and as a result I’m finding I have time to do more changes than I thought I could, which makes me happy. I should be finished drafting the chapters by the end of the year.
In particular, the faster than expected pace is giving me a chance to address one of the big weaknesses of the second edition. In many places, the second edition is a collection of facts and experiences. It says what and why, but it doesn’t convey a process or method, and it doesn’t teach you how to think about things and apply the results to situations beyond what the book covers. I’m finding a number of key areas to remedy that: performance optimization, profiling, indexing, high availability, scalability, and so on. I’m adding an explanation of the principles and consequences, and actually showing real case studies selected from the most interesting customer issues I’ve seen at Percona. These illustrate the process or method I explain, and then I wrap that up by pointing out the key bits and reinforcing why they’re important.
I’m also adding chapter summaries. Summaries are hard to write. They should synthesize, not restate, the main points, and they should organize the reader’s thoughts for the transition to the next chapter. I hope I’m doing a good enough job at this, but it’s bound to be better than the second edition, which didn’t have summaries.
Detailed outlining is saving me. After writing the second edition, I realized that I’d spent about four times as much time rewriting as I did writing. This time I did a lot of outlining before I went to the writing stage, down to the level of individual thoughts and the relationship between them — nearly prose level, but not quite. Prose is much harder to reorganize and cut-and-paste than bullet points, because it requires correct grammar. Outlining doesn’t have to be grammatically correct. The effort I put into outlining in exhaustive detail is definitely paying off. When I find a section or chapter that I want to rewrite instead of just updating, and I didn’t plan for that, I back off and go back to the outlining. It’s so much more efficient. For example, I just caught myself starting to spin my wheels on partitioning, which is a section that I decide to rewrite to convey a how-to-think approach instead of what-to-think. I noticed the inefficiency, went to the outline in a text file, came back to the chapter, and ended up finishing it very quickly.
Finally, and I never thought I’d say this, but I’m using Microsoft Word and it’s making my life easier. The copyeditors use Word, and I knew I was going to have to use something that reads and writes .doc files eventually, so I decided to just use Word all the way through. I threw up a virtualbox with Windows XP, went out and bought a copy of Word 2010 for $150 or so, and installed it. It runs great in the virtual machine, and it is so much better than Open Office it’s not even in the same league. It works very well, with a lot of features that really do enhance my productivity. It isn’t Free Software, but maybe I’m growing up and coming to terms with the real world: sometimes getting things done is more important than principles, and $150 is less valuable than the time I’m saving. (I knew that I definitely was not going to use Open Office this time around, no matter what. I was planning on plain-text and Vim. But Word is actually more productive, I believe, by the time I factor in the interaction with the editors and other back-and-forth.)
Features I’d like to see in InnoDB
I had some conversations with a few people at Oracle Open World about features that would be beneficial in InnoDB. They encouraged me to blog my thoughts, so I am.
Someday I’d like to have a clear mental list of features I want in MySQL in general, but that is a much harder list to organize and prioritize. InnoDB is an easier place to get started.
First, I’d like truly online, nonblocking DDL. I see two ways this could work: reading the old version of rows and writing the new version, or doing a shadow-copy and building the new table in the background. The first way has the advantage of being lazy, so the schema change is instantaneous, and you really never pay any additional cost. However, it has the disadvantage that it might be hard to implement multiple schema changes if a previous change isn’t fully finished, so to speak (I can see a lot of bugs if there are more than 2 versions of the schema at a time). This is a limitation I’d be okay with. If I need to make a schema change and then I can’t make another change for a while until I run some statement that updates every row to its current value, that’s okay. The second version would work something like Facebook’s online-schema-change tool, except it would be implemented internally. I’m frankly unfamiliar with the actual source code of fast index creation, but I imagine it could be used as a starting point. The disadvantage is that the schema would actually be changed; it wouldn’t be lazy, so it would add load to the server at the time of the change.
Second, I’d like the ability to extend secondary indexes with additional columns, similar to INCLUDE in Microsoft SQL Server. This could make it a lot cheaper to have many covering indexes without incurring the cost of keeping the columns in the non-leaf nodes that we don’t want to sort and index on (we just want the values available to cover the query). The benefits of this feature should be pretty obvious. I don’t know how hard this would be.
Third, I’d like InnoDB to fadvise() the transaction log file blocks after it writes them, to tell the operating system it won’t need them again. This is something we did in XtraBackup and it makes a big difference on Linux. This could make it practical to have much larger transaction logs without causing swap pressure and competing for the buffer pool. Linux is not smart about dropping blocks from the cache otherwise, and tends to keep blocks in cache even when they will not be used again until the logs are recycled. I assume, but am not as sure, that OS readahead should suffice to read those blocks back into the cache as the log writing circles around to the tail.
Three feature requests should be enough for one day. Hopefully this is useful. What features would you like to see?





