Archive for the ‘High Performance MySQL’ tag
Progress report on High Performance MySQL, Second Edition
It’s been a while since I’ve written about progress on the book. I actually stopped working on it as much at the beginning of the month, because on October 31st I managed to finish a first draft of the last big chapter (Scaling and High Availability)! Now I’m back to full-time work at my employer, and I’m working on the book in the evenings and weekends only.
This doesn’t mean the book is close to being done, though. The editor is sending out some chapters for technical review, and there’s still a lot more writing and revising to be done.
Last weekend I revised the Security chapter from the first edition, which I think will be the only chapter that we’ll just revise and update, rather than completely rewriting (well, maybe the Architecture chapter could be considered a revision instead of a rewrite, but it’s a stretch; we changed it a lot). I removed a lot of the material that repeated the MySQL manual, and added a lot of information and best practices on grants, new privileges and objects in MySQL 5, common tasks, common mistakes, and so on. The chapter ended up being nearly as long, even though I stripped out all the code listings and so on from the first edition (in fact, I reduced the first edition’s material to a few paragraphs).
Beyond that, though, there are little details to finish out in many of the chapters. Examples that need to be finished, figures that need to be re-drawn, material that doesn’t quite fit and needs to be re-arranged or even moved to another chapter; it’s a lot of work. Peter Zaitsev has been reviewing some of the core chapters on query and schema optimization etc, and I’m revising them in response to his comments. That’s what I spent today doing.
I think the biggest chunks of work that remain are going to be making chapters 3, 4, 5, 6 and 7 (benchmarking, profiling, schema, indexing, query optimization, advanced features, and server tuning) flow together well. The challenge here is how to organize the vast amount of material so it reads well, without too many forward references, and still be useful as a reference work. The detail we’ve gone into is incredible. It makes it very hard to find the single best place to mention each little bit of wisdom, because all of this material is completely inter-related. It’s tough to flatten the graph of knowledge into a one-dimensional narrative.
It’s not just these chapters that have a lot of inter-related material, of course. It’s hard to talk about tuning the server settings (chapter 7) without bringing the OS and hardware (chapter 8) into it, and whenever you do this you also need to think about measuring and monitoring status information (chapter 14). Of course, you need to do that for benchmarking and profiling, too (chapter 3). I’m sure you see the dilemma!
The good news is, if we succeed in doing this well, you will find the book enormously useful. Stay tuned!
High Performance MySQL, Second Edition: Query Performance Optimization
Your comments on the Advanced MySQL Features chapter were great. A lot of the questions I got (in the comments and via email) about chapter 6 are really addressed in chapter 5, “Query Performance Optimization,” so I’m posting its outline too. I have the same questions: are there things you’d like to see us cover? Do you have any favorite techniques you’d like to see us include? Any other comments or questions?
A bit of basic chapter information: It’s currently 45 pages and very nearly a first draft, I think. The outline might be subject to change if we decide to rearrange the material, which actually might help in some places. As I pulled out the headings for this article, I noticed that some of the sections are just huge, and might benefit from being divided up a little bit more. But that’s all going to come out in the wash; the focus right now is getting the material down in some order that makes sense, and making sure it’s clear, concise, and correct.
[Update: I got curious how the page count in OpenOffice.org will translate to the page count on paper, so I opened one of the chapters from the first edition and it looks like every 10 pages in OpenOffice.org will be about 11 pages on paper. So this chapter really ought to be about 50 pages on paper].
Here’s the chapter outline:
[Intro]
Slow Query Basics: Optimize Data Access
Are You Asking the Database For Data You Don't Need?
Is MySQL Examining Too Much Data?
Rows Examined and Rows Returned
Ways to Restructure Queries
Complex Queries Vs. Many Queries
Chopping Up a Query
Join Decomposition
Query Execution Basics
Query States
The Query Cache
The Query Compiler and Optimizer
The MySQL Query Optimizer
Table and Index Statistics
MySQL's Join Execution Strategy
The Join Optimizer
Returning Results to the Client
MySQL Query Optimizer Limitations
Correlated Subqueries
When a Correlated Subquery is Good
UNION limitations
Index Merge Optimizations
Equality Propagation
Parallel Execution
Hash Joins
Loose Index Scans
MIN() and MAX()
SELECT and UPDATE on the same table
Optimizing Specific Types of Queries
Optimizing COUNT() Queries
What COUNT() Does
Myths About MyISAM
Simple Optimizations
More Complex Optimizations
Optimizing JOIN Queries
Optimizing Subqueries
Optimizing GROUP BY and DISTINCT
Optimizing GROUP BY WITH ROLLUP
Optimizing ORDER BY
Optimizing LIMIT and OFFSET
Optimizing UNION
Optimizing Range Queries
Query Optimizer Hints
User-Defined VariablesHigh Performance MySQL, Second Edition: Advanced SQL Functionality
Work continues apace on High Performance MySQL, Second Edition (the link leads to the chapter outline). I’m working now on Chapter 6, Advanced SQL Functionality, and thought I’d solicit input on it. Are there things you’d like to see us cover? Do you have any favorite techniques you’d like to see us include? Feel free to leave feedback in the comments. The chapter is already significantly done, with 26 pages written, but the ink’s not on paper yet, so there’s still time to correct omissions!
I should note that there are separate chapters on architecture, schema and indexing design, application optimization, query optimization, etc. We’re trying to focus this chapter on “advanced features” and how to get high performance out of them. Also, the outline is still subject to change: there’s so much material that it’s hard to decide the best place to put something, and exactly what is an “advanced feature” might be open to interpretation, or we might feel something is better placed elsewhere. That said, please throw your ideas at us and we’ll worry about the details for you. Here’s the outline of this chapter, as of three minutes ago:
[Intro]
The MySQL Query Cache
Operational Detail and Caveats
Optimizations
Alternatives
Prepared Statements and Multiple Query Execution
Client-side Prepared Statements
Client-side Prepared Statements
Optimization of the Execution Plan
Server-side Prepared Statements
Prepared Statements in Stored Procedures
Prepared Statement Caveats
Cursors
Client-side Cursors
Stored Procedures
Pros and Cons
Recommendation
Example
User Defined Functions
Events
Views
Updatable Views
Limitations of Updatable Views
Security
Performance
Triggers
Working with Multi-byte Character Sets
Full-Text Search
Full-Text Search Queries
Changes in MySQL 5.1
Full-Text Trade-Offs and Workarounds
Full-Text Tuning and Optimization
Foreign Key Constraints
Merge Tables and Partitioning
Merge Tables
Merge Table Performance Limitations
Merge Table Strengths
Partitioned Tables
Optimizing Queries Against Partitioned Tables
Maintaining Partitioned Tables
Distributed Transactions (XA)

