Archive for May, 2010
How to tune MySQL’s sort_buffer_size
I perpetually see something like the following:
My server load is high and my queries are slow and my server crashes. Can you help me tune my server? Here is some information.
[random sample of SHOW GLOBAL STATUS, like the query cache counters]
my.cnf:
[mysqld] key_buffer_size=1500M query_cache_size= 64M max_connections = 256 key_buffer = 8M sort_buffer_size = 100M read_buffer_size = 8M delay_key_write = ALL
There are many problems in this my.cnf file, but the sort_buffer_size is a glaring one that identifies the user as someone who should not be playing with live ammunition. Therefore, I have developed an advanced process for tuning sort_buffer_size, which you can follow to get amazing performance improvements. It’s magical.
- How expert are you?
- I know that there is a sort buffer, and that it is related to sort_merge_passes. When sort_merge_passes is high, I have been told to increase the sort_buffer_size. I also know that it is somehow related to the number of sorts the server does, so when there are a lot of sorts shown in variables like Sort_rows and Sort_scan, I think I should also increase it. You are a beginner.
- I have been administering MySQL for many years. I know that there are two sort algorithms inside MySQL. I know exactly how to optimize the key cache hit ratio. You are a novice.
- I have read every blog post Peter Zaitsev ever wrote, and I can improve on them all. You are an expert.
- Based on your score on the scale above, find your optimal sort_buffer_size tuning algorithm below:
- Beginners and novices should leave this setting at its default, and comment it out of the configuration file.
- Experts don’t need me to tell them what to do, but most of them will leave this setting at its default, and comment it out of the configuration file.
The most amazing thing about sort_buffer_size is how many people utterly ruin their server performance and stability with it, but insist that they know it’s vital to change it instead of leaving at its default. I do not know why this is always the case. Why don’t people choose random variables to destroy their performance? It’s not as though there is a shortage to choose from. Why does everyone always pick sort_buffer_size instead of something else? It’s like a flame drawing the moths in.
Feel free to ask questions if anything is unclear, but be prepared for a direct answer if you ask for tuning advice.
PS: I considered a simpler tuning guide, such as Domas’s guide to tuning the query cache, but I am convinced that people need more a complex guide for the sort_buffer_size, or they will not believe in the validity of the instructions. I base this on multiple experiences being paid a lot of money to suggest not setting sort_buffer_size to 256M, and being told that I must be an idiot.
Using Aspersa to capture diagnostic data
I frequently encounter MySQL servers with intermittent problems that don’t happen when I’m watching the server. Gathering good diagnostic data when the problem happens is a must. Aspersa includes two utilities to make this easier.
The first is called ‘stalk’. It would be called ‘watch’ but that’s already a name of a standard Unix utility. It simply watches for a condition to happen and fires off the second utility.
This second utility does most of the work. It is called ‘collect’ and by default, it gathers stats on a number of things for 30 seconds. It names these statistics according to the time it was started, and places them into a directory for analysis.
Here’s a sample of how to use the tools. In summary: get them and make them executable, then configure them; then start a screen session and run the ‘stalk’ utility as root. Go do something else and come back later to check! A code sample follows.
$ wget http://aspersa.googlecode.com/svn/trunk/stalk
$ wget http://aspersa.googlecode.com/svn/trunk/collect
$ chmod +x stalk collect
$ mkdir -p ~/bin
$ mv stalk collect ~/bin
$ vim ~/bin/stalk # Configure it
$ screen -S stalking.the.server
$ sudo ~/bin/stalk
Inside the ‘stalk’ tool, you’ll see a few things you can configure. By default, it tries to connect to mysqld via mysqladmin and see how many threads are connected to the server. If this increases over 100 (a sample number you should almost certainly change), or if it can’t connect to mysqld, then it fires off the ‘collect’ tool, or whatever else you configure it to execute.
The ‘collect’ tool, by default, captures a variety of things including disk usage, cpu usage, internal status from mysqld, and even oprofile (which it saves using the standard oprofile save feature; you must use opreport to get your report later). There is also a commented-out section to run GDB if you want stack traces. This is not enabled by default because that’ll freeze mysqld briefly. Usually this is OK if mysqld is already unresponsive during the problem!
Wishing I could be at ODTUG
Ronald asked me if I could present at ODTUG’s Kaleidoscope conference, which is only a couple hours from me, but I’ll be at the Netways OSDC that week. Matt Yonkovit will be there representing Percona. I wish I could go: I would really like to mingle with more Oracle users and developers, and I think that participation is the key to building relationships between MySQL and Oracle users — two groups of people who are going to be overlapping more in the future. If you can attend, I hope you will — and blog about it so I can read.





