Xaprb

Stay curious!

How we enabled threading in MySQL

with 10 comments

MySQL on GNU/Linux appears to be able to either run multiple processes, or one process and multiple threads. We’ve noticed a significant CPU penalty for multiple processes, probably from the context switching overhead. The trouble was, one of our servers wouldn’t use threads; it wanted to use multiple processes. This article explains how we got it to use threads instead.

First, we noticed the master server’s CPU utilization was higher than the slave’s, even though we expected the type of queries running on the slave should have caused it to have around the same CPU utilization as the master. We checked the configuration, but couldn’t find anything that should have caused this. Then we noticed the slave only had a single process in top, but the master had dozens. My co-worker speculated that the single process on the slave might have been running many threads, which have so much less context-switching overhead that it could have caused the difference. Indeed, I was able to toggle the display of threads in top with the H key, and could see each connection being handled by a thread.

Another clue was running vmstat and looking at the number of context switches in the cs column. The master’s number was much higher than the slave’s. We examined a number of other performance metrics (see my article about monitoring server load in GNU/Linux), but those ended up being the the most obvious signs of difference between the two servers.

The key ended up being NPTL. As I discussed in my article on Gentoo and NPTL, apparently certain software won’t multi-thread, even when it has linuxthreads available. I’m not pretending to know a lot about compiling MySQL, but we did try multiple ways to get it to use threads, and it wasn’t until we figured out NPTL wasn’t built into glibc that we made any progress. After re-building glibc and restarting the the mysql daemon, it came back up with just one process, but multiple threads. Success! Now our master server uses less CPU, leaving more available for queries.

Written by Xaprb

July 16th, 2006 at 9:45 pm

Posted in GNU/Linux, SQL

10 Responses to 'How we enabled threading in MySQL'

Subscribe to comments with RSS or TrackBack to 'How we enabled threading in MySQL'.

  1. Nah… if you check the MySQL server code, you will not find a single fork() call. It’s just that unless you use NPTL, you’re likely to see multiple mysqld’s listed in “ps xa | grep mysqld“. However, that’s just the kernel lying to you. You will notice that all those “processes” in fact use the exact same amount of memory, etc. What in fact happens is that the threads are reported as processes. But they’re threads anyway.

    Arjen Lentz

    17 Jul 06 at 1:27 am

  2. MySQL is never multi-process, it is always threaded. But LinuxThreads are not “true” threads, they are processes that share the same address space using the clone() system call, so they are reported as distinct processes by the ps utility. but as far as the mysql server is concerned, they are threads.

    jim

    17 Jul 06 at 2:01 am

  3. I am afraid your effort was futile. You see the only thing you achieved was to switch from LinuxThreads to NPTL. MySQL always uses threads, but depending on library they look different to top, vmstat and other tools. When using LinuxThreads each thread has it’s own PID and that’s why they appear as separate processes.

    If you take closer look at top output for example you will see all mysqld “processes” have exactly the same numbers for memory and CPU. That is because it is same process information repeatedly displayed for each thread. With certain versions of top it was possible to hide/show threads.

    Some tools like pstree for example always show threads as such:

      ├─mysqld_safe───mysqld───8*[{mysqld}]

    With NPTL threads don’t have PID so they don’t appear in top and others. You can easily verify if I am correct or wrong. If your Linux supports LD_ASSUME_KERNEL switch between LT and NPTL simply start mysqld once with each library and observe the difference in top.

    Now the real problem. On Master mysqld can utilize several threads better for very simple reason. Each client session is handled by separate thread in mysqld. Several sessions can run queries in parallel. However in order to ensure data is replicated consistenly all statements are “serialized” in binary log and repeated in exactly the same order at the Slave. This is only possible if at Slaev all the replicated statements are executed by single thread.

    So if you have 10 threads on Master running 10 statements parallel these 10 statements will come to Slave in serialized manner and single thread will run them one at a time.

    I hope that helps.

    Best regards

    Alexander Keremidarski

    17 Jul 06 at 3:15 am

  4. Hi,

    This blog got us puzzeled ….

    MySQL uses LinuxThreads or NPTL (Native POSIX Thread Library), which ever is available on your Linux box. MySQL is always threading, it doesn’t fork.

    Fact that you ’see’ more threads in a ps output on one machine, might be the fact that it used LinuxThreads, which has a hack to make them visible as seperated processes.

    Geert Vanderkelen

    17 Jul 06 at 3:22 am

  5. Hi,

    MySQL always uses threads. However if LinuxThreads are used MySQL will show up in top and ps as multiple processes as this is how LinuxThreads are implemented. It would not be real multiple processes though as address space and other things are shared.

    Good to hear NPTL provided performance improvement in your case. It should as it was specially designed to solve number of LinuxThread design issues.

    Peter

    17 Jul 06 at 3:39 am

  6. Wow, thanks everyone for writing! I am learning a ton now that more people are reading and commenting (I am not that experienced myself). Our CPU utilization certainly went down after we got switched to NPTL, but it must just be that NPTL is more efficient than linuxthreads.

    Sorry for the delay in moderating, I was in Turkey for a week!

    Xaprb

    19 Jul 06 at 9:09 pm

  7. Hi, thank you all, I have few servers and on just one of them I see only one process under TOP, I was scared that my mysql installation was screwed, instead probably it’s just that on this machine I have a different kernel,2.6, and is showing one process as it should.

    Benedetto

    26 Jul 06 at 7:31 am

  8. Wow; thanks a bunch. I’ve been wrecking my brain on this problem for months! I had a hunch it had something to do with different library implementations. But now I understand my ps output is normal on my machine, because my machine use NPTL instead of LinuxThreads.
    But from what I understand you rebuild glibc and now your top/ps output _does_ show the threads _with_ NPTL? This doesn’t make sense; since you stated one of the things with NPTL is that it doesn’t come up in ps/top?

    Jan

    5 Feb 07 at 12:21 pm

  9. Yes, the H key toggles showing NPTL threads in top.

    Xaprb

    6 Feb 07 at 9:11 am

  10. Hi, I have a problem related to this subject . We use a linux VPS(Virtual Private Server) running CentOS 4.5 and mysqld 5.0.41 . When the VPS is restarted mysqld it’s running at boot time with NPTL (threads don’t have pid so they don’t appear in top, ‘ps aux’ and others, but you can see them using ps -eLf for example. We have only one mysqld process running). In this case, the server is running great, without any problems.
    But mysqld is restarted (or I restart the VPS without starting mysqld at boot, start mysqld after the boot sequence was completed) , we’ll have lots of mysqld processes, like using linuxthreads implementation (LinuxThreads has a hack to make threads visible as separated processes). In this LinuxThreads ’state’, the mysqld server restart will fail(60 sec timeout while trying to do
    /bin/kill -0 “$MYSQLPID”) . Also, the server will crush in the next 2 days.
    I’m not sure if it’s a virtualization/kernel issue.

    GNU_LIBPTHREAD_VERSION->linuxthreads-0.10

    Thanks

    LC

    2 Aug 07 at 6:54 pm

Leave a Reply