Xaprb

Stay curious!

The MySQL init-script mess

with 22 comments

I don’t think there is a single good-quality MySQL init script for a Unix-like operating system. On Windows, there is the service facility, and I used to write Windows services. Based on that, I believe Windows has a pretty good claim to better reliability for start/stop services with MySQL.

What’s wrong with the init scripts? Well, let me count the reasons! Wait, I’m out of fingers and toes.

I’ll just mention the two annoying ones that I’ve run into most recently. Both are on Debian, though there is nothing especially broken about Debian’s init scripts. The first one comes from parsing my.cnf wrong and not treating pid-file and pid_file identically. The server treats them identically, thus, so should any other program that reads the my.cnf file (there’s this program called my_print_defaults… use it!). The second bug is because Debian uses two configuration files for start/stop services: the init script reads /etc/mysql/debian.cnf for no discernable reason. (I guess they never heard of using [sections] in the /etc/mysql/my.cnf file, or just reading the [mysqld] section.) So if you configure your server to place its socket in a non-default location, you have to redundantly update /etc/mysql/debian.cnf too, or the init script will fail. Duplication of configuration parameters is just stupid, period.

These are fairly mundane bugs. I’ve seen literally dozens more. Part of the problem is that each distribution that packages up and redistributes MySQL tends to ship with their own init script, instead of reusing the official scripts provided by MySQL. Understandable, because mysqld_safe is generic and doesn’t really integrate well with any system’s init facilities. But man, do they reinvent a bunch of lovely bugs, mostly related to things like parsing the .cnf files, handling pid files, handling sockets, special user accounts, braindead look-before-you-leap patterns of pinging before actually doing a task, stupid timeouts, wrong handling of log files and log file rotation, dumb hacks with syslog, failing to check for real evidence of a running process (you can’t trust what a cache file on disk says!), adding facepalm-worthy CHECK TABLES automatically on every table on server startup, and on and on.

The official mysqld_safe script tends to be a little less broken, in my experience, but still has many unlovely behaviors and missing features that I’d consider to be bugs.

I haven’t even mentioned the “manage multiple instances” scripts yet. Boy, do those have a ton of bugs. They do stupid things like grepping configuration files for strings that may or may not be in the configuration files. I remember one emergency case where MySQL couldn’t be started on a box because the string “mysql_multi” didn’t exist in a my.cnf file clearly designed for multiple instances to run. I added a comment to the effect of “# This comment is necessary for mysql_multi to work” and the problem was solved. A sane script would actually check for multiple instance definitions, not for some arbitrary string of characters. Anyway, this is just one tiny example, I don’t mean to dwell on it.

What happens when you have a bad init script? All kinds of things. You can’t shut down the server gracefully, so if you shut down the system, you hard-crash MySQL eventually, and good luck getting replication back after that in most cases. You can’t start the server correctly, or it reports the wrong thing and then tries to start several instances, and the second one borks the first one’s pid file and/or socket, causing the aforementioned shutdown problem or worse. And on it goes.

My principle is usually “don’t complain, do something about it.” But there’s a problem, in this case: writing a good init script is actually a significantly complex software engineering project. It is NOT “just a script.” (Insert my usual rant about the need for an actual test suite.) And that is not something I am working on at the moment, nor has it ever become my priority for the last several years. So in this case I’m complaining, because the writing on the wall says that I am probably never going to work on this, and I’d at least like there to be some visibility about what a serious problem this is.

Distribution maintainers could probably improve the situation significantly by taking a look at each other’s bug reports. If everyone solved the same bugs everyone else has solved (and don’t forget bugs in mysqld_safe, too) that would be a big step forward.

Written by Baron Schwartz

April 24th, 2012 at 9:25 pm

Posted in SQL

22 Responses to 'The MySQL init-script mess'

Subscribe to comments with RSS

  1. I have never understood why there are so many rewrites of service start facilities on Linux. This really seems like a problem we could just solve once and then move on. Meanwhile, to add to your rant my favorite is locating the MySQL log file after MySQL has crashed. I usually end up putting ‘set -x’ in /etc/init.d/mysql[d] to figure out where it is actually trying to write things.

    Robert Hodges

    25 Apr 12 at 12:59 am

  2. Maybe start with logging bugs on the quick wins?

    Then see if the evolution of those issues reignites interest in providing ‘good’ init scripts.

    sime

    25 Apr 12 at 1:30 am

  3. I was always thought it was weird that MySQL has two names for the same variable. You are right that anything that reads MySQL’s config files would have to normalize configuration variables till that is fixed.

    As you know my.cnf comes pre-installed on Debian, to handle upgrades you can either blow the file away, somehow patch it, or as Debian does separate pre-installed files from user configuration which goes in /etc/mysql/config.d/. User settings should win if there are conflicts.

    Debian routinely reports bugs upstream. Not sure if this is the case for MySQL specifically. This seems like a better approach than trying to coordinate with n other distributions, no?

    Have you looked at the Debian init script? It is a wrapper for the mysqld_safe.

    Allan Wind

    25 Apr 12 at 1:55 am

  4. there’s always wrapping it with DJB’s daemontools. You get the same method of running a process across any *nix distribution, and by being forced to pass args on the command line you all together avoid the issue of where this month’s “Linux flavor of choice” puts logs or reads from cnf files or sources defaults from.

    brad fino

    25 Apr 12 at 4:11 am

  5. The actual init script that comes with the server distribution is mysql.server which can be found in either the share/mysql or support-files directory (depending on distribution and version.

    #justsayn

    hartmut

    25 Apr 12 at 5:25 am

  6. @Allan

    The two different forms are due to the three different places where variables can be set and used (cmdline, my.cnf, SQL context).

    In the SQL context only ‘_’ works, ‘-’ would be interpreted as the minus operator and so can’t be part of an identifier name

    In the command line context ‘_’ could work, but the usual convention is –some-option-name, not –some_option_name

    In the config file context there is no real preference for either way

    So strictly speaking the ‘_’ form would be sufficient these days, but the ‘-’ form is kept for convenience reasons.

    I could imagine though that “in the old days” there were only command line options, with my.cnf and SQL level variable access only having been added later, so first there was only the ‘-’ form, then the ‘_’ form was needed for SQL, but nobody wanted to break the cmdline option names at that point, so the compromise was to simply treat ‘-’ and ‘_’ as the same thing … but that predates even my involvement with all this …

    hartmut

    25 Apr 12 at 5:33 am

  7. Hartmut, thanks. I did not know about mysql.server.

    Hartmut, not only is there pid-file and pid_file, but pid_fi will work too, because it’s an unambiguous prefix. I’ve written several my.cnf parsers and it is far from a trivial task. This is why nobody should ever do it, including me. They should use my_print_defaults instead and rely on something consistent that is supposed to be authoritative.

    Allan, I have indeed “looked at” Debian’s init script. Not only is it a wrapper, it’s a BAD wrapper that has practically zero error checking or consideration of errors that might happen :-)

    Xaprb

    25 Apr 12 at 6:58 am

  8. @Xaprb,

    I am truly amazed that you have not known about mysql.server script. We wrote it, back in 1998, just for these problems. We had thousands of issues / tickets created for the problems when customers used “smart” distribution scripts which were not suited for MySQL server.

    We especially had problems with Red Hat scripts (I am a fan of Red Hat and appreciate what they are doing) and their usage of KILL instead of mysqladmin shutdown and some other quirks.

    And I stand fully behind what my dear friend Hartmut said above.

    If you have any other questions, you can catch me on FB, as we are friends …… ;o)

    Sinisa

    25 Apr 12 at 10:05 am

  9. The scripts distributed with the source are no peach either though – I haven’t verified if this has been fixed yet (The bug is still open)

    http://bugs.mysql.com/bug.php?id=61291

    Justin Rovang

    25 Apr 12 at 10:57 am

  10. @Justin

    Sorry, that bug is not about Unix init scripts ….

    Sinisa

    25 Apr 12 at 11:09 am

  11. Sinisa,

    I am constantly discovering things I did not know (or maybe I forgot). I just noticed the ‘identity’ variable and looked it up a few minutes ago. I am not as smart or experienced as some people say I am!

    Xaprb

    25 Apr 12 at 11:42 am

  12. @Xaprb

    We learn as long as we live …

    And learning brings so much pleasure if done correctly and on the subject of interest.

    My late father used to say “The more I learn the more I know how much I do not know !!!”

    Sinisa

    25 Apr 12 at 2:18 pm

  13. Baron,
    W/R mysql.server script:
    As I usually install MySQL server from binary distribution or via compilation (I rarely use packages), the safe way to install it as a service is, for example:

    sudo cp /path/to/installed/server/support_files/mysqll.server /etc/init.d/mysql55

    When compiled from source, the mysql.server script already contains the correct “basedir” variable (e.g. /usr/local/mysql55 as I like to position it).

    If installed from binary tarball, this must be set.

    It may also be a good idea to set the “datadir” variable in the script. Both are open for setup at about line #30 of the script.

    I choose a service name of “mysql55″ or “mysql51″ etc., so as to make sure I do not collide with possible past or future installations of MySQL server.

    Last, I set the serive to run at startup and terminate at shutdown (e.g. on debian/derived, using rcconf)

    May write more about this on a dedicated blog post.

    Shlomi Noach

    25 Apr 12 at 11:56 pm

  14. Hmmm… perhaps my needs are simply simple, but the MacOS Server seems to start MySQL reliably, with no muss nor fuss.

    Apple has a rather nice startup system, with a dependency chain and compatible version tracking. If it’s part of the open-source Darwin core, perhaps it could be adopted by the various Linuxes?

    Or like I said, perhaps it works well because my needs are simple.

    Jan Steinman

    25 Apr 12 at 11:56 pm

  15. @Jan
    For very simple needs even broken Debian/Ubuntu scripts are fine.
    Also in Linux world some sophisticated startup systems were developed in recent few years. Like the Ubuntu’s Upstart, Fedora’s systemd, etc., which have similar aims as Apple’s launchd. Unfortunately the way MySQL service control was implemented in them seems still poor.

    Przemek

    26 Apr 12 at 9:01 am

  16. I agree. In the normal course of events, every distribution’s init scripts are fine. It’s when anything unexpected happens that they cause minor problems to become serious problems. I don’t use MySQL on a Mac (and the few Percona customers I am aware of who were running in production on Mac servers are now migrated to GNU/Linux), so I cannot comment on that.

    Xaprb

    26 Apr 12 at 11:22 am

  17. Till

    25 Jun 12 at 7:18 pm

  18. … and Gentoo’s init scripts are busted too, of course. Remove the socket file (perhaps by accidentally starting up a second instance, which detects the first instance is running and shuts itself down, in the process removing the pid file and socket file), and then try to “/etc/init.d/mysql restart” on Gentoo. Face-palming ensues. Why do people write init scripts that assume a pid/socket file is the source of truth about whether something is up and running?

    Xaprb

    20 Aug 12 at 2:04 pm

  19. Till,

    The FreeBSD script you reference has some of the braindead behaviors I hate. Example: if the data directory is empty, it creates a default data directory and starts the server anyway. I can’t count how many times I have seen a volume fail to mount because of a configuration error, leaving the bare mount directory. The stupid init script then sees this and starts MySQL on the root filesystem anyway. This is dumb. I did not ask the script to create a blank empty database with insecure user privileges, I asked it to start an existing database. If there is no existing database, that should be a fatal error and the process of starting MySQL should fail.

    Another example is that it waits a hard-coded 15 seconds for the PID file to come into existence. This is completely naive.

    So again, it has zero realistic error checking, actively harmful behaviors, and is overly simplistic. I would not praise it as one of the best, I would grade it a 3 on the scale of 1 to 10.

    Xaprb

    20 Aug 12 at 2:14 pm

  20. Is 10 the best, or 1? ;)

    Regardless – I understand your troubles. I think I never ran into these before and maybe just got lucky. I think I agree and the e.g. creating the data_dir is not the scope of an init script. Maybe I’ll work up a patch and send it to them.

    I also agree the 15 seconds are kind of arbitrary. How would you suggest this is done instead?

    I find Debian’s approach to connect to the database server for a ping not the best either. Start up time can vary depending on the size of the database and whatever it has to do to complete.

    So yeah. Let me know what you would do instead or how a perfect start script would look like in your book, and I could roll this into a patch for them.

    Till

    20 Aug 12 at 2:35 pm

  21. As far as detecting whether the server has started, off the top of my head I’d do something like this: fork off the mysqld process, remember its PID, then go into a loop once a second. While the PID exists, check if you can connect to the server. If not, check the tail of its log file and see what is happening. It is likely to be performing recovery. It is also very possible that it’s in a loop trying to access the InnoDB files which another instance has locked. If that’s the case it should bail out and alert the user. But in the “good” case it either starts up, or delays for recovery and then starts, and the init script should just print progress indicators while that happens. I’m not sure a hardcoded timeout is a good idea. If there is a timeout, it should surely indicate whether the mysqld process is running or not, so the user has some idea that they should inspect manually before trying to start again.

    Xaprb

    20 Aug 12 at 4:41 pm

  22. Sinisa,

    what is the shortcoming of using kill instead of mysqladmin to shut down the server? I’ve always preferred kill, because it succeeds in more cases. See Eric’s blog post here for more details: http://ebergen.net/wordpress/2012/09/29/shutting-down-with-mysqld-mysqladmin-sigterm-or-sigkill/

    Xaprb

    30 Sep 12 at 8:30 am

Leave a Reply