Xaprb

Stay curious!

Archive for January, 2009

MySQL disaster recovery by promoting a slave

with 5 comments

I was just talking to someone who backs up their MySQL servers once a day with mysqldump, and I said in a catastrophe, you’re going to have to reload from a backup; that’s some amount of downtime, plus up to a day of lost data.

And they said “We can just promote a slave, we’ve done it before. It works fine.”

Granted, in some/many cases, this is fine. There are all sorts of caveats — for example, you either know that your slave has the same data as the master or you don’t care. But it’s fine for some things.

So then I said “what about DROP TABLE?”

And there was a pause. I assume they were realizing that the chance of accidental or malicious destruction of data is much higher than the chance of multiple servers dying at once. This is why slave != backup.

How about you?

Granted, you can use a delayed slave to protect against this particular scenario. But you still need “real” backups, and you still have to think about the worst case — restoring that backup.

Written by Xaprb

January 20th, 2009 at 7:33 pm

iopp: a tool to print I/O operations per-process

with 8 comments

Mark Wong’s entry titled “Following up a couple questions from the presentation at PSU on January 8, 2009” just caught my eye:

What is ‘iopp’?

It’s a custom tool to go through the Linux process table to get i/o statistics per process. It is open source and can be downloaded from:

http://git.postgresql.org/?p=~markwkm/iopp.git;a=summary

If you know me, you know I can’t pass up “I/O statistics per process.” No way. So, after a moment of browsing the code, which is short and to the point, I tried it out:

baron@kanga:~$ wget -q -O iopp.c "http://git.postgresql.org/?p=~markwkm/iopp.git;a=blob_plain;f=iopp.c;hb=HEAD"
baron@kanga:~$ gcc -o iopp iopp.c 
baron@kanga:~$ ./iopp --help
usage: iopp -h|--help
usage: iopp [-ci] [-k|-m] [delay [count]]
            -c, --command display full command line
            -h, --help display help
            -i, --idle hides idle processes
            -k, --kilobytes display data in kilobytes
            -m, --megabytes display data in megabytes

Sweet! Next,

baron@kanga:~$ ./iopp -i -k 5
  pid    rchar    wchar    syscr    syscw      rkb      wkb     cwkb command
 4912        2        1        0        0        0        0        0 dbus-daemon
 5713        0        1        0        0        0        0        0 hald
 5717       17        0        0        0        0        0        0 hald-runner
 5932        0        2        0        0        0        0        0 NetworkManager
22101       94       28        0        0        0        0        0 Xorg
22238        4        4        0        0        0        0        0 pulseaudio
22684       29       55        1        0        0        0        0 firefox
26860        0       43        0        0        0        0        0 gnome-terminal

It behaves just like vmstat — it loops every 5 seconds until I stop it.

So what are we looking at here? I don’t see any documentation, but I see from the source that it’s reading /proc/[PID]/io. Unfortunately that’s not documented in my proc manpage, but there’s a patch that provides documentation for the file’s contents.

According to that, we’re looking at the pid, the number of kibibytes read and written (even if they came from the cache), the number of read and write system calls, and the number of kibibytes read and written to physical medium (i.e. not just to the OS cache). Finally we have canceled write kibibytes, and the command name. I won’t repeat the documentation on the canceled write bytes — it is what it sounds like, but there’s a little bit more explanation on that patch I linked.

This tool would have been very handy to know about last week. One of my clients was seeing a lot of disk writes from a MySQL server, and it would have made it considerably easier to diagnose the problem.

There is one small bug — the -i flag causes idle processes not to be printed out, but it’s applied after bytes have been transformed into kibi/mebibytes, so any process that has zeroes after that transformation gets filtered out. So you’ll get different output from -i -k than you will from -i or from -i -m. I’ll see if I can find the author’s email address and let him know about this…

Written by Xaprb

January 13th, 2009 at 8:16 pm

Where do you use Maatkit in real life?

with 13 comments

I note that Maatkit has been deemed unworthy to mention on Wikipedia. Someone emailed me the deletion log today:

20:13, 10 July 2008 Djsasso (Talk | contribs) deleted “Maatkit” ‎ (WP:PROD, reason was ‘Non-notable application, single primary source’.)

I have never been a very big promoter. I prefer to let people promote things themselves, and I try for a policy of attraction rather than promotion myself.

With that said, I often hear people saying some variant of “Maatkit saved my behind, thanks so much. I use it all the time.”

But that’s my private email and phone calls. So now is your chance to say so in public. Put your love story in the comments, and let’s see if Maatkit is notable or not. Oh, and feel free to bring back the Wikipedia page if you think Wikipedia is notable enough that it’s important for Maatkit to be there ;-)

Written by Xaprb

January 13th, 2009 at 10:45 am

Posted in Maatkit,SQL