Archive for the ‘backup’ tag
Let’s suppose that your backup process looks like this: you stop a replication slave, shut down MySQL, and copy away the data directory. Assume that the slave is perfect and has the same data as the master. Nothing is broken, nothing is wrong, everything is working fine. In most cases, this should work, right?
Under what kinds of circumstances will you not get all your data back if you restore the file copy and start MySQL?
A while ago Peter Zaitsev wrote about his wishes for mysqldump. These included multi-threaded dumps and “safe” dumps that would wait for a server to restart if it crashed, then keep dumping other tables. I’ve had sketches of this done for a while, but during this week I fleshed it out while writing about backup and recovery for our upcoming book. I had my own list of features I wanted to add:
- Support for backup sets, with the backup configuration stored in the database itself.
- Emphasis on tab-delimited dumps.
- Sane defaults, focused on ease of use and ease of recovery.
- Support for compression by default.
The resulting script is satisfactory to me. If you just run it without arguments, it connects to the server mentioned in your .my.cnf file and dumps all databases and tables, one file per table, gzipped, in parallel (at least two, but by default it detects the number of CPUs and runs that many in parallel).
baron@kanga $ mysql-parallel-dump SET DATABASE TABLE TIME STATUS THREADS default mysql columns_priv 0 0 2 default mysql db 0 0 2 default mysql help_category 0 0 2 default mysql func 0 0 2 default mysql help_keyword 0 0 2 ...snip... default test t1 0 0 2 default test t2 0 0 1
You can tell it to dump elsewhere, and it’s easy to dump all tables in tab-delimited format. Here it’s reading its configuration from the database, writing to /tmp, and not backing up tables that have been dumped in the last 5 minutes:
baron@kanga $ mysql-parallel-dump --basedir /tmp --tab --sets set1 \ --settable test.backupset --age 5m Nothing to do for set set1 baron@kanga $ ls -lR /tmp/set1 /tmp/set1: total 8 -rw-rw-rw- 1 baron baron 40 2007-09-30 21:43 00_master_data.sql drwxrwxrwx 2 baron baron 4096 2007-09-30 21:43 test /tmp/set1/test: total 16 -rw-rw-rw- 1 baron baron 549 2007-09-30 21:43 t1.sql.gz -rw-rw-rw- 1 baron baron 31 2007-09-30 21:43 t1.txt.gz -rw-rw-rw- 1 baron baron 550 2007-09-30 21:43 t2.sql.gz -rw-rw-rw- 1 baron baron 29 2007-09-30 21:43 t2.txt.gz
And as you can see, it knows I’ve dumped those tables recently and didn’t do them again. Pretty handy for scheduling and resuming backups, no? It makes it easy to keep going if something happens in the middle of the backup and you want to restart.
I’m aware of the similar mysqlpdump script, and I generally don’t like duplicating other people’s efforts, but I decided to go ahead and finish what I’d started. To tell you the truth, neither script is complicated. It’s just a matter of providing a sensible wrapper around existing functionality (in my case, that’s mysqldump and SELECT INTO OUTFILE, which I do directly rather than asking mysqldump to do it with -T, which just makes mysqldump into the same kind of wrapper). I also wanted to provide it as part of the MySQL Toolkit, so it’s all in the same place. Frankly, I also built in a lot more functionality than mysqlpdump has, and I consider the defaults to be more useful. I’d love for mysqldump itself to have better defaults — especially for dumping large datasets, which it’s frankly pretty poor at right now. One of these days MySQL AB will make me obsolete, I just know it…
Oh, and in keeping with my tradition, it’s sort of ridiculously sophisticated and overly generic. It has a little macro language that you can use to basically turn it into a loop iterator over the selected databases and tables, and run any command you wish. Here’s an example:
mysql-parallel-dump -- mysqldump --skip-lock-tables '%D' '%N' \| gzip --fast -c - \> '%D.%N.gz'
That basically duplicates the built-in defaults (except the defaults are actually a lot more complicated than that). But it illustrates how you could use this as a shell to select which tables to dump and fork off sub-processes, handling all the locking, error checking, and so forth for them. Here I’m spawning off mysqldump, but it would be just as easy to execute a custom script.
There’s one more wish Peter and I both have, but which is impossible for right now as far as we know. That’s to do parallel SELECT INTO OUTFILE dumps for a bunch of tables in one transaction. This will not be possible until more than one connection can participate in a single transaction. Ask the MySQL guys about that one!
This script is part of MySQL Toolkit and will be released as soon as I have time. There are a few other bug fixes I want to include. In the meantime, if you’re dying to get it, you can grab it from the MySQL Toolkit subversion repository.
Progress on High Performance MySQL, Second Edition is coming along nicely. You have probably noticed the lack of epic multi-part articles on this blog lately — that’s because I’m spending most of my spare time on the book. At this point, we have significant work done on some of the hardest chapters, like Schema Optimization and Query Optimization. I’ve been deep in the guts of those hard optimization chapters for a while now, so I decided to venture into lighter territory: Backup and Recovery, which is one of the few chapters we planned to “revise and expand” from the first edition, rather than completely writing from scratch.
Since we decided to take that approach, I began by following the outline from the first edition, and figured I’d re-read the first edition’s chapter and re-outline, then add more material as appropriate. To my surprise, I found this chapter in the first edition is one of the most cursory (I don’t mean to criticize too much — you’ll see where I’m going with this in a second). It’s quite short and doesn’t really discuss recovery at all, despite the chapter title. There’s one sub-section titled “Recovery,” but it’s only a few paragraphs, and mostly discusses dumping, not recovery! [Edit: whoops, I see each subsection in the "Tools and Techniques" has a few words about how to restore backups created with that specific tool. But there's still not much general advice about how to restore backups.]
The chapter devotes a lot of space to code listings and such, and not enough on how to do high-performance backups in a high-performance application, in my opinion. I quickly decided it needs to be significantly expanded, not just updated, and I scrapped the original text and became more liberal with the outline. I’m referring to the first edition as I write, but I’m not keeping any of the text. Chalk it up to perfectionism.
The outline, as I have it so far, is as follows. If you compare it to the first edition, you’ll see I’ve rearranged it quite a bit:
1 Why Backups? (very brief, even more so than the first edition) 2 Considerations and Tradeoffs 2.1 How Much Can You Afford to Lose? 2.2 Online or Offline? 2.3 Dump or Raw Backup? 2.4 Onsite or Offsite? 2.5 What to Back Up 2.6 Storage Engines and Consistency 2.7 Replication 3 Restoring from a Backup 3.1 Copying Files Across the Network 3.2 Starting MySQL 3.3 Point-In-Time Recovery 4 Tools and Techniques 4.1 mysqldump 4.2 mysqlhotcopy 4.3 Zmanda Recovery Manager 4.4 InnoDB Hot Backup 4.5 Offline Backups 4.6 Filesystem Snapshots 4.7 MySQL Global Hot Backup 4.8 Automating and Scripting Backups 5 Rolling Your Own Backup Script
At this point, I have written sections 1, 2 and 3, which are about 11 pages in OpenOffice.org (compare to 6 pages on paper in the first edition). I’m sure this will only grow as other things occur to me. The outline of section 4 is completely open to change, and section 5 might not even happen; if you can script, you can script. Otherwise, you might want to use one of the tools listed in section 4. All in all, I’d say we’re looking at about 25 to 30 pages, just based on what’s in my head and not yet written down.
Now, to come to my point: what would be helpful to you? Are there any challenges you’d like me to cover, such as how you back up a data warehouse with terabytes of data? (I’ve already done that, in What To Back Up, but feel free to ask anyway.) Are there challenges you have had to solve, which you think would be very helpful to others? This chapter is largely open to suggestion at this point. If you tell me/us what you’d like to see, this is your opportunity to get at least four experts to solve your problems in-depth.
The usual disclaimers apply: no guarantees, this is all open to change, this is top-secret pre-production material anyway and you never saw this web page. What is the first rule of Fight Club, again?
I’m looking forward to your feedback.