Archive for the ‘backups’ tag
After I wrote about things you need to know about MySQL backups, a customer contacted me and asked me what I know about SecoBackup for MySQL. I see it has a very low cost and Percona has Amazon accounts for testing purposes, so I quickly downloaded s3sql_188.8.131.52-2.01_i386.deb, installed it, configured it, and gave it a whirl.
Since I just want to see what it does to take a backup, I started up a sandbox running from /tmp/12345 and configured it to backup msandbox:email@example.com (I shut down my main mysqld on my laptop to make sure it can’t connect to the default instance).
Then I configured a backup set and tried to take a backup. Right away I saw it isn’t full featured enough. It doesn’t let you specify a port to connect to. This is fixable. They just have to add better support for all the command-line parameters like port, SSL, socket, etc. It is not enough to specify host, username and password. But it gives me a funny feeling about their level of experience with production MySQL servers.
And so I expected to see it using mysqldump. I tried a backup, and indeed I saw it’s using mysqldump. It could not connect to port 12345, which is where I set up my sandbox. I see no way to make it do that. I tried changing the port to the default 3306 and tried again, but it still doesn’t manage to connect to it to take a backup.
In any case, mysqldump might be a fine way to take a backup, for some people, but I need to see what command-line parameters it’s using before I can determine. So I hacked around for a while, got it to take a backup of my default instance, and eventually found out how to get the parameters it used to dump the data. Here are the parameters:
–opt –extended-insert –single-transaction –default-character-set=utf8 –create-options
I also turned on the log to my mysql instance and examined it afterwards to find out more about what this software really does to the database. I didn’t see anything unusual.
So having done that, how does SecoBackup stack up against my list of ten things to know?
- Does the backup require shutting down MySQL? If not, what is the impact on the running server? Blocking, I/O load, cache pollution, etc? Answer: It uses mysqldump, so you can use it online, but you’ll get all the usual stuff: cache pollution, extra load on the server, and so on. For large databases, mysqldump is unusable, so it won’t work for them.
- What technique is used for the backup? Is it mysqldump or a custom product that does something similar? Is it a filesystem copy? Answer: mysqldump.
- Does the backup system understand that you cannot back up InnoDB by simply copying its files? Answer: It doesn’t do file copies.
- Does the backup use FLUSH TABLES, LOCK TABLES, or FLUSH TABLES WITH READ LOCK? These all interrupt processing. Answer: None of the above.
- What other effects are there on MySQL? Iâ€™ve seen systems that do a RESET MASTER, which immediately breaks replication. Are there any FLUSH commands at all, like FLUSH LOGS? Answer: None of the above.
- How does the system guarantee that you can perform point-in-time recovery? Answer: It does not. It doesn’t capture the binary log positions.
- How does the system guarantee consistency with the binary log, InnoDB logs, and replication? Answer: It does not. It doesn’t capture the binary log positions.
- Can you use the system to set up new MySQL replication slaves? How? Answer: No. It does not capture the master log positions.
- Does the system verify that the backup is restorable, e.g. does it run InnoDB recovery before declaring success? Answer: It does not verify backups.
- Does anyone stand behind it with support, and guarantee working, recoverable backups? How strong is the legal guarantee of this and how much insurance do they have? Answer: I’ll skip this question.
And then there’s the other question someone asked in the comments on the original article: how long does it take to restore the backup? The answer is, for big databases it’s going to take A Very Long Time. This is another reason why mysqldump is unusable for backing up large databases.
Overall I’m not all that impressed with the quality of the software; I crashed it a number of times trying to set things up and take backups, and it does sort of naive things like print output without a trailing linebreak so the terminal gets messed up. But whether it’s a good choice really depends, I think, on your data. (Naturally.) It seems like it’s pretty convenient, but even if they fixed the problems and added –master-data to the mysqldump options, it would absolutely not work for a lot of the systems I work on. Even if you could back up some of those servers with mysqldump, it would take way too long to restore.
But here is the most important thing: It is neither Open Source nor Free Software, and I had to download it and try it out to find out that it uses mysqldump rather than some other technique. Nowhere on their website does SecoBackup mention that they take backups via mysqldump. That’s basic information that I would like to see right up front. Any backup system should disclose how it works.
I keep reading that Ma.gnolia’s data is permanently lost because “a specialist had been unable to recover any data from the corrupted hard drive.” This is not in itself a reason to consider data completely lost.
It is not clear to me whether the hard drive itself is unusable, e.g. the spindle won’t spin and the head won’t read the ones and zeroes, or whether the filesystem is corrupted. It sounds to me, from reading Larry Hallf’s comments, like it’s a simple matter of filesystem corruption. And even if the disk is dead, there is apparently a backup made from the corrupted filesystem, so there should be more than one way to try to recover this data: “Ma.gnolia’s database server suffered from file system corruption, which also corrupted it’s database backup, even though it was on a separate system.”
You don’t need to recover your filesystem to recover your MySQL data. Shameless plug: Percona can do it for you. We can get the raw data off a block device without even trying to mount it as a filesystem. Recovering MySQL data is not the same as recovering other types of data. If the disk spins, it might be possible to recover data from it.
Whether it’s worth it or not is another matter. Percona data recovery isn’t cheap, but it’s worth it for at least some people. I cannot name names, but you are using services from companies that have retained Percona to recover from worse cases of data loss than this appears to be, going by the limited available information. The original reason we built our data recovery toolset was to help one of the world’s largest corporations.
But cost and time may not have been the driving factor here. Whoever the unnamed data recovery specialist was, they took a long time and got no results. And now Ma.gnolia has given up and declared it a lost cause, which is sad for their users. I hope Larry Halff didn’t pay for the results he didn’t get. And I hope he didn’t wipe out his corrupted backup yet.
In the meantime, at least this incident is shining a bright light on the need for tested, verified backups. I’ve had two clients ask me how they can avoid ending up the same way as Ma.gnolia.
This is actually old news, but I never thought to file a bug report (until now) or say anything to anyone about it. If you use mysqldump to dump and restore a MySQL table that has INSERT triggers, you can get different data in your restored database than you had when you dumped. The problem? The tool dumps the triggers before the data, so they get added back to the table before the rows are inserted.
The fix for this is really trivial. Just move the triggers after the INSERTs. Someone should patch mysqldump, but I’ve been too lazy… and besides, you know what I say about me and C programming. Unsafe at any speed.
This isn’t a problem with mk-parallel-dump, by the way. It puts the triggers in separate files, and mk-parallel-restore adds them back after the data is restored.