Why MySQL replication is better than mysqlbinlog for recovery
You have a backup, and you have the binary logs between that backup and now. You need to do point-in-time recovery (PITR) for some reason. What do you do? The traditional answer is “restore the backup and then use mysqlbinlog to apply the binary logs.” But there’s a much better way to do it.
The better way is to set up a server instance with no data, and load the binary logs into it. I call this a “binlog server.” Then restore your backup and start the server as a replication slave of the binlog server. Let the roll-forward of the binlogs happen through replication, not through the mysqlbinlog tool.
Why is this better? Because replication is a more tested way of applying binary logs to a server. The results are much more likely to be correct, in my opinion. Plus, replication is easier and more convenient to use. You can do nice things like START SLAVE UNTIL, skip statements, stop and restart without having to figure out where you left off, and so on.
Replication also has the ability to correctly reproduce more types of changes than mysqlbinlog does. Try this with statement-based replication:
insert into tbl(col) values(connection_id());
That’ll work just fine through replication, because the SQL thread on the slave will change its connection ID to match the original. It won’t work through mysqlbinlog.
Further Reading:






and let’s not forget about when you have alot of blobs…
http://bugs.mysql.com/bug.php?id=33048
sbester
4 Sep 10 at 1:31 pm
Unless it’s INSERT INTO table SELECT * FROM information_schema.processlist WHERE id = CONNECTION_ID()…
Scott
4 Sep 10 at 2:17 pm
Baron – I wish there were a simpler way to that
Mark Callaghan
4 Sep 10 at 4:02 pm
Scott, I think you’re saying that INFORMATION_SCHEMA queries don’t replicate correctly in statement-based mode, which I agree with.
Xaprb
4 Sep 10 at 6:16 pm
Mark, what do you wish were simpler?
Xaprb
4 Sep 10 at 6:17 pm
I would rather not have to set up the binlog-only master
Mark Callaghan
4 Sep 10 at 7:29 pm
Will this SQL statement be right using mysqlbinglog:
INsert INTO `tbl` set dateling=NOW();
huarong
4 Sep 10 at 11:05 pm
insert into tbl(col) values(connection_id());
have no problem with mysqlbinlog.
Because there is
SET @@session.pseudo_thread_id=35451752/*!*/;
in the binlogs.
and
INsert INTO `tbl` set dateling=NOW();
will be right too.
SET TIMESTAMP=1283657440/*!*/;
huarong
4 Sep 10 at 11:37 pm
I did not know about pseudo_thread_id. It was not documented until recently, and even now its documentation lacks something: “for internal server use” is not very helpful. Thanks for pointing that out. So my example wasn’t a valid one after all. Still, there are plenty of others.
Xaprb
5 Sep 10 at 7:14 am
I’ve always figured that mysqlbinlog broke the cardinal rule of not writing code to do the same thing in two different places. I’ve had a lot of trouble with it in the past, although I’m not able to remember just now what those problems were. I fell back to using mysql replication itself much as you said above a long time ago.
Jeremy Cole
6 Sep 10 at 1:48 am
@Baron,
The new “MySQL High Availability” (O’Reilly) covers these issues (e.g. pseudo process Id, session variables, etc.), and then some. The first 100 pages are more of a binary-log specification & documentation.
Shlomi Noach
11 Sep 10 at 7:06 am
I have that book on my to-read list but I haven’t gotten to it yet.
Xaprb
12 Sep 10 at 7:28 am