Introducing MySQL Slave Delay
MySQL Slave Delay implements an oft-requested replication feature: the ability to make a slave lag its master, a.k.a scheduling binlog events for some time in the future. This will probably be built into MySQL replication someday, but in the meantime, you can use this tool. A delayed slave is great if disaster strikes and propagates through replication, because it lets you stop the delayed slave before it gets corrupted, restart the slave until just before the offending statement, and recover quickly. This is a lot faster than restoring last night’s full backup and replaying a day’s worth of binary logs on the master.
This is the first tool largely contributed by someone else. Sergey Zhuravlev sent me the original program over email, and I tweaked it to watch the slave’s relay logs so it doesn’t need to connect to the master, added documentation, and released it.
What it does
MySQL Slave Delay watches the slave’s relay log positions and makes the SQL thread lag the I/O thread by starting and stopping it as needed. It can also watch the master’s log positions, but this is usually not needed unless the slave’s I/O thread delay is large (typically it’s just milliseconds, but it depends on the network speed).
Since you can’t tell how far behind the slave is unless it’s running, MySQL Slave Delay samples and remembers the binlog positions. Later, when it needs to start the slave, it recalls the master’s position at the desired time ago, and makes the slave run until it hits that position.
Another way to think about it is that it schedules replication events for sometime in the future, instead of letting them be applied immediately.
All in all, a simple and easy way to accomplish delayed replication.
About MySQL Toolkit
MySQL Toolkit is a set of essential tools for MySQL users, developers and administrators. The project’s goal is to make high-quality command-line tools that follow the UNIX philosophy of doing one thing and doing it well. They are designed for scriptability and ease of processing with standard command-line utilities such as awk and sed.

Sergey and Baron,
Please send me your size information for twitter t-shirts.
You keep blowing me away with your tools (which we use!).
Also, please put up a donate link or wishlist.
Jeremy
Jeremy
5 Aug 07 at 3:38 am
Nice, very nice.
I’m giving it a bit of field trial now on some sandbox DBs, and if it holds up, yes it will be put into production.
NB: I saw a slave SQL thread kill “error” presented as a note in mysqld.log @ the slave machine, but am not worried ab’t it just yet – I want to see if a dropped table on the master’s DB actually waits 30 minutes –my setting– to propagate to the slave. If so, we’re golden here.
FYI, a typical default rig should look something like this, though:
mysql-slave-delay –quiet –delay 30m –interval 15s h=dhdb01,u=myLocalUser,p=myPassword dhdb01
(well, aside from the intentionally munged user/pass names and a quickie ampersand on the end, that’s exactly what I used).
Took a second to grok the syntax for user/pass, and a lot of folks lock down their DBs, so… there you go :)
Cheers!
/P
Penguinisto
7 Aug 07 at 5:20 pm
Hi! If it doesn’t hold up in production, it’s a bug. Report it via Sourceforge if so!
About the user/pass, I usually have these in a ~/.my.cnf file so they don’t have to be specified at the commandline.
Xaprb
7 Aug 07 at 10:03 pm
No worries… I had a bit of trouble getting it to read out of my.cnf (where the requisite info was already lodged). This may be to some paranoia that we tend to implement around here, though.
Otherwise it does great for setting a 30m delay, and I see no problems (yet – still stress-testing it) with putting it into production. Any bugs will get sent up the chain, I promise :)
The error msgs I spoke are these (sanitized) – doesn’t seem to slow anything down at all, though, so I’m not particularly worried about seeing them. That, and it was mentioned in the docs anyway. For the rubberneckers, here’s what pops up (success, failure, success, failure, etc – all notes):
—
070808 8:15:36 [Note] Slave SQL thread initialized, starting replication in log ‘(binlog).000001′ at position 3437288, relay log ‘./(my)-relay-bin.000004′ position: 2650095
070808 8:15:51 [Note] Error reading relay log event: slave SQL thread was killed
070808 8:45:52 [Note] Slave SQL thread initialized, starting replication in log ‘(binlog).000001′ at position 3454088, relay log ‘./(my)-relay-bin.000004′ position: 2666895
070808 8:46:07 [Note] Error reading relay log event: slave SQL thread was killed
070808 9:16:07 [Note] Slave SQL thread initialized, starting replication in log ‘(binlog).000001′ at position 3471168, relay log ‘./(my)-relay-bin.000004′ position: 2683975
070808 9:16:22 [Note] Error reading relay log event: slave SQL thread was killed
—
Like I said, no worries, and everything works as advertised. Just figured you’d want to peek at ‘em.
I do have something to ponder, though; the slave server I implemented this on is about to be in the middle of a daisy-chain where two-way replication occurs. This is actually good in that changes I make to this DB propagate out immediately, while inbound changes are delayed. That said, I’m about to add another DB machine parallel to it (for a poor-man’s RAC rig w/ heartbeat/STONITH/scripting).
Is there a way to rig a delay for only one specific master, while letting the other roll its changes right in? I suspect that’s part of what the master binary DB check can be used for, but I’m not sure.
If not, no worries – I can just set the new DB to replicate off the same one that my first one does, set up the STONITH/heartbeat and scripting, and possibly create a replication loop of sorts as a ’sanity check’ (I haven’t even checked to see if that’s possible w/o inducing problems yet). If that works I’ll document the whole thing (I have to document it anyway) and send it along w/ all the names/users/etc info sanitized out.
Besides, this is rather fun (and they pay me for it)… :)
Cheers!
/P
Penguinisto
8 Aug 07 at 12:24 pm
Hi,
It could be that the .my.cnf didn’t work because the tool only reads the [mysql] section of the file, so if you have a [client] section it wouldn’t read that. Maybe that should be changed; I can’t decide.
In the scenario you describe, I think you’re talking about A->B->A replication. If you run the tool against B, it will only slow B getting the changes from A. This is because (I assume) you’ll have both servers configured with log_slave_updates. So an event on A will go to B’s relay log, sit for half an hour, execute, go into B’s binary log, go to A’s relay log — and be ignored, since it came from A to begin with. However, events on B will go into B’s binary log, A’s relay log, and get executed on A right away.
If you are talking about A->B->C replication, though, it’s different. A’s event will be delayed half an hour on B, then go into B’s binary log and then to C’s relay log, delayed half an hour. So in case you’re thinking about trying to delay B but not C, I don’t think you can do it.
This conversation brings up an interesting thought: the relay logs could be delayed too. I can’t think of why that would be useful right now, though.
Xaprb
8 Aug 07 at 1:28 pm
[...] Zhuravlev on Xaprb wrote about new extension for MySQL replication – slave delay. Interesting idea with some obvious, [...]
</depesz> » Blog Archive » Log Buffer #57: a Carnival of the Vanities for DBAs
10 Aug 07 at 12:01 pm
I’m running,
mysql-slave-delay u=user,h=localhost,p=pass –delay 10m –interval 15s –time 20m localhost
and seeing the following error,
2007-08-24T16:09:17 slave running 0 seconds behind
Use of uninitialized value in concatenation (.) or string at /usr/bin/mysql-slave-delay line 269.
Any ideas?
Jodrell
24 Aug 07 at 11:12 am
I have just one idea: it’s a bug :) I’ll file it at the sourceforge project and take a look at reproducing and fixing it.
Xaprb
24 Aug 07 at 1:32 pm