Dump and reload InnoDB buffer pool in MySQL 5.6
After Gavin Towey’s recent blog post about Percona Server’s buffer pool dump locking the server for the duration of the operation, I thought I should re-examine MySQL 5.6′s implementation of a similar feature. When InnoDB engineers first announced the feature, I didn’t think it was complete enough to serve a DBA’s needs fully.
If you’re not familiar with this topic, MySQL 5.6 will allow the DBA to save the IDs of the database pages that are in the buffer pool, and reload the pages later. This technique can help a server to warm up in minutes instead of hours after a restart or failover.
I read through the documentation, and it looks good. I still think it might be good to have a built-in configuration variable to save the page IDs at regular intervals. But the approach MySQL 5.6 has taken will allow a DBA to use an event or a script to trigger that, so it’s more of an inconvenience than a showstopper. On the other hand, dumping on shutdown and reloading on startup are probably the most useful behaviors, and MySQL 5.6 does include that.
There is also more visibility into status and progress of the operation, which is good.
The million-dollar question is whether InnoDB’s implementation blocks the server’s operation, or whether it works without interrupting service. I’ll be curious to see if anyone has tested that.



We have a fix that does not block server operation anymore. In fact, it is already in Percona Server 5.1.65-14.0 release and also will be in a forthcoming Percona Server 5.5 release. Feel free to track https://bugs.launchpad.net/percona-server/+bug/686534
Laurynas
18 Sep 12 at 9:50 am
Laurynas,
Have you tried to measure how much time is spent in scanning the buffer pool (just remembering each page somewhere in memory, without IO) and how much time is spent writing the list of pages to disk?
Joe Smith
18 Sep 12 at 12:14 pm
Joe -
Unfortunately we don’t have such measurements. Due to the way the fix works, I am not sure the absolute CPU time measurement spent in the dump thread would be the best metric for the feature, as the periodic acquisition-and-release of the mutex makes it more similar to a yet another user transaction requiring buffer pool resources.
It would be interesting to do overall Percona Server performance comparison during the dump with and without the fix.
Laurynas
18 Sep 12 at 12:57 pm
When we ported this patch to our branch, we fixed this hang as well:
https://code.launchpad.net/~percona-dev/percona-server/fb_changes_auto_lru_dump/+merge/46012
Harrison Fisk
18 Sep 12 at 1:48 pm
Both implementations can block buffer pool operations for certain periods of time.
Looking at the fix for the above mentioned bug by Laurynas (https://code.launchpad.net/~gl-az/percona-server/5.1-686534-881001/+merge/119062), I’d say that “does not block server operation anymore” is wrong, too. The LRU_list_mutex is still held whilst scanning the buffer pool pages, it’s just released whilst writing to the file, and reacquired immediately afterwards. And this is done for each individual page.
I really question parts of that fix too, especially the changes on lines 117/118 – the mutex should already be held there (line 109), and the goto skips releasing the mutex (and that is not done within end either).
The approach we have taken is very different. For those that want to follow along code wise see – http://bazaar.launchpad.net/~mysql/mysql-server/5.6/view/head:/storage/innobase/buf/buf0dump.cc#L178.
Instead of scanning the whole buffer pool LRU list in one go, we scan each buffer pool *instance* individually, and only hold the buf_pool mutex whilst we are getting the space# and page# number of each page, buffering the entire buffer pool instance’s pages along the way. We then release the buf_pool mutex, and write the entire buffer of page info per buffer pool instance out to the file.
Overall this means that you may get better concurrency with larger buffer pools if you appropriately tune the innodb_buffer_pool_instances variable as well – http://dev.mysql.com/doc/refman/5.6/en/innodb-performance.html#innodb-multiple-buffer-pools – tuning this with the Percona approach would have no effect.
Which approach is better? Hard to say without a comparison benchmark.. The Percona approach seems like it would aqcuire/release the mutex much more often, whereas our approach acquires different mutexes depending on the buffer pool instance, yet may hold it for slightly longer periods of time whilst generating the list to write out to the file.
Out of interest, why was the “dump on interval” done in the first place? Is that in case of a server crash (the only reason I can think of..)?
Mark Leith
18 Sep 12 at 2:32 pm
Yes. In case of a server crash, if the LRU dump is on a DRBD or SAN along with the data, then a cold standby can be brought up quickly. This was one of the major use cases for the feature originally. It can be done with a script or event, but DBAs would rather that a feature like this be built in so there’s one less thing to remember and monitor.
Xaprb
18 Sep 12 at 3:31 pm
Mark -
Yes, our implementation holds the mutex for every 4096KB of dump, which amounts to scanning half of million pages on the LRU list and 8GB of buffer pool dumped. We may adjust this amount up or down depending on what results we see.
Re. lines 109 and 117/118, indeed they don’t look correct, thanks. We’ll take care of this. This would only have impact in the case of I/O failure of the write.
The MySQL 5.6 implementation seems to acquire the buffer pool mutex for every buffer pool instance, save the dump info, release the mutex, write, repeat. So it seems effectively similar (modulo the different InnoDB and XtraDB mutexes…) to our fix with the difference of the mutex holding duration set to the buffer pool instance size instead of 8GB.
Tuning the number of buffer pools for Percona Server would of course do nothing for that 8GB constant, except for the instances that are sized to a non-multiple of 8GB.
Laurynas
19 Sep 12 at 2:48 am
There was in fact a slight error in our Percona Server 5.1 implementation when an I/O error occurred during write. This has been corrected and should appear in our next 5.1 release.
In our Percona Server 5.5 implementation (lp:~gl-az/percona-server/5.5-686534-881001), which should also become available soon, we implemented similar functionality but also working with multiple buffer pools. The implementation will iterate through the buffer pools, dumping each and only holding the current buffer pool mutex while populating the dump page. We also modified some of the dump page logic to allow the number of dump pages to fill for each mutex cycle to be changed at compile time. The default is one page.
The math posted above by Laurynas is slightly flawed though by an initial typo that changes the rest of the math. Each record is 8 bytes (space id, page id). Using default InnoDB page size of 16K means each dump page will hold 2048 LRU records worth of dump, not 4096KB of records.
This fix should have the effect of relieving stalls due to lengthy holding of the buffer pool mutexes at the possible expense of creating more context switching around those mutexes during dump. This can be reduced by increasing the number of dump pages filled for each mutex cycle by either changing the new #define and recompiling or by exposing that value as yet another system variable, but we will leave that for another day.
George O. Lorch III
21 Sep 12 at 1:48 pm