Comments on: Dump and reload InnoDB buffer pool in MySQL 5.6 http://www.xaprb.com/blog/2012/09/18/dump-and-reload-innodb-buffer-pool-in-mysql-5-6/ Stay curious! Thu, 02 May 2013 12:36:53 +0000 hourly 1 http://wordpress.org/?v=3.5.1 By: George O. Lorch III http://www.xaprb.com/blog/2012/09/18/dump-and-reload-innodb-buffer-pool-in-mysql-5-6/#comment-20269 George O. Lorch III Fri, 21 Sep 2012 17:48:58 +0000 http://www.xaprb.com/blog/?p=2864#comment-20269 There was in fact a slight error in our Percona Server 5.1 implementation when an I/O error occurred during write. This has been corrected and should appear in our next 5.1 release.

In our Percona Server 5.5 implementation (lp:~gl-az/percona-server/5.5-686534-881001), which should also become available soon, we implemented similar functionality but also working with multiple buffer pools. The implementation will iterate through the buffer pools, dumping each and only holding the current buffer pool mutex while populating the dump page. We also modified some of the dump page logic to allow the number of dump pages to fill for each mutex cycle to be changed at compile time. The default is one page.

The math posted above by Laurynas is slightly flawed though by an initial typo that changes the rest of the math. Each record is 8 bytes (space id, page id). Using default InnoDB page size of 16K means each dump page will hold 2048 LRU records worth of dump, not 4096KB of records.

This fix should have the effect of relieving stalls due to lengthy holding of the buffer pool mutexes at the possible expense of creating more context switching around those mutexes during dump. This can be reduced by increasing the number of dump pages filled for each mutex cycle by either changing the new #define and recompiling or by exposing that value as yet another system variable, but we will leave that for another day.

]]>
By: Laurynas http://www.xaprb.com/blog/2012/09/18/dump-and-reload-innodb-buffer-pool-in-mysql-5-6/#comment-20258 Laurynas Wed, 19 Sep 2012 06:48:10 +0000 http://www.xaprb.com/blog/?p=2864#comment-20258 Mark -

Yes, our implementation holds the mutex for every 4096KB of dump, which amounts to scanning half of million pages on the LRU list and 8GB of buffer pool dumped. We may adjust this amount up or down depending on what results we see.

Re. lines 109 and 117/118, indeed they don’t look correct, thanks. We’ll take care of this. This would only have impact in the case of I/O failure of the write.

The MySQL 5.6 implementation seems to acquire the buffer pool mutex for every buffer pool instance, save the dump info, release the mutex, write, repeat. So it seems effectively similar (modulo the different InnoDB and XtraDB mutexes…) to our fix with the difference of the mutex holding duration set to the buffer pool instance size instead of 8GB.

Tuning the number of buffer pools for Percona Server would of course do nothing for that 8GB constant, except for the instances that are sized to a non-multiple of 8GB.

]]>
By: Xaprb http://www.xaprb.com/blog/2012/09/18/dump-and-reload-innodb-buffer-pool-in-mysql-5-6/#comment-20254 Xaprb Tue, 18 Sep 2012 19:31:15 +0000 http://www.xaprb.com/blog/?p=2864#comment-20254 Yes. In case of a server crash, if the LRU dump is on a DRBD or SAN along with the data, then a cold standby can be brought up quickly. This was one of the major use cases for the feature originally. It can be done with a script or event, but DBAs would rather that a feature like this be built in so there’s one less thing to remember and monitor.

]]>
By: Mark Leith http://www.xaprb.com/blog/2012/09/18/dump-and-reload-innodb-buffer-pool-in-mysql-5-6/#comment-20253 Mark Leith Tue, 18 Sep 2012 18:32:23 +0000 http://www.xaprb.com/blog/?p=2864#comment-20253 Both implementations can block buffer pool operations for certain periods of time.

Looking at the fix for the above mentioned bug by Laurynas (https://code.launchpad.net/~gl-az/percona-server/5.1-686534-881001/+merge/119062), I’d say that “does not block server operation anymore” is wrong, too. The LRU_list_mutex is still held whilst scanning the buffer pool pages, it’s just released whilst writing to the file, and reacquired immediately afterwards. And this is done for each individual page.

I really question parts of that fix too, especially the changes on lines 117/118 – the mutex should already be held there (line 109), and the goto skips releasing the mutex (and that is not done within end either).

The approach we have taken is very different. For those that want to follow along code wise see – http://bazaar.launchpad.net/~mysql/mysql-server/5.6/view/head:/storage/innobase/buf/buf0dump.cc#L178.

Instead of scanning the whole buffer pool LRU list in one go, we scan each buffer pool *instance* individually, and only hold the buf_pool mutex whilst we are getting the space# and page# number of each page, buffering the entire buffer pool instance’s pages along the way. We then release the buf_pool mutex, and write the entire buffer of page info per buffer pool instance out to the file.

Overall this means that you may get better concurrency with larger buffer pools if you appropriately tune the innodb_buffer_pool_instances variable as well – http://dev.mysql.com/doc/refman/5.6/en/innodb-performance.html#innodb-multiple-buffer-pools – tuning this with the Percona approach would have no effect.

Which approach is better? Hard to say without a comparison benchmark.. The Percona approach seems like it would aqcuire/release the mutex much more often, whereas our approach acquires different mutexes depending on the buffer pool instance, yet may hold it for slightly longer periods of time whilst generating the list to write out to the file.

Out of interest, why was the “dump on interval” done in the first place? Is that in case of a server crash (the only reason I can think of..)?

]]>
By: Harrison Fisk http://www.xaprb.com/blog/2012/09/18/dump-and-reload-innodb-buffer-pool-in-mysql-5-6/#comment-20252 Harrison Fisk Tue, 18 Sep 2012 17:48:24 +0000 http://www.xaprb.com/blog/?p=2864#comment-20252 When we ported this patch to our branch, we fixed this hang as well:

https://code.launchpad.net/~percona-dev/percona-server/fb_changes_auto_lru_dump/+merge/46012

]]>