<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.2.2" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Introducing MySQL Parallel Dump</title>
	<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/</link>
	<description>Stay curious!</description>
	<pubDate>Fri, 16 May 2008 03:40:22 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.2</generator>

	<item>
		<title>By: Xaprb</title>
		<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-14257</link>
		<author>Xaprb</author>
		<pubDate>Thu, 28 Feb 2008 13:12:28 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-14257</guid>
		<description>Hi,

There's a current forum conversation on the same topic.  It would be good to move this there so it's all in the same place:

http://sourceforge.net/forum/forum.php?thread_id=1952777&#038;forum_id=664350</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>There&#8217;s a current forum conversation on the same topic.  It would be good to move this there so it&#8217;s all in the same place:</p>
<p><a href="http://sourceforge.net/forum/forum.php?thread_id=1952777&#038;forum_id=664350" rel="nofollow">http://sourceforge.net/forum/forum.php?thread_id=1952777&#038;forum_id=664350</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: RV</title>
		<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-14256</link>
		<author>RV</author>
		<pubDate>Thu, 28 Feb 2008 10:02:33 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-14256</guid>
		<description>If you had multiple schema's, within one MySQL instance, and each one had no dependency/connection to the other, could one not spawn multiple (single-transaction, master-data) mysqldumps against each one and therefore reduce the time to perform full consistent InnoDB hot backups from a master?  If not, then I believe this type of tool is only usable for large InnoDB instances on slaves.  I know that performing dumps on masters is not recommended, but our replication confidence is low, so at the moment it's a must.  Hopefully that's where Maatkit will again come to the rescue.</description>
		<content:encoded><![CDATA[<p>If you had multiple schema&#8217;s, within one MySQL instance, and each one had no dependency/connection to the other, could one not spawn multiple (single-transaction, master-data) mysqldumps against each one and therefore reduce the time to perform full consistent InnoDB hot backups from a master?  If not, then I believe this type of tool is only usable for large InnoDB instances on slaves.  I know that performing dumps on masters is not recommended, but our replication confidence is low, so at the moment it&#8217;s a must.  Hopefully that&#8217;s where Maatkit will again come to the rescue.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xaprb</title>
		<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-14254</link>
		<author>Xaprb</author>
		<pubDate>Wed, 27 Feb 2008 15:42:22 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-14254</guid>
		<description>It's not possible.  Locking tables is the only way to do it.  A single transaction would require many connections to be able to participate in the transaction, which is something I wouldn't assume MySQL will build anytime soon :-)

I suppose another way would be to lock tables briefly and allow all child processes to do BEGIN TRANSACTION, but that isn't the way it's built at the moment.</description>
		<content:encoded><![CDATA[<p>It&#8217;s not possible.  Locking tables is the only way to do it.  A single transaction would require many connections to be able to participate in the transaction, which is something I wouldn&#8217;t assume MySQL will build anytime soon :-)</p>
<p>I suppose another way would be to lock tables briefly and allow all child processes to do BEGIN TRANSACTION, but that isn&#8217;t the way it&#8217;s built at the moment.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: RV</title>
		<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-14253</link>
		<author>RV</author>
		<pubDate>Wed, 27 Feb 2008 14:05:43 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-14253</guid>
		<description>Baron,

Great work, but would you mind confirming if this does a consistent snapshot as in single-transaction for mysqldump for InnoDB tables or is this simply not possible via a parallel method?

Cheers</description>
		<content:encoded><![CDATA[<p>Baron,</p>
<p>Great work, but would you mind confirming if this does a consistent snapshot as in single-transaction for mysqldump for InnoDB tables or is this simply not possible via a parallel method?</p>
<p>Cheers</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xaprb</title>
		<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13448</link>
		<author>Xaprb</author>
		<pubDate>Thu, 04 Oct 2007 02:01:11 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13448</guid>
		<description>Just for information, the new version is released, and can do chunked dumps as promised.  Testing and benchmarks welcome!</description>
		<content:encoded><![CDATA[<p>Just for information, the new version is released, and can do chunked dumps as promised.  Testing and benchmarks welcome!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xaprb</title>
		<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13446</link>
		<author>Xaprb</author>
		<pubDate>Wed, 03 Oct 2007 14:32:34 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13446</guid>
		<description>By default, it does FLUSH TABLES WITH READ LOCK once at the beginning of the whole process, then passes the --skip-lock-tables option to mysqldump so it doesn't try to get its own lock.

If you give the --locktables option, it locks only tables it uses.  This is enabled by default when --sets is given.  Each set gets and releases locks on the tables it uses, so tables in each set will be consistent, but may not be consistent with tables from other sets.

You can disable the global lock with --noflushlock.</description>
		<content:encoded><![CDATA[<p>By default, it does FLUSH TABLES WITH READ LOCK once at the beginning of the whole process, then passes the &#8211;skip-lock-tables option to mysqldump so it doesn&#8217;t try to get its own lock.</p>
<p>If you give the &#8211;locktables option, it locks only tables it uses.  This is enabled by default when &#8211;sets is given.  Each set gets and releases locks on the tables it uses, so tables in each set will be consistent, but may not be consistent with tables from other sets.</p>
<p>You can disable the global lock with &#8211;noflushlock.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan</title>
		<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13445</link>
		<author>Dan</author>
		<pubDate>Wed, 03 Oct 2007 14:18:41 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13445</guid>
		<description>Baron - 

Could you be a little more explicit about how you lock tables?  Do you lock all tables in a database at the start of the dump and only release the locks when all the threads are done with every table?  Or are you locking tables as needed and releasing the lock when the dump for that table is done?

Thanks again,

Dan</description>
		<content:encoded><![CDATA[<p>Baron - </p>
<p>Could you be a little more explicit about how you lock tables?  Do you lock all tables in a database at the start of the dump and only release the locks when all the threads are done with every table?  Or are you locking tables as needed and releasing the lock when the dump for that table is done?</p>
<p>Thanks again,</p>
<p>Dan</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xaprb</title>
		<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13444</link>
		<author>Xaprb</author>
		<pubDate>Tue, 02 Oct 2007 20:27:20 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13444</guid>
		<description>Thanks, that's great.

The next version will perform even better on the single big table: it'll dump it into chunks in parallel (this is to help address the issue of loading in parallel and/or loading with commits between chunks).</description>
		<content:encoded><![CDATA[<p>Thanks, that&#8217;s great.</p>
<p>The next version will perform even better on the single big table: it&#8217;ll dump it into chunks in parallel (this is to help address the issue of loading in parallel and/or loading with commits between chunks).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan</title>
		<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13443</link>
		<author>Dan</author>
		<pubDate>Tue, 02 Oct 2007 20:24:50 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13443</guid>
		<description>Here are some benchmarks:

Performed on a 3.6GHz Xeon with 4 processors.  The filesystem is a RAID 10 array of 15k speed disks, directly attached to the server with a fibre channel.

&lt;pre&gt;Size of DB:   #of tables:     mysqldump:        mysql-parallel-dump:
1.4GB            22           5m 45sec          4m 29sec
117MB            232          37sec             7sec&lt;/pre&gt;

Note that the larger db had most of the space allocated for a single large table, so a lot of the time was spent on dumping that table.  Parallel threads were not that helpful in that case.  They were very helpful where there were lots of tables of roughly equivalent sizes.

I can probably run some more thorough tests, if you'd like, since my methods were pretty informal.

Thanks,

Dan</description>
		<content:encoded><![CDATA[<p>Here are some benchmarks:</p>
<p>Performed on a 3.6GHz Xeon with 4 processors.  The filesystem is a RAID 10 array of 15k speed disks, directly attached to the server with a fibre channel.</p>
<pre>Size of DB:   #of tables:     mysqldump:        mysql-parallel-dump:
1.4GB            22           5m 45sec          4m 29sec
117MB            232          37sec             7sec</pre>
<p>Note that the larger db had most of the space allocated for a single large table, so a lot of the time was spent on dumping that table.  Parallel threads were not that helpful in that case.  They were very helpful where there were lots of tables of roughly equivalent sizes.</p>
<p>I can probably run some more thorough tests, if you&#8217;d like, since my methods were pretty informal.</p>
<p>Thanks,</p>
<p>Dan</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xaprb</title>
		<link>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13442</link>
		<author>Xaprb</author>
		<pubDate>Tue, 02 Oct 2007 19:45:18 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2007/09/30/introducing-mysql-parallel-dump/#comment-13442</guid>
		<description>Hi Dan,

How large a speedup?  If you can share your benchmarks (data size, speed of mysqldump, speed with mysql-parallel-dump, number/speed of CPUs, number/speed/RAID configuration of disks) it would be great to have some representative numbers for the documentation.  I'm trying to give some general ideas in the documentation of when and how much this is likely to be helpful.  I've found one system where it's not all that helpful.

Multiple transactions shouldn't be a problem.  Good old LOCK TABLES keeps it consistent.</description>
		<content:encoded><![CDATA[<p>Hi Dan,</p>
<p>How large a speedup?  If you can share your benchmarks (data size, speed of mysqldump, speed with mysql-parallel-dump, number/speed of CPUs, number/speed/RAID configuration of disks) it would be great to have some representative numbers for the documentation.  I&#8217;m trying to give some general ideas in the documentation of when and how much this is likely to be helpful.  I&#8217;ve found one system where it&#8217;s not all that helpful.</p>
<p>Multiple transactions shouldn&#8217;t be a problem.  Good old LOCK TABLES keeps it consistent.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
