<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Xaprb &#187; PostgreSQL</title>
	<atom:link href="http://www.xaprb.com/blog/category/postgresql/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.xaprb.com/blog</link>
	<description>Stay curious!</description>
	<lastBuildDate>Mon, 08 Feb 2010 19:36:15 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>How PostgreSQL protects against partial page writes and data corruption</title>
		<link>http://www.xaprb.com/blog/2010/02/08/how-postgresql-protects-against-partial-page-writes-and-data-corruption/</link>
		<comments>http://www.xaprb.com/blog/2010/02/08/how-postgresql-protects-against-partial-page-writes-and-data-corruption/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 19:36:15 +0000</pubDate>
		<dc:creator>Xaprb</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[checkpoint]]></category>
		<category><![CDATA[checksums]]></category>
		<category><![CDATA[CRC32]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[recovery]]></category>
		<category><![CDATA[wal]]></category>

		<guid isPermaLink="false">http://www.xaprb.com/blog/?p=1616</guid>
		<description><![CDATA[I explored two interesting topics today while learning more about Postgres.

Partial page writes

PostgreSQL&#8217;s partial page write protection is configured by the following setting, which defaults to &#8220;on&#8221;:

full_page_writes (boolean)

When this parameter is on, the PostgreSQL server writes the entire content of each disk page to WAL during the first modification of that page after a checkpoint&#8230; [...]


Related posts:<ol><li><a href='http://www.xaprb.com/blog/2009/02/19/the-magnolia-data-might-not-be-permanently-lost/' rel='bookmark' title='Permanent Link: The Ma.gnolia data might not be permanently lost'>The Ma.gnolia data might not be permanently lost</a> <small>I keep rea</small></li><li><a href='http://www.xaprb.com/blog/2009/09/29/what-data-types-does-your-innovative-storage-engine-not-support/' rel='bookmark' title='Permanent Link: What data types does your innovative storage engine NOT support?'>What data types does your innovative storage engine NOT support?</a> <small>I&#8217;ve</small></li><li><a href='http://www.xaprb.com/blog/2009/04/05/postgresql-conference-east-2009-day-three/' rel='bookmark' title='Permanent Link: PostgreSQL Conference East 2009, Day Three'>PostgreSQL Conference East 2009, Day Three</a> <small>As I said </small></li></ol>

Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.]]></description>
			<content:encoded><![CDATA[<p>I explored two interesting topics today while learning more about Postgres.</p>

<h3>Partial page writes</h3>

<p>PostgreSQL&#8217;s partial page write protection is configured by the following setting, which defaults to &#8220;on&#8221;:</p>

<blockquote cite="http://www.postgresql.org/docs/8.3/static/runtime-config-wal.html#GUC-FULL-PAGE-WRITES"><p>full_page_writes (boolean)</p>

<p>When this parameter is on, the PostgreSQL server writes the entire content of each disk page to WAL during the first modification of that page after a checkpoint&#8230; Storing the full page image guarantees that the page can be correctly restored, but at a price in increasing the amount of data that must be written to WAL. (Because WAL replay always starts from a checkpoint, it is sufficient to do this during the first change of each page after a checkpoint. Therefore, one way to reduce the cost of full-page writes is to increase the checkpoint interval parameters.)</p></blockquote>

<p>Trying to reduce the cost of full-page writes by increasing the checkpoint interval highlights a compromise.  If you decrease the interval, then you&#8217;ll be writing full pages to the WAL quite often.  This should in theory lead to surges in the number of bytes written to the WAL, immediately following each checkpoint. As pages are revisited over time for further changes, the number of bytes written should taper off gradually until the next checkpoint.   Hopefully someone who knows more can confirm this.  Does anyone graph the number of bytes written to their WAL?  That would be a nice illustration to see how dramatic this surging is.</p>

<p>Decreasing the checkpoint interval seems a bit scary, and is bound to have its own costs, for all the usual reasons.  A massive checkpoint once in a while should be really expensive, and would lead to a bad worst-case time for recovery.  Does the new bgwriter implementation in 8.3 fix any of this?  In theory it could, but I don&#8217;t know enough yet to say.  I have heard conflicting opinions on this point.  I have a lot more to read about it before I form my own opinion.</p>

<p>Storing full pages might not really be that expensive.  It could bloat the WAL, but is the cost (in terms of time) really that high?  InnoDB (in MySQL) protects against partial page writes with a double-write strategy: a region in the tablespace is called the doublewrite buffer.  Page writes are first sent to the doublewrite buffer, then to their actual location in the data file.  I don&#8217;t remember where, but I&#8217;ve seen benchmarks showing that this doesn&#8217;t hurt performance, even though it seems counter-intuitive.  Modern disks can do a lot of sequential writes, and the way InnoDB writes its data makes a lot of things sequential.  I doubt that putting full pages into the PostgreSQL WAL is forced to cost a lot, unless there is an implementation-specific aspect that makes it expensive.</p>

<p>The TODO has some <a href="http://wiki.postgresql.org/wiki/Todo#Write-Ahead_Log">items on the WAL</a>, which look interesting &#8212; &#8220;Eliminate need to write full pages to WAL before page modification&#8221; and a couple more items.  I need to understand PostgreSQL&#8217;s recovery process better before I know what these really mean.</p>

<h3>Detecting data corruption</h3>

<p>I was able to verify that the WAL entries have a checksum.  It is a CRC32.  This is in <a href="http://doxygen.postgresql.org/xlog_8c-source.html#l00567">xlog.c</a>.</p>

<p>However, as far as I can understand, the answer for detecting data corruption in normal data pages is &#8220;Postgres doesn&#8217;t do that.&#8221;  I was told on the IRC channel that normal data pages don&#8217;t have checksums.  I am not sure how to verify that, but if it&#8217;s true it seems like a weakness.  I&#8217;ve seen hardware-induced corruption on InnoDB data many times, and it could sometimes only be detected by page checksums.</p>

<p>What happens when a page is corrupt?  It probably depends on where the corruption is.  If a few bytes of the user&#8217;s data is changed, then I assume you could just get different data out of the database than you inserted into it.  But if non-user data is corrupted then do you get bizarre behavior, or do you get a crash or error?  I need to learn more about PostgreSQL&#8217;s data file layout to understand this.  Imagining (I haven&#8217;t verified this) that a page has a pointer to the next page, what happens if that pointer is flipped to refer to some other page, say, a page from a different table?  If TABLE1 and TABLE2 have identical structures but different data, could SELECT * FROM TABLE1 suddenly start showing rows from TABLE2 partway through the results?  Again I need to learn more about this.</p>

<p>Related posts:<ol><li><a href='http://www.xaprb.com/blog/2009/02/19/the-magnolia-data-might-not-be-permanently-lost/' rel='bookmark' title='Permanent Link: The Ma.gnolia data might not be permanently lost'>The Ma.gnolia data might not be permanently lost</a> <small>I keep rea</small></li><li><a href='http://www.xaprb.com/blog/2009/09/29/what-data-types-does-your-innovative-storage-engine-not-support/' rel='bookmark' title='Permanent Link: What data types does your innovative storage engine NOT support?'>What data types does your innovative storage engine NOT support?</a> <small>I&#8217;ve</small></li><li><a href='http://www.xaprb.com/blog/2009/04/05/postgresql-conference-east-2009-day-three/' rel='bookmark' title='Permanent Link: PostgreSQL Conference East 2009, Day Three'>PostgreSQL Conference East 2009, Day Three</a> <small>As I said </small></li></ol></p>
<p>Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://www.xaprb.com/blog/2010/02/08/how-postgresql-protects-against-partial-page-writes-and-data-corruption/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>My wishlist for SQL: the UNTIL clause</title>
		<link>http://www.xaprb.com/blog/2010/01/22/my-wishlist-for-sql-the-until-clause/</link>
		<comments>http://www.xaprb.com/blog/2010/01/22/my-wishlist-for-sql-the-until-clause/#comments</comments>
		<pubDate>Fri, 22 Jan 2010 23:18:08 +0000</pubDate>
		<dc:creator>Xaprb</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://www.xaprb.com/blog/?p=1592</guid>
		<description><![CDATA[I&#8217;d like an UNTIL clause, please.  I&#8217;d use it sort of like LIMIT in MySQL and PostgreSQL, except that it would define when to stop returning looking for rows, instead of defining how many to return.  Example:

SELECT * FROM users ORDER BY user_id UNTIL user_id >= 100;

That would select users up to and [...]


No related posts.

Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;d like an UNTIL clause, please.  I&#8217;d use it sort of like LIMIT in MySQL and PostgreSQL, except that it would define when to stop <del datetime="2010-01-23T16:18:53+00:00">returning</del> looking for rows, instead of defining how many to return.  Example:</p>

<code><pre>SELECT * FROM users ORDER BY user_id UNTIL user_id >= 100;</pre></code>

<p>That would select users up to and including user 99.  Ideally the clause could accept any boolean predicate, including subqueries.  I&#8217;ll hold my breath and wait for this wish to come true now.</p>

<p>No related posts.</p>
<p>Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://www.xaprb.com/blog/2010/01/22/my-wishlist-for-sql-the-until-clause/feed/</wfw:commentRss>
		<slash:comments>28</slash:comments>
		</item>
		<item>
		<title>How Linux iostat computes its results</title>
		<link>http://www.xaprb.com/blog/2010/01/09/how-linux-iostat-computes-its-results/</link>
		<comments>http://www.xaprb.com/blog/2010/01/09/how-linux-iostat-computes-its-results/#comments</comments>
		<pubDate>Sun, 10 Jan 2010 01:53:10 +0000</pubDate>
		<dc:creator>Xaprb</dc:creator>
				<category><![CDATA[GNU/Linux]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Sys-Admin]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[iostat]]></category>

		<guid isPermaLink="false">http://www.xaprb.com/blog/?p=1545</guid>
		<description><![CDATA[iostat is one of the most important tools for measuring disk performance, which of course is very relevant for database administrators, whether your chosen database is Postgres, MySQL, Oracle, or anything else that runs on GNU/Linux.  Have you ever wondered where statistics like await (average wait for the request to complete) come from?  [...]


Related posts:<ol><li><a href='http://www.xaprb.com/blog/2009/08/23/how-to-find-per-process-io-statistics-on-linux/' rel='bookmark' title='Permanent Link: How to find per-process I/O statistics on Linux'>How to find per-process I/O statistics on Linux</a> <small>Newer Linu</small></li><li><a href='http://www.xaprb.com/blog/2009/06/21/recap-of-southeast-linux-fest-2009/' rel='bookmark' title='Permanent Link: Recap of Southeast Linux Fest 2009'>Recap of Southeast Linux Fest 2009</a> <small>Last weeke</small></li><li><a href='http://www.xaprb.com/blog/2009/07/01/wikipedias-concensus-linux-is-an-operating-system/' rel='bookmark' title='Permanent Link: Wikipedia&#8217;s concensus: Linux is an operating system'>Wikipedia&#8217;s concensus: Linux is an operating system</a> <small>My brother</small></li></ol>

Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.]]></description>
			<content:encoded><![CDATA[<p><code>iostat</code> is one of the most important tools for measuring disk performance, which of course is very relevant for database administrators, whether your chosen database is Postgres, MySQL, Oracle, or anything else that runs on GNU/Linux.  Have you ever wondered where statistics like await (average wait for the request to complete) come from?  If you look at the disk statistics the <a href="http://www.mjmwired.net/kernel/Documentation/iostats.txt">Linux kernel makes available through files such as /proc/diskstats</a>, you won&#8217;t see await there.  How does iostat compute await?  For that matter, how does it compute the average queue size, service time, and utilization?  This blog post will show you how that&#8217;s computed.</p>

<p>First, let&#8217;s look at the fields in /proc/diskstats.  The order and location varies between kernels, but the following applies to 2.6 kernels.  For reads and writes, the file contains the number of operations, number of operations merged because they were adjacent, number of sectors, and number of milliseconds spent.  Those are available separately for reads and writes, although iostat groups them together in some cases.  Additionally, you can find the number of operations in progress, total number of milliseconds during which I/Os were in progress, and the weighted number of milliseconds spent doing I/Os.  Those are not available separately for reads and writes.</p>

<p>The last one is very important.  The field showing the number of operations in progress is transient &#8212; it shows you the instantaneous value, and this &#8220;memoryless&#8221; property means you can&#8217;t use it to infer the number of I/O operations that are in progress on average.  But the last field has memory, because it is defined as follows:</p>

<blockquote><p>
Field 11 &#8212; weighted # of milliseconds spent doing I/Os
    This field is incremented at each I/O start, I/O completion, I/O
    merge, or read of these stats by the number of I/Os in progress
    (field 9) times the number of milliseconds spent doing I/O since the
    last update of this field.  This can provide an easy measure of both
    I/O completion time and the backlog that may be accumulating.
</p></blockquote>

<p>So the field indicates the total number of milliseconds that all requests have been in progress.  If two requests have been waiting 100ms, then 200ms is added to the field.  And thus it records what happened over the duration of the sampling interval, not just what&#8217;s happening at the instant you look at the file.  We&#8217;ll come back to that later.</p>

<p>Now, given two samples of I/O statistics and the time elapsed between them, we can easily compute everything iostat outputs in -dx mode.  I&#8217;ll take them slightly out of order to reflect how the computations are done internally.</p>

<ul>
<li>rrqm/s is merely the incremental merges divided by the number of seconds elapsed.</li>
<li>wrqm/s is similarly simple, and r/s, w/s, rsec/s, and wsec/s are too.</li>
<li>avgrq-sz is the number of sectors divided by the number of I/O operations.</li>
<li>avgqu-sz is computed from the last field in the file &#8212; the one that has &#8220;memory&#8221; &#8212; divided by the milliseconds elapsed.  Hence the units cancel out and you just get the average number of operations in progress during the time period.  The name (short for &#8220;average queue size&#8221;) is a little bit ambiguous.  This value doesn&#8217;t show how many operations were queued but not yet being serviced &#8212; it shows how many were <em>either</em> in the queue waiting, <em>or</em> being serviced.  The exact wording of the kernel documentation is &#8220;&#8230;as requests are given to appropriate struct request_queue and decremented as they finish.&#8221;</li>
<li>%util is the total time spent doing I/Os, divided by the sampling interval.  This tells you how much of the time the device was busy, but it doesn&#8217;t really tell you whether it&#8217;s reaching its limit of throughput, because the device could be backed by many disks and hence capable of multiple I/O operations simultaneously.</li>
<li>await is the total time for all I/O operations summed, divided by the number of I/O operations completed.</li>
<li>svctm is the most complex to derive.  It is the utilization divided by the throughput.  You saw utilization above; the throughput is the number of I/O operations in the time interval.</li>
</ul>

<p>Although the computations and their results seem both simple and cryptic, it turns out that you can derive a lot of information from the relationship between these various numbers.  This is one of those tools where a few lines of code have a surprising amount of meaning, which is left for the reader to understand.  I&#8217;ll get more into that in the future.</p>

<p>Related posts:<ol><li><a href='http://www.xaprb.com/blog/2009/08/23/how-to-find-per-process-io-statistics-on-linux/' rel='bookmark' title='Permanent Link: How to find per-process I/O statistics on Linux'>How to find per-process I/O statistics on Linux</a> <small>Newer Linu</small></li><li><a href='http://www.xaprb.com/blog/2009/06/21/recap-of-southeast-linux-fest-2009/' rel='bookmark' title='Permanent Link: Recap of Southeast Linux Fest 2009'>Recap of Southeast Linux Fest 2009</a> <small>Last weeke</small></li><li><a href='http://www.xaprb.com/blog/2009/07/01/wikipedias-concensus-linux-is-an-operating-system/' rel='bookmark' title='Permanent Link: Wikipedia&#8217;s concensus: Linux is an operating system'>Wikipedia&#8217;s concensus: Linux is an operating system</a> <small>My brother</small></li></ol></p>
<p>Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://www.xaprb.com/blog/2010/01/09/how-linux-iostat-computes-its-results/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A simple way to make birthday queries easier and faster</title>
		<link>http://www.xaprb.com/blog/2009/12/31/a-simple-way-to-make-birthday-queries-easier-and-faster/</link>
		<comments>http://www.xaprb.com/blog/2009/12/31/a-simple-way-to-make-birthday-queries-easier-and-faster/#comments</comments>
		<pubDate>Thu, 31 Dec 2009 20:48:50 +0000</pubDate>
		<dc:creator>Xaprb</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://www.xaprb.com/blog/?p=1514</guid>
		<description><![CDATA[It&#8217;s New Year&#8217;s Eve, a date that should strike terror into the hearts of many, because tomorrow a bunch of their queries are going to fail.

Queries to &#8220;find all birthdays in the next week&#8221; and similar are always a nightmare to write.  If you want to see a bunch of examples, go look at [...]


Related posts:<ol><li><a href='http://www.xaprb.com/blog/2009/08/18/how-to-find-un-indexed-queries-in-mysql-without-using-the-log/' rel='bookmark' title='Permanent Link: How to find un-indexed queries in MySQL, without using the log'>How to find un-indexed queries in MySQL, without using the log</a> <small>You probab</small></li><li><a href='http://www.xaprb.com/blog/2009/08/07/finding-queries-with-duplicate-columns/' rel='bookmark' title='Permanent Link: Finding queries with duplicate columns'>Finding queries with duplicate columns</a> <small>A while ag</small></li><li><a href='http://www.xaprb.com/blog/2009/11/01/catching-erroneous-queries-without-mysql-proxy/' rel='bookmark' title='Permanent Link: Catching erroneous queries, without MySQL proxy'>Catching erroneous queries, without MySQL proxy</a> <small>MySQL Prox</small></li></ol>

Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s New Year&#8217;s Eve, a date that should strike terror into the hearts of many, because tomorrow a bunch of their queries are going to fail.</p>

<p>Queries to &#8220;find all birthdays in the next week&#8221; and similar are always a nightmare to write.  If you want to see a bunch of examples, go look at the user-contributed comments on <a href="http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html">the MySQL date and time function reference</a>.  This post is about a slightly saner way to do that.  There&#8217;s still some nasty math involved, but a) a lot less of it, and b) at least the query will be able to use indexes[1].</p>

<p>So here&#8217;s my tip: instead of storing the user&#8217;s full birthdate, just store the month and day they were born.  Try it.  You&#8217;ll love it!</p>

<p>[1] Yes, I know Postgres can index a function.  So this can be considered a jab at MySQL, which can&#8217;t.</p>

<p>Related posts:<ol><li><a href='http://www.xaprb.com/blog/2009/08/18/how-to-find-un-indexed-queries-in-mysql-without-using-the-log/' rel='bookmark' title='Permanent Link: How to find un-indexed queries in MySQL, without using the log'>How to find un-indexed queries in MySQL, without using the log</a> <small>You probab</small></li><li><a href='http://www.xaprb.com/blog/2009/08/07/finding-queries-with-duplicate-columns/' rel='bookmark' title='Permanent Link: Finding queries with duplicate columns'>Finding queries with duplicate columns</a> <small>A while ag</small></li><li><a href='http://www.xaprb.com/blog/2009/11/01/catching-erroneous-queries-without-mysql-proxy/' rel='bookmark' title='Permanent Link: Catching erroneous queries, without MySQL proxy'>Catching erroneous queries, without MySQL proxy</a> <small>MySQL Prox</small></li></ol></p>
<p>Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://www.xaprb.com/blog/2009/12/31/a-simple-way-to-make-birthday-queries-easier-and-faster/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Josh Berkus helps clarify clustering</title>
		<link>http://www.xaprb.com/blog/2009/11/21/josh-berkus-helps-clarify-clustering/</link>
		<comments>http://www.xaprb.com/blog/2009/11/21/josh-berkus-helps-clarify-clustering/#comments</comments>
		<pubDate>Sat, 21 Nov 2009 13:57:51 +0000</pubDate>
		<dc:creator>Xaprb</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[Josh Berkus]]></category>
		<category><![CDATA[MySQL-Cluster]]></category>

		<guid isPermaLink="false">http://www.xaprb.com/blog/?p=1449</guid>
		<description><![CDATA[If you haven&#8217;t seen it, Josh Berkus has a very concise way to look at the confusing mess that is database &#8220;clustering&#8221; from the point of view of three distinct types of users: transactional, analytic, and online. I think that using this kind of distinction could help keep discussions clear &#8212; I&#8217;ve seen a lot [...]


Related posts:<ol><li><a href='http://www.xaprb.com/blog/2009/04/17/sessions-of-interest-at-the-percona-performance-conference/' rel='bookmark' title='Permanent Link: Sessions of interest at the Percona Performance Conference'>Sessions of interest at the Percona Performance Conference</a> <small>Having wri</small></li><li><a href='http://www.xaprb.com/blog/2009/03/13/50-things-to-know-before-migrating-oracle-to-mysql/' rel='bookmark' title='Permanent Link: 50 things to know before migrating Oracle to MySQL'>50 things to know before migrating Oracle to MySQL</a> <small>A while ba</small></li><li><a href='http://www.xaprb.com/blog/2009/11/19/how-to-tell-if-someone-is-bullshitting/' rel='bookmark' title='Permanent Link: How to tell if someone is bullshitting'>How to tell if someone is bullshitting</a> <small>Ever been </small></li></ol>

Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.]]></description>
			<content:encoded><![CDATA[<p>If you haven&#8217;t seen it, <a href="http://it.toolbox.com/blogs/database-soup/the-three-database-clustering-users-35473">Josh Berkus has a very concise way</a> to look at the confusing mess that is database &#8220;clustering&#8221; from the point of view of three distinct types of users: transactional, analytic, and online. I think that using this kind of distinction could help keep discussions clear &#8212; I&#8217;ve seen a lot of conversations around clustering run off the rails due to disagreements about what clustering means.  MySQL Cluster, for example, is a huge red herring for a lot of people, but it seems to be a difficult process to learn it well enough to decide.  If we called it a clustering solution for transactional users, but not for analytic or online users, it might help a lot.</p>

<p>Related posts:<ol><li><a href='http://www.xaprb.com/blog/2009/04/17/sessions-of-interest-at-the-percona-performance-conference/' rel='bookmark' title='Permanent Link: Sessions of interest at the Percona Performance Conference'>Sessions of interest at the Percona Performance Conference</a> <small>Having wri</small></li><li><a href='http://www.xaprb.com/blog/2009/03/13/50-things-to-know-before-migrating-oracle-to-mysql/' rel='bookmark' title='Permanent Link: 50 things to know before migrating Oracle to MySQL'>50 things to know before migrating Oracle to MySQL</a> <small>A while ba</small></li><li><a href='http://www.xaprb.com/blog/2009/11/19/how-to-tell-if-someone-is-bullshitting/' rel='bookmark' title='Permanent Link: How to tell if someone is bullshitting'>How to tell if someone is bullshitting</a> <small>Ever been </small></li></ol></p>
<p>Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://www.xaprb.com/blog/2009/11/21/josh-berkus-helps-clarify-clustering/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
