<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: High Performance MySQL, Second Edition: Query Performance Optimization</title>
	<atom:link href="http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/</link>
	<description>Stay curious!</description>
	<pubDate>Tue, 06 Jan 2009 04:22:01 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.2</generator>
		<item>
		<title>By: Amit Shah</title>
		<link>http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-15561</link>
		<dc:creator>Amit Shah</dc:creator>
		<pubDate>Mon, 15 Dec 2008 10:53:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-15561</guid>
		<description>I have a table that contains duplicate data.

Some more clarifications on the data set. The table consists of:
id, url, ...a bunch of stored data for that url..., timestamp

The user can store historical data about different urls in the table. so it may look something like this as time goes on:

id ¦ url ¦ ...stored data ... ¦ timestamp
1 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-04 13:03:12
2 ¦ www.webmasterworld.com ¦ ...blah... ¦ 2005-08-04 13:33:12
3 ¦ www.cnn.com ¦ ...blah... ¦ 2005-08-04 15:03:12
4 ¦ www.cnn.com ¦ ...blah... ¦ 2005-08-06 10:00:02
5 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-10 13:03:12
6 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-11 13:03:12
7 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-12 13:03:12
8 ¦ www.msn.com ¦ ...blah... ¦ 2005-08-20 13:13:12
9 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-31 13:03:12
...

The query I am trying to write needs to SELECT the last entry for a url (latest date or greatest id) relative to the other rows for the same url. (the last entry for mydomain.com for example). (This is why I was using GROUP BY, so that I would only get one row per url)

The query needs to return the most recent entry for each url.

id ¦ url ¦ ...stored data ... ¦ timestamp
2 ¦ www.webmasterworld.com ¦ ...blah... ¦ 2005-08-04 13:33:12
4 ¦ www.cnn.com ¦ ...blah... ¦ 2005-08-06 10:00:02
8 ¦ www.msn.com ¦ ...blah... ¦ 2005-08-20 13:13:12
9 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-31 13:03:12

Does this give a more clear picture of the query I am trying to create?

I am trying to pull all of the most recent rows from the table

SELECT * FROM `data` GROUP BY 'name' ORDER BY 'timestamp' 

Here im getting first group by records and on that result it will apply order by. But I want to apply order by first and on the first record it should apply group by.

To over come above issue, I'm trying below query
SELECT data.* FROM data INNER JOIN (SELECT MAX(id) AS id FROM data GROUP BY url) ids ON data.id = ids.id 

I'm getting some problem in performance as sub query always executed slow. 

If there is a simpler/more efficient way to do this - I would love to know. 

Any help would be appreciated.</description>
		<content:encoded><![CDATA[<p>I have a table that contains duplicate data.</p>
<p>Some more clarifications on the data set. The table consists of:<br />
id, url, &#8230;a bunch of stored data for that url&#8230;, timestamp</p>
<p>The user can store historical data about different urls in the table. so it may look something like this as time goes on:</p>
<p>id ¦ url ¦ &#8230;stored data &#8230; ¦ timestamp<br />
1 ¦ <a href="http://www.mydomoain.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.mydomoain.com');">http://www.mydomoain.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-04 13:03:12<br />
2 ¦ <a href="http://www.webmasterworld.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.webmasterworld.com');">http://www.webmasterworld.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-04 13:33:12<br />
3 ¦ <a href="http://www.cnn.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.cnn.com');">http://www.cnn.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-04 15:03:12<br />
4 ¦ <a href="http://www.cnn.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.cnn.com');">http://www.cnn.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-06 10:00:02<br />
5 ¦ <a href="http://www.mydomoain.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.mydomoain.com');">http://www.mydomoain.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-10 13:03:12<br />
6 ¦ <a href="http://www.mydomoain.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.mydomoain.com');">http://www.mydomoain.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-11 13:03:12<br />
7 ¦ <a href="http://www.mydomoain.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.mydomoain.com');">http://www.mydomoain.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-12 13:03:12<br />
8 ¦ <a href="http://www.msn.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.msn.com');">http://www.msn.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-20 13:13:12<br />
9 ¦ <a href="http://www.mydomoain.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.mydomoain.com');">http://www.mydomoain.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-31 13:03:12<br />
&#8230;</p>
<p>The query I am trying to write needs to SELECT the last entry for a url (latest date or greatest id) relative to the other rows for the same url. (the last entry for mydomain.com for example). (This is why I was using GROUP BY, so that I would only get one row per url)</p>
<p>The query needs to return the most recent entry for each url.</p>
<p>id ¦ url ¦ &#8230;stored data &#8230; ¦ timestamp<br />
2 ¦ <a href="http://www.webmasterworld.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.webmasterworld.com');">http://www.webmasterworld.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-04 13:33:12<br />
4 ¦ <a href="http://www.cnn.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.cnn.com');">http://www.cnn.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-06 10:00:02<br />
8 ¦ <a href="http://www.msn.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.msn.com');">http://www.msn.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-20 13:13:12<br />
9 ¦ <a href="http://www.mydomoain.com" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/www.mydomoain.com');">http://www.mydomoain.com</a> ¦ &#8230;blah&#8230; ¦ 2005-08-31 13:03:12</p>
<p>Does this give a more clear picture of the query I am trying to create?</p>
<p>I am trying to pull all of the most recent rows from the table</p>
<p>SELECT * FROM `data` GROUP BY &#8216;name&#8217; ORDER BY &#8216;timestamp&#8217; </p>
<p>Here im getting first group by records and on that result it will apply order by. But I want to apply order by first and on the first record it should apply group by.</p>
<p>To over come above issue, I&#8217;m trying below query<br />
SELECT data.* FROM data INNER JOIN (SELECT MAX(id) AS id FROM data GROUP BY url) ids ON data.id = ids.id </p>
<p>I&#8217;m getting some problem in performance as sub query always executed slow. </p>
<p>If there is a simpler/more efficient way to do this - I would love to know. </p>
<p>Any help would be appreciated.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: High Performance MySQL, Second Edition: Schema Optimization and Indexing at Xaprb</title>
		<link>http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13526</link>
		<dc:creator>High Performance MySQL, Second Edition: Schema Optimization and Indexing at Xaprb</dc:creator>
		<pubDate>Mon, 15 Oct 2007 01:43:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13526</guid>
		<description>[...] we have on schema, index, and query optimization. The last two chapters I&#8217;ve written about (Query Performance Optimization and Advanced MySQL Features) have generated lots of feed back along the lines of &#8220;don&#8217;t [...]</description>
		<content:encoded><![CDATA[<p>[...] we have on schema, index, and query optimization. The last two chapters I&#8217;ve written about (Query Performance Optimization and Advanced MySQL Features) have generated lots of feed back along the lines of &#8220;don&#8217;t [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xaprb</title>
		<link>http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13490</link>
		<dc:creator>Xaprb</dc:creator>
		<pubDate>Mon, 08 Oct 2007 12:57:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13490</guid>
		<description>Hi Lukas,

That's because the outline has huge sections that should really be split into subsections, I think :-)

Giuseppe Maxia also recommended SQL Performance Tuning.  I have a copy but haven't read it in a while.  Thanks for the reminder.</description>
		<content:encoded><![CDATA[<p>Hi Lukas,</p>
<p>That&#8217;s because the outline has huge sections that should really be split into subsections, I think :-)</p>
<p>Giuseppe Maxia also recommended SQL Performance Tuning.  I have a copy but haven&#8217;t read it in a while.  Thanks for the reminder.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lukas</title>
		<link>http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13489</link>
		<dc:creator>Lukas</dc:creator>
		<pubDate>Mon, 08 Oct 2007 09:57:46 +0000</pubDate>
		<guid isPermaLink="false">http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13489</guid>
		<description>Hmm, I presume stuff like slow query log, explain etc. are discussed in another chapter because I do not see where they would fit in the outline? I have done a number of talks on generic (though with a MySQL slant) talk on this topic. You might find my slides useful:
http://pooteeweet.org/slides

Although the book is a bit dated and alot of the comments about MySQL are no longer valid, I still recommend "SQL Performance Tuning" for its general approach and deep understanding. If you do not have this book yet, I suggest to at least flip through it to see how they approach explaining the various areas in SQL optimization</description>
		<content:encoded><![CDATA[<p>Hmm, I presume stuff like slow query log, explain etc. are discussed in another chapter because I do not see where they would fit in the outline? I have done a number of talks on generic (though with a MySQL slant) talk on this topic. You might find my slides useful:<br />
<a href="http://pooteeweet.org/slides" rel="nofollow" onclick="javascript:urchinTracker ('/outbound/comment/pooteeweet.org');">http://pooteeweet.org/slides</a></p>
<p>Although the book is a bit dated and alot of the comments about MySQL are no longer valid, I still recommend &#8220;SQL Performance Tuning&#8221; for its general approach and deep understanding. If you do not have this book yet, I suggest to at least flip through it to see how they approach explaining the various areas in SQL optimization</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xaprb</title>
		<link>http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13486</link>
		<dc:creator>Xaprb</dc:creator>
		<pubDate>Mon, 08 Oct 2007 00:33:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13486</guid>
		<description>That's a useful observation.  You may have put your finger on a back-of-my-mind feeling about the chapter.

There's a lot of stuff in chapter 7, Optimizing Server Settings, which talks about volume of data.  I think in general we assume big datasets and/or heavy load.  I will run this by the other authors and see what they say -- frankly they have more experience with large-volume installations than I do (I manage data in the 10s to 100s of GB regularly, but not TB).</description>
		<content:encoded><![CDATA[<p>That&#8217;s a useful observation.  You may have put your finger on a back-of-my-mind feeling about the chapter.</p>
<p>There&#8217;s a lot of stuff in chapter 7, Optimizing Server Settings, which talks about volume of data.  I think in general we assume big datasets and/or heavy load.  I will run this by the other authors and see what they say &#8212; frankly they have more experience with large-volume installations than I do (I manage data in the 10s to 100s of GB regularly, but not TB).</p>
]]></content:encoded>
	</item>
</channel>
</rss>
