Comments on: High Performance MySQL, Second Edition: Query Performance Optimization http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/ Stay curious! Mon, 13 May 2013 05:55:40 +0000 hourly 1 http://wordpress.org/?v=3.5.1 By: Amit Shah http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-15561 Amit Shah Mon, 15 Dec 2008 10:53:42 +0000 http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-15561 I have a table that contains duplicate data.

Some more clarifications on the data set. The table consists of:
id, url, …a bunch of stored data for that url…, timestamp

The user can store historical data about different urls in the table. so it may look something like this as time goes on:

id ¦ url ¦ …stored data … ¦ timestamp
1 ¦ http://www.mydomoain.com ¦ …blah… ¦ 2005-08-04 13:03:12
2 ¦ http://www.webmasterworld.com ¦ …blah… ¦ 2005-08-04 13:33:12
3 ¦ http://www.cnn.com ¦ …blah… ¦ 2005-08-04 15:03:12
4 ¦ http://www.cnn.com ¦ …blah… ¦ 2005-08-06 10:00:02
5 ¦ http://www.mydomoain.com ¦ …blah… ¦ 2005-08-10 13:03:12
6 ¦ http://www.mydomoain.com ¦ …blah… ¦ 2005-08-11 13:03:12
7 ¦ http://www.mydomoain.com ¦ …blah… ¦ 2005-08-12 13:03:12
8 ¦ http://www.msn.com ¦ …blah… ¦ 2005-08-20 13:13:12
9 ¦ http://www.mydomoain.com ¦ …blah… ¦ 2005-08-31 13:03:12

The query I am trying to write needs to SELECT the last entry for a url (latest date or greatest id) relative to the other rows for the same url. (the last entry for mydomain.com for example). (This is why I was using GROUP BY, so that I would only get one row per url)

The query needs to return the most recent entry for each url.

id ¦ url ¦ …stored data … ¦ timestamp
2 ¦ http://www.webmasterworld.com ¦ …blah… ¦ 2005-08-04 13:33:12
4 ¦ http://www.cnn.com ¦ …blah… ¦ 2005-08-06 10:00:02
8 ¦ http://www.msn.com ¦ …blah… ¦ 2005-08-20 13:13:12
9 ¦ http://www.mydomoain.com ¦ …blah… ¦ 2005-08-31 13:03:12

Does this give a more clear picture of the query I am trying to create?

I am trying to pull all of the most recent rows from the table

SELECT * FROM `data` GROUP BY ‘name’ ORDER BY ‘timestamp’

Here im getting first group by records and on that result it will apply order by. But I want to apply order by first and on the first record it should apply group by.

To over come above issue, I’m trying below query
SELECT data.* FROM data INNER JOIN (SELECT MAX(id) AS id FROM data GROUP BY url) ids ON data.id = ids.id

I’m getting some problem in performance as sub query always executed slow.

If there is a simpler/more efficient way to do this – I would love to know.

Any help would be appreciated.

]]>
By: High Performance MySQL, Second Edition: Schema Optimization and Indexing at Xaprb http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13526 High Performance MySQL, Second Edition: Schema Optimization and Indexing at Xaprb Mon, 15 Oct 2007 01:43:35 +0000 http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13526 [...] we have on schema, index, and query optimization. The last two chapters I’ve written about (Query Performance Optimization and Advanced MySQL Features) have generated lots of feed back along the lines of “don’t [...]

]]>
By: Xaprb http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13490 Xaprb Mon, 08 Oct 2007 12:57:58 +0000 http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13490 Hi Lukas,

That’s because the outline has huge sections that should really be split into subsections, I think :-)

Giuseppe Maxia also recommended SQL Performance Tuning. I have a copy but haven’t read it in a while. Thanks for the reminder.

]]>
By: Lukas http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13489 Lukas Mon, 08 Oct 2007 09:57:46 +0000 http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13489 Hmm, I presume stuff like slow query log, explain etc. are discussed in another chapter because I do not see where they would fit in the outline? I have done a number of talks on generic (though with a MySQL slant) talk on this topic. You might find my slides useful:
http://pooteeweet.org/slides

Although the book is a bit dated and alot of the comments about MySQL are no longer valid, I still recommend “SQL Performance Tuning” for its general approach and deep understanding. If you do not have this book yet, I suggest to at least flip through it to see how they approach explaining the various areas in SQL optimization

]]>
By: Xaprb http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13486 Xaprb Mon, 08 Oct 2007 00:33:23 +0000 http://www.xaprb.com/blog/2007/10/07/high-performance-mysql-second-edition-query-performance-optimization/#comment-13486 That’s a useful observation. You may have put your finger on a back-of-my-mind feeling about the chapter.

There’s a lot of stuff in chapter 7, Optimizing Server Settings, which talks about volume of data. I think in general we assume big datasets and/or heavy load. I will run this by the other authors and see what they say — frankly they have more experience with large-volume installations than I do (I manage data in the 10s to 100s of GB regularly, but not TB).

]]>