How to select the first or last row per group in SQL
There is no “first” or “last” aggregate function in SQL. Sometimes you can use MIN() or MAX(), but often that won’t work either. There are a couple of ways to solve this vexing non-relational problem. Read on to find out how.
First, let’s be clear: I am posing a very non-relational problem. This is not about the minimum, maximum, top, most, least or any other relationally valid extreme in the group. It’s the first or last, in whatever order the rows happen to come. And we all know rows aren’t ordered — in theory. But in practice they are, and sometimes you need the first or last row in a group.
If you have a question this article doesn’t answer, you might like to read how to select the first/least/max row per group in SQL and how to find the maximum row per group in SQL without subqueries.
A MySQL user-variable solution
I’ll show a MySQL-specific solution with one of the queries I developed for MySQL Table Checksum.
Here’s the idea: crush an entire table down to a single checksum value by checksumming each row, mushing it together with the previous row’s checksum, and then checksumming the result again. It’s fairly easy to do this, but it’s hard to get the final result in one statement. This is necessary to use the statement in an INSERT .. SELECT, which I needed to do.
An example might clarify:
select * from fruit; +---------+ | variety | +---------+ | apple | | orange | | lemon | | pear | +---------+ set @crc := ''; select variety, @crc := md5(concat(@crc, md5(variety))) from fruit; +---------+-----------------------------------------+ | variety | @crc := md5(concat(@crc, md5(variety))) | +---------+-----------------------------------------+ | apple | ae6d32585ecc4d33cb8cd68a047d8434 | | orange | 7ec613c796f44ef5ccb0e24e94323e38 | | lemon | a2475f37be12cebf733ebfc7ee2ee473 | | pear | ec98fe57833bbd91790ebc7ccf84c7e9 | +---------+-----------------------------------------+
I want the “last” value of @crc after the statement is done processing. How can I do this? The solution I found is to use a counter variable. I’ll demonstrate:
set @crc := '', @cnt := 0; select variety, @cnt := @cnt + 1 as cnt, @crc := md5(concat(@crc, md5(variety))) as crc from fruit; +---------+------+----------------------------------+ | variety | cnt | crc | +---------+------+----------------------------------+ | apple | 1 | ae6d32585ecc4d33cb8cd68a047d8434 | | orange | 2 | 7ec613c796f44ef5ccb0e24e94323e38 | | lemon | 3 | a2475f37be12cebf733ebfc7ee2ee473 | | pear | 4 | ec98fe57833bbd91790ebc7ccf84c7e9 | +---------+------+----------------------------------+
The counter variable might make you want to write something like HAVING cnt = MAX(cnt), but that won’t work (try it!). Instead, I prefixed the checksum with the count so the last row is the stringwise maximum:
select variety,
@crc := concat(lpad(@cnt := @cnt + 1, 10, '0'),
md5(concat(right(@crc, 32), md5(variety)))) as crc
from fruit;
+---------+--------------------------------------------+
| variety | crc |
+---------+--------------------------------------------+
| apple | 0000000001ae6d32585ecc4d33cb8cd68a047d8434 |
| orange | 00000000027ec613c796f44ef5ccb0e24e94323e38 |
| lemon | 0000000003a2475f37be12cebf733ebfc7ee2ee473 |
| pear | 0000000004ec98fe57833bbd91790ebc7ccf84c7e9 |
+---------+--------------------------------------------+
You can see I also left-padded the count so a lexical sort will agree with a numeric sort, and so I can predict how many extra characters I’ll need to remove to get back the original value. Now I can use the MAX() function to select the last row, and simply lop off the leftmost ten digits (I use the RIGHT() function for convenience, but generally you want to use SUBSTRING()):
select right(max(
@crc := concat(lpad(@cnt := @cnt + 1, 10, '0'),
md5(concat(right(@crc, 32), md5(variety))))
), 32) as crc
from fruit;
+----------------------------------+
| crc |
+----------------------------------+
| ec98fe57833bbd91790ebc7ccf84c7e9 |
+----------------------------------+
Et voila, I got the last value in the group. By the way, this will work with ONLY_FULL_GROUP_BY in the server’s SQL mode.
Other methods
My solution relies on a MySQL user variable to do the counting, but there are many ways to number rows in SQL: you could simulate the ROW_NUMBER() function, for instance, or use techniques mentioned in the comments on how to number rows in MySQL (one of the comments shows a particularly clever solution with subqueries, but I didn’t want to use it because MySQL doesn’t support subqueries in older versions). Any of these should work one way or another. Of course, if you are using a product such as Microsoft SQL server 2005, which actually has the ROW_NUMBER() function, you can use that!
Conclusion
Finding the first or last row is a bit unintuitive, and it’s definitely non-relational, but sometimes it’s what you need. The technique I demonstrated in this article is easily adaptable to many kinds of queries. I hope it helped you!
If this article didn’t solve your problem, please read these before posting questions to the comments: how to select the first/least/max row per group in SQL and how to find the maximum row per group in SQL without subqueries.



[...] blog is focussed on MySQL. The author really knows his stuff, not only in MySQL esoterica but about SQL in general. Great place to go to learn more about [...]
MySQL Resources « Technikhil Writing
29 Aug 07 at 9:10 pm
Thank you. Nice tips!
Artem Nosulchik
5 Sep 07 at 3:38 am
This is not helping specially when you are shortening the queries , you are writing in detail but you come to practice you are shorting your only way to explain what are you talking about..
Zaman
21 Feb 08 at 1:29 am
I ran into a scenario which required me to use the last row of a group, and after a lot of searching and looking up all the usual methods I came up with one which I thought I’d share incase anyone else does.
I did not use any sub-queries or the likes due to how slow they are, but instead made use of the GROUP_CONCAT() function.
In my scenario, I had a table which had rows of data for every “minute”, each row contained 4 main values I needed: open, close, high, low. I made my query to group this data into 5 minute intervals, such as: 5:00PM, 5:05PM, 5:10PM and so on. The high/low was easily accomplished by MIN/MAX, the open was correct because “group by” returns the first rows “open” value, but obviously “close” was incorrect and I needed the last row of the group. The main method mentioned here did not do me any good due to the large amount of data I was working with, and my solution was simply this (this is not the exact query):
—————————-
select
open, MAX(high), MIN(low), SUBSTRING_INDEX(GROUP_CONCAT(CAST(close as CHAR)), ‘,’, -1)
from mytable
group by [xyz...]
—————————-
The group_concat() will return all “close” values within the group seperated by a comma, I used CAST() because the values are FLOAT and would otherwise result in BLOB, lastly the SUBSTRING_INDEX() function was used to retrieve the last item in the string after the final comma.
Works like a charm!
Scot
28 Sep 09 at 5:19 pm
[...] came across an interesting post at Xaprb.com on selecting the first and last row of a group in MySQL. This solution builds on a comment in [...]
Simulating First/Last aggregate functions in MySQL | zonalivre.org
11 Oct 09 at 8:17 pm
Here is a great simple and efficient solution to the problem. It’s close to just saying FIRST() LAST() in MySQL
http://topwebguy.com/first-and-last-in-mysql-a-working-solution/
Webmaster
17 Apr 10 at 3:40 am
There is a simpler solution
You would just do an inner select and sort by the order you want in descending/ascending order
SELECT ip, id, word
FROM
( SELECT * FROM tb_a ORDER BY id DESC )
GROUP BY ip
Ian Bugeja
13 Aug 12 at 1:44 pm