Xaprb

Stay curious!

A review of O’Reilly’s SQL Hacks

with 8 comments

I recently read O’Reilly’s SQL Hacks, by Andrew Cumming and Gordon Russell. It’s an interesting and rewarding mixture of tips and tricks for novice to expert users.

Overview

The book consists of 100 hacks, organized into 12 chapters whose topics and level roughly progress from complete novice to advanced. There are 366 pages of real content. The index is good; for example, on the topic of dates, which is one of the main things I use references for, it has dozens of entries, many pointing to the same entries but worded in a different way, which can be very helpful for such a topic.

I give it 3 out of 5 stars because there’s some really good content, but there’s also some that shouldn’t have made it past technical reviewers, and it could have been better organized and more carefully edited.

If you approach the book as the authoritative source of expertise on how to solve every problem, you’ll be disappointed (and if you don’t know any better, you might be misled into some very bad solutions to particular problems). If you approach it as the opinions and suggestions of several people, who have things to teach you but are by no means the last word on any given subject, you may get a lot out of it. I did.

Incidentally, many of the hacks appear on this and other websites, but that shouldn’t stop you from getting a copy.

There’s some really good content

Some of the hacks are great. For example, Hack #24, “Multiply Across a Result Set,” shows you how to make up for the lack of a built-in PRODUCT() aggregate function with EXP(SUM(LN(col))). It’s simple, but I never thought of it, and now I have it tucked away for easy reference when I need it. That’s worth something — I won’t know how much until I need it.

There are also some great hacks about date math, which is always a bear. Having the complex math to find the second Tuesday of the month, or the last Thursday, easily at hand is great.

These and other gems make the book worth both reading through once and owning permanently. It was worth it for me to read once for the great ideas it gave me, like #24, and so I’ll know whether it contains a solution to something I’m working on. It’s worth owning permanently because I want to have it at hand for quick reference when I’m solving problems that are tough to memorize, and don’t want to derive the solutions from first principles. Date math falls into that category.

Reviewers were asleep at the wheel

Update: I assumed too much below; perhaps the technical reviewers did raise concerns but were not listened to, or something else might have happened. I won’t change what I wrote, so you can see my thoughts as I originally wrote them, but I want to say I’ve thought better of it now.

Some of the hacks miss obvious good solutions and propose bad ones instead. Take #16, “Search for a String Across Columns.” The hack is introduced with the question “does anyone have yellow anywhere in their room?” Then it advocates bludgeoning this query to death with the following blunt instrument:

select name from bedroom
where concat(floorcolor, ceilingcolor, wallcolor) like '%yellow%'

That really makes me cringe. There are many ways to do that better, most obviously:

select name from bedroom
where 'yellow' in(floorcolor, ceilingcolor, wallcolor);

The query isn’t written to return results where someone’s room has ‘yellow-orange’ or some other partial string matching. It explicitly states that concatenating, and then using a pattern matching operator, is better because it helps you avoid errors in typing. I disagree with this cost-benefit analysis; it’s an ugly kludge that will completely defeat indexing on any system I know of (the book mentions that it “will not usually employ an index,” which is an understatement).

It also won’t work right, because it smashes values together and loses the boundaries between them, which they go on to solve (in most but not all cases, but they don’t say that) with an uglier version of the same thing, instead of writing the query right.

This trick shows up again in Hack #88, which is also better solved with standard SQL. I wrote an admonition not to use these techniques in my employer’s copy of the book. Where were the technical reviewers on these? These techniques are a DBA’s nightmare, and any reviewer decently familiar with SQL should have cried foul and presented the better ways to solve the problems.

There are some syntactic issues, too. The book shows you how to run each of the examples on several different RDBMSs, which is good, but sometimes incompatibilities they mention don’t really exist. For example, CREATE TABLE SELECT in MySQL can also accept an optional AS, so it can be compatible with other systems, but the authors don’t say that. You also don’t need to prefix literal dates with DATE in MySQL and SQL Server, so some of the date examples could be rewritten to run without change on more systems.

Sometimes standard good practices fall by the wayside. For example, Hack #24 uses the != operator, which isn’t standard SQL.

Better editing would increase its usefulness

The book isn’t very clearly organized. Sometimes topics are hidden inside other topics. For example, if you want to know how to reduce the precision of numbers and dates, you wouldn’t look for it inside a hack titled “Break it Down By Range,” which is about grouping data into buckets. Similarly, if I want to find out how to get the top row from each group, I’m not going to look in Hack #88, “Test Two Values from a Subquery.”

There’s also some repeated material. That same hack repeats Hack #83, “Find the Top n in Each Group.” Perhaps it’s just for the sake of having a realistic example to illustrate the hack. If that’s the case, I’d change my criticism to “some of the examples are about something other than the hack and distract from its purpose.”

There’s too much variability in the style of the SQL examples. There is no consistent coding style, which is an even better idea in a book than in real code. Query results are sometimes shown as fancy MSWord-style tables and sometimes as MySQL-style textual bars and dashes.

Finally, many of the best hacks aren’t really by the listed authors, but by Rudy Limeback. But his contributions aren’t edited for style and consistency, so they feel very out of place, as though they were just copied and pasted out of an email. This is fairly jarring to read. (The astute reader will have noticed that I like even technical books to be a delight to read; perhaps you won’t have the same neuroses as I).

Summary

Don’t take my criticism too strongly; the book is worth reading and then keeping, as I said. I learned a lot from it and it’s among the more valuable books on SQL I have read. In fact, given that most books on SQL are just regurgitations of freely available manuals, it’s a standout.

Written by Xaprb

January 2nd, 2007 at 11:13 pm

Posted in Uncategorized

8 Responses to 'A review of O’Reilly’s SQL Hacks'

Subscribe to comments with RSS

  1. My edition claims it was published in November 2007. Now that’s a hack I’d like to be able to do.

    Matt Doar

    4 Jan 07 at 8:15 pm

  2. Thanks for the kind words!!!

    ;o)

    rudy

    8 Jan 07 at 11:01 am

  3. I agree, this is a good book.

    GB

    6 Feb 07 at 9:26 pm

  4. Thanks for writing this intelligent review — I checked it out before buying the book (I did order it). I came to distrust reviews on Amazon and plan to start posting my own reviews on a personal site or a blog rather than on Amazon — and also, instead of just reading Amazon reviews I now google for them in general. That’s how I found your review and it’s been everything most Amz reviews aren’t — knowledgeable, balanced, and fair: obviously you wrote it to share your real impressions rather than in order to byte the potential buyers into buying the book. Thanks again! I’ll be checking out your site from now on.

    JJJS

    22 Mar 07 at 5:58 pm

  5. I’m reading this book right now and I’m having second thoughts. There’s a large number of typos/contradictory statements/and other strange stuff (that looks like errors to me, but I’ll work on it more). Haven’t you noticed any? The book appears to not have been edited and even proofread.

    JJJS

    3 Apr 07 at 2:45 pm

  6. I did indeed, hence the “reviewers were asleep at the wheel” heading. But bear in mind a) it’s a tech book, and most tech books are rushed to market — this is far better than most b) it has several authors, and that’s always a ton of work c) I may write a book someday, but I haven’t yet, so I try to imagine myself on the receiving end of a book review.

    Xaprb

    3 Apr 07 at 3:05 pm

  7. :-) OK.

    But — if I may — let us try to imagine ourselves on the receiving end of a book buyer too. It’s too early to say (I’m on page 50), but I have a strong first impression that this book is um… not good. Some of the “hacks” appear to be written while being high on drugs! Did you notice how in one of the initial sections the author declares his goal as finding the date on which most widgets were ordered and then he proceeds to solve this problem by MAX’ing on the _date_! Gee, well, he found the latest date, what a surprise — even though more widgets were ordered on an earler date. The he’s off to the next “hack”. How do you do this, I wonder? Or — always a possibility — am I missing something here? In another snippet he creates a view on a table (so that the data can be looked at); then he modifies the table by the stuff from the view while checking that the record id be in the view’s set. But this view is a copy of the table! Logically, that is: physically, it IS the table. So, how can a record NOT be in the set?

    That said, I’m no guru, and it’s quite possible that I’m mistaken, so I’m gonna work on it some more, before I unleash let the dogs of war.

    One good thing: at least this book forces you to think, just to make sure what looks like errors indeed is errors.

    A bit subjective: I wouldn’t call this book “hacks” anything. Hackery, to me, summons an image of a mad genius; something well past the mere solid competence. But nothing in this book appears terribly original. It’s a cookbook of sorts.

    All right, checking out for now (but I’ll be back). :-)

    JJJS

    3 Apr 07 at 4:11 pm

  8. I appreciate your thoughts. I also agree that some of the “hacks” are bad practices (re-read my review… I was quite critical in places) but thought there was a lot of good stuff there too. *shrug* I see it as the authors’ best effort, and parts of it I get a lot from. I don’t want to be critical of that to the point of overlooking the good.

    Xaprb

    3 Apr 07 at 4:48 pm

Leave a Reply