How to write SQL JOIN clauses more compactly

Here’s a way to write join clauses so they are more compact, more readable, and less confusing.

Standard SQL:2003 defines a USING clause that can serve the same function as the ON clause in the familiar JOIN syntax. For example, the following join

select a.col1, b.col2
from a
   inner join b on a.col3 = b.col3

May be written as follows:

select a.col1, b.col2
from a
   inner join b using(col3)

That may not look like much of an improvement, but it is a big help in larger joins where many tables have columns with the same names. In these cases, not only is it tedious to write out every pair of columns that must match in the join, you often have to refer to the tables with aliases too. And it’s tough to read such a join and understand it, or debug it. For example, what’s wrong with this join?

select tbl1.col1, tbl2.col2, tbl3.col2, tbl4.col1
from apples as tbl1
   inner join oranges as tbl2 on tbl1.col3 = tbl2.col3
   inner join grapes as tbl3 on tbl3.col3 = tbl2.col3
   inner join peaches as tbl4 on tbl3.col3 = tbl2.col3

The statement is valid and will execute, but it won’t give the results you probably wanted (tbl4’s join clause doesn’t refer to any columns from tbl4). The bug is even harder to find if the statement isn’t neatly indented and consistently organized. That statement is better written with USING clauses:

select tbl1.col1, tbl2.col2, tbl3.col2, tbl4.col1
from apples as tbl1
   inner join oranges as tbl2 using(col3)
   inner join grapes as tbl3 using(col3)
   inner join peaches as tbl4 using(col3)

USING matches the specified columns from each table, eliminating the need to write them out twice explicitly with aliases. In MySQL 5, you can see how the statement gets rewritten by the optimizer with EXPLAIN EXTENDED followed by SHOW WARNINGS. The result shows that it gets rewritten as an old-style join with the column-matching done in the WHERE clause.

USING isn’t a drop-in replacement for the ON clause in normal join syntax. There are some differences, especially on different platforms (and in MySQL’s case, even differences between versions of the product — see MySQL JOIN Syntax). This probably makes relational purists hot under the collar. Of course, everything about SQL makes relational purists mad, because SQL isn’t relational, and database products are even less relational than the SQL standard… oh well.

In my opinion, use it if it makes your life easier. Programming is hard — use whatever your tools give you to ease the burden.

One more note: there’s also a NATURAL JOIN syntax that automatically discovers columns with the same names in both tables and matches them in the join. I don’t use this in programs, because if someone adds more columns to one of the tables involved in such a join, the join criteria will silently change. I think join criteria should always be explicit, for the same reason I avoid blind inserts. However, this syntax can be convenient for writing one-off queries at the command line.

Technorati Tags:No Tags

You might also like:

  1. How to simulate optional parameters in SQL
  2. Why large IN clauses are problematic
  3. How to write a SQL exclusion join
  4. How to simulate FULL OUTER JOIN in MySQL
  5. My personal SQL coding standards

10 Responses to “How to write SQL JOIN clauses more compactly”


  1. 1 Ronald Bradford

    I didn’t know that syntax. Good one Baron!

  2. 2 Roland Bouman

    Hi Baron,

    I knew the syntax, but I don’t really like it. I am still waiting for what i think is the only sensible compact join syntax, the “foreign key constraint join”.

    Read more here: Intelligent SQL JOIN syntax?

  3. 3 Xaprb

    Roland, I like the RELATE USING syntax your post discusses too. But I think current syntaxes are adequate; there are so many different ways (especially outer joins) to join tables. The finer points of specific ways to join tables will always require wordy syntaxes. For example, a left outer join on a range, like so:

    ... left outer join tbl2 on col = col2 or col3 between col4 and col5...

    But again, for many simple cases, a foreign key constraint join would make a lot of sense. I say, bring it on… another way to make things easier. I like having lots of ways to do things.

    One reason NATURAL JOIN isn’t so useful for most things I do: people tend to add a TIMESTAMP column named ‘ts’ Obviously this shouldn’t be involved in a join :-)

  4. 4 Roland Bouman

    Adequate: well, a lot of people feel that way.

    What I don’t like about USING is that it makes you do one and the same thing (JOIN-ing) in two different ways (with ON and USING). It does happen quite often that you have a table that has more than one foreign key to one particular parent table, forcing you to use ON anyway (even if you use the convention of using the same column names for related tables - a convention which I don’t prefer). So I rather write all joins the same way - with ON. And indeed, the more exotic non-equi joins pose no problem at all to ON.

    What I think would make the RELATE syntax interesting is that it would allow you to write the join as if it is a sentence or statement. It would allow you to code with the semantics of the relationship rather than the implementation with foreign keys.

    To make this really work, you would actually need to be able to define foreign key constraints with additional role names. Something like

    ALTER TABLE City
    ADD CONSTRAINT city_resides_in_country
    FOREIGN KEY (Country_id)
    REFERENCES Country(id)
    PARENT_ROLE `contains`
    CHILD_ROLE `resides in`
    ;

    There should be a unicity constraint on the name of the table, the rolename and the name of the related table. Now the join could be written like

    SELECT *
    FROM  Country RELATE `Contains` City

    or, if you like

    SELECT *
    FROM  City RELATE `Resides in` Country

    This would totally solve the issue of having more than one foreign key to the same parent table because the two relationships will always play other roles (if not, the design is wrong)

  5. 5 David Shrewsbury

    Your first example of using the USING clause is missing the USING clause. :-)

    -Dave

  6. 6 Xaprb

    DOH! What makes it even funnier is that nobody else caught it. Fixed, thanks Dave.

  7. 7 rudy

    i don’t like USING either, because it’s vague and can, in certain circumstances, produce ridiculous results

    in the example you gave:

    select tbl1.col1, tbl2.col2, tbl3.col2, tbl4.col1
    from apples as tbl1
       inner join oranges as tbl2 using(col3)
       inner join grapes as tbl3 using(col3)
       inner join peaches as tbl4 using(col3)

    you say that “USING matches the specified columns from each table” but this is problematic — is it really each table?

    in this case, col3 exists in all 4 tables, so does this mean that the last USING is equivalent to:

     on tbl4.col3 = tbl3.col3
    and tbl4.col3 = tbl2.col3
    and tbl4.col3 = tbl1.col3

    and don’t get me started on NATURAL joins — what were they thinking?!!!

  8. 8 Xaprb

    Rudy, very good points.

    Disclosure: there is not a single USING in any codebase I’m involved with.

  9. 9 rye

    Well another difference between join with ‘on’ and join with ‘using’ is that when “select *” is used, with “on” the joined column will be presented twice, while with “using’ only once and placed as the first column.

    Another point I think noteworthy is that with “using” the joined column must be enclosed in parenthesis/”()”, without them it’s an error.

    ok the DBMS I play with is mysql 5.0.27 on NT.

  1. 1 Binary Look » links for 2006-12-09

Leave a Reply

Please do not use this blog to get help with problems or bugs in Maatkit or innotop: use the Sourceforge forums, mailing list, or bug trackers. If you're asking for help with MySQL, please use the MySQL mailing list instead.