There is usually more than one way to write a given query, but not all ways are created equal. Some mathematically equivalent queries can have drastically different performance. This article examines one of the motivations for inventing LEFT OUTER join and including it in the SQL standard: improved performance through exclusion joins.
LEFT OUTER join syntax was added to the SQL-92 standard specifically to address certain queries that had only been possible with NOT IN subqueries. The disadvantage of using subqueries in these situations is that they may require creating many anonymous tables and probing into them. A clever optimizer could generate the same plan as a LEFT OUTER join, but since there was no such thing at the time and query optimizers were much less capable, query performance could take quite a hit. I should pause here and say that I wasn’t programming in 1992, so I’m only speaking from the history I’ve read and heard, not from personal experience. However, I definitely have personal experience with the performance hits of NOT IN queries!
Setup
I’ll use two tables of data, apples and oranges.
| Variety | Price |
|---|---|
| Fuji | 5.00 |
| Gala | 6.00 |
| Variety | Price |
|---|---|
| Valencia | 4.00 |
| Navel | 5.00 |
The old-style way
In old-style SQL, one joined data sets by simply specifying the sets, and then specifying the match criteria in the WHERE clause, like so:
select *
from apples, oranges
where apples.Price = oranges.Price
and apples.Price = 5
Placing the join conditions in the WHERE clause is confusing when queries get more complex. It becomes hard to tell which conditions are used to join the tables (apples.Price = oranges.Price), and which are used to exclude results (apples.Price = 5). The two are equivalent in old-style joins, but as mentioned, some joins cannot be written in this style (more on this later).
The new way
The updated SQL standard addressed these issues by separating the join conditions from the WHERE clause. Join conditions now go in the FROM clause, greatly clarifying the syntax. Here is the simple join written in the newer style:
select *
from apples
inner join oranges
on apples.Price = oranges.Price
where apples.Price = 5
Outer joins
Separating the join conditions from the WHERE clause allows OUTER joins. There are three kinds of OUTER joins: LEFT, RIGHT and FULL. The most common is a LEFT OUTER join, but all three types have the characteristic of not eliminating rows entirely from the result set when they fail the condition. Instead, when data does not match, the row is included from one table as usual, and the other table’s columns are filled with NULLs (since there is no matching data to insert).
In a LEFT OUTER join, every row from the left-hand table is included, whether there is a matching row in the right-hand table or not. When there is a matching row in the right-hand table, it is included; otherwise the right-hand table’s columns are filled with NULLs. A demonstration may clarify:
select *
from apples
left outer join oranges
on apples.Price = oranges.Price
| Variety | Price | Variety | Price |
|---|---|---|---|
| Fuji | 5.00 | Navel | 5.00 |
| Gala | 6.00 | NULL | NULL |
INNER joins select matching rows in the result set. It is possible to use an INNER join to select apples and oranges with matching prices, as above. With LEFT OUTER joins it is possible to answer the reverse query, “show me apples for which there are no oranges with a matching price.” Simply eliminate matching rows in the WHERE clause:
select apples.Variety
from apples
left outer join oranges
on apples.Price = oranges.Price
where oranges.Price is null
Outer joins are not possible with inner join
The above query is not possible with INNER JOIN. The following query does not accomplish the same thing:
select apples.Variety
from apples
inner join oranges
on apples.Price = oranges.Price
where apples.Price <> oranges.Price
In fact, this query will return nothing, because the join condition contradicts the WHERE clause. This query is not the same thing either:
select apples.Variety
from apples
inner join oranges on
apples.Price <> oranges.Price
Why? Because if there are no rows in oranges, nothing will get returned. It is simply not possible to write this query with an INNER join or an old-style join, no matter what technique is used. Don’t be fooled by analyzing the two data sets presented in this article; for some cases you may be able to get the same behavior, but not for all possible data sets. There is a way to write this query using subqueries, though:
select apples.Variety
from apples
where apples.Price not in (
select Price from oranges)
Outer joins and subqueries
Why use a LEFT OUTER join instead of using a subquery? Depending on the query, this technique may force the subquery to be evaluated for every row in the left-hand table (especially for correlated subqueries, where the subquery refers to values from the left-hand table). A LEFT OUTER join, by contrast, can often use a much more efficient query plan. Again, they may be mathematically equivalent — and a good query optimizer may generate the same query plan, but this is not always the case. It depends heavily on the query, the optimizer, and how the tables are indexed. I have seen queries perform orders of magnitude better when rewritten with an exclusion join.

What a helpful piece! I’m relatively new to SQL so it was great to see someone go in depth and explain what I was trying to do. Keep it up!
Agreed. Nice article.
Very well written. Thank you for sharing your insight!
Extremely helpful! Thanks for posting this article!
Wonderful pointer. Saved my day.
I was looking for a way to exclude rows without using the NOT IN. The LEFT OUTER join is yielding the correct results so far. Thanks for the article.
Nice article. i was lookin for the same. keep it up.
Very helpful to exclude rows when combining two tables.
Very useful! I was looking for a way to remove rows that had no association anymore, and the LEFT OUTER JOIN you posted is exactly what I needed. thanks!
About my previous comment, just to note that it is not a simple case of replacing SELECT for DELETE, but you have to tell DELETE which table to actually remove rows from. As in:
DELETE table1 FROM table1 LEFT OUTER JOIN table2 …. etc
Great article.
Please add more advanced tricks.
Can’t get enough of them. :)
Great article, really helped me shift from using multiple inner join queries for exclusion joins to single left outer joins to accomplish the same thing.
Actually, OUTER JOINS were available in many RDBMS packages before SQL92, but the SQL syntax for those joins was non-standard. Oracle used the weird ( ) operator. MySQL supported this syntax until 4.0, but it was removed in 4.1 in favor of support only for the SQL standard.
The Oracle 7.3 syntax was:
SELECT *
FROM t1,
t2
WHERE t1.some_id = t2.some_id ( )
Well that is odd. There are supposed to be plus signs between those empty parenthesis.
Thanks for the article, it helped me a lot because I am MYSTIFIED by joins.
My question is sort of an aside:
The only reason I use joins is for these specific cases, where they offer me the ability to exclude or miss fields. Conceptually, I think the JOIN syntax makes no sense at all, and is kind of a lame convenience kludge. I have always naturally used the ‘old school’ method of comparing tables because that one has a simple standard logic of joining, grouped in the where clause.
My question is, can you explain to me why you (or lots of people) think the JOIN syntax itself is somehow more sensical than the ‘old-style’ way, because I don’t see it.
Steve, the JOIN clauses can be written as WHERE clauses for inner joins, but I prefer to separate them anyway. It helps me see more clearly which code is there to match rows between tables, and which code is for filtering out rows from the logical result. (Even though I know that the query optimizer really applies the filters as it performs the join, not after, it makes complicated joins easier for me to write if I pretend it doesn’t).
Very well written and useful!! Thank you.
I have a query which works fine in oracle, but it is hogging memmory
in mysql. How to write this query in MYSQL in a efficent way?
delete from table1
where not exists(select * from table2 where table1.id = table2.id)
Thanks in advance
Praveen