If you’re new to this series, you should start by reading our introductory post.
In the previous posts in this series (on single table joins and multi-column joins) we had matching records in our two tables.
But sometimes, you may not have data in your second table. This kind of thing can come up when (for example):
- All vendors don’t have a 1099 code
- All employees aren’t assigned to a department
- All GL accounts don’t have a sub account or cost center.
And this is where a left outer join comes in handy. We use it when we may or may not have a record in the second table, but we want all the data from the primary table.
To illustrate, let’s look at two tables: Customers (from the Northwind sample database) and Red3States (which I created).
As you can see, it has 91 rows. The region column is returning “NULL” in some rows. And in some places, it looks like it’s returning state abbreviations (e.g. “WA” for Customer LazyK and “CA” for Customer LETSS).
Let’s add the state name to the query:
Luckily, I have 50 rows in my state table.
Now, let’s put the two together:
So, we have state names that make sense. But only received 13 rows back because those were the only rows where the region was actually a state name.
Therefore, we need to change the join, using a left outer join instead:
Now, “OR” matches to “Oregon” and “SP” returns ‘NULL,” meaning no value was found. And we’re back to 91 rows, which is what we want.
The basic rule: If you’re adding a table to your query—and you don’t expect a match to always exist—use the left outer join.