MySQL Outer Joins Accross Multiple Tables


This is the third time in as many months that I’ve had to read the MySQL Reference Manual section on JOIN Syntax, so it’s time to summarize.

For example, let’s say you have tables:

books

+----+-----------------------------------------+-----------+
| id | title                                   | is_unread |
+----+-----------------------------------------+-----------+
|  1 | Ulysses                                 |         1 | 
|  2 | The Great Gatsby                        |         0 | 
|  3 | A Portrait of the Artist as a Young Man |         1 | 
|  4 | Lolita                                  |         1 | 
|  5 | Brave New World                         |         0 | 
+----+-----------------------------------------+-----------+

tags

+----+------------+
| id | name       |
+----+------------+
|  1 | classic    | 
|  2 | fiction    | 
|  3 | irish      | 
|  4 | dystopia   | 
|  5 | literature | 
+----+------------+

books_tags

+---------+--------+
| book_id | tag_id |
+---------+--------+
|       1 |      2 | 
|       1 |      3 | 
|       2 |      2 | 
|       2 |      1 | 
|       3 |      2 | 
|       3 |      3 | 
|       4 |      2 | 
|       5 |      1 | 
|       5 |      2 | 
|       5 |      4 | 
+---------+--------+

and we want to know how many books there are with each tag in the database. We could say

SELECT t.*, COUNT(bt.book_id)
FROM tags t
JOIN books_tags bt ON t.id = bt.tag_id
GROUP BY t.id;

and get

+----+----------+----------------+
| id | name     | count(book_id) |
+----+----------+----------------+
|  1 | classic  |              2 | 
|  2 | fiction  |              5 | 
|  3 | irish    |              2 | 
|  4 | dystopia |              1 | 
+----+----------+----------------+
4 rows in set (0.00 sec)

but we’re missing a tag! This is easily solved by using a LEFT JOIN, like so:

SELECT t.*, COUNT(bt.book_id)
FROM tags t
LEFT JOIN books_tags bt ON t.id = bt.tag_id
GROUP BY t.id;

+----+------------+----------------+
| id | name       | count(book_id) |
+----+------------+----------------+
|  1 | classic    |              2 | 
|  2 | fiction    |              5 | 
|  3 | irish      |              2 | 
|  4 | dystopia   |              1 | 
|  5 | literature |              0 | 
+----+------------+----------------+
5 rows in set (0.00 sec)

But what if you only want to consider books you’ve read? Now we must also join against the books table.
You may be tempted to try

SELECT t.*, COUNT(bt.book_id)
FROM tags t
LEFT JOIN books_tags bt ON t.id = bt.tag_id
LEFT JOIN books b ON b.id = bt.book_id
WHERE is_unread = 0
GROUP BY t.id;

but that simply returns

+----+----------+----------------+
| id | name     | count(book_id) |
+----+----------+----------------+
|  1 | classic  |              2 | 
|  2 | fiction  |              2 | 
|  4 | dystopia |              1 | 
+----+----------+----------------+

To return all tags the trick is to move the condition into the join clause and then count what’s returned from the third table. Thus

SELECT t.*, COUNT(b.id)
FROM tags t
LEFT JOIN books_tags bt ON t.id = bt.tag_id
LEFT JOIN books b ON b.id = bt.book_id AND b.is_unread = 0
WHERE 1 GROUP BY t.id;

+----+------------+-------------+
| id | name       | count(b.id) |
+----+------------+-------------+
|  1 | classic    |           2 | 
|  2 | fiction    |           2 | 
|  3 | irish      |           0 | 
|  4 | dystopia   |           1 | 
|  5 | literature |           0 | 
+----+------------+-------------+

I’ve read two classics, two works of fiction, and one about dystopia. Doesn’t that seem like a lot?

What if we want to go a step further? Let’s add authors:

authors:

+----+---------------------+
| id | name                |
+----+---------------------+
|  1 | James Joyce         | 
|  2 | F. Scott Fitzgerald | 
|  3 | Vladimir Nabokov    | 
|  4 | Aldous Huxley       | 
+----+---------------------+

authors_books:

+-----------+---------+
| author_id | book_id |
+-----------+---------+
|         1 |       1 | 
|         2 |       2 | 
|         1 |       3 | 
|         3 |       4 | 
|         4 |       5 | 
+-----------+---------+

If we want to do a standard inner join against the authors table (we don’t want books without authors, but we still want tags without books) we can do so with:

SELECT t.*, COUNT(b.id)
FROM tags t LEFT JOIN books_tags bt ON t.id = bt.tag_id
LEFT JOIN (books b, authors_books ab, authors a) ON b.id = bt.book_id AND b.id = ab.book_id AND ab.author_id = a.id AND a.name = ‘F. Scott Fitzgerald’ AND b.is_unread = 0
WHERE 1
GROUP BY t.id;


powered by WordPress     themed by Mukkamu     presented by ideaharbor.org     everything else by steve hulet