Chapter 2
Chapter 2 — INNER JOIN and LEFT JOIN
Chapter 1 split the bookshop into ten normalised tables. Now comes the payoff: JOINs reunite that data at query time, combining rows from multiple tables into a single result set. INNER JOIN and LEFT JOIN cover 90% of real-world multi-table queries. This chapter works through both in depth — what they return, when to use each, and the patterns that will appear in almost every query you write.
Why JOINs Exist
Normalisation splits data across tables to eliminate redundancy. The publisher's name lives in publishers, not in every books row. That's correct design — but when a user asks "what books does Doubleday publish?", we need both tables. A JOIN tells MySQL: for each row in books, find the matching row in publishers and glue them together.
The result of a JOIN looks like a single wide table. It isn't stored anywhere — MySQL builds it on the fly for this query, then discards it. The underlying tables stay exactly as they are.
SELECT b.title, p.name FROM books b, publishers p WHERE b.publisher_id = p.id; -- Mix of join condition and filter in WHERE -- If you forget the WHERE: 6 books × 4 publishers = 24 rows
SELECT b.title, p.name FROM books b JOIN publishers p ON p.id = b.publisher_id; -- Join condition is ON, filter conditions go in WHERE -- Clear, intentional, impossible to forget the ON
INNER JOIN
INNER JOIN returns only rows where the ON condition is satisfied in both tables. If a book has a NULL publisher_id, it won't appear. If a publisher has no books, it won't appear. Only matched pairs make it into the result.
Basic syntax
| title | publisher | price | year_published |
|---|---|---|---|
| Murder on the Orient Express | Hodder & Stoughton | 7.49 | 1934 |
| Foundation | Doubleday | 8.99 | 1951 |
| Dune | Ace Books | 9.99 | 1965 |
| Neuromancer | Ace Books | 7.99 | 1984 |
| Hyperion | Doubleday | 9.99 | 1989 |
| Gone Girl | Doubleday | 8.49 | 2012 |
JOIN and INNER JOIN identically. Most developers write JOIN for inner joins and spell out LEFT JOIN / RIGHT JOIN / CROSS JOIN in full — the asymmetry makes the type visible at a glance.
Adding a WHERE clause — filtering after joining
| title | publisher | country |
|---|---|---|
| Murder on the Orient Express | Hodder & Stoughton | UK |
Joining three tables — books + authors via book_authors
Books and authors have a many-to-many relationship through book_authors. To get author names on books, we join through the junction table:
| title | author | price |
|---|---|---|
| Foundation | Isaac Asimov | 8.99 |
| Murder on the Orient Express | Agatha Christie | 7.49 |
| Gone Girl | Gillian Flynn | 8.49 |
| Neuromancer | William Gibson | 7.99 |
| Dune | Frank Herbert | 9.99 |
| Hyperion | Dan Simmons | 9.99 |
AND condition on the ON clause is applied during the join itself — it's more efficient than filtering in WHERE.Aggregating across a join — average review rating per book
| title | avg_rating | review_count |
|---|---|---|
| Foundation | 5.0 | 1 |
| Murder on the Orient Express | 5.0 | 1 |
| Dune | 4.5 | 2 |
| Neuromancer | 4.0 | 1 |
Hyperion and Gone Girl are missing — they have no reviews yet. That's INNER JOIN's defining behaviour. To include them, we need LEFT JOIN.
LEFT JOIN
LEFT JOIN returns every row from the left table, regardless of whether a match exists in the right table. Where there is no match, all columns from the right table appear as NULL. This is exactly what you want when the relationship is optional: a book may or may not have reviews; you still want the book in the result.
All books with their average rating — including unreviewed books
| title | avg_rating | review_count |
|---|---|---|
| Dune | 4.5 | 2 |
| Foundation | 5.0 | 1 |
| Murder on the Orient Express | 5.0 | 1 |
| Neuromancer | 4.0 | 1 |
| Gone Girl | NULL | 0 |
| Hyperion | NULL | 0 |
COUNT(*) counts the NULL row for unmatched left-side entries — giving 1 instead of 0 for books with no reviews. COUNT(r.id) counts only non-NULL values in the right table's column — correctly giving 0 when there are no matching reviews. Always use COUNT(column_from_right_table) after a LEFT JOIN aggregation.
COALESCE — replacing NULLs in output
All customers with their order count — including customers who never ordered
| customer | order_count |
|---|---|
| Alice Martin | 2 |
| Bob Nguyen | 1 |
| Carol Patel | 1 |
LEFT JOIN with a publisher that has books with nullable publisher_id
The Anti-Join Pattern — LEFT JOIN + WHERE IS NULL
A LEFT JOIN where you filter for NULL on the right table's column finds rows in the left table that have no match at all in the right table. This is called an anti-join — it's how you answer "which X has no Y?"
| title | price |
|---|---|
| Gone Girl | 8.49 |
| Hyperion | 9.99 |
| Self-Published Manifesto | 4.99 |
r.id, o.id, and ba.book_id above are all primary or not-null keys — safe choices for the IS NULL test.
Joining on Non-Primary-Key Columns
The ON clause can match any column — not just primary and foreign keys. Joins on natural keys (ISBN, email, slug) and even on non-unique values are valid.
book_id in reviews) should have an index added explicitly, as we did in Chapter 1. For joins on natural keys like ISBN, add an index or UNIQUE constraint.
NULL Handling in JOINs
Common Mistakes
JOIN publishers p without ON p.id = b.publisher_id is a CROSS JOIN — every book row is combined with every publisher row. 6 books × 4 publishers = 24 rows of garbage. MySQL 8.0 will usually throw a syntax error for a missing ON, but MySQL 5.x may not. Always write the ON.
Practical Example — Order Summary Report
A real-world query combining multiple JOINs and LEFT JOINs to produce an order line-item report:
| customer | order_id | status | order_date | book | qty | line_total |
|---|---|---|---|---|---|---|
| Bob Nguyen | 3 | delivered | 2026-03-05 | Foundation | 1 | 8.99 |
| Bob Nguyen | 3 | delivered | 2026-03-05 | Murder on the Orient Express | 1 | 7.49 |
| Alice Martin | 1 | delivered | 2026-04-10 | Dune | 2 | 19.98 |
| Alice Martin | 1 | delivered | 2026-04-10 | Neuromancer | 1 | 7.99 |
| Alice Martin | 2 | shipped | 2026-05-20 | Hyperion | 1 | 9.99 |
| Carol Patel | 4 | pending | 2026-06-01 | Gone Girl | 1 | 8.49 |
Quick Reference — Chapter 2
| Pattern | What it does |
|---|---|
| JOIN … ON | INNER JOIN — returns only rows with a match in both tables. Excludes NULLs and unmatched rows from either side. |
| LEFT JOIN … ON | Returns all rows from the left table + matched rows from the right. Right-table columns are NULL where there is no match. |
| LEFT JOIN … WHERE right.id IS NULL | Anti-join — rows in the left table that have no matching row in the right table. Always filter on a NOT NULL column. |
| COUNT(right_table.id) | After a LEFT JOIN GROUP BY, counts only matched rows (0 for unmatched). COUNT(*) wrongly gives 1. |
| COALESCE(col, 'default') | Replaces NULL in output with a fallback value. Use on right-table columns in LEFT JOIN results. |
| table alias | FROM books b — alias b. Use b.column everywhere. Short aliases (b, p, a, c, o, r, oi) improve readability and remove ambiguity when tables share column names. |
| ON clause condition | The join condition. Can include AND for extra filters applied during the join (more efficient than WHERE for LEFT JOINs). Use WHERE for post-join filters on the result. |
| Three-table join | FROM a JOIN b ON … JOIN c ON … — chain as many JOINs as needed. Each adds one more table to the working result set. |
| NULL = NULL | Evaluates to NULL (unknown), not TRUE. A NULL FK never matches any row in an INNER JOIN. Use IS NULL / IS NOT NULL for null checks — never = NULL. |