Chapter 5

Chapter 5 — Common Table Expressions (CTEs)

A Common Table Expression — written with the WITH keyword — is a named, temporary result set that exists only for the duration of a single SQL statement. Think of it as giving a subquery a name so you can refer to it clearly, reuse it, and chain multiple steps together without nesting queries five levels deep.

MySQL added full CTE support in version 8.0, including the powerful recursive variant that can walk tree structures and graphs. If you're still on MySQL 5.7 you won't have CTEs — one more reason to upgrade.

Availability: CTEs (WITH clause, including recursive) require MySQL 8.0+. MariaDB added them in 10.2. Check your version: SELECT VERSION();

1. Basic Syntax — WITH … AS

WITH cte_name AS ( ← name you choose SELECT ... ← any valid SELECT FROM ... WHERE ... ) SELECT * FROM cte_name ← reference the CTE here WHERE ... ;

The CTE name is scoped to the statement that defines it. You cannot reference it from other queries or sessions. It disappears the moment the statement finishes.

Why bother? — CTE vs inline subquery

Both queries below produce identical results. The CTE version is dramatically easier to read, especially once queries grow:

-- ❌ Inline subquery — hard to parse, inner query buried SELECT customer, total_spent FROM ( SELECT CONCAT(c.first_name, ' ', c.last_name) AS customer, SUM(oi.quantity * oi.unit_price) AS total_spent FROM customers c JOIN orders o ON o.customer_id = c.customer_id JOIN order_items oi ON oi.order_id = o.order_id GROUP BY c.customer_id ) sub WHERE total_spent > 100 ORDER BY total_spent DESC;
-- ✅ CTE version — logic named up front, main query reads cleanly WITH customer_spend AS ( SELECT CONCAT(c.first_name, ' ', c.last_name) AS customer, SUM(oi.quantity * oi.unit_price) AS total_spent FROM customers c JOIN orders o ON o.customer_id = c.customer_id JOIN order_items oi ON oi.order_id = o.order_id GROUP BY c.customer_id ) SELECT customer, total_spent FROM customer_spend WHERE total_spent > 100 ORDER BY total_spent DESC;
Good CTE names are nouns, not verbs. Name the result set, not what you're doing: customer_spend, top_books, unpaid_orders — not get_customers or filter_orders. The name should describe what the data is.

2. Chaining Multiple CTEs

Multiple CTEs are separated by commas — only one WITH keyword, multiple named blocks. Each CTE can reference any CTE defined before it in the chain.

WITH first_cte AS ( SELECT ... ), second_cte AS ( SELECT ... FROM first_cte -- can reference first_cte ), third_cte AS ( SELECT ... FROM second_cte -- can reference first or second ) SELECT * FROM third_cte;

Example — monthly bestsellers with rank

A three-step chain: aggregate sales → rank by month → filter to top 3 per month:

WITH -- Step 1: total units sold per book per month monthly_sales AS ( SELECT b.book_id, b.title, DATE_FORMAT(o.order_date, '%Y-%m') AS month, SUM(oi.quantity) AS units_sold FROM order_items oi JOIN orders o ON o.order_id = oi.order_id JOIN books b ON b.book_id = oi.book_id GROUP BY b.book_id, b.title, month ), -- Step 2: rank books within each month ranked_sales AS ( SELECT month, title, units_sold, RANK() OVER (PARTITION BY month ORDER BY units_sold DESC) AS rnk FROM monthly_sales ) -- Final: keep only top 3 per month SELECT month, rnk, title, units_sold FROM ranked_sales WHERE rnk <= 3 ORDER BY month DESC, rnk;
RANK() is a window function — covered in full in Chapter 6. For now, just know it numbers rows within each partition (month) by the ORDER BY expression. Ranked CTEs like this are a very common pattern you'll write regularly.

Example — two-stage customer segmentation

WITH spend_by_customer AS ( SELECT c.customer_id, CONCAT(c.first_name, ' ', c.last_name) AS customer, COUNT(DISTINCT o.order_id) AS order_count, SUM(oi.quantity * oi.unit_price) AS total_spent FROM customers c JOIN orders o ON o.customer_id = c.customer_id JOIN order_items oi ON oi.order_id = o.order_id GROUP BY c.customer_id, customer ), segmented AS ( SELECT customer, order_count, total_spent, CASE WHEN total_spent >= 200 THEN 'VIP' WHEN total_spent >= 75 THEN 'Regular' ELSE 'Casual' END AS segment FROM spend_by_customer ) SELECT segment, COUNT(*) AS customers, ROUND(AVG(total_spent), 2) AS avg_spend FROM segmented GROUP BY segment ORDER BY FIELD(segment, 'VIP', 'Regular', 'Casual');
segmentcustomersavg_spend
VIP3284.50
Regular8112.75
Casual1438.20
CTEs are not materialised by default in MySQL 8. The optimiser may evaluate each CTE inline (as if it were a subquery) or materialise it into a temp table depending on cost. You can force materialisation with the MATERIALIZE optimiser hint if you want the CTE evaluated once and reused — useful when a CTE is referenced multiple times in the final query.

3. Recursive CTEs

A recursive CTE references itself. It's the standard SQL way to walk tree structures — category hierarchies, org charts, threaded comments, bill-of-materials — without application-side loops or multiple round-trips to the database.

Anchor member

The starting point — a normal SELECT that returns the root row(s). Runs once.

Recursive member

References the CTE by name to join the previous iteration's output back to the table. Runs repeatedly until it returns zero rows.

WITH RECURSIVE cte_name AS ( -- 1. Anchor: starting rows SELECT id, parent_id, name, 1 AS depth FROM tree_table WHERE parent_id IS NULL UNION ALL -- 2. Recursive: join previous result back to the table SELECT t.id, t.parent_id, t.name, c.depth + 1 FROM tree_table t JOIN cte_name c ON t.parent_id = c.id ) SELECT * FROM cte_name ORDER BY depth, name;
UNION ALL not UNION: Recursive CTEs must use UNION ALL (not UNION DISTINCT). Deduplication during recursion is not supported and would prevent the recursion from working correctly.

Setup — add a genre category tree

Add a self-referential genres table to the bookshop to model a category hierarchy:

CREATE TABLE IF NOT EXISTS genres ( genre_id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(80) NOT NULL, parent_id INT DEFAULT NULL, FOREIGN KEY (parent_id) REFERENCES genres(genre_id) ); INSERT INTO genres (genre_id, name, parent_id) VALUES -- Root categories (1, 'Fiction', NULL), (2, 'Non-Fiction', NULL), (3, 'Children', NULL), -- Fiction sub-genres (4, 'Literary Fiction', 1), (5, 'Science Fiction', 1), (6, 'Fantasy', 1), (7, 'Crime & Thriller', 1), -- Non-Fiction sub-genres (8, 'History', 2), (9, 'Science', 2), (10, 'Self-Help', 2), -- Science Fiction sub-sub-genres (11, 'Space Opera', 5), (12, 'Cyberpunk', 5), (13, 'Hard SF', 5), -- Fantasy sub-sub-genres (14, 'Epic Fantasy', 6), (15, 'Urban Fantasy', 6);

Walk the entire tree — with indented display

WITH RECURSIVE genre_tree AS ( -- Anchor: root genres (no parent) SELECT genre_id, name, parent_id, 0 AS depth, CAST(name AS CHAR(500)) AS path -- breadcrumb path FROM genres WHERE parent_id IS NULL UNION ALL -- Recursive: join children of the current level SELECT g.genre_id, g.name, g.parent_id, gt.depth + 1, CONCAT(gt.path, ' > ', g.name) FROM genres g JOIN genre_tree gt ON g.parent_id = gt.genre_id ) SELECT genre_id, CONCAT(REPEAT(' ', depth), name) AS name_indented, depth, path FROM genre_tree ORDER BY path;
genre_idname_indenteddepthpath
3Children0Children
1Fiction0Fiction
7 Crime & Thriller1Fiction > Crime & Thriller
6 Fantasy1Fiction > Fantasy
14 Epic Fantasy2Fiction > Fantasy > Epic Fantasy
15 Urban Fantasy2Fiction > Fantasy > Urban Fantasy
5 Science Fiction1Fiction > Science Fiction
13 Hard SF2Fiction > Science Fiction > Hard SF
12 Cyberpunk2Fiction > Science Fiction > Cyberpunk
11 Space Opera2Fiction > Science Fiction > Space Opera
2Non-Fiction0Non-Fiction
8 History1Non-Fiction > History
9 Science1Non-Fiction > Science
10 Self-Help1Non-Fiction > Self-Help

Find all descendants of a single genre

Change the anchor to start at a specific node — the recursion will collect everything below it:

-- All sub-genres under "Fiction" (genre_id = 1) WITH RECURSIVE subtree AS ( SELECT genre_id, name, parent_id, 0 AS depth FROM genres WHERE genre_id = 1 -- ← start here UNION ALL SELECT g.genre_id, g.name, g.parent_id, s.depth + 1 FROM genres g JOIN subtree s ON g.parent_id = s.genre_id ) SELECT genre_id, name, depth FROM subtree WHERE depth > 0 -- exclude the root itself ORDER BY depth, name;

Find the ancestors of a leaf node (walk upward)

Flip the direction: start at a leaf and JOIN to the parent on each iteration:

-- Breadcrumb path from "Space Opera" up to root WITH RECURSIVE ancestors AS ( SELECT genre_id, name, parent_id, 0 AS depth FROM genres WHERE name = 'Space Opera' UNION ALL SELECT g.genre_id, g.name, g.parent_id, a.depth - 1 FROM genres g JOIN ancestors a ON g.genre_id = a.parent_id -- walk UP via parent_id ) SELECT name, depth FROM ancestors ORDER BY depth;
namedepth
Fiction-2
Science Fiction-1
Space Opera0

4. Depth Limiting and Cycle Detection

Recursive CTEs will loop forever if your data contains a cycle (A is the parent of B, B is the parent of A) or if the recursion has a bug. MySQL enforces a hard limit via the cte_max_recursion_depth system variable (default: 1000 iterations) — when hit, the query errors rather than running forever. For production code, add explicit guards.

Limit depth with a WHERE condition

WITH RECURSIVE genre_tree AS ( SELECT genre_id, name, parent_id, 0 AS depth FROM genres WHERE parent_id IS NULL UNION ALL SELECT g.genre_id, g.name, g.parent_id, gt.depth + 1 FROM genres g JOIN genre_tree gt ON g.parent_id = gt.genre_id WHERE gt.depth < 5 -- ← hard stop at depth 5 ) SELECT * FROM genre_tree ORDER BY depth, name;

Detect cycles with a path string

Track visited IDs in a path string. Before continuing, check that the next node's ID isn't already in the path:

WITH RECURSIVE safe_walk AS ( SELECT genre_id, name, parent_id, 0 AS depth, CAST(genre_id AS CHAR(1000)) AS visited_ids, FALSE AS is_cycle FROM genres WHERE parent_id IS NULL UNION ALL SELECT g.genre_id, g.name, g.parent_id, sw.depth + 1, CONCAT(sw.visited_ids, ',', g.genre_id), FIND_IN_SET(g.genre_id, sw.visited_ids) > 0 AS is_cycle FROM genres g JOIN safe_walk sw ON g.parent_id = sw.genre_id WHERE sw.is_cycle = FALSE -- stop if cycle found ) SELECT * FROM safe_walk WHERE is_cycle = FALSE;
Change the recursion depth limit for deep trees: SET SESSION cte_max_recursion_depth = 5000; Only increase this for legitimate deep hierarchies — not to paper over an infinite loop bug. Always add a depth guard in your recursive member too.

5. Practical CTE Patterns

Generate a date series (no date table needed)

-- Generate every day in June 2026 WITH RECURSIVE date_series AS ( SELECT DATE('2026-06-01') AS dt UNION ALL SELECT DATE_ADD(dt, INTERVAL 1 DAY) FROM date_series WHERE dt < '2026-06-30' ) -- Left join to orders so days with zero orders still appear SELECT ds.dt, COALESCE(COUNT(o.order_id), 0) AS orders_placed FROM date_series ds LEFT JOIN orders o ON DATE(o.order_date) = ds.dt GROUP BY ds.dt ORDER BY ds.dt;
Date series are one of the most useful recursive CTE patterns. Generating a spine of dates and left-joining to transactional data ensures your report includes every day — even ones with no activity — without needing a pre-populated calendar table.

Generate a number series

-- Numbers 1 to 100 — useful for cross joins, test data, pagination WITH RECURSIVE nums AS ( SELECT 1 AS n UNION ALL SELECT n + 1 FROM nums WHERE n < 100 ) SELECT n FROM nums;

CTE in an UPDATE or DELETE

CTEs can prefix UPDATE and DELETE statements too — useful for complex filtering:

-- Mark reviews as 'verified' where the customer actually purchased the book WITH verified_purchases AS ( SELECT DISTINCT oi.book_id, o.customer_id FROM order_items oi JOIN orders o ON o.order_id = oi.order_id ) UPDATE reviews r JOIN verified_purchases vp ON vp.book_id = r.book_id AND vp.customer_id = r.customer_id SET r.verified = TRUE WHERE r.verified = FALSE;

Quick Reference

ConceptSyntax / ruleNotes
Basic CTE WITH name AS ( SELECT … ) SELECT … FROM name Scoped to the single statement. Vanishes after execution.
Multiple CTEs WITH a AS (…), b AS (…) SELECT … One WITH, comma-separated. Later CTEs can reference earlier ones.
Recursive CTE WITH RECURSIVE name AS ( anchor UNION ALL recursive ) SELECT … MySQL 8.0+ only. Must use UNION ALL. Runs until recursive member returns zero rows.
Depth limit WHERE depth < N in the recursive member Always add this for production queries on real (potentially dirty) data.
System depth cap SET SESSION cte_max_recursion_depth = N; Default 1000. Increase for legitimately deep trees, not to hide bugs.
Path / cycle tracking Build a visited_ids string; check with FIND_IN_SET() Guards against cycles in data that should be acyclic but isn't.
Date/number series Recursive CTE starting at a seed value, adding 1 or 1 day each iteration Left-join against transactions to include zero-activity periods in reports.
CTE + UPDATE/DELETE WITH … UPDATE table JOIN cte … Useful for complex filter logic before modifying rows. MySQL syntax — no FROM needed.
Materialisation hint WITH name AS ( SELECT /*+ MATERIALIZE */ … ) Forces MySQL to evaluate the CTE into a temp table once. Useful if referenced multiple times.
Next: Chapter 6 — Window functions: OVER, PARTITION BY, ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, and running totals with SUM OVER.