Recursive Common Table Expressions are immensely useful when you’re querying hierarchical data. Let’s explore what makes them work.
Common Table Expressions (CTEs) are some of the most useful constructions in SQL. Their main purpose is improving query design, which makes queries easier to read. One of the reasons CTEs are so popular is that they let you divide longer queries into shorter subqueries. These are easier to read and edit. (For a detailed introduction to CTEs, read this article.)
Another great feature of CTEs is that they can be recursive. In other words, a recursive query can refer back to itself until a certain threshold is reached. If you need to perform the same function numerous times, you only need to write the code once.
This type of CTE – the recursive CTE – will be the topic of this post.
Recursive CTE Syntax
Remember, CTEs are essentially named subqueries that function like a temporary view. When you write them (or when you’re looking for them in the code), look for the
WITH clause, followed by a
Recursive CTEs are usually used in collections that have some type of hierarchy, like bosses and employees, folder structures, or thread structures in an online forum.
Let’s say we have a database for a company that has a few levels of employee hierarchy:
We want to get a list of all workers and their immediate bosses. A recursive CTE is ideal for this task. This is what it would look like:
WITH all_workers AS ( SELECT Id, BossId, FirstName, LastName, 0 as Org_Level FROM Employees WHERE BossId is NULL UNION ALL SELECT e.Id, e.BossId, e.FirstName, e.LastName, Org_Level + 1 FROM Employees e INNER JOIN all_workers r ON e.BossId = r.Id ) SELECT * FROM all_workers
At first glance, this query can be a little complicated. Let’s analyze each statement:
SELECT Id, BossId, FirstName, LastName, 0 as Org_Level FROM Employees WHERE BossId is NULL
The first statement determines the “root” elements, or those at the top level of the hierarchy. In our example, these are the employees who don’t report to a boss, i.e. the Big Bosses themselves. We will expand our structure by working down from these elements.
UNION ALL SELECT e.Id, e.BossId, e.FirstName, e.LastName, Org_Level + 1 FROM Employees e INNER JOIN all_workers r ON e.BossId = r.Id
Here we’ve got our recursive query, which is correlated with the result of the previous query. At this point, we have our root elements (the Big Bosses). We’ll now create a new collection of employees, but this time it is connected to the previous collection. This is the recursive element: it will keep on connecting subordinates with their subordinates until it gets to the lowest level of the hierarchy.
UNION ALL command merges the results from each recursive round in a final result table. (Remember: in every step, a recursive CTE only links elements returned by the previous step. It does not go back to the beginning and link everyone with the root-level Big Bosses; it only links them to their immediate supervisor or manager.)
The result table should look like this:
This table only has three hierarchy levels. But what if our table had, say, 3,000 levels? We would need to stop it at some point. Every recursive CTE statement needs an ending condition, one that defines how many iterations are needed. When the number of iterations is hit, the query stops running. The
MAXRECURSION function is very helpful in this situation. All you need to do is add a line after the
SELECT clause that states how many recursions (iterations) you need:
OPTION (MAXRECURSION 2)
If we apply this condition to our previous CTE, the result table will look like this:
Adapting Recursive CTEs to Different Databases
Different SQL databases have different syntaxes, so don’t be surprised if your recursive CTE needs to be modified a bit. For example:
- In PostgreSQL and MySQL, use
WITH RECURSIVEinstead of
WITHto define a recursive CTE.
- Also in PostgreSQL, use
MAXRECURSIONto set the maximum number of iterations.
- In Oracle, you need to add a condition in the
WHEREclause to define iterations; the
MAXRECURSIONstatement won’t work.
CTEs are very powerful SQL tools, especially when used recursively. Recursion is perfect for handling huge tables with many hierarchical levels. To learn more about CTEs, check out Vertabelo Academy’s Recursive Queries interactive course!