FROM: A Clause with Plenty of Possibilities

hands chosing colorful donuts from paper boxes, many options and possibilities,

Think that FROM only specifies tables in an SQL statement? Think again …

When I first began to work with SQL, I saw the FROM clause as the simplest part of any query. You’d put only one table there, or may be two or more tables in the case of a join. That was all I thought about the FROM clause in those days.

As I began to learn more about SQL, I discovered that there are a lot of techniques that can be used in a FROM. Often, these techniques are the key that unlocks the solution of a complex query.

In this post, we’re going to consider everything that goes into a FROM clause: one table, multiple tables (in a join), a view, a materialized view, a temporary table, a subquery, or a blend of any of the above.

We will use the following tables, team and game, to illustrate our examples.

team table:

team_id name founded
1 Lions 01-01-1900
2 Penguins 04-01-1989
3 Blackbirds 30-12-1950

game table:

team_1 team_2 match_day stadium points_1 points_2
1 2 05-03-2017 PenguinHouse 2 1
1 3 12-03-2017 SuperArena 1 0
2 3 19-03-2017 SuperArena 1 1

The Simplest FROM: Just One Table

The simplest way to use the FROM clause is to put just one table in it. Suppose we want to obtain all the team names from the above table. We can use the following simple query:

SELECT name FROM team

RESULTS

Lions
Penguins
Blackbirds

Multiple Tables: A Join

We can also have more than one table in the FROM clause. When we need to read data from two or more tables, we join these tables by listing them in the FROM:

FROM team, game

A join is a really important operation in SQL and it is one of the building blocks of the relational database model. Please note that when join tables we also need to include in a WHERE or ON clause to associate records from one table with their corresponding records in another table. In our example, we use the WHERE clause to join records from the team and game tables. This specifies a join based on the values in the team_id and team_1 columns:

WHERE team.team_id = game.team_1

If we put two tables in the FROM clause and we forget the join condition, we have a problem. The query will return a result set called a Cartesian product, which includes all the possible combinations of all matching records in both tables. Cartesian products should almost always be avoided, although under some specific situations they can be necessary.

Here is an example of a proper join. Notice the WHERE clause:

SELECT *
FROM team, game
WHERE team.team_id = game.team_1

RESULTS

team_id	name	founded	    team_1	team_2	match_day	stadium	            points_1	   points_2
   1	Lions	1900-01-01	1	2	2017-03-05	PenguinHouse		2	     1
   1	Lions	1900-01-01	1	3	2017-03-12	SuperArena		1	     0
   1	Lions	1900-01-01	1	2	2017-03-05	PenguinHouse		2	     1

You can also use an alternative syntax to join tables. It looks different, but it returns the same results. This alternative syntax uses two reserved words: JOIN and ON. Here is the same query presented in the alternative syntax:

SELECT *
FROM team JOIN game ON team.team_id = game.team_1

Both syntaxes are equivalent in terms of results. However, using the JOIN clause gives you some extra options, like using an OUTER clause to manage rows that have no corresponding records.

The Cartesian Product

Let’s go back to the Cartesian product. When would we want this in our result set? Suppose we are planning a tournament where every team must play against every other team. We can obtain a list of all possible matches (a Cartesian product) with the following SQL:

SELECT t1.name, t2.name 
FROM team t1, team t2

RESULTS

Lions		Lions
Lions		Penguins
Lions		Blackbirds
Blackbirds	Lions
Blackbirds	Penguins
Blackbirds	Blackbirds
Penguins	Lions
Penguins	Penguins
Penguins	Blackbirds

This example is a pure Cartesian product. However, there are some pairs that we don’t want in the result – e.g. having a team play against itself (Lions vs. Lions) or having a redundant match (Lions vs. Penguins and Penguins vs. Lions). To exclude these pairs, we add a condition to the WHERE clause:

SELECT t1.name, t2.name 
FROM team t1, team t2
WHERE t1.name > t2.name

RESULTS

Lions		Blackbirds
Penguins	Blackbirds
Penguins	Lions	

The condition in the WHERE clause discards pairs of teams playing against itself or for duplicate matches to appear in our list.

A Self Join

A self join is a join where both tables are the same, mainly when a table have a foreign key pointing to the same table. Self joins can also be made by joining one table to itself. For example, you may use a self join to compare two columns in the same table against each other.

Suppose we have a table called employee with columns called employee_id and manager_id. The following SQL uses a self join to compare these columns:

SELECT 	E1.name as employee, E2.name as boss
FROM    	employee E1, employee E2
WHERE 	E1.manager_id = E2.employee_id

E1 and E2 are aliases for the employee table. SQL does not allow you to use the same table name more than once per statement, so you must use table aliases to get around that rule. In this example, SQL is treating the employee table as two different tables. The aliases differentiate between the “two tables”.

Using a View in the FROM Clause

Another option is to use a named view in the FROM clause. A view appears to be a table but is not. A view is part of a table – it is actually a subset of records displayed by a SELECT – but it doesn’t store records. As far as the FROM clause is concerned, a view is treated like a table.

Let’s see an example. Suppose we need a view that shows any teams that have never lost a game. We will create a view titled invictus to select the appropriate teams. The code is:

CREATE VIEW invictus AS
SELECT name 
FROM team T
WHERE NOT EXISTS (
SELECT * FROM game G 
WHERE (G.team_1 = T.team_id AND G.points_1 < G.points_2) 
              		        OR (G.team_2 = T.team_id AND G.points_1 > G.points_2)
)

Now we can use the view to obtain the unbeaten teams:

SELECT * FROM invictus

RESULTS

Lions

Using a Materialized View in the FROM Clause

A materialized view is similar to a view; it is also created using SELECT statements on base tables. However, materialized views can store records. These records can be obtained or calculated from regular tables or even other views, so materialized views introduce a kind of redundancy. The records in materialized views are synchronized with the relevant base tables using the REFRESH statement.

We can use a materialized view in the FROM clause, just as we did in the previous section. Let’s create a materialized view called mvw_results using the columns name and match_day. We will create a new column, result, which will store a “WON”, “TIE” or “LOST” value for every game played by the team.

CREATE MATERIALIZED VIEW  mvw_results AS
	-- won matches
	SELECT team_1 as team_id, match_day, ‘WON’ 
FROM    game WHERE points_1 > points_2
UNION
	SELECT team_2 as team_id, match_day, ‘WON’ 
FROM    game WHERE points_2 > points_2
-- tied matches
	UNION 
	SELECT team_1 as team_id, match_day, ‘TIE’ 
FROM    game WHERE points_1 = points_2
UNION
	SELECT team_2 as team_id, match_day, ‘TIE’ 
FROM    game WHERE points_2 = points_2
-- tied matches
	UNION 
	SELECT team_2 as team_id, match_day, ‘LOST’ 
FROM    game WHERE points_1 > points_2
UNION
	SELECT team_1 as team_id, match_day, ‘LOST’ 
FROM    game WHERE points_2 > points_2

Using a Subquery in the FROM Clause

We can also enclose a SELECT in parenthesis, assign a name to it (thus making it a named subquery), and put it in the FROM clause. The database will treat it as a regular table.

Let’s see an example. Suppose we want to rank every team in the tournament. To do this, we need to calculate how many points each team has. Two points are awarded for each win and one for each tie.

First, we will create two subqueries: one called q_won that uses the columns name of team and quantity of won matches, and another called q_tied with the same columns. Let’s use the previous materialized view when we create the two subqueries in the FROM clause:

SELECT 	
T.name, 
(COALESCE(sq_won.quantity_won,0) * 2  + COALESCE(sq_tie.quantity_tied,0) * 1 ) points
FROM   (team T 
	LEFT OUTER JOIN (SELECT team_id , count(*) quantity_won 
                   FROM mvw_results 
                   WHERE result ='WON'
             GROUP BY team_id) AS sq_won 
           ON T.team_id = sq_won.team_id )
           LEFT OUTER JOIN ( SELECT team_id , count(*) quantity_tied 
                                     FROM mvw_results 
                                     WHERE result ='TIE'
                                     GROUP BY team_id) AS sq_tie 
            ON T.team_id = sq_tie.team_id
ORDER BY points DESC

Using A Temporary Table in the FROM Clause

The last option we will review is the use of a temporary table in the FROM clause. A temporary table is created during the current Postgres session; it belongs to the session, and other sessions cannot access it. Moreover, other sessions can have a temporary table with the same name.

Temporary tables are destroyed when the session ends or when a DROP TABLE statement is executed. In the following example, we will do the same query as we did with the subquery example. The difference is that we will use two temporary tables instead of the subqueries. Here is the code:

CREATE TEMPORARY TABLE  sq_won  AS 
SELECT team_id , count(*) quantity_won 
FROM mvw_results 
WHERE result ='WON'
GROUP BY team_id;

CREATE TEMPORARY TABLE  sq_tie  AS
SELECT team_id , count(*) quantity_tied 
FROM mvw_results 
WHERE result ='TIE'
GROUP BY team_id;

SELECT 	
T.name, 
(coalesce(sq_won.quantity_won,0) * 2  + coalesce(sq_tie.quantity_tied,0) * 1 ) points
FROM   (team T 
	LEFT OUTER JOIN sq_won ON T.team_id = sq_won.team_id )
            LEFT OUTER JOIN sq_tie    ON T.team_id = sq_tie.team_id
ORDER BY points DESC;

So, we did the same query using subqueries and using temporary tables. Is there any difference between these approaches? Yes. We can create indexes on temporary tables; if we need to improve performance, the temporary table approach enables that.

Try It Yourself

In this article, we saw the many things we can use in the FROM clause: one table, joined tables, views, materialized views, temporary tables, and subqueries. SQL is flexible and allows us to combine all these elements in a single FROM.

SQL offers several ways to write the same query. As a challenge to the reader, I’m going to mention that it is possible to obtain a materialized view using a CASE statement instead of a UNION. How would you do this?

For more SQL information, check out Vertabelo SQL Academy. It’s free to try and covers lots of good stuff!

Maria Alcaraz

Former Freelance Database Developer, Mother of 4 children