Refine Results with SQL Set Operators

SQL Set Operators

Using UNION, UNION ALL, EXCEPT, and INTERSECT to manage SQL query results.

The function of SQL set operators is pretty simple; they allow us to combine results from different SQL queries into one result set.

The best way to understand how set operators work is to use visual methods like the Venn diagram. For those of you not familiar with Venn diagrams, they are two circles that represent items or collections of items. The area where the circles overlap shows what the two groups have in common.

In this article, we’ll look at the following SQL set operators and how they work:

  • UNION
  • UNION ALL
  • EXCEPT
  • INTERSECT

We’ll use Venn diagrams to demonstrate how each operator functions and what makes it different from other set operators. Ready? Let’s start with a basic one, UNION.

UNION

The UNION operator takes two sets and returns the union of those sets. So if we have a group of celebrities in a table called “celebrities” (circle A) and a group of politicians in a table called “politicians” (circle B), a UNION of people from both tables would list those people who are famous politicians, famous non-politicians, and non-famous politicians – in other words, anyone who appears in either table.

The UNION operator answers OR questions affirmatively for either or both parts of the question. So if the question is “Do you know celebrities or politicians?” the answer would be “Yes” if you knew either celebrities or politicians (or both).


The UNION operator

To be more concrete, let’s use an example.

Let’s say the table “celebrities” consists of:

ID PERSON
1 Donald Trump
2 Lady Gaga
3 Beyoncé

And “politicians” consists of:

ID PERSON
1 Donald Trump
2 Bernie Sanders

We would use the following statement to get a list of people in either table:

SELECT PERSON FROM POLITICIANS
UNION
SELECT PERSON FROM CELEBRITIES 

Here are the results of that query:

PERSON
Donald Trump
Bernie Sanders
Lady Gaga
Beyoncé

The important thing here is that if the same person is in both tables the UNION statement will display only one instance of that person. That’s why there is only one Donald Trump record shown.

UNION ALL

UNION ALL is closely related to UNION. The only difference is that UNION ALL keeps the number of records as found in all tables. So in the above example, UNION ALL would return two Donald Trump records, not just one.

Notice the code:

SELECT PERSON FROM POLITICIANS
UNION ALL
SELECT PERSON FROM CELEBRITIES 

The returns would be:

PERSON
Donald Trump
Donald Trump
Bernie Sanders
Lady Gaga
Beyoncé

Note:
In terms of execution time, UNION ALL is much faster than UNION. This is because UNION sorts the data to eliminate any duplicates and sorting takes time. UNION ALL does not sort data.

EXCEPT

The EXCEPT statement takes things away. It reduces the result set from the first table by taking away records found in the second table. EXCEPT tells us who or what is not in a table. For example, it would answer the question “Who is a celebrity but not a politician?”

It is worth knowing that EXCEPT is replaced by MINUS in Oracle and some other RDBMS.

The Venn diagram of the EXCEPT statement for our politicians (A) and celebrities (B) sets would look something like this:


The EXCEPT statement

The code to find out who is famous but not a politician is:

SELECT PERSON FROM POLITICIANS
EXCEPT
SELECT PERSON FROM CELEBRITIES 

The return set would be:

PERSON
Lady Gaga
Beyoncé

So there are no Donalds here. This is because Donald Trump is also a politician and the EXCEPT took the politicians away from the results set.

INTERSECT

This statement answers AND questions. For example “Who are both celebrities(A) and politicians (B)?”

The Venn diagram for this kind of question looks like this (the darker part is where the intersection occurs):


Intersection

To find the answer to our question, we would type the following code:

SELECT PERSON FROM POLITICIANS
INTERSECT
SELECT PERSON FROM CELEBRITIES 

The resulting rows would return results who are both celebrities and politicians. In this case, there is one and only one record:

PERSON
Donald Trump

And there you have it: very simple statements. You can find exercises to practice this and other concepts in Vertabelo Academy’s SQL courses.

Aldo Zelen

Aldo is a data architect with a passion for the cloud. From leading a team of data professionals to coding a data warehouse in the cloud, Aldo has experience with the whole lifecycle of data-intensive projects. Aldo spends his free time forecasting geopolitical events in forecasting tournaments.

comments powered by Disqus

GET ACCESS TO EXPERT CONTENT!

Over 85.000 happy students
and counting!