SQL Subqueries

SQL Subqueries1

The article describes what a subquery is and what these useful statements look like. We will cover basic examples with the IN, EXISTS, ANY, and ALL operators, look at subqueries in from and where clauses, and explore the difference between correlated and nested subqueries.

First, let’s start with an example database. To present some of these statements we need to have an example table and fill it with some data.



What is a Subquery?

A subquery is a SELECT statement with another SQL statement, like in the example below.

select *
from product
where id in
           (select product_id
            from provider_offer
            where provider_id = 156);

Subqueries are further classified as either a correlated subquery or a nested subquery. They are usually constructed in such a way to return:

  1. a table
    select max(average.average_price)
    from (select product_category, avg(price) as average_price
          from product
          group by product_category) average;
  2. or a value
    select id
    from purchase
    where  value > (select avg(value)
                    from purchase )

Nested Subqueries

Nested subqueries are subqueries that don’t rely on an outer query. In other words, both queries in a nested subquery may be run as separate queries.

This type of subquery could be used almost everywhere, but it usually takes one of these formats:

WHERE [NOT] IN (subquery)
Select *
from client
where city in (select city
               from provider);

The example subquery returns all clients that are from the same city as the product providers.
The IN operator checks if the value is within the table and retrieves the matching rows.

WHERE expression comparison_operator [ANY| ALL] (subquery)

Subquery With ALL Operator

The ALL operator compares a value to every value from the result table.

For example, the following query returns all of the models and producers of bikes that have a price greater than the most expensive headphones.

select producer, model
from product
where product_category = 'bike'
and price > ALL
            (select price
             from product
             where product_category = 'headphones');

Similar subquery but with ANY operator:

Subquery With ANY Operator

The ANY operator compares a value to each value in a table and evaluates whether or not the result of an inner query contains at least one row.

The following query returns all of the models and producers of bikes that have a price greater than at least one of the headphones.

select producer, model
from product
where product_category = 'bike'
and price > ANY
            (select price
             from product
             where product_category = 'headphones');

You can also nest a subquery in another aubquery. For example:

Subquery Nested in Another Subquery Using IN Operator

This query returns producers and models of bikes that exist in provider’s offers from the USA.

select producer, model
from product
where product_category = 'bike'
      and id in
               (select distinct product_id
                from provider_offer
                where provider_id in
                                    (select id
                                     from provider
                                     where country = 'USA'));

The same could be done using joins.

select product.producer, product.model
from product, provider_offer, provider
where provider_offer.product_id = product.id
and provider_offer.provider_id = provider.id
and product_category = 'bike'
and provider.country = 'USA';

Correlated Subqueries

Subqueries are correlated when the inner and outer queries are interdependent, that is, when the outer query is a query that contains a subquery and the subquery itself is an inner query. Users that know programming concepts may compare it to a nested loop structure.

Let’s start with a simple example.

The inner query calculates the average value and returns it. In the outer query’s where clause, we filter only those purchases which have a value greater than the inner query’s returned value.

Subquery Correlated in Where Clause

select id
from purchase p1
where date > '2013-07-15'
and value > (select avg(value)
             from purchase p2
             where p1.date = p2.date)

The query returns purchases after 15/07/2014 with a total price greater than the average value from the same day.

The equivalent example, but with joining tables.

Select  p1.id
from purchase p1, purchase p2
where p1.date = p2.date
and p1.date > '2013-07-15'
group by p1.id
having p1.value > avg(p2.value);

This example can also be written as a select statement with a subquery correlated in a from clause.

The subquery returns the table that contains the average value for each purchase for each day. We join this result with the Purchase table on column ‘date’ to check the condition date > ’15/07/2014′.

select id
from purchase, (select date, avg(value) as average_value
                from purchase
              	where date > '2013-07-15'
              	group by date) average
where purchase.date = average.date
and purchase.date > '2013-07-15'
and purchase.value > average.average_value;

Usually, this kind of subquery should be avoided because indexes can’t be used on a temporary table in memory.

Subquery With EXISTS

WHERE [NOT] EXISTS (subquery)

The EXISTS operator checks if the row from the subquery matches any row in the outer query. If there’s no data matched, the EXISTS operator returns FALSE.

This Query returns all clients that ordered after 10/07/2013.

select id, company_name
from client
where exists (select *
              from purchase
              where client.id = purchase.client_id
              where date > '2013-07-10')

When a subquery is used, the query optimizer performs additional steps before the results from the subquery are used. If a query that contains a subquery can be written using a join, it should be done this way. Joins usually allow the query optimizer to retrieve the data in a more efficient way.

Subquery Limitations

There are some limitations of using subqueries:

  • In Oracle you can nest up to 255 levels of subqueries in a WHERE clause.
  • In SQL Server you can nest up to 32 levels.

Maria Alcaraz

Former Freelance Database Developer, Mother of 4 children

comments powered by Disqus


Over 85.000 happy students
and counting!