Very good! So far, we've only used subqueries that were independent of the main query—you could first run the subquery alone and then manually hard-code its result into the main query.
We are now going to learn about subqueries that depend on the main query. They are called correlated subqueries.
Study the example:
SELECT
*
FROM Country
WHERE Country.Area <= (
SELECT
MIN(City.Area)
FROM City
WHERE City.CountryID = Country.ID
);
We want to find all countries whose area is less than or equal to the minimum city area in that particular country. In other words, if there is a country smaller than its smallest city, it will be shown. Why would we use such a query? After all, if a country contains a city, it doesn't make much sense for it to be smaller than its smallest city, right? Well, this query can be very convenient if we want to check whether there are any errors in our database. If this query returned anything but an empty set, we would know that something fishy is going on in our records.
What's the new piece here? Take a look at the WHERE
clause in the subquery. That's right, it uses Country.ID
. Which country does it refer to? The country from the main query, of course! This is the secret behind correlated subqueries—if you ran the subquery alone, your database would say
'Hey, you want me to compare City.CountryID
to Country.ID
, but there are tons of IDs in the Country
table, so I don't know which one to choose.'
But if you run the instruction as a subquery and the main query retrieves data from the Country
table, then the database will compare each Country.ID
from the subquery with the current Country.ID
of the main query.
Just remember the golden rule: subqueries can use tables from the main query, but the main query can't use tables from the subquery!