Opening the Door to SQL Window Functions
When you want to analyze data in a database, you need SQL window functions. What are they? How do they work? Read on to find out.
There are two ways people use databases. One way is for the creation, modification and deletion of data. The other is analyzing the data, which means getting answers to specific questions. The more precise the questions you ask, the more tools you’ll need to answer them efficiently. That’s why SQL incorporated window functions – also known as analytical functions or OVER functions – in 2003.
What Are SQL Window Functions?
Window functions help you query your data with the regard for context. This sounds complicated, so I’ll explain it as simply as I can.
Let’s imagine your data in the form of a table, the same kind you get after executing a SELECT
statement. Window Functions work with aggregate functions, like COUNT
, SUM
, and AVG
. Remember what those did? They allow you to say something about more than one row.
(If you have no idea what you just read, I suggest you first read the Beginner’s Guide to SQL Aggregate Functions and then come back.)
In GROUP BY
, we choose the columns we want to use to group our data. This returns aggregated information about the data in each group. But what if we wanted to get the non-aggregated rows (i.e. without a GROUP BY
) and the aggregated data at the same time? That’s when we’d use a window functions.
Why did I tell you to imagine the table? It explains the “window” in window functions. In this case, the “window” is like a small frame you place on the data in the table. Instead of dealing with all the data, you only have a subset. You can apply the “functions” to the set of rows defined by the “window”.
Looking at a Row in Relation to Other Rows
Every example of SQL window functions I have ever seen uses “employees and their salaries” data. But let’s do something closer to home: grocery shopping. Suppose that after you bought groceries, you entered a list of purchases and their cost into a database, like this:
id | item | type | price |
---|---|---|---|
1 | Yogurt | Dairy | $3 |
2 | Milk | Dairy | $2 |
3 | Apples | Fruit | $5 |
4 | Bread | Baked goods | $4 |
5 | Ham | Meat | $8 |
6 | Orange Soda | Drinks | $1 |
One of the things we may want to ask is “How much did I spend on each item compared to the total cost of my shopping?” So we want to select every row from our table but look at each one from the perspective of the whole table. We could achieve that using this SQL snippet:
SELECT item, price / (SELECT SUM(price) FROM shopping) FROM shopping
Here are the results:
Yogurt | 0.1304347826 |
Milk | 0.0869565217 |
Apples | 0.2173913043 |
Bread | 0.1739130435 |
Ham | 0.347826087 |
Orange Soda | 0.0434782609 |
We’ve run a subquery on every single item to get the total cost of our shopping. However, subqueries can cause our query to slow down if we repeat the same action several times. We can get exactly the same result using a window function instead of a subquery:
SELECT item, price / SUM(price) OVER() FROM shopping
The magic lies in the OVER()
function, which indicates that we are using a window with the SUM
aggregate function. Without any parameters (as we did in the above example), OVER()
applies an aggregate function to the entire selected set.
This is a simple example, and SQL window functions can do a great many things. Some other questions we could answer with window Functions include:
- What is the third most expensive item you bought?
- What is the total cost of items above and including this one?
- Which items cost more than the average?
- Which items are priced up to a dollar more than the cheapest item?
You can squeeze even more juice out of window functions when you’re using data that has been partitioned, or divided into groups. For example, you could find out:
- Which items cost more than the average of the same type?
- How expensive was each item compared to others of the same type?
- What is the average, total, lowest, and highest price of items of each type?
All those questions can be answered using window functions, although you may have to partition data with the OVER()
function or use some of the more complex analytical functions.
Check out our course on Window Functions, where you’ll find a ton of exercises that will get you familiar with SQL Windows Functions’ basic and advanced features.