*Want to learn how to use SQL window functions? Ranking functions are a good place to start!*

Learning about SQL window functions usually comes after you’ve built a foundation in the language, but these powerful functions take your skills up a level. As you master them, you’ll find better ways to solve query problems.

When used for business intelligence applications, SQL queries combine data retrieval and advanced computations. These operations are more complex than those used in OLTP systems. In fact, the SQL standard was expanded with analytical functions specifically to support this kind of activity.

SQL now has several types of analytical functions, including:

- Ranking functions
- Aggregation functions
- Positional functions
- Statistical functions

Ranking functions are the easiest windows functions to learn; they serve as a nice little gateway to the other, more complicated functions.

### SQL Window Functions Basics

Before we dive into ranking functions, it’s worth noting that all analytical function share some basic rules:

- The processing of the analytical function is done
**after**the`GROUP BY`

processing of the SQL statement and**before**the ordering of the returned rows. - Analytic function execution organizes results into partitions and then computes functions over these partitions in a specified order.
- Execution of the analytical function occurs
**after row filtering**(`WHERE`

)**and grouping**(`GROUP BY`

) so the calculations can be performed on the`GROUP BY`

results.

In SQL, window functions go by several names, including windowing functions, `OVER`

functions, and analytical functions. We’ll stick with “window function” or “analytical function” in this article.

### What Are Ranking Functions?

Ranking analytical functions rank data based on the orders stated in the function specification. Let’s look at four ranking functions that each compute rank in a different way: `RANK`

, `DENSE RANK`

, `ROW_NUMBER`

, and `NTILE`

.

We’ll use a simple “`sales`

” table to demonstrate each of these functions. The columns in this table are:

- “
`id`

” – The primary key of the table. - “
`salesman_id`

” – The ID of the person who made the sale. - “
`sales_item`

” – The ID of the item sold. - “
`sales_num`

” – The number of items sold. - “
`sales_price`

” – The price per individual item. - “
`datetime`

” – The date of the sale.

### RANK

`RANK`

is a simple analytical function. It does not take any column list arguments, like all other rank analytical functions except `NTILE`

. It returns a **rank** or a number based on the ordering of the rows by some condition.

Let’s say we want to rank our salespeople based on the number of items they’ve sold. We do this using the `RANK`

function over the “`sales_num`

” column, which records the number of sales for each rep. Here’s the code:

SELECT id, salesman_id, sales_item, sales_num, sales_price, rank() over (order by sales_num) FROM sales ORDER BY sales_num;

And the result set is displayed as :

ID | SALESMAN _ID |
SALES _ITEM |
SALES _NUM |
SALES _PRICE |
RANK()OVER (ORDERBYSALES_NUM) |
---|---|---|---|---|---|

1 | 10 | 100 | 4 | 50.5 | 1 |

2 | 30 | 200 | 7 | 50 | 2 |

5 | 60 | 200 | 8 | 30.5 | 3 |

4 | 50 | 200 | 15 | 110 | 4 |

6 | 70 | 200 | 22 | 10.5 | 5 |

3 | 40 | 200 | 40 | 10.5 | 6 |

`ORDER BY`

is mandatory when using `RANK`

, as it displays records in an ascending or descending order based on our requirements. In the above example, we ranked “sales_num” in ascending order so the largest sales number has a correspondingly higher rank. (Note: Ascending is the default order in an `ORDER BY`

clause; you do not need to specify it with the `ASC`

keyword.)

If we were to reverse the order by specifying the `DESC`

keyword – which you do have to specify – the sale reps’ ranks would change:

SELECT id, salesman_id, sales_item, sales_num, sales_price, rank() over (order by sales_num desc) FROM sales ORDER BY sales_num;

The results:

ID | SALESMAN _ID |
SALES _ITEM |
SALES _NUM |
SALES _PRICE |
RANK()OVER (ORDERBYSALES_NUMDESC) |
---|---|---|---|---|---|

1 | 10 | 100 | 4 | 50.5 | 6 |

2 | 30 | 200 | 7 | 50 | 5 |

5 | 60 | 200 | 8 | 30.5 | 4 |

4 | 50 | 200 | 15 | 110 | 3 |

6 | 70 | 200 | 22 | 10.5 | 2 |

3 | 40 | 200 | 40 | 10.5 | 1 |

The ranks have reversed: the lowest sales number now has the highest rank number.

An interesting situation arises when we have two records with identical values. Let’s look at that ascending example again. Notice what happens when two salespeople have made the same number of sales:

ID | SALESMAN _ID |
SALES _ITEM |
SALES _NUM |
SALES _PRICE |
RANK()OVER (ORDERBYSALES_NUM) |
---|---|---|---|---|---|

1 | 10 | 100 | 4 | 50.5 | 1 |

2 | 30 | 200 | 7 | 50 | 2 |

7 | 80 | 200 | 7 | 20 | 2 |

5 | 60 | 200 | 8 | 30.5 | 4 |

4 | 50 | 200 | 15 | 110 | 5 |

6 | 70 | 200 | 22 | 10.5 | 6 |

3 | 40 | 200 | 40 | 10.5 | 7 |

Two reps have the same number of total sales (7) so `RANK`

gives them the same rank (2). The next rank skips the next number (3) and that rep is ranked fourth.

What happened to that third-place spot? What if we don’t want to skip rank numbers?

### DENSE RANK

Like `RANK`

, `DENSE RANK`

takes no parameters. The difference between the two is in the handling of the rank number after records with identical ranking values, as demonstrated above. `DENSE RANK `

is “dense”; it does not return any gaps between records. This is useful for describing any kind of competition where a tie might occur (e.g. two teams tie for second place in a sports tournament).

As we saw in our previous example, two sales reps shared the second place, but the results skipped third place and moved right on to fourth place. This makes no sense.

Let’s rework our previous example, this time using the DENSE_RANK function:

SELECT id, salesman_id, sales_item, sales_num, sales_price, rank() over (order by sales_num) as rank, dense_rank() over (order by sales_num) as dense_rank FROM sales ORDER BY sales_num;

And the result set is:

ID | SALESMAN _ID |
SALES _ITEM |
SALES _NUM |
SALES _PRICE |
RANK | DENSE _RANK |
---|---|---|---|---|---|---|

1 | 10 | 100 | 4 | 50.5 | 1 | 1 |

2 | 30 | 200 | 7 | 50 | 2 | 2 |

7 | 80 | 200 | 7 | 20 | 2 | 2 |

5 | 60 | 200 | 8 | 30.5 | 4 | 3 |

4 | 50 | 200 | 15 | 110 | 5 | 4 |

6 | 70 | 200 | 22 | 10.5 | 6 | 5 |

3 | 40 | 200 | 40 | 10.5 | 7 | 6 |

Pay attention to the row with an “8” in the “sales_num” column. `RANK`

skipped “3” and gave this row a rank of “4”. If the tying records would have had more than two rows, RANK would skip them all. If there were ten identical values, `RANK`

would skip all ten and give the next unique record the rank of 13. `DENSE_RANK`

keeps it “dense” and just continues, grouping all identical values into one rank and assigning the next unique value the next rank number – in this case, “3”.

### ROW NUMBER

`ROW_NUMBER`

function is different than `DENSE_RANK`

and `RANK`

as it does not group records that share ranking values: it simply lists all records in sequential order. This is quite a commonly used function.

We can expand the results we got from `DENSE_RANK`

by adding `ROW_NUMBER`

over the “`sales_num`

” column:

SELECT id, salesman_id, sales_item, sales_num, sales_price, rank() over (order by sales_num) as rank, dense_rank() over (order by sales_num) as dense_rank, row_number() over (order by sales_num) as row_number FROM sales ORDER BY sales_num;

And the result is:

ID | SALESMAN _ID |
SALES _ITEM |
SALES _NUM |
SALES _PRICE |
RANK | DENSE _RANK |
ROW _NUMBER |
---|---|---|---|---|---|---|---|

1 | 10 | 100 | 4 | 50.5 | 1 | 1 | 1 |

2 | 30 | 200 | 7 | 50 | 2 | 2 | 2 |

7 | 80 | 200 | 7 | 20 | 2 | 2 | 3 |

5 | 60 | 200 | 8 | 30.5 | 4 | 3 | 4 |

4 | 50 | 200 | 15 | 110 | 5 | 4 | 5 |

6 | 70 | 200 | 22 | 10.5 | 6 | 5 | 6 |

3 | 40 | 200 | 40 | 10.5 | 7 | 6 | 7 |

You might wonder which of the identical-value records would receive which ranking. This is **totally undetermined, **which means it would change during different executions. So beware: many errors happen because of the misuse of `ROW_NUMBER`

on datasets with non-unique arguments.

*Note: *One interesting use case for ranking functions is filtering data. If we filtered the top two sales by price using `RANK`

, it would look like:

with top_price as ( SELECT id, salesman_id, sales_item, sales_num, sales_price, RANK() OVER (ORDER BY sales_price) rank_sales_price FROM sales) select * from top_price where rank_sales_price < 3

The results:

ID | SALESMAN _ID |
SALES _ITEM |
SALES _NUM |
SALES _PRICE |
RANK _SALES_PRICE |
---|---|---|---|---|---|

3 | 40 | 200 | 40 | 10.5 | 1 |

6 | 70 | 200 | 22 | 10.5 | 1 |

### NTILE

Unlike the other ranking functions we’ve discussed in this article,` NTILE`

takes arguments. And it’s not widely used. The `NTILE`

function divides rows into equal groups and returns the rank, or number, of these groups.

Suppose we want to divide our sales reps into three groups. We do this by specifying a 3 as the argument for `NTILE`

(i.e. `NTILE(3)`

) over the “`sales_num`

” column:

SELECT id, salesman_id, sales_item, sales_num, sales_price, NTILE(3) over (order by sales_num ) as NTILE FROM sales ORDER BY sales_num;

We get the following ranks, which are based on the values in the “`sales_num`

” column:

ID | SALESMAN _ID |
SALES _ITEM |
SALES _NUM |
SALES _PRICE |
NTILE |
---|---|---|---|---|---|

1 | 10 | 100 | 4 | 50.5 | 1 |

2 | 30 | 200 | 7 | 50 | 1 |

7 | 80 | 200 | 7 | 20 | 1 |

5 | 60 | 200 | 8 | 30.5 | 2 |

4 | 50 | 200 | 15 | 110 | 2 |

6 | 70 | 200 | 22 | 10.5 | 3 |

3 | 40 | 200 | 40 | 10.5 | 3 |

Notice that the first group has three records and the other two groups only have two each. This is how `NTILE`

works. If there is a remainder in the division of rows, `NTILE`

expands the groups by one apiece until the remainder is exhausted. In our case, the remainder was one, so only the first group was expanded.

### Want to Learn More About SQL Window Functions?

Now that you understand a bit about ranking functions, you’re on your way to the world of SQL window functions. But don’t just read about them – practice what you’re learning. I recommend Vertabelo Academy’s Window Functions course. It’s a good, hands-on way to really dig into using analytical functions to power up your SQL. And don’t forget to come back to our blog to learn more about the other types of window functions!