SQL Hacks To Control Family Budget On Black Friday Weekend

sql basics, how to in sql, JOIN, GROUP BY, reporting with SQL, data analysis, extras,

If you’re in the US, chances are you’ve been eagerly awaiting the approach of Black Friday just as much as Thanksgiving. Though the shopping frenzy takes hold of nearly everyone, some people have to stick to their budgets and shop prudently. In this article, we’ll take a look at how generating an SQL report can help you track how much your family spent shopping on Black Friday.

Storing Black Friday Purchases in a Database

Before we can create an SQL report, we first need some data we can use. Our example database consists of two tables: person and product.

id name
1 Ryan
2 Emily
3 Ava
4 Ethan

The person table is quite simple—it stores the names of family members who spent money on Black Friday, as well as their associated identifiers.

id name price discount person_id
1 leather wallet 120 22 3
2 Vertabelo “SQL Basics” course 29 40 1
3 tablet 234 30 2
4 headphones 40 10 1
5 travel book 10 25 4
6 bag 240 60 1
7 blouse 25 40 2
8 smart-phone 500 35 3
9 smart-watch 100 45 2
10 blender 320 65 1
11 toy blocks 80 20 1
12 mirror 210 70 4

The product table stores detailed information about all products the family purchased on Black Friday, including prices, discounts, and identifiers of the people who made the purchases.

Who Bought the Most Items?

Now that we’ve stored our information in a database, we can begin creating an SQL report to summarize our data. Let’s begin with the following simple query:

SELECT person_id, COUNT(id)
FROM product
GROUP BY person_id;

In the SELECT clause, we’ve chosen to work with the column named person_id, which stores the identifiers of our family members. For each person, we need to compute the number of products they purchased, so we’ll use the COUNT aggregate function. We’ll pass in the name of the column that stores product identifiers (id) as the argument for this function. Essentially, this allows us to count the number of times a family member’s identifier appears in the table of purchases.

Since we listed the column person_id in our SELECT clause, we also need to pass it in as a parameter to the GROUP BY clause. Remember: all columns outside aggregate functions must be listed in a GROUP BY clause.

Here’s the result of executing that query:

person_id count
4 2
1 5
3 2
2 3

The values stored in the column named person_id are the identifiers of the family members who made the purchases, and the values stored in the column count indicate how many products they purchased.

There’s one problem, though. We’d rather not have to go back and forth between the result table and the original person table to determine who’s who. Thus, we’ll modify our query so it displays the names of our family members instead of their identifiers.

SELECT ps.name, COUNT(p.id)
FROM product p
JOIN person ps ON p.person_id=ps.id
GROUP BY ps.name
ORDER BY COUNT(p.id) DESC;

In the query above, we’ve joined the tables product and person so we can retrieve our family members’ names. Now, in our SELECT clause, we specify the column that stores our family members’ names. With that set, we need to also modify the GROUP BY clause so it contains the name column from the person table. We’ll also use an ORDER BY clause to sort the data in descending order (DESC) in terms of the number of products purchased. Take a look at the result of our report:

name count
Ryan 5
Emily 3
Ethan 2
Ava 2

Our initial report indicates that Ryan bought the most items on Black Friday. With three items, Emily came in second.

It wasn’t too difficult to prepare this simple report with SQL. If you’d like to extend your knowledge of SQL and learn about joins, aggregate functions, data ordering, and many other concepts, or if your adventure with SQL has just begun, we encourage you to take a look at our SQL Basics course. We’re offering a special Black Friday promotion— get 40% OFF the course!

What Exactly Did Ryan Purchase?

All right, we know that Ryan bought the most items, but what exactly did he purchase? The following SQL query can help us answer this question:

WITH record_products AS
	 (
	SELECT person_id, COUNT(id)
	FROM product
	GROUP BY person_id
	ORDER BY COUNT(id) DESC
	LIMIT 1
	)
SELECT p.name FROM product p
JOIN record_products rp ON rp.person_id=p.person_id ;

To simplify our query, we used a Common Table Extension (CTE). A CTE functions as a temporary result set that can be used elsewhere within the same query. A CTE begins with the WITH keyword and a temporary name, followed by a pair of parentheses. Inside those parentheses, we define our temporary result set. Here, our CTE is temporarily named record_products and defines a query that returns the identifier of the person who purchased the most products. The result of this query is limited to the first record returned. Note that PostgreSQL uses LIMIT clauses with the number of rows to accomplish this.

After the CTE, we wrote a query that returns the name of the products that were purchased by this individual, using the information from the product table. We joined this table with the result of the CTE. In the join condition, we supplied the ID of the person returned by the CTE and the ID of the person from the product table. This is crucial to obtaining the correct number of rows.

CTEs are pretty cool, and if you’re interested in learning more about using them, go ahead and check out our Recursive Queries course. With our special Black Friday sitewide sale, you can save 40% on the course!

Here are the results of this new report:

name
Vertabelo “SQL Basics” course
headphones
bag
blender
toy blocks

Who Spent the Most Money?

Although we know who bought the most items, we don’t know who spent the most money. These two people may not necessarily be the same, so it’s worth investigating further.

Take a look at the query below:

SELECT ps.name,
SUM(p.price-(p.price*p.discount/100.0)) AS sum_price
FROM product p  
JOIN person ps ON p.person_id=ps.id
GROUP BY ps.name
ORDER BY sum_price DESC;

Here, we used data from the joined tables product and person to retrieve the name of each family member, alongside the amount they spent on Black Friday. The precise amount they spent is calculated with the SUM function. Inside the SUM function, we used an expression that calculates the discounted price by calculating the discount reduction and then subtracting it from the product’s base price, using values from the respective columns of discount and price.

In the GROUP BY clause, we listed the column containing each family member’s name. We also used an ORDER BY clause to sort the records in descending order by the amount spent by each family member. Take a look at the result of the updated report:

name sum_price
Ava 419
Ryan 326
Emily 234
Ethan 71

As you can tell, Ava spent the most on Black Friday, even though she bought fewer items than Ryan.

In our final report, we’ll answer the following question: what was the average discount for each family member?

Who Got the Most Discounts?

We’ll create another report to determine who took advantage of discounts the most. Here’s our new query:

SELECT ps.name,
AVG(p.discount) AS avg_discount
FROM product p  
JOIN person ps ON p.person_id=ps.id
GROUP BY ps.name
ORDER BY avg_discount DESC;

Again, we joined the two tables product and person in order to retrieve family members’ names and the average discounts they received on their purchases. We grouped our data by the family members’ names, hence why we passed in the name column from the person table to both the SELECT and GROUP BY clauses. We also used the AVG function to find the average percentage discount, passing in the discount column as its argument.

The resulting table was sorted in descending order by the average discount of products purchased on Black Friday.

name avg_discount
Ethan 47.50
Ryan 39.00
Emily 38.33
Ava 28.50

It seems Ethan enjoyed the greatest average discount on his purchases, even though he didn’t buy the most items or spend the most on his purchases.

Summary

SQL is a handy skill in our economy-driven world, as it can help you track your personal finances and generate a variety of useful reports. If you’re interested in learning more about SQL, Vertabelo Academy is a great place to get started. Check out our interactive courses today, and take advantage of our Black Friday discounts!

Dorota Wdzięczna

Dorota is an IT engineer and works as a Data Science Writer for Vertabelo. She has experience as a Java programmer, webmaster, teacher, lecturer, IT specialist, and coordinator of IT systems. In her free time, she loves working in the garden, taking photos of nature, especially macro photos of insects, and visiting beautiful locations in Poland.