Introduction to Reporting with SQL — the Ultimate Tutorial for Business Professionals

Working with the financial aspects of large and small enterprises can be a daunting task for a business professional. In this article, we’ll look at several ways of constructing the perfect SQL report.

You’ve probably already heard about SQL from your colleagues or in other areas of your career. If you’re here, you’ve likely concluded that learning SQL will make your professional life easier – and you’re right!

Today, we’ll take a look at some basic SQL to help you create great automated reports that you can run on your own or even send to your IT staff for optimization. Don’t worry if you have no SQL experience – Vertabelo Academy’s the perfect place to learn just the right amount of SQL to help you get your work done more efficiently.

Connecting to a database

As a business user, you’re probably already using some data reporting tool, such as Oracle Business Intelligence, or perhaps even a reporting service or data discovery tool, such as Tableau or Power BI. However, with SQL, you can go beyond these simple tools and communicate directly with a database.

Obviously, the first thing you need to do is connect to your organization’s database system. There are many popular enterprise-level databases that are not free, such as Microsoft SQL, Oracle, or DB2. Perhaps your organization uses some open-source database like MySQL or PostgreSQL. Whatever the case may be, you’ll need to connect to some sort of database to start.

Every popular database comes with its own program, called the client, that allows you to run SQL queries on it (e.g., SQL Developer or SQL Server Management Studio). There are even some applications, such as DataGrip, that allow you to connect to multiple databases at once. After you connect to the database, it’s time to start executing your commands.

Be warned—if you are just starting to play around with SQL, you should do so on Vertabelo Academy. It’s not a good idea to play around with your organization’s production environment without first consulting your IT team.

Your first query

As business-oriented professional, you should skip all the SQL functions concerning table creation, deletion, and filling as you learn SQL, since you likely won’t need these. It doesn’t hurt to learn them, of course, but you should focus on using the SELECT statement with different functions that help you calculate all the information you need.

We’ll look at some basic SELECT statements with special emphasis on aggregation and pagination. These concepts are so simple that you can start using them right away! For a deeper dive into different SELECT statements, be sure to visit Vertabelo Academy and take our SQL Basics and Standard SQL Functions courses. For now, we’ll focus on a single table containing sales data.

Here’s the basic composition of the sales table:


This table can serve as the basis for many analytical reports. To learn more about the specific column types, check out this video tutorial on our blog. Here’s a brief summary of the columns in the table above:

  • id — the unique identity for a particular sale
  • sales_units — the number of units sold in this sale
  • sales_cost — the cost of different units in this sale
  • customer_id — identifies the customer to whom we sold the articles
  • employee_id — identifies the employee who sold the articles
  • store_id — identifies the location where the articles were sold
  • product_id — identifies the product that was sold
  • date_str — text representing the date on which this sale occurred

Since you likely spend most of your time querying data on existing tables that someone with technical knowledge has already prepared for you, we’ll work on getting our hands dirty and mastering some basic SQL queries.

SELECT statements

Let’s look at all the sales we have in our table. To access all the values in the table columns, we would write the following simple query:

SELECT
  ID,
  SALES_UNITS,
  SALES_PRICE,
  SALES_COST,
  CUSTOMER_ID,
  EMPLOYEE_ID,
  STORE_ID,
  PRODUCT_ID,
  DATE_STR
FROM SALES

After executing the above query, we’d get the following data. We’ve limited our display to five rows here to keep things simple.


Note that querying all the data in a table is usually not a good idea, as there’s simply too much information to sift through, some of which is likely irrelevant. You should narrow down your results with the WHERE clause. The WHERE clause allows you to limit the size of the query’s result set to only those rows that match the condition you specified. It’s basically analogous to spreadsheet filtering.

So, if we’d like to filter our data to only display rows corresponding to the date 01-01-2016, we would write the following statement:

SELECT
  ID,
  SALES_UNITS,
  SALES_PRICE,
  SALES_COST,
  CUSTOMER_ID,
  EMPLOYEE_ID,
  STORE_ID,
  PRODUCT_ID,
  DATE_STR
FROM SALES
WHERE DATE_STR = '2016-01-01'

Of course, this is just a basic example of data filtering. Some of our other courses offer more in-depth examples.

Pagination

When constructing reports in SQL at an analytical level, you generally need to limit your result set in some way, too, and not just your data. This is where pagination comes in. Let’s take a look at how we can order our data.

If you want to present your data in a more organized way, you need to order your result set. This is similar to the sort function we find in Excel spreadsheets. To organize your data, you need to specify the column you’d like to use in your sort and the order you’d like to apply (desc for descending and asc for ascending). In our example, if we’d like to sort our results by the number of units that were sold, in descending order, we’d write the following query:

SELECT
  ID,
  SALES_UNITS,
  SALES_PRICE,
  SALES_COST,
  CUSTOMER_ID,
  EMPLOYEE_ID,
  STORE_ID,
  PRODUCT_ID,
  DATE_STR
FROM SALES
WHERE DATE_STR = '2016-01-01'
ORDER BY sales_units DESC;

Now, if we wanted to limit the number or rows that we see, we would simply use the FETCH FIRST clause when working with standard SQL. In our example, let’s apply this clause to limit our results to only the top 10 sales:

SELECT
  ID,
  SALES_UNITS,
  SALES_PRICE,
  SALES_COST,
  CUSTOMER_ID,
  EMPLOYEE_ID,
  STORE_ID,
  PRODUCT_ID,
  DATE_STR
FROM SALES
WHERE DATE_STR = '2016-01-01'
ORDER BY sales_units DESC
OFFSET 0 FETCH ROWS FIRST 10 ROWS ONLY;

You’ll notice that the pagination FETCH FIRST 10 ROWS ONLY above was used in conjunction with the OFFSET clause to offset our selection of rows by a certain number. Here, we used an offset value of “0”, so the pagination start at and includes the first row. If we had used an offset of “1” instead, we would have gotten the next 10 rows after the first one. You can learn more about OFFSET on Vertabelo Academy or the official Vertabelo Academy blog.

Aggregation

Sometimes, your report needs to have syntactical or aggregated data. Your supervisor doesn’t need to see the statistics for every transaction at the lowest level of granularity. And, as you know, general ledgers are comprised of only aggregated data.

There are many aggregation functions in spreadsheet software, but they are not nearly as flexible and powerful as the ones in SQL. To aggregate data with SQL, we need to specify the column that defines the group on which we would like to run our aggregation function. If we wanted to look at the total revenue we had every day, we would construct the following query:

SELECT
  DATE_STR,
  SUM(SALES_UNITS*SALES_PRICE) sum_revenue
FROM SALES
GROUP BY DATE_STR
ORDER BY DATE_STR;

Of course, we can certainly employ pagination in conjunction with aggregation to form an even clearer report. With the following query, we’re looking at the top 10 days in terms of revenue. First, we’re ordering the aggregated rows by descending revenue and then narrowing our view to the top 10 results.

SELECT
  DATE_STR,
  SUM(SALES_UNITS*SALES_PRICE) sum_revenue
FROM SALES
GROUP BY DATE_STR
ORDER BY sum_revenue DESC
FETCH FIRST 10;

Here is our result set after running the above query:


The alternative to these seven lines of code is pivoting and complex mapping inside spreadsheets. The SUM function is of course not the only one you can use to manipulate your data in SQL. You have access to many standard functions, such as min, max, avg, media, and others.

Next steps

There you have it! We’ve covered basic SQL syntax and created some ad hoc SQL reports that give you the basic knowledge you need to continue on your learning path. We hope we’ve convinced you that building reports with SQL is much easier, faster, and more intuitive than with spreadsheets.

After you master the basics of selecting, filtering, and manipulating data in SQL, you should expand your knowledge of reporting by mastering different kinds of JOIN statements, which are used to join data from multiple tables. There are also some specific SQL operators that come in handy, such as CUBE, ROLLUP, GROUPING, and GROUPING SETS. However, these operators are not all that intuitive or easy to learn, so we’ll publish a more technical article in the future to cover these in detail.

In the meantime, we encourage you to sign up for a Vertabelo Academy account today to start practicing the skills we covered in this article. After all, reading alone is not enough to master any subject—practice is key.

Aldo is a data architect with a passion for the cloud. From leading a team of data professionals to coding a data warehouse in the cloud, Aldo has experience with the whole lifecycle of data-intensive projects. Aldo spends his free time forecasting geopolitical events in forecasting tournaments.

GET ACCESS TO EXPERT SQL CONTENT!