Learning to Think in SQL

Learning to think in SQL will help you understand what SQL queries do, how to create and change them, and how to create reports from the data you obtain. One of the greatest skills you can have in SQL programming is this the ability to change an existing SQL query so you get the results you need. When you are first starting out, you likely will not create SQL queries from scratch but rather reuse what others have written. Thus, you need to understand the meaning of SQL queries and what they do.

Let’s get started with learning to think in SQL so you understand important SQL concepts and learn to write better queries!

The language of databases

As with learning any language—French, Chinese, Java, or SWIFT—you need a proper roadmap of a language and a description of its rules so you can start thinking in that language. Having vocabulary sheets and grammar rules on hand certainly helped me when I was learning a foreign language. It’s the same with any computer language.

In no time at all, you are going to learn how to think in SQL and how to use this odd new language: Structured Query Language. We’ll consider the basic syntax and “grammar” that allow us to craft SQL queries. To keep things at a high level, I’ll avoid diving into all the minute details of SQL rules and syntax. Instead, I’ll explain the most important and general aspects of SQL—concepts you need to understand and pitfalls you need to avoid. Once you’ve digested these, you’ll be ready to dive into more advanced content.

SQL works with sets of records

In SQL, we call rows in a table records. SQL is built to work with several records, rather than just a single record. As such, you need to think of working with sets of records, not individual records, when writing SQL queries. You typically use SQL to query some data and produce a report. Given a query, SQL will return all of the data (records) you requested.

Once you start digging deeper into how to use SQL, you find that you can use it for more than just creating reports. At that point, you’ll need to pay attention to what you tell a database to do, as each SQL query will work with a set of records—and that set may be empty, contain only a single record, or contain many records.

Where is your data coming from?

Next, where does the information you want to work with come from? To specify this, you choose the table you would like to work with by specifying its name in the FROM clause. Which data are you looking for? Where is the best place to retrieve them from? Do you have the correct name of the table? Is this table the most up-to-date source of the data? If you don’t know the answers to these questions, you may find some insight by looking in the system tables (sys.tables, etc.), depending on the type of database. Try:

SHOW TABLES

(I will be using some examples from Vertabelo Academy’s online SQL Basics course. If you’d like to follow along, this is from Section 5, “More on JOINs“, and works on an H2 database.)

What information will you work with?

What kind of information do you want to consider within each record? In SQL, the columns of a table are called attributes. When working with a particular table, you need to identify the attributes you want to zoom in on by using the SELECT statement. The following simple query retrieves all attributes (columns) of a table named student:

SELECT * FROM student

(Note: this will only work if our system has a student table, as we find in “More on JOINS“. If not, choose another table name for the FROM statement above.)

This query will return all records from the table and show all columns. Now you can determine which pieces of information you’re interested in!

What set of records will you work with?

Tables can contain hundreds, thousands, and even millions of records. However, you won’t always need to look at all these records to find what you’re looking for. Instead of returning the entire set of data from a table, you’ll want to narrow down the set of records you would like to work with by using the WHERE clause.

The WHERE clause allows you to specify a logical expression that filters a table’s records. SQL will return all records whose data match the condition(s) you’ve listed. You can use the standard logical operators (=, >, <, <>) or even more complex compound expressions involving BETWEEN, AND, OR, and NOT. In addition, there are many arithmetic and logical functions that let you do just about anything you can imagine. All of the conditions you specify will involve one or more table columns.

As an example, I’ll work with a public database that contains a table named student. Here’s a simple query I could write:

SELECT name, room_id FROM student WHERE name LIKE 'J%'

First, I told SQL that I would like to focus on the name and room_id attributes (columns) of the student table. Then, I told it to filter the records (rows) so that only the ones whose name entries begin with the letter “J” are shown in the resulting set.

Combining information from multiple tables

Now, let’s kick things up a notch. What if you need to work with records in more than one table? To that end, you’ll work with table JOINs: JOIN /LEFT/RIGHT.

Databases administrators like to keep things “tidy”. They call it “normalization”. If you keep an employee file on hand, you try to store each unique name only once. That way if there is a mistake or if the name changes it only needs to be corrected/changed in one place rather than running a “batch” change and “hoping” you got all the instances to the change fixed. What this means though is that whenever you are interested in employees, you must bring in data from the Employee table. In SQL that is done through a JOIN.

There four types of joins: INNER, LEFT, RIGHT, and the seldom used FULL. INNER and LEFT joins are the most common. An INNER JOIN takes two tables and returns only the records that match a certain condition in both tables. In other words, it returns the intersection of the two sets of records. A LEFT JOIN returns all records of the values specified on the left side of the equation and only the matching records from the data set on right side of the equation. A RIGHT JOIN, of course, is the opposite—it returns all records of the values specified on the right side of the equation and only the matching records from the dataset on left side of the equation. And finally, a FULL JOIN returns a cartesian product of the records, with NULLs where there are no matches

full join, inner join, left join, right join

Using the same example as before, here’s a demonstration of an INNER JOIN:

	SELECT r.room_number, e.name
	FROM equipment AS e 
	INNER JOIN room AS r
	ON e.room_id = r.id;

The SQL UNION statement allows you to combine the results of two SELECT statements, similar to a FULL JOIN.

Organizing our results

Sometimes, you’d like to organize the information you have retrieved from a table. To sort your retrieved data, you use the ORDER BY clause. ORDER BY does just what its name suggests—it orders the returned results by the column(s) specified in its condition. The default order is ASCending, but you can also specify DESCending order. Continuing with our earlier example, we now sort our records by room_id in ascending order:

       SELECT r.room_number, e.name
       FROM equipment AS e 
       INNER JOIN room AS r
       ON e.room_id = r.id
       ORDER BY r.room_number ASC

Taking things to the next level

When you’re ready to move on to more advanced material, you can look into subqueries and common table expressions (CTEs). Subqueries are a more complex and programmatic way to restrict the results you want a query to return. In other words, you use a nested query to restrict the results of the base query that you wrote. You must be very careful when writing subqueries—they can be very powerful but can also use significant system resources if written poorly.

Finally, I’ll mention the SQL MERGE (or UPSERT) statement, which either INSERTs new records or UPDATEs existing records, based on whether a condition is matched or not. This is where we get into advanced linguistic concepts that we’ll dive more into in other articles.

Finally!

Well, that’s about it for our introduction on thinking in SQL! I’ve given you the six key concepts you need to understand when working with SQL. These concepts offer insight into how SQL queries work and how to correctly modify SQL queries.

You should remember the following important information:

  1. SQL works with groups of records (rows) rather than a single record.
  2. You need to determine where the information you want to query is located (tables).
  3. You need to determine which attributes (columns) you want your query to report.
  4. You need to determine which records you would like to filter.
  5. You need to know how to combine information from multiple sources (tables)
  6. And you also need to know how to organize the resulting set of a query.

I hope you found this guide helpful!

CTO at NoSide

GET ACCESS TO EXPERT SQL CONTENT!