5 Tips for Managing Long SQL Queries

Long Queries

Long SQL queries are notoriously hard for beginners to structure and understand. These five tips will teach you the best practices for writing and formatting complex SQL code.

As we all know, SQL queries are essential to database management. Without them, it would be extremely difficult to find and work with the information in a database.

Query length depends on the type of information we need and the size of the database. When we read a simple piece of code, we are usually able to understand it quite easily. But what if we’re dealing with a 1,500-line query? If we want to understand what we’ve done (or let others understand it) we have to organize our code properly. Fortunately, remembering just five useful tips will help us manage long SQL queries.

Tip 1: Indent Your Code

Indentation helps keep your query clean by identifying where each block of code begins. This makes program structure more understandable and enables developers to easily find a specific instruction. When you correctly indent your code, you can quickly see what you are selecting, which tables provide the data, and what restrictions apply to it (i.e. the WHERE and GROUP BY clauses).

Let’s see an example of properly indented code:

SELECT column1
FROM table1
WHERE column3 IN
    SELECT TOP(1) column4
    FROM table2
    INNER JOIN table3
    ON table2.column1 = table3.column1

Compare the code above to this sample:

SELECT column1, column2 FROM table1 WHERE column3 IN ( SELECT TOP(1) column4 FROM table2 INNER JOIN table3 ON table2.column1 = table3.column1)

Now, imagine that there are 100 similar instructions, too. Without indentation, that query will be very hard to read and understand!

There is no single accepted indentation style. Every company or programmer usually develops their own. Even so, your indentation style should make sense to others. To help you get started, I’ve provided links to style guides by ApexSQL and Simon Holywell.

baner sql creating tables

Indentation is the first thing you should do when you start to write code. If you inherit a chunk of un-indented code, there are a lot of sites that let you format an SQL query for free.

Tip 2: Use the WITH Clause

The WITH clause allows you to name a subquery block and treat it like a normal table. In this case, the WITH clause is basically a replacement for a normal subquery.

Look at the query below. Its task is to show the number of hospitals in Los Angeles where patients whose last visit was more than 14 days ago received an average drug dosage over 1,000 units.

WITH patient_data AS (
	SELECT patient_id, patient_name, hospital, drug_dosage
	FROM hospital_registry
	WHERE (last_visit > now() - interval '14 days' OR last_visit IS NULL) 
	AND city = "Los Angeles"

WITH average_dosage AS (
	SELECT hospital, AVG(drug_dosage) AS Average
	FROM patient_data
	GROUP BY hospital

SELECT count(hospital) 
FROM average_dosage;
WHERE AVG(drug_dosage) > 1000

I’ll explain all elements in this query.

The first element is the WITH clause. Using it, we can specify two subqueries as patient_data and average_dosage and use them like normal tables. The patient_data subquery was used to create the average_dosage subquery. So then average_dosage becomes the SELECT statement’s final data source.

This arrangement is much cleaner than writing everything in one query and putting all the subqueries in the WHERE instructions.

Because it makes code easier to read, the WITH clause is ideal for simplifying complex SQL queries.

Tip 3: Explain Yourself with Comments

Code tells how, but comments explain why. Comments are some of the most important helpful features in any code, from a programming project in Java to an SQL query. They allow programmers to express their thought processes and their desired outcomes.

Always add comments to your code. You’ll be glad you did, and so will any other programmers who work on your SQL. Even if you are just taking a break, add comments – they will help you pick up where you left off.

sql cheat sheet

This is how a very simple query looks without comments:

SELECT name, 
FROM course_marks
WHERE points < 300 OR points IS NULL;

Do you really know exactly what the WHERE means? Without comments, it is hard to understand what the coder wanted to accomplish.

Just a few comments are enough to do the job. Let’s see this same code, but with comments and explanations:

/*Get the list of every student 
who failed or was absent on exam*/

SELECT name, 
FROM course_marks
WHERE points < 300 OR points IS NULL; --student fails when gets 300 or less points.
--When student was absent, instead of points there is NULL in column

Better, isn’t it? These comments have explained the developer’s actions and helped us understand what the query does.

Tip 4: Break Queries into Steps

Complex queries are really a collection of blocks of code, which are much easier to control and to check for mistakes. Building your query a step at a time is a best practice for SQL development. This will enable you to find logical errors faster and be more confident of your results.

Look how much easier it is to understand this complex bit of SQL because it’s been broken into logical steps:

WITH subject AS (
 sum(exam_score) AS total_score 
 FROM subject_marks 
 GROUP BY student_id, subject_id, school_id

student AS (
 avg(total_score) AS average_student 
 FROM subject
 GROUP BY student_id, school_id

FROM student 
GROUP BY school_id;

We want the result of this query to show us the average amount of points for every student from every school. The first stage (the WITH subject clause) finds the sum of points for every subject. The second stage (the WITH student clause) averages points for every student. By the time we reach the SELECT, all we need to do is find the average amount of points and group the results by school.

By building a query this way, it becomes apparent that there are no long or difficult queries. It just depends on how you choose to break them up and write them.

Tip 5: Stick with One Naming Convention

It is also very important to use a single naming convention for tables, columns, and queries. This makes the code easier to write and read.

When a legacy database has tables named PRODUCT, users, USERS_other, and UserSECOND_NEW, it looks very unprofessional and the coding gets chaotic. Follow the established rules for capitalization, underscores, spaces, etc. This also applies to query styles: stick with the same formatting rules and indentation patterns for the whole database. If you start a new line after a SELECT, do it in all your queries.

baner sql operating on data

Naming conventions can be very different from database to database. As with indentation, there is no single industry-recognized style. I recommend that you find which conventions are the most widely used and model your own style on those. These articles in the Vertabelo and Launch by Lunch blogs are a good place to start.

The best way to understand the difference between good and bad query styling is to see them for yourself. Below, we find an example of poor style:

SELECT name, SECONDNAME, Date_TIME, address, SecondAddress
FROM registry, other_USERSData
WHERE registry.id = other_USERSData.id;

This query, which uses one constant style, is much more readable:

SELECT name, second_name, date_time, address, second_address
FROM registry, other_users_data
WHERE registry.id = other_users_data.id;

To sum up, these tips will help you write better queries, long or short. The best thing you can do is to practice using good techniques. If you don’t know how to develop your skills, Vertabelo Academy is a great solution. There are a lot of lessons that will teach you about managing queries and their structures.

It will take time to learn how to write clean and smart SQL queries. You’ll need to understand table relationships, code and data planning, how to identify the parts of a query, and what results you can expect. Patience and hard work are the keys to success!