It’s inevitable that some data in the database has no value what in SQL is represented by the
NULL keyword. “No value” here is different from zero, false, or an empty string (but with exceptions! In Oracle database,
NULL is the same as string of zero length). During playing with the database, treating
NULLs are becoming more and more troublesome, therefore >MySQL standard provides help with some of the functions, like COALESCE.
Let’s introduce the
COALESCE function returns a first non
NULL argument from the passed list of arguments. Its syntax is as follows:
COALESCE(x, y, … , n)
What additionally is important, that:
- It takes at least two arguments.
- The expressions in the
COALESCEfunction must evaluate to the same data type (For example the SQL statement
SELECT COALESCE (1, 'aa', NULL, 23);will produce an error).
- If all listed arguments evaluate to
NULL, the function also returns
COALESCE is essentially a shorthanded
CASE statement presented below:
CASE WHEN x IS NOT NULL THEN x WHEN y IS NOT NULL THEN y WHEN ... IS NOT NULL THEN … ELSE n END
COALESCE – usage examples
Now, take a look at the presented below prototype of the university database model. I will show some examples that requires the
COALESCE function that refers to tables from this model.
— Vertabelo (@Vertabelo) October 25, 2016
Using COALESCE with value concatenation
Take a look at the
student table from the introduced university database model:
It may happen, that the particular person could not have the the middle name, therefore some records in the
middle_name column may be
NULLs like show the example data below.
Let’s try to concatenate student’s first name, middle name, last name. This one works in PostgreSQL, Oracle.
SELECT first_name || ' ' || middle_name || ' ' || last_name AS full_name FROM student;
The result table contains a column
full_name with the concatenated strings.
Notice that only one record has a proper full name. The rest of them are
NULLs, because the middle name was also
NULL (while concatenating strings with
NULL value, the result is also
Such a result doesn’t looks very nice. The result don’t show the full name of students with no middle name. Here the
COALESCE() function comes with the rescue. Using this we can replace
NULL fields with for example an empty string. Now, the new one query looks as follows:
SELECT first_name || COALESCE(' ' || middle_name || ' ', ' ') || last_name AS full_name FROM student;
The result is now complete 🙂
MySQL, MS SQL Server uses for the concatenation
CONCAT() function or ‘
Similarly, the query in MySQL or MS SQL Server will look as follows:
SELECT CONCAT(first_name, ' ', middle_name, ' ', last_name) FROM student;
… and the records that have
NULLs will also have :
COALESCE function, the
NULLs that appear in the
middle_name column will be replaced.
SELECT CONCAT(first_name, ' ', COALESCE(middle_name, ''), ' ', last_name) FROM student;
Using PostgreSQL COALESCE with aggregate functions
NULLs indeed can appear very problematic. Let’s imagine different situation. I want to get to know what’s the average grade for the course. It seems very easy, right? But, wait, wait… even if this is a simple query we can encounter some drawbacks. The issue concerns some of the SQL aggregate functions like
EXAMPLE: Using COALESCE with AVG function
Let’s try an example with
As you see below, students who are registered to the course of id 1 don’t have a grade yet.
The query, that returns the average grade for the course looks as follows:
SELECT course_instance_id, AVG(grade) AS average_grade_in_course FROM student_courses GROUP BY course_instance_id;
AVG() function , the non-
NULL fields are summed and the sum is divided by the number of non-
NULL fields. So, if all the records have
NULL in grade column, the average grade will be
NULL as well.
The result table below:
In such a case, we want to insert other value, like 0 (the grade can be a value from 2 to 5, so the average 0 can indicate for us that students don’t have grades)
COALESCE() function comes with help in replacing a values. The same query with
COALESCE will be:
SELECT course_instance_id, COALESCE(AVG(grade), 0) AS average_grade_in_course FROM student_courses GROUP BY course_instance_id;
And now we can get the result table as:
EXAMPLE: Using COALESCE with SUM() function
This one is a similar example. Let’s assume different situation. We want to count the total amount of the absence hours for each student. Take a look at selected part of the model:
The data in the table
student_course_attendance looks as follows.
While the data
Take a look at the data in those tables closely. As you see, two students are registered for the particular course, while only the one was missing classes.
The query, that will calculate the sum of the total number of the absence hours for a
course_id = 1 will for each student is:
SELECT student_id, SUM(absence_hours) AS total_absence_hours FROM student_courses sc LEFT JOIN student_course_attendance sca ON sc.id = sca.student_courses_id WHERE course_instance_id = 1 GROUP BY 1;
… and evaluates in a result table:
The total absence hour was evaluated to
NULL, because indeed, this student didn’t miss any classes in this course instance. There weren’t any record for this student in the
student_course_attendance table, and the
SUM() function returned
NULL. We can secure from this situation of course using
The new query is as follows:
SELECT student_id, COALESCE(SUM(absence_hours), 0) AS total_absence_hours FROM student_courses sc LEFT JOIN student_course_attendance sca ON sc.id = sca.student_courses_id WHERE course_instance_id = 1 GROUP BY 1;
And the result is now:
Let’s abandon now the university model and look at the different example.
Using COALESCE in creating PIVOT tables
This use case will be presented for the PostgreSQL database. Note, that other databases like Oracle and MS SQL Server have their equivalent functions for creating pivot tables.
Look at the table below. This is the result of some query, that shows what is the total sale for each brand in the particular month.
I want to transponse the rows with columns (create a pivot table). The result table should have columns:
dec. Notice that not in the every month product from the particular brand was sold. Let’s take a look at the Postgres example.
SELECT brand_id, jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec FROM CROSSTAB ( 'SELECT year, month, qty FROM sales ORDER BY 1', 'SELECT m FROM generate_series(1,12) m' ) AS ( year int, "jan" int, "feb" int, "mar" int, "apr" int, "may" int, "jun" int, "jul" int, "aug" int, "sep" int, "oct" int, "nov" int, "dec" int );
After executing such a query, we receive:
Unfortunately, the sell is recorded only in some of the months and only in those fields, the number of total sales is pasted. As you probably expected, the rest of the fields contains
To get rid of the unwanted
NULLs we can use the
COALESCE function, that when needed insert a 0 instead of
NULL. The repaired query looks as follows:
SELECT brand_id, COALESCE(jan, 0), COALESCE(feb, 0), COALESCE(mar, 0), COALESCE(apr, 0), COALESCE(may, 0), COALESCE(jun, 0), COALESCE(jul, 0), COALESCE(aug, 0), COALESCE(sep, 0), COALESCE(oct, 0), COALESCE(nov, 0), COALESCE(dec, 0) FROM CROSSTAB( 'SELECT year, month, qty FROM sales ORDER BY 1', 'SELECT m FROM generate_series(1,12) m' ) AS ( year int, "jan" int, "feb" int, "mar" int, "apr" int, "may" int, "jun" int, "jul" int, "aug" int, "sep" int, "oct" int, "nov" int, "dec" int );
NULLs can make life problematic. If you haven’t experienced the bad side of the missing values yet, you will come across it for sure. It’s a matter of time. For now, remember the
COALESCE function, that will help you tackle the unwanted