Computing Tardiness: Date, Time, and Interval Arithmetic in SQL
In this article, we’re going to discuss some interesting operations we can perform with date-related data types in SQL. The SQL standard, which most relational databases comply with these days, specifies the date-related data types that must be present in relational databases. The most important of such data types are date, time, timestamp, and interval. Here’s a brief rundown of the differences between these data types:
- date: represents a single day (e.g., 18/11/2017)
- time: represents time with hour, minutes, and floating-point seconds (e.g., 10:44:30.2)
- timestamp: represents a point in time, with complete information of date and time (e.g., 2017-11-18 10:44:30.2)
- interval: represents a time interval which can have different levels of precision (e.g., 20:10:30, which translates to 20 hours, 10 minutes, 30 seconds).
Logging Employee Hours
We’ll use these data types with the following example of a firehouse to log the start and end times of firefighters. The table stores scheduled (expected) and actual start/end times for employees. Examining the data in this table, you can tell that some firefighters arrived on time while others did not.
The SQL standard also specifies that certain arithmetic operations between date-related values must return specific data types. In the next few sections, we’ll perform some date-time arithmetic operations to determine which firefighters arrived late to work.
Operations Between Numerical and Date Values
We’ll begin with the simple operation of adding an integer to a date. When we add an integer “N” to a date, we obtain another date that is “N” days in the future or past, depending on whether the integer is positive or negative, respectively. With this in mind, let’s write a query that narrows our focus to all firefighters who worked over the past five days:
SELECT * FROM firestation_working_time WHERE SchedStartTime > '2017-11-10' - 5
We can perform the same operation using an interval instead of an integer. Moreover, using the interval approach, we can add/subtract not only the number of days but also the number of months, years, or a combination of all three from a particular date. Let’s see how we can obtain the firefighters who worked over the past 72 hours. Take a look at the following query:
SELECT * FROM firestation_working_time WHERE "SchedStartTime" > '2017-11-08 08:00:00' - interval '72 hours'
One difference between the two queries we’ve written so far is in the data types they return. While the first query returns a date, the second query returns a timestamp. When we add a date to an interval, the resulting data type must be a timestamp. Internally, SQL treats the date as a timestamp by simply appending to it the time “00:00:00”.
Before we move on to other arithmetic operations, I’d like to challenge you with a question: What would happen if we were to add 10 minutes to a date? I encourage you to write the appropriate SQL query to accomplish this task. Use the previous example as a model.
Identifying Tardy Workers: Subtracting Timestamps
Let’s now try another arithmetic operation. What happens if we subtract one date from another date or a timestamp from another timestamp? In both cases, the result will be an interval, representing the elapsed time between the two dates or timestamps.
Let’s see an example of this in action using the same table. Suppose we’d like to obtain data for the firefighters who arrived late on the date 2017-11-02. We can execute the following query:
SELECT "FiremanFullName", "RealStartTime" - "SchedStartTime" "Starting Delay" FROM firestation_working_time WHERE "RealStartTime"::date = '2017-11-02' AND "RealStartTime" - "SchedStartTime" > interval '0 seconds'
Average Tardiness: Aggregation With Intervals
We can also use aggregate functions such as
SUM with arithmetic operations. Suppose our boss reviews the information for all firefighters who were late on the date 2017-11-02 and asks that we send him a complete report with the average and total accumulative tardiness of these employees. The following query will return this report:
SELECT "FiremanFullName", AVG("RealStartTime"-"SchedStartTime")"AverageDelay", SUM("RealStartTime"-"SchedStartTime") "Accum. Delay" FROM firestation_working_time WHERE MONTH("RealStartTime"::date) = 11 AND YEAR("RealStartTime"::date) = 2017 GROUP BY "FiremanFullName"
We can also execute a query to obtain the firefighters who logged excess hours. We’ll calculate the amount of excess hours they worked and organize the results in descending order.
SELECT "FiremanFullName", SUM("RealEndTime"-"SchedEndTime") "Extra Work" FROM firestation_working_time WHERE MONTH("RealStartTime"::date) = 11 AND YEAR("RealStartTime"::date) = 2017 GROUP BY "FiremanFullName" ORDER BY "Extra Work" DESC
In the previous query, notice that we’re adding intervals. The excess time a firefighter logged on a specific day is represented as an interval. The
SUM function is then used to add these overtime intervals to obtain the firefighters’ cumulative excess hours for the month of November, 2017. It should be clear that the result of adding two or more intervals is another interval.
Naturally, you may be wondering: can we add all kinds of data types to each other? What if we try to add two dates or two timestamps? Simply put, we can’t perform these operations because they do not produce meaningful results. However, recall that subtracting one date from another is perfectly fine, since doing so merely produces a time interval.
As we learned, there are several arithmetic operations we can perform with date, timestamp, and interval data types. Moreover, the results of such operations can have different data types than those of their operands.
We didn’t cover all operations between date-related data types in this article, such as multiplying an interval by an integer. If you’re interested in learning more about these topics, go ahead and check out the courses we offer at Vertabelo Academy.