Useful SQL Patterns: Date Generator

SQL Patterns, date and time, generators, generator,

As you start coding in SQL, you will use some statements and techniques over and over again. We call these “SQL patterns”. This series will look at the most common SQL patterns and consider how to use them.

SQL patterns, such as the pivot pattern we discussed last week, can save you a lot of time and effort. Suppose you are asked to get a range of days in a financial quarter, but the only records you have are for the start and end dates of each quarter. Do you have to manually add all the missing days? No. You can use a technique known as a date generator to fill in the gaps. This SQL pattern is commonly used with date and time data, and it can be done granularly, down to every second of a year. Another name for this technique is “expanding to value”, but we’ll stick with “generator”.

Expanding Date Values with a Generator

So you want to have your data expanded to include every data point in a range. For the purposes of our example, we’ll say that you need to get the first day of every month in a year. Let’s use the following tables to see how this could be done.

The balance table contains:

  • id – The table’s primary key
  • account – The account ID number
  • customer – A unique number assigned to each customer
  • amount – The customer’s account balance during a specific time
  • datetime_from – The date the balance was entered
  • datetime_to – The date the balance changed

The dates table contains:

  • datetime – A date representing a certain period
  • day – The day in datetime
  • month – The month in datetime
  • year – The year in datetime

Data model

In the balance table, there are intervals in which the balance amount has a range of dates (defined by datetime_from and datetime_to) for a particular client and account.

This data looks like:

id datetime_from account customer amount date_to
29 2017-01-01 1 1 1000 2017-04-01
30 2017-04-01 1 1 500 2017-06-01
31 2017-06-01 1 1 1500 2017-08-01
32 2017-08-01 1 1 2100 2017-12-01

We need the dates table to perform the expansion; it contains all the dates needed for our case.

Let’s say it holds the first day of every month of 2017:

datetime day month year
2017-01-01 1 1 2017
2017-02-01 1 2 2017
2017-03-01 1 3 2017
2017-04-01 1 4 2017
2017-05-01 1 5 2017
2017-06-01 1 6 2017
2017-07-01 1 7 2017
2017-08-01 1 8 2017
2017-09-01 1 9 2017
2017-10-01 1 10 2017
2017-11-01 1 11 2017
2017-12-01 1 12 2017

To expand the row set, we use the BETWEEN operator. We specify the date range (using datetime_from and datetime_to from the balance table) and define what goes in between them. In this case, it will be the datetime attributes from the dates table.

The query looks like this :

  sum(amount) AS amount
  JOIN dates ON (dates.datetime BETWEEN balance.datetime_from AND balance.date_to-1)
GROUP BY customer, dates.datetime
ORDER BY datetime;

And the resulting data set is :

datetime customer amount
2017-01-01 1 1000
2017-02-01 1 1000
2017-03-01 1 1000
2017-04-01 1 500
2017-05-01 1 500
2017-06-01 1 1500
2017-07-01 1 1500
2017-08-01 1 2100
2017-09-01 1 2100
2017-10-01 1 2100
2017-11-01 1 2100

Learning More About Generators

Now we’ve seen a generator, a query which returns a bigger dataset than what it started with. Be careful writing these types of queries: a small mistake can really slow down the execution time.

You can read more about adding information to tables in this data warehousing article. Or if you want to find out more about SQL Patterns, check out the beginning of the series here.

Aldo Zelen

Aldo is a data architect with a passion for the cloud. From leading a team of data professionals to coding a data warehouse in the cloud, Aldo has experience with the whole lifecycle of data-intensive projects. Aldo spends his free time forecasting geopolitical events in forecasting tournaments.

comments powered by Disqus


Over 85.000 happy students
and counting!