Generating Recurring Timespans in Postgres

Pixelated Andrew Pixelated Andrew

At a previous job, my team worked on what at its core was a Gantt chart for scheduling shift-work.

We ended up pulling in a library at the application layer for the sole purpose of generating time ranges with which to populate the schedule. On top of adding a single-use dependency, the resulting implementation was convoluted and relatively slow. I was thinking this problem must be solvable at the database level and as it turns out, it is! Well, it is provided you are using postgres or any RDMS that supports a function for generating series.

Getting Started

First off, we’ll need a shifts table to hold our “rules” for describing recurring shifts, for example: “Every Monday from 9am - 1pm”. More complex rules like, “Every other Tuesday” is beyond the scope of this post.

CREATE TABLE shifts (
  name varchar(255),  -- Name of the shift, like "Morning" or "Afternoon"
  days integer[],     -- Array of ints representing the day of the week
  start_time time,    -- Start time (without date)
  end_time time       -- End time (without date)
);

days is an array of integers representing the day of the week in ISO format—so 1 is Monday, 2 is Tuesday, and so on. start_time and end_time are stored in postgres’ time type, which is a very convenient “time without date” type that is ever-missing from some (many?) popular languages. name is simply and arbitrary name we will give our shifts so we can identify them.

Now let’s populate our table:

INSERT INTO shifts
  (name,        days,          start_time, end_time  )
VALUES
  ('Full Time', '{1,2,3,4,5}', '09:00:00', '17:00:00'),  -- Mon-Fri, 9am-5pm
  ('Part Time', '{5,6}',       '12:00:00', '20:00:00'),  -- Thu-Fri, 12pm-8pm
  ('Weekend',   '{6,7}',       '12:00:00', '20:00:00');  -- Sat-Sun, 12pm-8pm

Now that we have that, let’s generate a schedule!

WITH days AS (
  SELECT * FROM
    generate_series(
      timestamp '2018-01-01T00:00:00', -- start_timestamp
      timestamp '2018-01-07T00:00:00', -- end_timestamp
      interval '1 day'
    ) AS day
)
SELECT
  shifts.name AS "Name",
  (days.day + shifts.start_time)::date AS "Date",
  (days.day + shifts.start_time)::time AS "Start",
  (days.day + shifts.end_time)::time AS "End",
  EXTRACT(isodow FROM days.day + shifts.start_time) AS "Day of Week"
FROM shifts
CROSS JOIN days
WHERE EXTRACT(isodow FROM days.day + shifts.start_time) = ANY(shifts.days)

Running this should give you the following result:

   Name    |    Date    |  Start   |   End    | Day of Week
-----------+------------+----------+----------+-------------
 Full Time | 2018-01-01 | 09:00:00 | 17:00:00 |           1
 Full Time | 2018-01-02 | 09:00:00 | 17:00:00 |           2
 Full Time | 2018-01-03 | 09:00:00 | 17:00:00 |           3
 Full Time | 2018-01-04 | 09:00:00 | 17:00:00 |           4
 Full Time | 2018-01-05 | 09:00:00 | 17:00:00 |           5
 Part Time | 2018-01-05 | 12:00:00 | 20:00:00 |           5
 Part Time | 2018-01-06 | 12:00:00 | 20:00:00 |           6
 Weekend   | 2018-01-06 | 12:00:00 | 20:00:00 |           6
 Weekend   | 2018-01-07 | 12:00:00 | 20:00:00 |           7

Breaking it down

First off, let’s take a look at the SELECT statement within the WITH, ie, the common table expression . Here we have our two pieces of variable data representing the time range we want to generate a schedule for. We pass these to postgres' generate series function with an interval of one day.

WITH days AS (
  SELECT * FROM
    generate_series(
      timestamp '2018-01-01T00:00:00',
      timestamp '2018-01-07T00:00:00',
      INTERVAL '1 day'
) AS day

Running this on its own will produce one row for each day of the time range:

         day
---------------------
 2018-01-01 00:00:00
 2018-01-02 00:00:00
 2018-01-03 00:00:00
 2018-01-04 00:00:00
 2018-01-05 00:00:00
 2018-01-06 00:00:00
 2018-01-07 00:00:00

Now let’s look at the next section.

SELECT
  shifts.name AS "Name",
  (days.day + shifts.start_time)::date AS "Date",
  (days.day + shifts.start_time)::time AS "Start",
  (days.day + shifts.end_time)::time AS "End",
  EXTRACT(isodow FROM days.day + shifts.start_time) AS "Day of Week"
FROM shifts
CROSS JOIN days

CROSS JOIN will generate the Cartesian product (all possible combinations) of the results. We can see this if we combine the queries;

SELECT WITH days AS (
  SELECT * FROM
    generate_series(
      timestamp '2018-01-01T00:00:00', -- start_timestamp
      timestamp '2018-01-07T00:00:00', -- end_timestamp
      interval '1 day'
    ) AS day
)
SELECT *
FROM shifts
CROSS JOIN days

…which produces:

   name    |    days     | start_time | end_time |         day
-----------+-------------+------------+----------+---------------------
 Full Time | {1,2,3,4,5} | 09:00:00   | 17:00:00 | 2018-01-01 00:00:00
 Part Time | {5,6}       | 12:00:00   | 20:00:00 | 2018-01-01 00:00:00
 Weekend   | {6,7}       | 12:00:00   | 20:00:00 | 2018-01-01 00:00:00
 Full Time | {1,2,3,4,5} | 09:00:00   | 17:00:00 | 2018-01-02 00:00:00
 Part Time | {5,6}       | 12:00:00   | 20:00:00 | 2018-01-02 00:00:00
 Weekend   | {6,7}       | 12:00:00   | 20:00:00 | 2018-01-02 00:00:00
 Full Time | {1,2,3,4,5} | 09:00:00   | 17:00:00 | 2018-01-03 00:00:00
 Part Time | {5,6}       | 12:00:00   | 20:00:00 | 2018-01-03 00:00:00
 Weekend   | {6,7}       | 12:00:00   | 20:00:00 | 2018-01-03 00:00:00
 Full Time | {1,2,3,4,5} | 09:00:00   | 17:00:00 | 2018-01-04 00:00:00
 Part Time | {5,6}       | 12:00:00   | 20:00:00 | 2018-01-04 00:00:00
 Weekend   | {6,7}       | 12:00:00   | 20:00:00 | 2018-01-04 00:00:00
 Full Time | {1,2,3,4,5} | 09:00:00   | 17:00:00 | 2018-01-05 00:00:00
 Part Time | {5,6}       | 12:00:00   | 20:00:00 | 2018-01-05 00:00:00
 Weekend   | {6,7}       | 12:00:00   | 20:00:00 | 2018-01-05 00:00:00
 Full Time | {1,2,3,4,5} | 09:00:00   | 17:00:00 | 2018-01-06 00:00:00
 Part Time | {5,6}       | 12:00:00   | 20:00:00 | 2018-01-06 00:00:00
 Weekend   | {6,7}       | 12:00:00   | 20:00:00 | 2018-01-06 00:00:00
 Full Time | {1,2,3,4,5} | 09:00:00   | 17:00:00 | 2018-01-07 00:00:00
 Part Time | {5,6}       | 12:00:00   | 20:00:00 | 2018-01-07 00:00:00
 Weekend   | {6,7}       | 12:00:00   | 20:00:00 | 2018-01-07 00:00:00

As we can see, we get every shift as if they all happen every day. Clearly we need to deal with this which is where the final piece comes in:

The English version of this line is Only keep results where the day of the week of a shift’s start time falls on one of the days it’s available on. EXTRACT is used here to grab the “ISO Day of Week” (isodow) and matching it against the days column. If that day is present in the days array for that row, then it’s a match. You can think of ANY as kind of like IN but for arrays (it has other uses too).

So putting all three together produces the result we already saw at the top:

WITH days AS (
  SELECT * FROM
    generate_series(
      timestamp '2018-01-01T00:00:00',
      timestamp '2018-01-07T00:00:00',
      interval '1 day'
    ) AS day
)
SELECT
  shifts.name AS "Name",
  (days.day + shifts.start_time)::date AS "Date",
  (days.day + shifts.start_time)::time AS "Start",
  (days.day + shifts.end_time)::time AS "End",
  EXTRACT(isodow FROM days.day + shifts.start_time) AS "Day of Week"
FROM shifts
CROSS JOIN days
WHERE EXTRACT(isodow FROM days.day + shifts.start_time) = ANY(shifts.days)
   Name    |    Date    |  Start   |   End    | Day of Week
-----------+------------+----------+----------+-------------
 Full Time | 2018-01-01 | 09:00:00 | 17:00:00 |           1
 Full Time | 2018-01-02 | 09:00:00 | 17:00:00 |           2
 Full Time | 2018-01-03 | 09:00:00 | 17:00:00 |           3
 Full Time | 2018-01-04 | 09:00:00 | 17:00:00 |           4
 Full Time | 2018-01-05 | 09:00:00 | 17:00:00 |           5
 Part Time | 2018-01-05 | 12:00:00 | 20:00:00 |           5
 Part Time | 2018-01-06 | 12:00:00 | 20:00:00 |           6
 Weekend   | 2018-01-06 | 12:00:00 | 20:00:00 |           6
 Weekend   | 2018-01-07 | 12:00:00 | 20:00:00 |           7

And there you have it.

Next steps

To actually use this, you would probably want more machine-friendly output, ie, don’t split up the Date and Start+End columns like I did. Plain ol' start+end datetimes is probably want you want. To take it a step further you could use postgres’ range types to store the the start and end dates as a single ranged unit. But that is a topic for another time.