Generating Recurring Timespans in Postgres
At a previous job, my team worked on what at its core was a Gantt chart for scheduling shift-work.
We ended up pulling in a library at the application layer for the sole purpose of generating time ranges with which to populate the schedule. On top of adding a single-use dependency, the resulting implementation was convoluted and relatively slow. I was thinking this problem must be solvable at the database level and as it turns out, it is! Well, it is provided you are using postgres or any RDMS that supports a function for generating series.
Getting Started
First off, we’ll need a shifts
table to hold our “rules” for
describing recurring shifts, for example: “Every Monday from 9am - 1pm”. More
complex rules like, “Every other Tuesday” is beyond the scope of this post.
CREATE TABLE shifts (
name varchar(255), -- Name of the shift, like "Morning" or "Afternoon"
days integer[], -- Array of ints representing the day of the week
start_time time, -- Start time (without date)
end_time time -- End time (without date)
);
days
is an array of integers representing the day of the
week in ISO format—so 1
is Monday, 2
is
Tuesday, and so on. start_time
and end_time
are
stored in postgres’ time
type, which is a very convenient “time
without date” type that is ever-missing from some (many?) popular languages.
name
is simply and arbitrary name we will give our shifts so we can
identify them.
Now let’s populate our table:
INSERT INTO shifts
(name, days, start_time, end_time )
VALUES
('Full Time', '{1,2,3,4,5}', '09:00:00', '17:00:00'), -- Mon-Fri, 9am-5pm
('Part Time', '{5,6}', '12:00:00', '20:00:00'), -- Thu-Fri, 12pm-8pm
('Weekend', '{6,7}', '12:00:00', '20:00:00'); -- Sat-Sun, 12pm-8pm
Now that we have that, let’s generate a schedule!
WITH days AS (
SELECT * FROM
generate_series(
timestamp '2018-01-01T00:00:00', -- start_timestamp
timestamp '2018-01-07T00:00:00', -- end_timestamp
interval '1 day'
) AS day
)
SELECT
shifts.name AS "Name",
(days.day + shifts.start_time)::date AS "Date",
(days.day + shifts.start_time)::time AS "Start",
(days.day + shifts.end_time)::time AS "End",
EXTRACT(isodow FROM days.day + shifts.start_time) AS "Day of Week"
FROM shifts
CROSS JOIN days
WHERE EXTRACT(isodow FROM days.day + shifts.start_time) = ANY(shifts.days)
Running this should give you the following result:
Name | Date | Start | End | Day of Week
-----------+------------+----------+----------+-------------
Full Time | 2018-01-01 | 09:00:00 | 17:00:00 | 1
Full Time | 2018-01-02 | 09:00:00 | 17:00:00 | 2
Full Time | 2018-01-03 | 09:00:00 | 17:00:00 | 3
Full Time | 2018-01-04 | 09:00:00 | 17:00:00 | 4
Full Time | 2018-01-05 | 09:00:00 | 17:00:00 | 5
Part Time | 2018-01-05 | 12:00:00 | 20:00:00 | 5
Part Time | 2018-01-06 | 12:00:00 | 20:00:00 | 6
Weekend | 2018-01-06 | 12:00:00 | 20:00:00 | 6
Weekend | 2018-01-07 | 12:00:00 | 20:00:00 | 7
Breaking it down
First off, let’s take a look at the SELECT
statement within
the WITH
, ie, the
common table expression
.
Here we have our two pieces of variable data representing the time range we want
to generate a schedule for. We pass these to postgres'
generate series
function
with an interval of one day.
WITH days AS (
SELECT * FROM
generate_series(
timestamp '2018-01-01T00:00:00',
timestamp '2018-01-07T00:00:00',
INTERVAL '1 day'
) AS day
Running this on its own will produce one row for each day of the time range:
day
---------------------
2018-01-01 00:00:00
2018-01-02 00:00:00
2018-01-03 00:00:00
2018-01-04 00:00:00
2018-01-05 00:00:00
2018-01-06 00:00:00
2018-01-07 00:00:00
Now let’s look at the next section.
SELECT
shifts.name AS "Name",
(days.day + shifts.start_time)::date AS "Date",
(days.day + shifts.start_time)::time AS "Start",
(days.day + shifts.end_time)::time AS "End",
EXTRACT(isodow FROM days.day + shifts.start_time) AS "Day of Week"
FROM shifts
CROSS JOIN days
CROSS JOIN
will generate the Cartesian product (all possible combinations) of
the results. We can see this if we combine the queries;
SELECT WITH days AS (
SELECT * FROM
generate_series(
timestamp '2018-01-01T00:00:00', -- start_timestamp
timestamp '2018-01-07T00:00:00', -- end_timestamp
interval '1 day'
) AS day
)
SELECT *
FROM shifts
CROSS JOIN days
…which produces:
name | days | start_time | end_time | day
-----------+-------------+------------+----------+---------------------
Full Time | {1,2,3,4,5} | 09:00:00 | 17:00:00 | 2018-01-01 00:00:00
Part Time | {5,6} | 12:00:00 | 20:00:00 | 2018-01-01 00:00:00
Weekend | {6,7} | 12:00:00 | 20:00:00 | 2018-01-01 00:00:00
Full Time | {1,2,3,4,5} | 09:00:00 | 17:00:00 | 2018-01-02 00:00:00
Part Time | {5,6} | 12:00:00 | 20:00:00 | 2018-01-02 00:00:00
Weekend | {6,7} | 12:00:00 | 20:00:00 | 2018-01-02 00:00:00
Full Time | {1,2,3,4,5} | 09:00:00 | 17:00:00 | 2018-01-03 00:00:00
Part Time | {5,6} | 12:00:00 | 20:00:00 | 2018-01-03 00:00:00
Weekend | {6,7} | 12:00:00 | 20:00:00 | 2018-01-03 00:00:00
Full Time | {1,2,3,4,5} | 09:00:00 | 17:00:00 | 2018-01-04 00:00:00
Part Time | {5,6} | 12:00:00 | 20:00:00 | 2018-01-04 00:00:00
Weekend | {6,7} | 12:00:00 | 20:00:00 | 2018-01-04 00:00:00
Full Time | {1,2,3,4,5} | 09:00:00 | 17:00:00 | 2018-01-05 00:00:00
Part Time | {5,6} | 12:00:00 | 20:00:00 | 2018-01-05 00:00:00
Weekend | {6,7} | 12:00:00 | 20:00:00 | 2018-01-05 00:00:00
Full Time | {1,2,3,4,5} | 09:00:00 | 17:00:00 | 2018-01-06 00:00:00
Part Time | {5,6} | 12:00:00 | 20:00:00 | 2018-01-06 00:00:00
Weekend | {6,7} | 12:00:00 | 20:00:00 | 2018-01-06 00:00:00
Full Time | {1,2,3,4,5} | 09:00:00 | 17:00:00 | 2018-01-07 00:00:00
Part Time | {5,6} | 12:00:00 | 20:00:00 | 2018-01-07 00:00:00
Weekend | {6,7} | 12:00:00 | 20:00:00 | 2018-01-07 00:00:00
As we can see, we get every shift as if they all happen every day. Clearly we need to deal with this which is where the final piece comes in:
The English version of this line is Only keep results where the day of
the week of a shift’s start time falls on one of the days it’s available
on. EXTRACT
is used here to grab the “ISO Day of Week”
(isodow
) and matching it against the days
column. If
that day is present in the days
array for that row, then it’s
a match. You can think of ANY
as kind of like IN
but
for arrays (it has other uses too).
So putting all three together produces the result we already saw at the top:
WITH days AS (
SELECT * FROM
generate_series(
timestamp '2018-01-01T00:00:00',
timestamp '2018-01-07T00:00:00',
interval '1 day'
) AS day
)
SELECT
shifts.name AS "Name",
(days.day + shifts.start_time)::date AS "Date",
(days.day + shifts.start_time)::time AS "Start",
(days.day + shifts.end_time)::time AS "End",
EXTRACT(isodow FROM days.day + shifts.start_time) AS "Day of Week"
FROM shifts
CROSS JOIN days
WHERE EXTRACT(isodow FROM days.day + shifts.start_time) = ANY(shifts.days)
Name | Date | Start | End | Day of Week
-----------+------------+----------+----------+-------------
Full Time | 2018-01-01 | 09:00:00 | 17:00:00 | 1
Full Time | 2018-01-02 | 09:00:00 | 17:00:00 | 2
Full Time | 2018-01-03 | 09:00:00 | 17:00:00 | 3
Full Time | 2018-01-04 | 09:00:00 | 17:00:00 | 4
Full Time | 2018-01-05 | 09:00:00 | 17:00:00 | 5
Part Time | 2018-01-05 | 12:00:00 | 20:00:00 | 5
Part Time | 2018-01-06 | 12:00:00 | 20:00:00 | 6
Weekend | 2018-01-06 | 12:00:00 | 20:00:00 | 6
Weekend | 2018-01-07 | 12:00:00 | 20:00:00 | 7
And there you have it.
Next steps
To actually use this, you would probably want more machine-friendly output, ie,
don’t split up the Date and Start+End columns like I did. Plain ol'
start+end datetime
s is probably want you want. To take it a step
further you could use postgres’ range
types to store the the start
and end dates as a single ranged unit. But that is a topic for another time.