By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,590 Members | 2,191 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,590 IT Pros & Developers. It's quick & easy.

consolidate time periods

P: n/a
Can I get a little help? Given a very simple table of events, with
their durations, I want to consolidate time periods for an event that
(1) touch each other or (2) overlap each other.

CREATE TABLE Events
(event_id VARCHAR(25) NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL,
CHECK (start_date <= end_date),
PRIMARY KEY (event_id, start_date, end_date));

INSERT INTO Events
VALUES ('Pie Eating', '2009-01-01', '2009-01-02'),
('Pie Eating', '2009-01-03', '2009-01-05'),
('Pie Eating', '2009-01-04', '2009-01-07'),
('Pie Eating', '2009-02-01', '2009-02-07');

The goal is to reduce these three rows into fewer rows that show how
many days we were eating pies.

('Pie Eating', '2009-01-01', '2009-01-07')

The first and second rows in the INSERT INTO statement touch each
other and can be replaced with:

('Pie Eating', '2009-01-01', '2009-01-05')

The third row will overlap with this new row and can be consolidated
with it, as shown before. However, the fourth row has data that
occurs a month after the first three and it is disjoint from all the
other rows.

What I want is:

1) A portable, simple cursor solution
2) A Recursive CTE solution

I am going to use this in a book, so you will get credit.

Mar 24 '07 #1
Share this Question
Share on Google+
14 Replies


P: n/a
--CELKO-- wrote:
Can I get a little help? Given a very simple table of events, with
their durations, I want to consolidate time periods for an event that
(1) touch each other or (2) overlap each other.

CREATE TABLE Events
(event_id VARCHAR(25) NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL,
CHECK (start_date <= end_date),
PRIMARY KEY (event_id, start_date, end_date));

INSERT INTO Events
VALUES ('Pie Eating', '2009-01-01', '2009-01-02'),
('Pie Eating', '2009-01-03', '2009-01-05'),
('Pie Eating', '2009-01-04', '2009-01-07'),
('Pie Eating', '2009-02-01', '2009-02-07');

The goal is to reduce these three rows into fewer rows that show how
many days we were eating pies.

('Pie Eating', '2009-01-01', '2009-01-07')

The first and second rows in the INSERT INTO statement touch each
other and can be replaced with:

('Pie Eating', '2009-01-01', '2009-01-05')

The third row will overlap with this new row and can be consolidated
with it, as shown before. However, the fourth row has data that
occurs a month after the first three and it is disjoint from all the
other rows.

What I want is:

1) A portable, simple cursor solution
2) A Recursive CTE solution

I am going to use this in a book, so you will get credit.
1) Too much work and with SQL Server supporting recursive CTE I claim
portability ;-)
2) WITH rec(event_id, start_date, end_date)
AS (SELECT event_id, start_date, end_date FROM events
UNION ALL
SELECT rec.event_id, rec.start_date, events.end_date
FROM events,
rec
WHERE events.start_date
BETWEEN rec.start_date AND rec.end_date + 1 day
AND events.end_date rec.end_date)
SELECT event_id, MIN(start_date) as start_date, end_date
FROM (SELECT event_id, start_date, MAX(end_date) AS end_date
FROM rec
GROUP BY event_id, start_date) AS rtrunc
GROUP BY event_id, end_date;

EVENT_ID START_DATE END_DATE
------------------------- ---------- ----------
SQL0347W The recursive common table expression "SRIELAU.REC" may contain an
infinite loop. SQLSTATE=01605

Pie Eating 01/01/2009 01/07/2009
Pie Eating 02/01/2009 02/07/2009

2 record(s) selected with 1 warning messages printed.

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Mar 25 '07 #2

P: n/a
PS: I suspect this can be written with OLAP expressions as well using
windowing

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Mar 25 '07 #3

P: n/a
Serge Rielau wrote:
PS: I suspect this can be written with OLAP expressions as well using
windowing
I guess something like:

SELECT min_start_date, MAX(end_date)
FROM (
SELECT
start_date, end_date,
MAX(CASE WHEN start_date <= max_end_date + 1 day
THEN NULL ELSE start_date END)
OVER (ORDER BY start_date, end_date
ROWS UNBOUNDED PRECEDING)
FROM (
SELECT
start_date, end_date,
MAX(end_date) OVER (
ORDER BY start_date, end_date
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
FROM events
) T1 (start_date, end_date, max_end_date)
) T2 (start_date, end_date, min_start_date)
GROUP BY min_start_date
ORDER BY min_start_date;

should work

/Lennart

Mar 25 '07 #4

P: n/a
On Mar 25, 10:58 am, Lennart <erik.lennart.jons...@gmail.comwrote:

Sorry, should take tare of different event_id's as well:

SELECT event_id, min_start_date, MAX(end_date)
FROM (
SELECT
event_id, start_date, end_date,
MAX(CASE WHEN start_date <= max_end_date + 1 day
THEN NULL ELSE start_date END)
OVER (PARTITION BY event_id
ORDER BY event_id, start_date, end_date
ROWS UNBOUNDED PRECEDING)
FROM (
SELECT
event_id, start_date, end_date,
MAX(end_date) OVER (
PARTITION BY event_id
ORDER BY event_id, start_date, end_date
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
FROM events
) T1 (event_id, start_date, end_date, max_end_date)
) T2 (event_id, start_date, end_date, min_start_date)
GROUP BY event_id, min_start_date
ORDER BY event_id, min_start_date;
/Lennart

Mar 25 '07 #5

P: n/a
>1) Too much work and with SQL Server supporting recursive CTE I claim portability ;-) <<

Actually, it is pretty easy:

DECLARE EventList CURSOR FOR
SELECT DISTINCT E.event_id, C.cal_date
FROM Events AS E, Calendar AS C
WHERE C.cal_date BETWEEN E.start_date AND E.end_date
ORDER BY E.event_id, C.cal_date ASC
FOR READ ONLY;

We now have a list of each day in an event.

1. Fetch the first row and put it into local storage, making the
end_date = start_date
2. Fetch the next row as teh current row
3. If this current row is in the same event and INTERVAL '1' DAY after
the local end_date, update the end_date with it.
4. If the current row is not in the same event or INTERVAL '1' DAY
after the local end_date, then
4.1. Insert the local storage into a working table as a row
4.2. Overwrite the local storage with the current row, making the
end_date = start_date
5. Loop until the end of the cursor.
6. The working table is the desired answer.

I can keep the code fairly portble and short.

Mar 25 '07 #6

P: n/a
--CELKO-- wrote:
Can I get a little help? Given a very simple table of events, with
their durations, I want to consolidate time periods for an event that
(1) touch each other or (2) overlap each other.

CREATE TABLE Events
(event_id VARCHAR(25) NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL,
CHECK (start_date <= end_date),
PRIMARY KEY (event_id, start_date, end_date));

INSERT INTO Events
VALUES ('Pie Eating', '2009-01-01', '2009-01-02'),
('Pie Eating', '2009-01-03', '2009-01-05'),
('Pie Eating', '2009-01-04', '2009-01-07'),
('Pie Eating', '2009-02-01', '2009-02-07');

The goal is to reduce these three rows into fewer rows that show how
many days we were eating pies.

('Pie Eating', '2009-01-01', '2009-01-07')

The first and second rows in the INSERT INTO statement touch each
other and can be replaced with:

('Pie Eating', '2009-01-01', '2009-01-05')

The third row will overlap with this new row and can be consolidated
with it, as shown before. However, the fourth row has data that
occurs a month after the first three and it is disjoint from all the
other rows.

What I want is:

1) A portable, simple cursor solution
2) A Recursive CTE solution

I am going to use this in a book, so you will get credit.
Completely different approach: http://tinyurl.com/2plby8

--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany
Mar 25 '07 #7

P: n/a
--CELKO-- wrote:
>>>1) Too much work and with SQL Server supporting recursive CTE I claim portability ;-) <<


Actually, it is pretty easy:

DECLARE EventList CURSOR FOR
SELECT DISTINCT E.event_id, C.cal_date
FROM Events AS E, Calendar AS C
WHERE C.cal_date BETWEEN E.start_date AND E.end_date
ORDER BY E.event_id, C.cal_date ASC
FOR READ ONLY;
Joe,

This only works if some date record in the cal_date table exists within the
date range of each and every row for an event. Otherwise you will not see
the entire range of the event, only the range spanned by cal_date records.
If that's what you want, it's fine. However, if, instead, the request is to
fetch the entire range of dates for an event that will include a given
single date (or small date range). That would require one to push the date
matching into a sub-query IB.

Art S. Kagel
We now have a list of each day in an event.

1. Fetch the first row and put it into local storage, making the
end_date = start_date
2. Fetch the next row as teh current row
3. If this current row is in the same event and INTERVAL '1' DAY after
the local end_date, update the end_date with it.
4. If the current row is not in the same event or INTERVAL '1' DAY
after the local end_date, then
4.1. Insert the local storage into a working table as a row
4.2. Overwrite the local storage with the current row, making the
end_date = start_date
5. Loop until the end of the cursor.
6. The working table is the desired answer.

I can keep the code fairly portble and short.
Mar 26 '07 #8

P: n/a
--CELKO-- wrote:
Can I get a little help? Given a very simple table of events, with
their durations, I want to consolidate time periods for an event that
(1) touch each other or (2) overlap each other.

CREATE TABLE Events
(event_id VARCHAR(25) NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL,
CHECK (start_date <= end_date),
PRIMARY KEY (event_id, start_date, end_date));

INSERT INTO Events
VALUES ('Pie Eating', '2009-01-01', '2009-01-02'),
('Pie Eating', '2009-01-03', '2009-01-05'),
('Pie Eating', '2009-01-04', '2009-01-07'),
('Pie Eating', '2009-02-01', '2009-02-07');

The goal is to reduce these three rows into fewer rows that show how
many days we were eating pies.
An aside: I prefer to encode these kinds of structures with a start date and
a duration as:

CREATE TABLE Events
(event_id VARCHAR(25) NOT NULL,
start_date DATE NOT NULL,
addl_days INTEGER, -- Where available I would use INTERVAL (DAYS) instead
CHECK (addl_days >= 0),
PRIMARY KEY (event_id, start_date));

I find that it makes finding events that occur on a particular date easier
to code, and allows for both compressed and single day records to coexist
and return correct results.

So, your data might look like:

INSERT INTO Events
VALUES ('Pie Eating', '2009-01-01', 1 ),
('Pie Eating', '2009-01-03', 2 ),
('Pie Eating', '2009-01-04', 3 ), -- I might prevent the overlap with
-- a trigger, but...
('Pie Eating', '2009-02-01', 6 );
So, a cursor solution to your problem:

DECLARE EventList CURSOR FOR
SELECT DISTINCT E.event_id, E.start_date, E.addl_days
FROM Events as E, Calendar as C
WHERE C.cal_date between E.start_date AND (E.start_date + E.addl_days)
FOR READ ONLY;

And the pseudo-code is similar to yours with the exception that you are only
incrementing the addl_days.

Obviously my previous comment about the dates in Calendar affecting the
results also apply to this schema.

FWIW.

Art S. Kagel
Mar 26 '07 #9

P: n/a
On Mar 25, 12:08 pm, Serge Rielau <srie...@ca.ibm.comwrote:
PS: I suspect this can be written with OLAP expressions as well using
windowing

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
This is my trial. I added data which are included in another row's
range.

INSERT INTO Events
VALUES ('Apple Eating', '2009-01-01', '2009-01-02')
, ('Apple Eating', '2009-01-03', '2009-01-05')
, ('Apple Eating', '2009-01-04', '2009-01-10')
, ('Apple Eating', '2009-01-05', '2009-01-06')
, ('Apple Eating', '2009-01-08', '2009-01-09')
, ('Apple Eating', '2009-01-11', '2009-01-12')
, ('Apple Eating', '2009-02-01', '2009-02-07')
, ('Apple Eating', '2009-03-01', '2009-03-07')
, ('Apple Eating', '2009-03-03', '2009-03-05')
, ('Apple Eating', '2009-04-01', '2009-04-03')
, ('Apple Eating', '2009-04-04', '2009-04-05')
, ('Apple Eating', '2009-04-04', '2009-04-06')
;

------------------------- Commands Entered -------------------------
WITH Flagged_Events AS (
SELECT event_id, start_date, end_date
, CASE
WHEN (pre_end_date < start_date - 1 DAYS
OR
pre_end_date IS NULL)
AND (fol_start_date end_date + 1 DAYS
OR
fol_start_date IS NULL
OR
fol_end_date < end_date) THEN
'Only'
WHEN (pre_end_date < start_date - 1 DAYS
OR
pre_end_date IS NULL)
AND fol_start_date <= end_date + 1 DAYS THEN
'Begin'
WHEN pre_end_date >= start_date - 1 DAYS
AND pre_end_date < end_date
AND (fol_start_date end_date + 1 DAYS
OR
fol_start_date IS NULL
OR
fol_end_date < end_date) THEN
'End'
ELSE 'Middle'
END AS Flag
FROM (SELECT event_id, start_date, end_date
, MAX(end_date) OVER(PARTITION BY event_id
ORDER BY start_date, end_date
ROWS BETWEEN UNBOUNDED PRECEDING
AND 1 PRECEDING
) pre_end_date
, MIN(end_date) OVER(PARTITION BY event_id
ORDER BY start_date, end_date
ROWS BETWEEN 1 FOLLOWING
AND UNBOUNDED FOLLOWING
) fol_end_date
, MIN(start_date) OVER(PARTITION BY event_id
ORDER BY start_date, end_date
ROWS BETWEEN 1 FOLLOWING
AND UNBOUNDED FOLLOWING
) fol_start_date
FROM Events
) R
)
SELECT event_id, start_date
, CASE Flag
WHEN 'Only' THEN
end_date
ELSE new_end_date
END AS end_date
FROM (SELECT event_id, start_date, end_date, Flag
, MAX(end_date) OVER(PARTITION BY event_id
ORDER BY start_date, end_date
ROWS BETWEEN CURRENT ROW
AND 1 FOLLOWING
) new_end_date
FROM (SELECT event_id, start_date, end_date, Flag
, MAX(Flag) OVER(PARTITION BY event_id
ORDER BY start_date,
end_date
ROWS BETWEEN 1 FOLLOWING
AND 1 FOLLOWING
) fol_Flag
FROM Flagged_Events
WHERE Flag <'Middle'
) Q
WHERE Flag <'End'
OR Flag = 'End' AND (fol_Flag <'End' OR fol_Flag IS
NULL)
) R
WHERE Flag <'End'
ORDER BY
event_id, start_date;
--------------------------------------------------------------------

EVENT_ID START_DATE END_DATE
------------------------- ---------- ----------
Apple Eating 2009-01-01 2009-01-12
Apple Eating 2009-02-01 2009-02-07
Apple Eating 2009-03-01 2009-03-07
Apple Eating 2009-04-01 2009-04-06
Pie Eating 2009-01-01 2009-01-07
Pie Eating 2009-02-01 2009-02-07

6 record(s) selected.

Mar 26 '07 #10

P: n/a
On Mar 25, 2:50 am, "--CELKO--" <jcelko...@earthlink.netwrote:
I am going to use this in a book, so you will get credit.
I want ask the people who reply to CELKO's question.
He is going to use your reply to use in his book without explicit your
permission.
Or, post in this forum means implicitly give permission to use any way
your writings for all peoples? In another word, did you agreed to give
up your copywrite?

Mar 26 '07 #11

P: n/a
Tonkuma wrote:
On Mar 25, 2:50 am, "--CELKO--" <jcelko...@earthlink.netwrote:
>>I am going to use this in a book, so you will get credit.


I want ask the people who reply to CELKO's question.
He is going to use your reply to use in his book without explicit your
permission.
Or, post in this forum means implicitly give permission to use any way
your writings for all peoples? In another word, did you agreed to give
up your copywrite?
Yes, you implicitely agree to let him use your reply.

Art S. Kagel
Mar 26 '07 #12

P: n/a
On Mar 26, 12:28 pm, "Tonkuma" <tonk...@jp.ibm.comwrote:
On Mar 25, 2:50 am, "--CELKO--" <jcelko...@earthlink.netwrote:
I am going to use this in a book, so you will get credit.

I want ask the people who reply to CELKO's question.
He is going to use your reply to use in his book without explicit your
permission.
Or, post in this forum means implicitly give permission to use any way
your writings for all peoples? In another word, did you agreed to give
up your copywrite?
Joe:

If you've not already, it may be of value to you to see chapter 6
(especially section 6.5.2) of Snodgrass's _Developing Time-Oriented
Database Applications in SQL_, which is now in the public domain:
http://www.cs.arizona.edu/~rts/publications.html.

I was working on a "find contigous slots of time for a given meeting
of x minute duration" query that went against a table of timeslots of
various lengths--some of which may or may not have been available--and
the section of Snodgrass's book I mention was very helpful to me. You
may also find Zimanyi's "Temporal Aggregates and Temporal
Universal..." paper in v35:1 (6/06) of the SIGMOD Record useful.

Regards,

--Jeff

Mar 26 '07 #13

P: n/a
> I want ask the people who reply to CELKO's question. He is going to use your reply to use in his book without explicit your permission. <<

I do not have to ask when you post in a public newsgroup without a
copyright on the material. I do because it is polite and it usually
gets me more help. I also have my publisher send out review copies to
contributors so they can highlight their name adn enjoy a $30-50
book.

Right now, there is a Blog called "Joe Celko, the SQL apprrentice"
that is nothing but a collection of **my** newsgroup postings (I wish
I knew who is doing it). He has made several hundred dollars in
advertising from it. The most that Google will do is post a
disclaimer that I have nothing to do with the site.

I also wish he would fix my spelling, but that would lead to problems
with improper use of public material. Welcome to the weird world of IP
and copyrights!

Mar 27 '07 #14

P: n/a
> the section of Snodgrass's book I mention was very helpful to me. You may also find Zimanyi's "Temporal Aggregates and Temporal Universal..." paper in v35:1 (6/06) of the SIGMOD Record useful. <<

I know Rick from our time on ANSI X3H2 together and I used some of his
material in one of my other books. I am trying to do an updated
version of it, with some of the new SQL-99 and SQL-2003 stuff. That
is why I am posting in newgroups that attract smart programmers :)

I did not know about Zimanyi's stuff; thanks!

Mar 27 '07 #15

This discussion thread is closed

Replies have been disabled for this discussion.