Merging overlapping spans/ranges

Max M

I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----

------- -------- # meta spans
I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if
anybody has an idea for a better approach?
######################################
# -*- coding: latin-1 -*-

"""
1) ---
2) ---
3) ---
4) -----
5) -----
------- --------

"""

class MetaSpans:

"""
Populate with a list of span tuples [(start,end)], and it will make
"meta"
spans, with overlapping spans folded into one span.
"""

def __init__(self):
self.spans = []

def add(self, span):
start, end = span
overlapping = [span]
non_overlapping = []
for spn in self.spans:
spn_start, spn_end = spn
# span rules for iterated spans
starts_before = spn_start <= start
ends_after = spn_end >= end
is_outside = starts_before and ends_after
starts_inside = start <= spn_start <= end
ends_inside = start <= spn_end <= end
overlaps = starts_inside or ends_inside or is_outside
if overlaps:
overlapping.append(spn)
else:
non_overlapping.append(spn)
# overlapping spans are changed to one span
starts = []
ends = []
for start, end in overlapping:
starts.append(start)
ends.append(end)
min_start = min(starts)
max_end = max(ends)
non_overlapping.append( (min_start, max_end) )
self.spans = non_overlapping
def findFreeTime(self, duration):
self.spans.sort()

if __name__ == '__main__':

ms = MetaSpans()
ms.add((0,3))
ms.add((4,7))
ms.add((2,5))
ms.add((9,14))
ms.add((12,17))
print ms.spans
from datetime import datetime
ms = MetaSpans()
ms.add((datetime(2005, 1, 1, 0, 0, 0), datetime(2005, 1, 1, 3, 0, 0)))
ms.add((datetime(2005, 1, 1, 4, 0, 0), datetime(2005, 1, 1, 7, 0, 0)))
ms.add((datetime(2005, 1, 1, 2, 0, 0), datetime(2005, 1, 1, 5, 0, 0)))
ms.add((datetime(2005, 1, 1, 9, 0, 0), datetime(2005, 1, 1, 14, 0, 0)))
ms.add((datetime(2005, 1, 1, 12, 0, 0), datetime(2005, 1, 1, 17, 0,
0)))
print ms.spans

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Jul 19 '05 #1

Subscribe Reply

4643

bearophileHUGS

This is the problem of finding the connected components inside an
interval graph. You can implement the algorithms yourself, of you can
use my graph data structure here:

http://sourceforge.net/projects/pynetwork/

The graph methods:
createIntervalgGraph
And:
connectedComponents
can probably solve your problem quite fast, algorithmically, and with
few lines of code.
If you need more help, then ask for it and I'll give the little code
needed.

Bear hugs,
Bearophile

Jul 19 '05 #2

Jordan Rastrick

Max M wrote:

I am writing a "find-free-time" function for a calendar. There are a lot of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----
>> ------- -------- # meta spans
I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if anybody has an idea for a better approach?
######################################
# -*- coding: latin-1 -*-

"""
1) ---
2) ---
3) ---
4) -----
5) -----
>> ------- -------- """

class MetaSpans:

"""
Populate with a list of span tuples [(start,end)], and it will make "meta"
spans, with overlapping spans folded into one span.
"""

def __init__(self):
self.spans = []

def add(self, span):
start, end = span
overlapping = [span]
non_overlapping = []
for spn in self.spans:
spn_start, spn_end = spn
# span rules for iterated spans
starts_before = spn_start <= start
ends_after = spn_end >= end
is_outside = starts_before and ends_after
starts_inside = start <= spn_start <= end
ends_inside = start <= spn_end <= end
overlaps = starts_inside or ends_inside or is_outside
if overlaps:
overlapping.append(spn)
else:
non_overlapping.append(spn)
# overlapping spans are changed to one span
starts = []
ends = []
for start, end in overlapping:
starts.append(start)
ends.append(end)
min_start = min(starts)
max_end = max(ends)
non_overlapping.append( (min_start, max_end) )
self.spans = non_overlapping
def findFreeTime(self, duration):
self.spans.sort()

if __name__ == '__main__':

ms = MetaSpans()
ms.add((0,3))
ms.add((4,7))
ms.add((2,5))
ms.add((9,14))
ms.add((12,17))
print ms.spans
from datetime import datetime
ms = MetaSpans()
ms.add((datetime(2005, 1, 1, 0, 0, 0), datetime(2005, 1, 1, 3, 0, 0))) ms.add((datetime(2005, 1, 1, 4, 0, 0), datetime(2005, 1, 1, 7, 0, 0))) ms.add((datetime(2005, 1, 1, 2, 0, 0), datetime(2005, 1, 1, 5, 0, 0))) ms.add((datetime(2005, 1, 1, 9, 0, 0), datetime(2005, 1, 1, 14, 0, 0))) ms.add((datetime(2005, 1, 1, 12, 0, 0), datetime(2005, 1, 1, 17, 0, 0)))
print ms.spans

I think the following code does what you want. It should be O(n log n)
- at least I hope thats what Python takes to sort the list of spans :)
Of course I've assumed you have the spans available to you all at once
as a list, and dont need to add them one at a time as you did in your
original code.

def get_metaspans(spans):
"""Given a list of span tuples [(start,end)], will generate all
meta spans, with overlapping spans folded into one span.
"""
spans.sort()
spans = iter(spans)
metaspan = spans.next()
for span in spans:
start, end = span
m_start, m_end = metaspan
if start > m_end:
yield metaspan
metaspan = span
elif end > m_end:
metaspan = (m_start, end)
# Need to yield the final metaspan once the span list is exhausted
yield metaspan

def get_breaks(metaspans):
"""Gets all the breaks in between a sequence of metaspans"""
metaspans = iter(metaspans)
_, prev_end = metaspans.next()
for metaspan in metaspans:
start, end = metaspan
yield (prev_end, start)
prev_end = end

I must admit I'm a bit of a generatoraholic, I'll tend to throw yields
at anything given half a chance :) Having to yield once more at the end
of the get_metaspans loop seems a little inelegant, maybe it could be
done a bit better. But its nice the way empty lists are handled
gracefully - the StopIteration thrown by the .next() calls are just
absorbed by the, uh, generatorness.

A little bit of testing:

spans = [(12, 13), (0,3), (2,5), (1,4), (4,6), (1,2), (8,9), (9, 10)] print list(get_metaspans(spans)) [(0, 6), (8, 10), (12, 13)] print list(get_breaks(get_metaspans(spans)))

[(6, 8), (10, 12)]

Is that more or less what was required?

Jul 19 '05 #3

elbertlev

The linear method:

You create an array - one bool per minute. For one day 24 * 60 entries
is enough. Spans (Start, End) are in minutes from midnight. Set array
slots in range(Start, End) to True for each input span.

Scan the array and find metaspans - contiguous sequences of False.

Jul 19 '05 #4

Mike Rovner

Max M wrote:

I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".

"Almost" linear method (includes .sort() which is afaik N*logN):
[Set all priority to 1, or you will get changes in priorities as well.]

def merge(p):
""" p is a seq of (start, priority, end)
returns list of merged events:
(start1, pri), (end1, 0), (start2, pri), (end2, 0), ...
"""
all=[] # list of x, h, up, id
for id,(l,h,r) in enumerate(p):
all.append((l,h,1,id))
all.append((r,h,0,id))
all.sort()

sl = [] # skyline solution
slx = slh = 0 # current values
vis = {} # active {id:h}
for x,h,up,id in all:
if up:
vis[id]=h
if h>slh:
sl.append((x,h))
slh=h
else:
del vis[id]
assert h<=slh
if h==slh:
v = vis.values()
if v:
h = max(v)
else:
h = 0
sl.append((x,h))
slh=h
slx=x
# merge same time events
s=dict(sl)
sl=s.keys()
sl.sort()
return [(k,s[k]) for k in sl]

Jul 19 '05 #5

Steven Bethard

el*******@hotmail.com wrote:

The linear method:

You create an array - one bool per minute. For one day 24 * 60 entries
is enough. Spans (Start, End) are in minutes from midnight. Set array
slots in range(Start, End) to True for each input span.

Scan the array and find metaspans - contiguous sequences of False.

Yeah, this would basically have been my suggestion. One implementation:

py> def merge(spans, nbins):
.... bins = [0]*nbins
.... for start, end in spans:
.... for bin in xrange(start, end):
.... bins[bin] += 1
.... def key((i, count)):
.... return count != 0
.... index_groups = itertools.groupby(enumerate(bins), key=key)
.... for isnonzero, indices in index_groups:
.... if isnonzero:
.... indices = [i for (i, count) in indices]
.... yield (indices[0], indices[-1])
....
py> spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
py> list(merge(spans, 20))
[(0, 6), (9, 16)]

Obviously, this needs some modification to handle datetime objects.

STeVe

Jul 19 '05 #6

Bengt Richter

On Tue, 10 May 2005 15:14:47 +0200, Max M <ma**@mxm.dk> wrote:

I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----
------- -------- # meta spans
I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if
anybody has an idea for a better approach?

Maybe (not tested beyond what you see ;-)

def mergespans(spans): ... start = end = None
... for s,e in sorted(spans):
... if start is None: start, end = s,e; continue
... if s <= end: end = e; continue
... yield start, end
... start,end = s,e
... if start is not None: yield start, end
... spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
list(mergespans(spans))

[(0, 7), (9, 17)]
Regards,
Bengt Richter

Jul 19 '05 #7

Bengt Richter

On Tue, 10 May 2005 15:14:47 +0200, Max M <ma**@mxm.dk> wrote:

I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----
------- -------- # meta spans
I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if
anybody has an idea for a better approach?

spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
Non-recommended one-liner:
reduce(lambda L,se: (L and se[0]<=L[-1][1] and (L.append((L.pop()[0], se[1])) or L)) or L.append(se) or L ,sorted(spans), [])

[(0, 7), (9, 17)]

Regards,
Bengt Richter

Jul 19 '05 #8

Jim Sizelove

Bengt Richter wrote:

On Tue, 10 May 2005 15:14:47 +0200, Max M <ma**@mxm.dk> wrote:

I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----

------- -------- # meta spans

I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if
anybody has an idea for a better approach?

Maybe (not tested beyond what you see ;-)

It is with some trepidation that I write this message; after hanging
around in c.l.p for the better part of a year, I have come to expect
Bengt's messages to be right-on and error-free.

However this time... :)

>>> def mergespans(spans): ... start = end = None
... for s,e in sorted(spans):
... if start is None: start, end = s,e; continue
... if s <= end: end = e; continue
... yield start, end
... start,end = s,e
... if start is not None: yield start, end
... >>> spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
>>> list(mergespans(spans)) [(0, 7), (9, 17)]

There can be a problem in mergespans if one span fits inside another:

spans = [(0,5), (4,7), (2,3), (9,14), (12,17)]
list(mergespans(spans)) [(0, 3), (4, 7), (9, 17)]

Here is a revised version (not tested beyond what you see ;-) def mergespans(spans): ... start = end = None
... for s,e in sorted(spans):
... if start is None:
... start, end = s,e
... continue
... if s <= end:
... if end < e:
... end = e
... continue
... yield start, end
... start,end = s,e
... if start is not None:
... yield start, end
... list(mergespans([(0,5), (4,7), (2,3), (9,14), (12,17)])) [(0, 7), (9, 17)] list(mergespans([(0,3), (4,7), (2,5), (9,14), (12,17)]))

[(0, 7), (9, 17)]

Can anyone find any other errors in this? ;-)

With humble regards,
Jim Sizelove

Jul 19 '05 #9

Jordan Rastrick

Should work fine as far as I can see. Of course, thats not very
'pythonic' - I should probably give at least 10 different unit tests
that it passes ;)

Its gratifying to know I'm not the only one who needed that final
yield.

Jim Sizelove wrote:

Bengt Richter wrote:
On Tue, 10 May 2005 15:14:47 +0200, Max M <ma**@mxm.dk> wrote:

I am writing a "find-free-time" function for a calendar. There are a lotof time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into alist of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----
>------- -------- # meta spans

I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered ifanybody has an idea for a better approach?

Maybe (not tested beyond what you see ;-)

It is with some trepidation that I write this message; after hanging
around in c.l.p for the better part of a year, I have come to expect
Bengt's messages to be right-on and error-free.

However this time... :)
>>> def mergespans(spans):

... start = end = None
... for s,e in sorted(spans):
... if start is None: start, end = s,e; continue
... if s <= end: end = e; continue
... yield start, end
... start,end = s,e
... if start is not None: yield start, end
...
>>> spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
>>> list(mergespans(spans))

[(0, 7), (9, 17)]

There can be a problem in mergespans if one span fits inside another:
>>> spans = [(0,5), (4,7), (2,3), (9,14), (12,17)]
>>> list(mergespans(spans)) [(0, 3), (4, 7), (9, 17)]

Here is a revised version (not tested beyond what you see ;-) >>> def mergespans(spans): ... start = end = None
... for s,e in sorted(spans):
... if start is None:
... start, end = s,e
... continue
... if s <= end:
... if end < e:
... end = e
... continue
... yield start, end
... start,end = s,e
... if start is not None:
... yield start, end
... >>> list(mergespans([(0,5), (4,7), (2,3), (9,14), (12,17)])) [(0, 7), (9, 17)] >>> list(mergespans([(0,3), (4,7), (2,5), (9,14), (12,17)]))

[(0, 7), (9, 17)]

Can anyone find any other errors in this? ;-)

With humble regards,
Jim Sizelove

Jul 19 '05 #10

Max M

Jim Sizelove wrote:
Wow! c.l.py is allmost like an AI program generator. But I guess it
helps to ask questions about problems that programmers find interresting :-)

The "linear" approach is pretty simple to code and understand. I am just
afraid what happens when many users tries to book that 14 day conference
in the calendar sometimes this year.

365 * 24 * 60 = 525,600 booleans in the array

Well ok a few MBytes ain't too bad on a modern machine. It would even be
possible to cut the resolution down to 15 minutes. Yielding 35,040
bools, but the solution is still O^2 with regards to size of "search
window" and number of events.

Bengts/Jims and Jordans solutions seems to be relatively similar. I will
use one of those instead of my own code.
Thanks!
--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Jul 19 '05 #11

Bengt Richter

On Tue, 10 May 2005 23:53:43 GMT, Jim Sizelove <si*****@insightbb.com> wrote:

Bengt Richter wrote: [...]
Maybe (not tested beyond what you see ;-) I had that feeling ... somehow the word "max" was trying to get my attention,
but I posted without figuring out why ;-/
It is with some trepidation that I write this message; after hanging
around in c.l.p for the better part of a year, I have come to expect
Bengt's messages to be right-on and error-free. Would that 'twere so ;-)

However this time... :) >>> def mergespans(spans): ... start = end = None
... for s,e in sorted(spans):
... if start is None: start, end = s,e; continue
... if s <= end: end = e; continue
... yield start, end
... start,end = s,e
... if start is not None: yield start, end
...
>>> spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
>>> list(mergespans(spans))

[(0, 7), (9, 17)]

There can be a problem in mergespans if one span fits inside another:

Good call. I should have thought of it.

>>> spans = [(0,5), (4,7), (2,3), (9,14), (12,17)]
>>> list(mergespans(spans)) [(0, 3), (4, 7), (9, 17)]

Here is a revised version (not tested beyond what you see ;-) >>> def mergespans(spans): ... start = end = None
... for s,e in sorted(spans):
... if start is None:
... start, end = s,e
... continue
... if s <= end:
... if end < e:
... end = e
... continue
... yield start, end
... start,end = s,e
... if start is not None:
... yield start, end
... >>> list(mergespans([(0,5), (4,7), (2,3), (9,14), (12,17)])) [(0, 7), (9, 17)] >>> list(mergespans([(0,3), (4,7), (2,5), (9,14), (12,17)])) [(0, 7), (9, 17)]

Can anyone find any other errors in this? ;-)

Not off hand. OTOH, I really didn't like the aesthetics of all that crufty logic in my original.
Maybe (again, barely tested ;-)

def mergespans(spans): ... it = iter(sorted(spans))
... start, end = it.next()
... for s,e in it:
... if s <= end:
... end = max(end, e) # was what "max" was trying to tell me, I think
... else:
... yield start, end
... start,end = s,e
... yield start, end
... list(mergespans([(0,5), (4,7), (2,3), (9,14), (12,17)])) [(0, 7), (9, 17)] list(mergespans([(0,3), (4,7), (2,5), (9,14), (12,17)]))

[(0, 7), (9, 17)]

With humble regards,

Me too. UIAM we're all still human here (even my favorite 'bots, I think ;-)

Regards,
Bengt Richter

Jul 19 '05 #12

by: Phil Sandler | last post by:

All, I have a table with start and end dates/times in it, and would like to be able to calculate the number of hours represented, accounting for overlapping records. Note that I am looking...

Microsoft SQL Server

std::copy for overlapping ranges?

by: Simon Elliott | last post by:

Is there an equivalent of std::copy which works on STL containers for overlapping ranges? -- Simon Elliott http://www.ctsn.co.uk

C / C++

problem walking thru a collection of SPANS children of a DIV

by: Joseph | last post by:

I am attempting to create a function that will hide all SPANS within an ided DIV ="idMain", then only display one span, but this function gives an error when it gets to elChildElement as being...

Javascript

Problem defining an XSD to validate for overlapping ranges

by: Jim Di Griz | last post by:

Hi I have a problem to create an XSD that prevents overlapping ranges. Sample XML: <root> <element LowerLimit="1" UpperLimit="50000"> ..... </element>

.NET Framework

Overlapping months

by: Chesne | last post by:

I have a table with and which are in an accommodation db. I would like to be able to determine how many days are occupied for a particular month using the two fields as per above. By adding the ...

Microsoft Access / VBA

Date overlapping

by: Rsapru | last post by:

i have a table containing following data effdate termdate uid ----------- ----------- ----------- 1 2 1 3 4 2 5 8 3 7 ...

Microsoft SQL Server

converting text and spans to an ElementTree

by: Steven Bethard | last post by:

I have some text and a list of Element objects and their offsets, e.g.:: ... (etree.Element('a'), 0, 21), ... (etree.Element('b'), 11, 18), ... (etree.Element('c'), 18, 18), ... ] ...

Python

Overlapping dates logic

by: Jim | last post by:

Maybe I am making this too difficult; but I have the following issue: A record contains a price for a specific Start Date and End Date. If the user adds a new price, I need to make sure the...

Visual Basic .NET

Finding overlapping times...

by: Breal | last post by:

I have a list that looks like the following I would like to be able to determine which of these overlap each other. So, in this case, tuple 1 overlaps with tuples 2 and 3. Tuple 2 overlaps...

Python

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

C# / C Sharp

Merging overlapping spans/ranges

Similar topics