473,606 Members | 2,453 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Merging overlapping spans/ranges

I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----
------- -------- # meta spans
I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if
anybody has an idea for a better approach?
############### ############### ########
# -*- coding: latin-1 -*-

"""
1) ---
2) ---
3) ---
4) -----
5) -----
------- --------

"""

class MetaSpans:

"""
Populate with a list of span tuples [(start,end)], and it will make
"meta"
spans, with overlapping spans folded into one span.
"""

def __init__(self):
self.spans = []

def add(self, span):
start, end = span
overlapping = [span]
non_overlapping = []
for spn in self.spans:
spn_start, spn_end = spn
# span rules for iterated spans
starts_before = spn_start <= start
ends_after = spn_end >= end
is_outside = starts_before and ends_after
starts_inside = start <= spn_start <= end
ends_inside = start <= spn_end <= end
overlaps = starts_inside or ends_inside or is_outside
if overlaps:
overlapping.app end(spn)
else:
non_overlapping .append(spn)
# overlapping spans are changed to one span
starts = []
ends = []
for start, end in overlapping:
starts.append(s tart)
ends.append(end )
min_start = min(starts)
max_end = max(ends)
non_overlapping .append( (min_start, max_end) )
self.spans = non_overlapping
def findFreeTime(se lf, duration):
self.spans.sort ()


if __name__ == '__main__':

ms = MetaSpans()
ms.add((0,3))
ms.add((4,7))
ms.add((2,5))
ms.add((9,14))
ms.add((12,17))
print ms.spans
from datetime import datetime
ms = MetaSpans()
ms.add((datetim e(2005, 1, 1, 0, 0, 0), datetime(2005, 1, 1, 3, 0, 0)))
ms.add((datetim e(2005, 1, 1, 4, 0, 0), datetime(2005, 1, 1, 7, 0, 0)))
ms.add((datetim e(2005, 1, 1, 2, 0, 0), datetime(2005, 1, 1, 5, 0, 0)))
ms.add((datetim e(2005, 1, 1, 9, 0, 0), datetime(2005, 1, 1, 14, 0, 0)))
ms.add((datetim e(2005, 1, 1, 12, 0, 0), datetime(2005, 1, 1, 17, 0,
0)))
print ms.spans

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science
Jul 19 '05 #1
11 4665
This is the problem of finding the connected components inside an
interval graph. You can implement the algorithms yourself, of you can
use my graph data structure here:

http://sourceforge.net/projects/pynetwork/

The graph methods:
createIntervalg Graph
And:
connectedCompon ents
can probably solve your problem quite fast, algorithmically , and with
few lines of code.
If you need more help, then ask for it and I'll give the little code
needed.

Bear hugs,
Bearophile

Jul 19 '05 #2

Max M wrote:
I am writing a "find-free-time" function for a calendar. There are a lot of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----
>> ------- -------- # meta spans
I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if anybody has an idea for a better approach?
############### ############### ########
# -*- coding: latin-1 -*-

"""
1) ---
2) ---
3) ---
4) -----
5) -----
>> ------- -------- """

class MetaSpans:

"""
Populate with a list of span tuples [(start,end)], and it will make "meta"
spans, with overlapping spans folded into one span.
"""

def __init__(self):
self.spans = []

def add(self, span):
start, end = span
overlapping = [span]
non_overlapping = []
for spn in self.spans:
spn_start, spn_end = spn
# span rules for iterated spans
starts_before = spn_start <= start
ends_after = spn_end >= end
is_outside = starts_before and ends_after
starts_inside = start <= spn_start <= end
ends_inside = start <= spn_end <= end
overlaps = starts_inside or ends_inside or is_outside
if overlaps:
overlapping.app end(spn)
else:
non_overlapping .append(spn)
# overlapping spans are changed to one span
starts = []
ends = []
for start, end in overlapping:
starts.append(s tart)
ends.append(end )
min_start = min(starts)
max_end = max(ends)
non_overlapping .append( (min_start, max_end) )
self.spans = non_overlapping
def findFreeTime(se lf, duration):
self.spans.sort ()


if __name__ == '__main__':

ms = MetaSpans()
ms.add((0,3))
ms.add((4,7))
ms.add((2,5))
ms.add((9,14))
ms.add((12,17))
print ms.spans
from datetime import datetime
ms = MetaSpans()
ms.add((datetim e(2005, 1, 1, 0, 0, 0), datetime(2005, 1, 1, 3, 0, 0))) ms.add((datetim e(2005, 1, 1, 4, 0, 0), datetime(2005, 1, 1, 7, 0, 0))) ms.add((datetim e(2005, 1, 1, 2, 0, 0), datetime(2005, 1, 1, 5, 0, 0))) ms.add((datetim e(2005, 1, 1, 9, 0, 0), datetime(2005, 1, 1, 14, 0, 0))) ms.add((datetim e(2005, 1, 1, 12, 0, 0), datetime(2005, 1, 1, 17, 0, 0)))
print ms.spans


I think the following code does what you want. It should be O(n log n)
- at least I hope thats what Python takes to sort the list of spans :)
Of course I've assumed you have the spans available to you all at once
as a list, and dont need to add them one at a time as you did in your
original code.

def get_metaspans(s pans):
"""Given a list of span tuples [(start,end)], will generate all
meta spans, with overlapping spans folded into one span.
"""
spans.sort()
spans = iter(spans)
metaspan = spans.next()
for span in spans:
start, end = span
m_start, m_end = metaspan
if start > m_end:
yield metaspan
metaspan = span
elif end > m_end:
metaspan = (m_start, end)
# Need to yield the final metaspan once the span list is exhausted
yield metaspan

def get_breaks(meta spans):
"""Gets all the breaks in between a sequence of metaspans"""
metaspans = iter(metaspans)
_, prev_end = metaspans.next( )
for metaspan in metaspans:
start, end = metaspan
yield (prev_end, start)
prev_end = end

I must admit I'm a bit of a generatoraholic , I'll tend to throw yields
at anything given half a chance :) Having to yield once more at the end
of the get_metaspans loop seems a little inelegant, maybe it could be
done a bit better. But its nice the way empty lists are handled
gracefully - the StopIteration thrown by the .next() calls are just
absorbed by the, uh, generatorness.

A little bit of testing:
spans = [(12, 13), (0,3), (2,5), (1,4), (4,6), (1,2), (8,9), (9, 10)] print list(get_metasp ans(spans)) [(0, 6), (8, 10), (12, 13)] print list(get_breaks (get_metaspans( spans)))

[(6, 8), (10, 12)]

Is that more or less what was required?

Jul 19 '05 #3
The linear method:

You create an array - one bool per minute. For one day 24 * 60 entries
is enough. Spans (Start, End) are in minutes from midnight. Set array
slots in range(Start, End) to True for each input span.

Scan the array and find metaspans - contiguous sequences of False.

Jul 19 '05 #4
Max M wrote:
I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".


"Almost" linear method (includes .sort() which is afaik N*logN):
[Set all priority to 1, or you will get changes in priorities as well.]

def merge(p):
""" p is a seq of (start, priority, end)
returns list of merged events:
(start1, pri), (end1, 0), (start2, pri), (end2, 0), ...
"""
all=[] # list of x, h, up, id
for id,(l,h,r) in enumerate(p):
all.append((l,h ,1,id))
all.append((r,h ,0,id))
all.sort()

sl = [] # skyline solution
slx = slh = 0 # current values
vis = {} # active {id:h}
for x,h,up,id in all:
if up:
vis[id]=h
if h>slh:
sl.append((x,h) )
slh=h
else:
del vis[id]
assert h<=slh
if h==slh:
v = vis.values()
if v:
h = max(v)
else:
h = 0
sl.append((x,h) )
slh=h
slx=x
# merge same time events
s=dict(sl)
sl=s.keys()
sl.sort()
return [(k,s[k]) for k in sl]

Jul 19 '05 #5
el*******@hotma il.com wrote:
The linear method:

You create an array - one bool per minute. For one day 24 * 60 entries
is enough. Spans (Start, End) are in minutes from midnight. Set array
slots in range(Start, End) to True for each input span.

Scan the array and find metaspans - contiguous sequences of False.


Yeah, this would basically have been my suggestion. One implementation:

py> def merge(spans, nbins):
.... bins = [0]*nbins
.... for start, end in spans:
.... for bin in xrange(start, end):
.... bins[bin] += 1
.... def key((i, count)):
.... return count != 0
.... index_groups = itertools.group by(enumerate(bi ns), key=key)
.... for isnonzero, indices in index_groups:
.... if isnonzero:
.... indices = [i for (i, count) in indices]
.... yield (indices[0], indices[-1])
....
py> spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
py> list(merge(span s, 20))
[(0, 6), (9, 16)]

Obviously, this needs some modification to handle datetime objects.

STeVe
Jul 19 '05 #6
On Tue, 10 May 2005 15:14:47 +0200, Max M <ma**@mxm.dk> wrote:
I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----
------- -------- # meta spans
I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if
anybody has an idea for a better approach?

Maybe (not tested beyond what you see ;-)
def mergespans(span s): ... start = end = None
... for s,e in sorted(spans):
... if start is None: start, end = s,e; continue
... if s <= end: end = e; continue
... yield start, end
... start,end = s,e
... if start is not None: yield start, end
... spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
list(mergespans (spans))

[(0, 7), (9, 17)]
Regards,
Bengt Richter
Jul 19 '05 #7
On Tue, 10 May 2005 15:14:47 +0200, Max M <ma**@mxm.dk> wrote:
I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----
------- -------- # meta spans
I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if
anybody has an idea for a better approach?

spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
Non-recommended one-liner:
reduce(lambda L,se: (L and se[0]<=L[-1][1] and (L.append((L.po p()[0], se[1])) or L)) or L.append(se) or L ,sorted(spans), [])

[(0, 7), (9, 17)]

Regards,
Bengt Richter
Jul 19 '05 #8
Bengt Richter wrote:
On Tue, 10 May 2005 15:14:47 +0200, Max M <ma**@mxm.dk> wrote:

I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----

------- -------- # meta spans


I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if
anybody has an idea for a better approach?


Maybe (not tested beyond what you see ;-)


It is with some trepidation that I write this message; after hanging
around in c.l.p for the better part of a year, I have come to expect
Bengt's messages to be right-on and error-free.

However this time... :)
>>> def mergespans(span s): ... start = end = None
... for s,e in sorted(spans):
... if start is None: start, end = s,e; continue
... if s <= end: end = e; continue
... yield start, end
... start,end = s,e
... if start is not None: yield start, end
... >>> spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
>>> list(mergespans (spans)) [(0, 7), (9, 17)]


There can be a problem in mergespans if one span fits inside another:
spans = [(0,5), (4,7), (2,3), (9,14), (12,17)]
list(mergespans (spans)) [(0, 3), (4, 7), (9, 17)]

Here is a revised version (not tested beyond what you see ;-) def mergespans(span s): ... start = end = None
... for s,e in sorted(spans):
... if start is None:
... start, end = s,e
... continue
... if s <= end:
... if end < e:
... end = e
... continue
... yield start, end
... start,end = s,e
... if start is not None:
... yield start, end
... list(mergespans ([(0,5), (4,7), (2,3), (9,14), (12,17)])) [(0, 7), (9, 17)] list(mergespans ([(0,3), (4,7), (2,5), (9,14), (12,17)]))

[(0, 7), (9, 17)]

Can anyone find any other errors in this? ;-)

With humble regards,
Jim Sizelove
Jul 19 '05 #9
Should work fine as far as I can see. Of course, thats not very
'pythonic' - I should probably give at least 10 different unit tests
that it passes ;)

Its gratifying to know I'm not the only one who needed that final
yield.

Jim Sizelove wrote:
Bengt Richter wrote:
On Tue, 10 May 2005 15:14:47 +0200, Max M <ma**@mxm.dk> wrote:

I am writing a "find-free-time" function for a calendar. There are a lotof time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into alist of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----
>------- -------- # meta spans

I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered ifanybody has an idea for a better approach?


Maybe (not tested beyond what you see ;-)


It is with some trepidation that I write this message; after hanging
around in c.l.p for the better part of a year, I have come to expect
Bengt's messages to be right-on and error-free.

However this time... :)
>>> def mergespans(span s):

... start = end = None
... for s,e in sorted(spans):
... if start is None: start, end = s,e; continue
... if s <= end: end = e; continue
... yield start, end
... start,end = s,e
... if start is not None: yield start, end
...
>>> spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
>>> list(mergespans (spans))

[(0, 7), (9, 17)]


There can be a problem in mergespans if one span fits inside another:
>>> spans = [(0,5), (4,7), (2,3), (9,14), (12,17)]
>>> list(mergespans (spans)) [(0, 3), (4, 7), (9, 17)]

Here is a revised version (not tested beyond what you see ;-) >>> def mergespans(span s): ... start = end = None
... for s,e in sorted(spans):
... if start is None:
... start, end = s,e
... continue
... if s <= end:
... if end < e:
... end = e
... continue
... yield start, end
... start,end = s,e
... if start is not None:
... yield start, end
... >>> list(mergespans ([(0,5), (4,7), (2,3), (9,14), (12,17)])) [(0, 7), (9, 17)] >>> list(mergespans ([(0,3), (4,7), (2,5), (9,14), (12,17)]))

[(0, 7), (9, 17)]

Can anyone find any other errors in this? ;-)

With humble regards,
Jim Sizelove


Jul 19 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
12803
by: Phil Sandler | last post by:
All, I have a table with start and end dates/times in it, and would like to be able to calculate the number of hours represented, accounting for overlapping records. Note that I am looking for an answer on HOW to do this--I don't necessarily need it to be written for me (although it would not go unappreciated!).
4
2961
by: Simon Elliott | last post by:
Is there an equivalent of std::copy which works on STL containers for overlapping ranges? -- Simon Elliott http://www.ctsn.co.uk
8
1494
by: Joseph | last post by:
I am attempting to create a function that will hide all SPANS within an ided DIV ="idMain", then only display one span, but this function gives an error when it gets to elChildElement as being undefined. The show_hide_Div function is meant to be called when a Radio button is selected. See below. Can anyone help function show_hide_Div(theDiv,DivsToHide,TheDivToShow) { elDiv = document.all.theDiv;
1
1683
by: Jim Di Griz | last post by:
Hi I have a problem to create an XSD that prevents overlapping ranges. Sample XML: <root> <element LowerLimit="1" UpperLimit="50000"> ..... </element>
2
1571
by: Chesne | last post by:
I have a table with and which are in an accommodation db. I would like to be able to determine how many days are occupied for a particular month using the two fields as per above. By adding the to tells me the date they leave. However, for statistical purposes I need to calculate the number of occupied days but when I reach closer to the end of the month some of the "Enddate' are in the following month. Can anyone tell me how to get...
8
5422
by: Rsapru | last post by:
i have a table containing following data effdate termdate uid ----------- ----------- ----------- 1 2 1 3 4 2 5 8 3 7 9 4 11 12 5 12 13 6 3 6 7
9
1984
by: Steven Bethard | last post by:
I have some text and a list of Element objects and their offsets, e.g.:: ... (etree.Element('a'), 0, 21), ... (etree.Element('b'), 11, 18), ... (etree.Element('c'), 18, 18), ... ] I'd like to produce the corresponding ElementTree. So I want to write a get_tree() function that works like::
4
3929
by: Jim | last post by:
Maybe I am making this too difficult; but I have the following issue: A record contains a price for a specific Start Date and End Date. If the user adds a new price, I need to make sure the Start Date doesn't overlap something that already exists. Meaning the new Start Date should not overlap the existing Start Date and End Date. Since I am new to VS2005 I thought perhaps there was a function to look at these dates... Your input is...
7
3178
by: Breal | last post by:
I have a list that looks like the following I would like to be able to determine which of these overlap each other. So, in this case, tuple 1 overlaps with tuples 2 and 3. Tuple 2 overlaps with 1. Tuple 3 overlaps with tuple 1. In my scenario I would have hundreds, if not thousands of these ranges. Any nice pythonic way to do this?
0
8024
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8432
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8310
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6781
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
5968
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5466
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
2451
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1561
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1305
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.