By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,538 Members | 1,428 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,538 IT Pros & Developers. It's quick & easy.

Chunking sequential values in a list

P: n/a
I have this function:

def sequentialChunks(l, stride=1):
chunks = []
chunk = []
for i,v in enumerate(l[:-1]):
v2 = l[i+1]
if v2-v == stride:
if not chunk:
chunk.append(v)
chunk.append(v2)
else:
if not chunk:
chunk.append(v)
chunks.append(chunk)
chunk = []
if chunk:
chunks.append(chunk)
return chunks

Which takes a list of numerical values "l" and splits it into chunks
where each chunk is sequential, where sequential means each value in a
chunk is
separated from the next by "stride".

So sequentialChunks([1,2,3,5,6,8,12]) returns:

[[1,2,3],[5,6],[8],[12]]

I don't think the code above is the most efficient way to do this, but
it is relatively clear. I tried fiddling with list-comprehension ways of
accomplishing it, but kept losing track of things...so if anyone has a
suggestion, I'd appreciate it.

Thanks,
-Dave
--
Presenting:
mediocre nebula.

Jul 13 '06 #1
Share this Question
Share on Google+
8 Replies


P: n/a
It looks like homework. Sometimes the simpler code is better:

def splitter(seq):
if not seq:
return []
result = []
current = [seq[0]]
for pos, el in enumerate(seq[1:]):
if el - current[-1] 1:
result.append(current[:])
current = []
current.append(el)
result.append(current)
return result

print splitter([1, 2, 3, 5, 6, 8, 12])
print splitter([1, 1, 1, 2, 2, 5, 6])
print splitter([1, 1, 1, 1, 1, 1, 1])
print splitter([1, 2, 3])
print splitter([1])
print splitter([])
print splitter([1.0, 1.1, 1.99, 4.01])
print splitter([1.0, 2.0, 3.0, 4.01])

To reduce memory used you can use islice instead of the slice seq[1:]
If you want something less easy to understand, you can try fixing the
following code that doesn't work:

from itertools import groupby
l = [1,2,3,5,6,8,12]
keyf = lambda (pos,x): x-l[pos]>1
[[el[1] for el in gr] for he,gr in groupby(enumerate(l[:-1]), keyf)]

Bye,
bearophile

Jul 13 '06 #2

P: n/a

David Hirschfield wrote:
I have this function:

def sequentialChunks(l, stride=1):
chunks = []
chunk = []
for i,v in enumerate(l[:-1]):
v2 = l[i+1]
if v2-v == stride:
if not chunk:
chunk.append(v)
chunk.append(v2)
else:
if not chunk:
chunk.append(v)
chunks.append(chunk)
chunk = []
if chunk:
chunks.append(chunk)
return chunks

Which takes a list of numerical values "l" and splits it into chunks
where each chunk is sequential, where sequential means each value in a
chunk is
separated from the next by "stride".

So sequentialChunks([1,2,3,5,6,8,12]) returns:

[[1,2,3],[5,6],[8],[12]]

I don't think the code above is the most efficient way to do this, but
it is relatively clear. I tried fiddling with list-comprehension ways of
accomplishing it, but kept losing track of things...so if anyone has a
suggestion, I'd appreciate it.
see the groupby example here:

http://docs.python.org/lib/itertools-example.html

Gerard

Jul 13 '06 #3

P: n/a

GerardDavid Hirschfield wrote:
>I have this function:

def sequentialChunks(l, stride=1):
...
>>
Which takes a list of numerical values "l" and splits it into chunks
where each chunk is sequential...
Gerardsee the groupby example here:

Gerardhttp://docs.python.org/lib/itertools-example.html

I've never been a fan of the operator module, so I find the example in the
docs ugly even though it's succinct. I'd also never used groupby but
thought it was the solution to the problem. As I was walking home from the
train I realized how it would work. You beat me to the punch with your
reply, but I'll post my solution anyway:

from itertools import groupby

class Counter:
def __init__(self):
self.last = None
self.value = True

def __call__(self, val):
if self.last is not None and self.last+1 != val:
self.value = not self.value
self.last = val
return self.value

for key, group in groupby([1,2,3, 5,6, 8, 12,13,14], Counter()):
print list(group)

Skip
Jul 14 '06 #4

P: n/a
David Hirschfield <da****@ilm.comwrites:
So sequentialChunks([1,2,3,5,6,8,12]) returns:
[[1,2,3],[5,6],[8],[12]]
Ugly and not too efficient: find the break points and use them to make
sub-lists.

def sequentialChunks(l, stride=1):
p = [0] + [i for i in xrange(1,len(l)) if l[i]-l[i-1] != stride] + [len(l)]
return [l[p[i]:p[i+1]] for i in xrange(len(p)-1)]

print sequentialChunks([1,2,3,5,6,8,12])
Jul 14 '06 #5

P: n/a

sk**@pobox.com wrote:
GerardDavid Hirschfield wrote:
>I have this function:
>>
>def sequentialChunks(l, stride=1):
...
>>
>Which takes a list of numerical values "l" and splits it into chunks
>where each chunk is sequential...

Gerardsee the groupby example here:

Gerardhttp://docs.python.org/lib/itertools-example.html

I've never been a fan of the operator module, so I find the example in the
docs ugly even though it's succinct. I'd also never used groupby but
thought it was the solution to the problem. As I was walking home from the
train I realized how it would work. You beat me to the punch with your
reply, but I'll post my solution anyway:

from itertools import groupby

class Counter:
def __init__(self):
self.last = None
self.value = True

def __call__(self, val):
if self.last is not None and self.last+1 != val:
self.value = not self.value
self.last = val
return self.value

for key, group in groupby([1,2,3, 5,6, 8, 12,13,14], Counter()):
print list(group)
I've no strong feelings about the operator module myself, or the
groupby example, but I do prefer your code in this case. I tweaked it
a little to take into account the stride, though I'm not sure it gives
what the OP wants.

all the best

Gerard

-----------------------------

class Counter:
def __init__(self, stride=1):
self.last = None
self.value = True
self.stride = stride

def __call__(self, val):
if self.last is not None:
d = val - self.last
if d self.stride:
self.value = not self.value
self.last = val
return self.value

def group(data,stride=1):
for key, group in groupby(data, Counter(stride)):
yield list(group)

-----------------------------------

Jul 14 '06 #6

P: n/a

David Hirschfield wrote:
I have this function:

def sequentialChunks(l, stride=1):
chunks = []
chunk = []
for i,v in enumerate(l[:-1]):
v2 = l[i+1]
if v2-v == stride:
if not chunk:
chunk.append(v)
chunk.append(v2)
else:
if not chunk:
chunk.append(v)
chunks.append(chunk)
chunk = []
if chunk:
chunks.append(chunk)
return chunks

Which takes a list of numerical values "l" and splits it into chunks
where each chunk is sequential, where sequential means each value in a
chunk is
separated from the next by "stride".

So sequentialChunks([1,2,3,5,6,8,12]) returns:

[[1,2,3],[5,6],[8],[12]]

I don't think the code above is the most efficient way to do this, but
it is relatively clear. I tried fiddling with list-comprehension ways of
accomplishing it, but kept losing track of things...so if anyone has a
suggestion, I'd appreciate it.
Gerard wrote:
>see the groupby example here:

http://docs.python.org/lib/itertools-example.html
tweaking the example from the docs to take the stride into account:

def stride(length):
i = 0
while True:
yield i
i += length

def group(data,d=1):
for k, g in groupby(zip(stride(d),data), lambda (i,x):i-x):
yield map(operator.itemgetter(1), g)

data = [1,2,3, 5, 6, 8, 12,13,14]

print list( group( data,2 ) )

hth

Gerard

Jul 14 '06 #7

P: n/a
be************@lycos.com wrote:
If you want something less easy to understand, you can try fixing the
following code that doesn't work:

from itertools import groupby
l = [1,2,3,5,6,8,12]
keyf = lambda (pos,x): x-l[pos]>1
[[el[1] for el in gr] for he,gr in groupby(enumerate(l[:-1]), keyf)]
>>from itertools import groupby
items = [1, 2, 3, 5, 6, 8, 12]
[[v for i, v in group] for key, group in groupby(enumerate(items),
lambda (i, v): i-v)]
[[1, 2, 3], [5, 6], [8], [12]]

which is almost identical to the last example in
http://docs.python.org/lib/itertools-example.html

Peter
Jul 14 '06 #8

P: n/a
Peter Otten:
which is almost identical to the last example in
http://docs.python.org/lib/itertools-example.html
I see, thank you. I haven't had enoug time and brain to fix it (and the
OP question seemed like homework, so leaving some other things to do is
positive).

I think still that too much clever code can be bad, it isn't easy to
understand, fix and modify if you need something a little different.
To solve such problem in 'production code' I probably prefer a longer
function like the one I have shown.

Bye,
bearophile

Jul 14 '06 #9

This discussion thread is closed

Replies have been disabled for this discussion.