473,388 Members | 1,352 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,388 software developers and data experts.

Break up list into groups

All,

I can't seem to find an answer to this question anywhere, but I'm
still looking. My problem is I have a list of values like this:

l = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

A value with bit 0x80 set delineates the start of a new packet of
information. What I want to do is to group the packets so that 1, 2, 3
go with the 1st packet tagged 0xF0, 4 ,5, 6 go with the 2nd packet
tagged 0xF0, 7 & 8 go with the packet tagged 0xF1 and so on. The
length of the data associated with each tag can vary. I've already
written an algorithm to do this but I was wondering if some
combination of itertools functions could do the job faster?

Here's what I've done and the expected output of the algorithm:

def splitIntoGroups(data):
groups = []
local = []

for value in data:
if 0x80 & value:
if len(local) 0:
groups.append(local)

local = []
local.append(value)
else:
local.append(value)

if len(local) 0:
groups.append(local)

return groups

l = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

print splitIntoGroups(l)

Desired result:

[[240, 1, 2, 3], [240, 4, 5, 6], [241, 7, 8], [242, 9, 10, 11, 12,
13], [240, 14], [241, 15]]

Thanks,

Dan McLeran

Jul 16 '07 #1
12 1547
On Mon, 2007-07-16 at 14:11 -0700, da********@yahoo.com wrote:
I can't seem to find an answer to this question anywhere, but I'm
still looking. My problem is I have a list of values like this:

l = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

A value with bit 0x80 set delineates the start of a new packet of
information. What I want to do is to group the packets so that 1, 2, 3
go with the 1st packet tagged 0xF0, 4 ,5, 6 go with the 2nd packet
tagged 0xF0, 7 & 8 go with the packet tagged 0xF1 and so on. The
length of the data associated with each tag can vary. I've already
written an algorithm to do this but I was wondering if some
combination of itertools functions could do the job faster?

Here's what I've done and the expected output of the algorithm:

def splitIntoGroups(data):
groups = []
local = []

for value in data:
if 0x80 & value:
if len(local) 0:
groups.append(local)

local = []
local.append(value)
else:
local.append(value)

if len(local) 0:
groups.append(local)

return groups

l = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

print splitIntoGroups(l)

Desired result:

[[240, 1, 2, 3], [240, 4, 5, 6], [241, 7, 8], [242, 9, 10, 11, 12,
13], [240, 14], [241, 15]]
Assuming you meant '0xF0' instead of '0x80'.... do you mean any value
>=240 starts a new group? If so:
groups = []
current = [] # probably not necessary, but as a safety
for i in l:
if i >= 240:
current = []
groups.append(current)
current.append(i)
Jul 16 '07 #2
On Mon, 2007-07-16 at 16:31 -0500, marduk wrote:
Assuming you meant '0xF0' instead of '0x80'.... do you mean any value
=240 starts a new group? If so:

groups = []
current = [] # probably not necessary, but as a safety
for i in l:
if i >= 240:
current = []
groups.append(current)
current.append(i)

Misunderstood... actually that should have read

groups = []
current = [] # probably not necessary, but as a safety
for i in l:
if 240 & i:
current = []
groups.append(current)
current.append(i)

Jul 16 '07 #3
This is a perfect place for a generator:

<code>
seq = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

def gengroups(seq):
group = []
for val in seq:
if val & 0x80 and group:
yield group
group = []
group.append(val)
yield group

if __name__ == "__main__":
print list(gengroups(seq))
</code>

The above assumes that the first value in the input sequence will have
0x80 set. Your implementation seems to makes the same assumption
though.

Also, just a note...
<code>
if len(local) 0:
...
</code>

is better written

<code>
if local:
...
</code>

Jul 16 '07 #4
"da********@yahoo.com" <da********@yahoo.comwrites:
I can't seem to find an answer to this question anywhere, but I'm
still looking. My problem is I have a list of values like this:

l = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

A value with bit 0x80 set delineates the start of a new packet of
information. What I want to do is to group the packets so that 1, 2, 3
go with the 1st packet tagged 0xF0, 4 ,5, 6 go with the 2nd packet
tagged 0xF0, 7 & 8 go with the packet tagged 0xF1 and so on. The
length of the data associated with each tag can vary. I've already
written an algorithm to do this but I was wondering if some
combination of itertools functions could do the job faster?
See:

http://groups.google.com/group/comp....10c95c7f3b3654
Jul 16 '07 #5
On Jul 16, 3:56 pm, marduk <mar...@nbk.hopto.orgwrote:
On Mon, 2007-07-16 at 16:31 -0500, marduk wrote:
Assuming you meant '0xF0' instead of '0x80'.... do you mean any value
>=240 starts a new group? If so:
No, I meant 0x80. 0x80 is only the most significant bit of the 0xF0
value. Every time this bit is set indicates the start of a new group.
Anyway, I like this algorithm better than mine, less code. I've
modified yours to make the fn a parameter:

def splitIntoGroups(data, fn):
groups = []
current = []
for i in data:
if fn(i):
current = []
groups.append(current)
current.append(i)

return groups

print splitIntoGroups(l, lambda x : x & 0x80)

Jul 16 '07 #6
Here's an ugly way I wouldn't recommended (using itertools groupby):
>
pyfrom itertools import groupby
pyalist = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6,
... 0xF1, 7, 8, 0xF2, 9, 10, 11, 12, 13,
... 0xF0, 14, 0xF1, 15]
pydef doit(alist):
... i = (list(g) for k,g in groupby(alist, lambda x: 0xf0&x))
... return [k for k in [j + i.next() for j in i] if len(k)>1]
...
pydoit(alist)

[[240, 1, 2, 3],
[240, 4, 5, 6],
[241, 7, 8],
[242, 9, 10, 11, 12, 13],
[240, 14],
[241, 15]]

James
Wow that is ugly. I think I like the simpler way better.

Thanks

Jul 16 '07 #7


Matt McCredie wrote:
That certainly is fast, unfortunately it doesn't pass all of the tests.
I came up with those tests so I don't know how important they are to the
original poster. I modified it and came up with a generator and a
non-generator version based (roughly) on your algorithm, that are almost
as quick, and pass all of the tests. Some of the modifications were done
just to make it quicker, so it would be fair when comparing against the
other methods. I hard-coded the comparison instead of using a function
and created a function that directly generates and returns a list
instead of a generator. I would probably use the generator version in my
code, but wrapping `list' around a generator adds about 4us (on my
machine). Anyway, getgroups7 passes all of the tests I mentioned and it
was timed at 10.37usec/pass. The down side: the code doesn't seem nearly
as elegant.

Matt
In most cases you wouldn't wrap the generator version in a list(), but use
it directly as a loop iterator.
A little renaming of variables helps it be a bit more elegant I think ...

def getgroups8(seq):
groups = []
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
groups.append(seq[start:stop])
start = stop
groups.append(seq[start:])
return groups

This passes all the tests and runs about the same speed.
Cheers,
Ron


<code>
def gengroups7(seq):
iseq = iter(xrange(len(seq)))
start = 0
for i in iseq:
if seq[i]&0x80:
start = i
break
else:
return
for i in iseq:
if seq[i]&0x80:
yield seq[start:i]
start = i
yield seq[start:]
def getgroups7(seq):
groups = []
iseq = iter(xrange(len(seq)))
start = 0
for i in iseq:
if seq[i]&0x80:
start = i
break
else:
return groups
for i in iseq:
if seq[i]&0x80:
groups.append(seq[start:i])
start = i
groups.append(seq[start:])
return groups

</code>

Jul 18 '07 #8


Matt McCredie wrote:
That certainly is fast, unfortunately it doesn't pass all of the tests.
I came up with those tests so I don't know how important they are to the
original poster. I modified it and came up with a generator and a
non-generator version based (roughly) on your algorithm, that are almost
as quick, and pass all of the tests. Some of the modifications were done
just to make it quicker, so it would be fair when comparing against the
other methods. I hard-coded the comparison instead of using a function
and created a function that directly generates and returns a list
instead of a generator. I would probably use the generator version in my
code, but wrapping `list' around a generator adds about 4us (on my
machine). Anyway, getgroups7 passes all of the tests I mentioned and it
was timed at 10.37usec/pass. The down side: the code doesn't seem nearly
as elegant.

Matt
In most cases you wouldn't wrap the generator version in a list(), but use
it directly as a loop iterator.
A little renaming of variables helps it be a bit more elegant I think ...

def getgroups8(seq):
groups = []
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
groups.append(seq[start:stop])
start = stop
groups.append(seq[start:])
return groups

This passes all the tests and runs about the same speed.
Cheers,
Ron


<code>
def gengroups7(seq):
iseq = iter(xrange(len(seq)))
start = 0
for i in iseq:
if seq[i]&0x80:
start = i
break
else:
return
for i in iseq:
if seq[i]&0x80:
yield seq[start:i]
start = i
yield seq[start:]
def getgroups7(seq):
groups = []
iseq = iter(xrange(len(seq)))
start = 0
for i in iseq:
if seq[i]&0x80:
start = i
break
else:
return groups
for i in iseq:
if seq[i]&0x80:
groups.append(seq[start:i])
start = i
groups.append(seq[start:])
return groups

</code>

Jul 18 '07 #9
Hello Dan,

Yet another option (using itertools.groupby):

from itertools import groupby

class GrouperToggler:
def __init__(self):
self.group = 1

def __call__(self, value):
# New packet, toggle group
if value & 0x80:
self.group = 1 - self.group
return self.group

def group(items):
for group, items in groupby(items, GrouperToggler()):
# groupby return [key, group_iterator]
yield [item for item in items]

i = [
0xF0, 1, 2, 3,
0xF0, 4, 5, 6,
0xF1, 7, 8,
0xF2, 9, 10, 11, 12, 13,
0xF0, 14,
0xF1, 15
]

for g in group(i):
print g

HTH,
--
Miki <mi*********@gmail.com>
http://pythonwise.blogspot.com

Jul 18 '07 #10
A little renaming of variables helps it be a bit more elegant I think ...
>
def getgroups8(seq):
groups = []
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
groups.append(seq[start:stop])
start = stop
groups.append(seq[start:])
return groups
Excellent work! One more modification and I get below 10us/pass:

def getgroups(seq):
groups = []
push = groups.append
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
push(seq[start:stop])
start = stop
push(seq[start:])
return groups

-Matt

Jul 19 '07 #11

Matimus wrote:
Excellent work! One more modification and I get below 10us/pass:

def getgroups(seq):
groups = []
push = groups.append
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
push(seq[start:stop])
start = stop
push(seq[start:])
return groups

-Matt

Looks good to me. :-)

So a generator versions would be...

(not tested)

def getgroups(seq):
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
yield seq[start:stop]
start = stop
yield seq[start:]

(I also wanted to compare this to Georges solution, maybe later.)

Now if there is some way to generalize this so it can be used in a broader
range of situations without loosing too much of it's efficiency. Of course
then maybe group by would be better. <shrug>

Cheers,
Ron
Jul 20 '07 #12

Matimus wrote:
Excellent work! One more modification and I get below 10us/pass:

def getgroups(seq):
groups = []
push = groups.append
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
push(seq[start:stop])
start = stop
push(seq[start:])
return groups

-Matt

Looks good to me. :-)

So a generator versions would be...

(not tested)

def getgroups(seq):
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
yield seq[start:stop]
start = stop
yield seq[start:]

(I also wanted to compare this to Georges solution, maybe later.)

Now if there is some way to generalize this so it can be used in a broader
range of situations without loosing too much of it's efficiency. Of course
then maybe group by would be better. <shrug>

Cheers,
Ron

Jul 20 '07 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: MetalOne | last post by:
The following does what I want, but I feel like this could maybe be a one liner. I just can't think of anything shorter. If there is nothing shorter, does this seem like a candidate for inclusion...
4
by: temp | last post by:
Hi All, I wonder could someone help me with this? What I want to do is search through a list of letters and look for adjacent groups of letters that form sequences, not in the usual way of...
1
by: Matt Chisholm | last post by:
Hi. I was wondering if there had ever been an official decision on the idea of adding labeled break and continue functionality to Python. I've found a few places where the idea has come up, in...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.