Break up list into groups

danmcleran

All,

I can't seem to find an answer to this question anywhere, but I'm
still looking. My problem is I have a list of values like this:

l = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

A value with bit 0x80 set delineates the start of a new packet of
information. What I want to do is to group the packets so that 1, 2, 3
go with the 1st packet tagged 0xF0, 4 ,5, 6 go with the 2nd packet
tagged 0xF0, 7 & 8 go with the packet tagged 0xF1 and so on. The
length of the data associated with each tag can vary. I've already
written an algorithm to do this but I was wondering if some
combination of itertools functions could do the job faster?

Here's what I've done and the expected output of the algorithm:

def splitIntoGroups(data):
groups = []
local = []

for value in data:
if 0x80 & value:
if len(local) 0:
groups.append(local)

local = []
local.append(value)
else:
local.append(value)

if len(local) 0:
groups.append(local)

return groups

l = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

print splitIntoGroups(l)

Desired result:

[[240, 1, 2, 3], [240, 4, 5, 6], [241, 7, 8], [242, 9, 10, 11, 12,
13], [240, 14], [241, 15]]

Thanks,

Dan McLeran

Jul 16 '07 #1

Subscribe Post Reply

1547

marduk

On Mon, 2007-07-16 at 14:11 -0700, da********@yahoo.com wrote:

I can't seem to find an answer to this question anywhere, but I'm
still looking. My problem is I have a list of values like this:

l = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

A value with bit 0x80 set delineates the start of a new packet of
information. What I want to do is to group the packets so that 1, 2, 3
go with the 1st packet tagged 0xF0, 4 ,5, 6 go with the 2nd packet
tagged 0xF0, 7 & 8 go with the packet tagged 0xF1 and so on. The
length of the data associated with each tag can vary. I've already
written an algorithm to do this but I was wondering if some
combination of itertools functions could do the job faster?

Here's what I've done and the expected output of the algorithm:

def splitIntoGroups(data):
groups = []
local = []

for value in data:
if 0x80 & value:
if len(local) 0:
groups.append(local)

local = []
local.append(value)
else:
local.append(value)

if len(local) 0:
groups.append(local)

return groups

l = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

print splitIntoGroups(l)

Desired result:

[[240, 1, 2, 3], [240, 4, 5, 6], [241, 7, 8], [242, 9, 10, 11, 12,
13], [240, 14], [241, 15]]

Assuming you meant '0xF0' instead of '0x80'.... do you mean any value

>=240 starts a new group? If so:

groups = []
current = [] # probably not necessary, but as a safety
for i in l:
if i >= 240:
current = []
groups.append(current)
current.append(i)

Jul 16 '07 #2

marduk

On Mon, 2007-07-16 at 16:31 -0500, marduk wrote:

Assuming you meant '0xF0' instead of '0x80'.... do you mean any value
=240 starts a new group? If so:

groups = []
current = [] # probably not necessary, but as a safety
for i in l:
if i >= 240:
current = []
groups.append(current)
current.append(i)

Misunderstood... actually that should have read

groups = []
current = [] # probably not necessary, but as a safety
for i in l:
if 240 & i:
current = []
groups.append(current)
current.append(i)

Jul 16 '07 #3

Matimus

This is a perfect place for a generator:

<code>
seq = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

def gengroups(seq):
group = []
for val in seq:
if val & 0x80 and group:
yield group
group = []
group.append(val)
yield group

if __name__ == "__main__":
print list(gengroups(seq))
</code>

The above assumes that the first value in the input sequence will have
0x80 set. Your implementation seems to makes the same assumption
though.

Also, just a note...
<code>
if len(local) 0:
...
</code>

is better written

<code>
if local:
...
</code>

Jul 16 '07 #4

Paul Rubin

"da********@yahoo.com" <da********@yahoo.comwrites:

I can't seem to find an answer to this question anywhere, but I'm
still looking. My problem is I have a list of values like this:

l = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6, 0xF1, 7, 8, 0xF2, 9, 10, 11, 12,
13, 0xF0, 14, 0xF1, 15]

A value with bit 0x80 set delineates the start of a new packet of
information. What I want to do is to group the packets so that 1, 2, 3
go with the 1st packet tagged 0xF0, 4 ,5, 6 go with the 2nd packet
tagged 0xF0, 7 & 8 go with the packet tagged 0xF1 and so on. The
length of the data associated with each tag can vary. I've already
written an algorithm to do this but I was wondering if some
combination of itertools functions could do the job faster?

See:

http://groups.google.com/group/comp....10c95c7f3b3654

Jul 16 '07 #5

danmcleran

On Jul 16, 3:56 pm, marduk <mar...@nbk.hopto.orgwrote:

On Mon, 2007-07-16 at 16:31 -0500, marduk wrote:
Assuming you meant '0xF0' instead of '0x80'.... do you mean any value
>=240 starts a new group? If so:

No, I meant 0x80. 0x80 is only the most significant bit of the 0xF0
value. Every time this bit is set indicates the start of a new group.
Anyway, I like this algorithm better than mine, less code. I've
modified yours to make the fn a parameter:

def splitIntoGroups(data, fn):
groups = []
current = []
for i in data:
if fn(i):
current = []
groups.append(current)
current.append(i)

return groups

print splitIntoGroups(l, lambda x : x & 0x80)

Jul 16 '07 #6

danmcleran

Here's an ugly way I wouldn't recommended (using itertools groupby):

>
pyfrom itertools import groupby
pyalist = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6,
... 0xF1, 7, 8, 0xF2, 9, 10, 11, 12, 13,
... 0xF0, 14, 0xF1, 15]
pydef doit(alist):
... i = (list(g) for k,g in groupby(alist, lambda x: 0xf0&x))
... return [k for k in [j + i.next() for j in i] if len(k)>1]
...
pydoit(alist)

[[240, 1, 2, 3],
[240, 4, 5, 6],
[241, 7, 8],
[242, 9, 10, 11, 12, 13],
[240, 14],
[241, 15]]

James

Wow that is ugly. I think I like the simpler way better.

Thanks

Jul 16 '07 #7

Ron Adam

Matt McCredie wrote:

That certainly is fast, unfortunately it doesn't pass all of the tests.
I came up with those tests so I don't know how important they are to the
original poster. I modified it and came up with a generator and a
non-generator version based (roughly) on your algorithm, that are almost
as quick, and pass all of the tests. Some of the modifications were done
just to make it quicker, so it would be fair when comparing against the
other methods. I hard-coded the comparison instead of using a function
and created a function that directly generates and returns a list
instead of a generator. I would probably use the generator version in my
code, but wrapping `list' around a generator adds about 4us (on my
machine). Anyway, getgroups7 passes all of the tests I mentioned and it
was timed at 10.37usec/pass. The down side: the code doesn't seem nearly
as elegant.

Matt

In most cases you wouldn't wrap the generator version in a list(), but use
it directly as a loop iterator.
A little renaming of variables helps it be a bit more elegant I think ...

def getgroups8(seq):
groups = []
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
groups.append(seq[start:stop])
start = stop
groups.append(seq[start:])
return groups

This passes all the tests and runs about the same speed.
Cheers,
Ron

<code>
def gengroups7(seq):
iseq = iter(xrange(len(seq)))
start = 0
for i in iseq:
if seq[i]&0x80:
start = i
break
else:
return
for i in iseq:
if seq[i]&0x80:
yield seq[start:i]
start = i
yield seq[start:]
def getgroups7(seq):
groups = []
iseq = iter(xrange(len(seq)))
start = 0
for i in iseq:
if seq[i]&0x80:
start = i
break
else:
return groups
for i in iseq:
if seq[i]&0x80:
groups.append(seq[start:i])
start = i
groups.append(seq[start:])
return groups

</code>

Jul 18 '07 #8

Ron Adam

Matt McCredie wrote:

That certainly is fast, unfortunately it doesn't pass all of the tests.
I came up with those tests so I don't know how important they are to the
original poster. I modified it and came up with a generator and a
non-generator version based (roughly) on your algorithm, that are almost
as quick, and pass all of the tests. Some of the modifications were done
just to make it quicker, so it would be fair when comparing against the
other methods. I hard-coded the comparison instead of using a function
and created a function that directly generates and returns a list
instead of a generator. I would probably use the generator version in my
code, but wrapping `list' around a generator adds about 4us (on my
machine). Anyway, getgroups7 passes all of the tests I mentioned and it
was timed at 10.37usec/pass. The down side: the code doesn't seem nearly
as elegant.

Matt

<code>
def gengroups7(seq):
iseq = iter(xrange(len(seq)))
start = 0
for i in iseq:
if seq[i]&0x80:
start = i
break
else:
return
for i in iseq:
if seq[i]&0x80:
yield seq[start:i]
start = i
yield seq[start:]
def getgroups7(seq):
groups = []
iseq = iter(xrange(len(seq)))
start = 0
for i in iseq:
if seq[i]&0x80:
start = i
break
else:
return groups
for i in iseq:
if seq[i]&0x80:
groups.append(seq[start:i])
start = i
groups.append(seq[start:])
return groups

</code>

Jul 18 '07 #9

Miki

Hello Dan,

Yet another option (using itertools.groupby):

from itertools import groupby

class GrouperToggler:
def __init__(self):
self.group = 1

def __call__(self, value):
# New packet, toggle group
if value & 0x80:
self.group = 1 - self.group
return self.group

def group(items):
for group, items in groupby(items, GrouperToggler()):
# groupby return [key, group_iterator]
yield [item for item in items]

i = [
0xF0, 1, 2, 3,
0xF0, 4, 5, 6,
0xF1, 7, 8,
0xF2, 9, 10, 11, 12, 13,
0xF0, 14,
0xF1, 15
]

for g in group(i):
print g

HTH,
--
Miki <mi*********@gmail.com>
http://pythonwise.blogspot.com

Jul 18 '07 #10

Matimus

A little renaming of variables helps it be a bit more elegant I think ...

>
def getgroups8(seq):
groups = []
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
groups.append(seq[start:stop])
start = stop
groups.append(seq[start:])
return groups

Excellent work! One more modification and I get below 10us/pass:

def getgroups(seq):
groups = []
push = groups.append
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
push(seq[start:stop])
start = stop
push(seq[start:])
return groups

-Matt

Jul 19 '07 #11

Ron Adam

Matimus wrote:

Excellent work! One more modification and I get below 10us/pass:

def getgroups(seq):
groups = []
push = groups.append
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
push(seq[start:stop])
start = stop
push(seq[start:])
return groups

-Matt

Looks good to me. :-)

So a generator versions would be...

(not tested)

def getgroups(seq):
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
yield seq[start:stop]
start = stop
yield seq[start:]

(I also wanted to compare this to Georges solution, maybe later.)

Now if there is some way to generalize this so it can be used in a broader
range of situations without loosing too much of it's efficiency. Of course
then maybe group by would be better. <shrug>

Cheers,
Ron

Jul 20 '07 #12

Ron Adam

Matimus wrote:

Excellent work! One more modification and I get below 10us/pass:

def getgroups(seq):
groups = []
push = groups.append
iseq = iter(xrange(len(seq)))
for start in iseq:
if seq[start] & 0x80:
for stop in iseq:
if seq[stop] & 0x80:
push(seq[start:stop])
start = stop
push(seq[start:])
return groups

-Matt

Jul 20 '07 #13

by: MetalOne | last post by:

The following does what I want, but I feel like this could maybe be a one liner. I just can't think of anything shorter. If there is nothing shorter, does this seem like a candidate for inclusion...

Python

Sequences in list

by: temp | last post by:

Hi All, I wonder could someone help me with this? What I want to do is search through a list of letters and look for adjacent groups of letters that form sequences, not in the usual way of...

Python

labeled break/continue

by: Matt Chisholm | last post by:

Hi. I was wondering if there had ever been an official decision on the idea of adding labeled break and continue functionality to Python. I've found a few places where the idea has come up, in...

Python

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Break up list into groups

Similar topics