By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
434,882 Members | 2,453 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 434,882 IT Pros & Developers. It's quick & easy.

groupby

P: n/a
can some explain why in the 2nd example, m doesn't print the list [1, 1, 1]
which i had expected?

for k, g in groupby([1, 1, 1, 2, 2, 3]): .... print k, list(g)
....
1 [1, 1, 1]
2 [2, 2]
3 [3]

m = list(groupby([1, 1, 1, 2, 2, 3]))
m [(1, <itertools._grouper object at 0x00AAC600>), (2, <itertools._grouper object
at 0x00AAC5A0>), (3, <itertools._grouper object at 0x00AAC5B0>)] list(m[0][1]) []

thanks,

bryan

May 23 '06 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Bryan wrote:
can some explain why in the 2nd example, m doesn't print the list [1, 1, 1]
which i had expected?

>>> for k, g in groupby([1, 1, 1, 2, 2, 3]): ... print k, list(g)
...
1 [1, 1, 1]
2 [2, 2]
3 [3]

>>> m = list(groupby([1, 1, 1, 2, 2, 3]))
>>> m [(1, <itertools._grouper object at 0x00AAC600>), (2, <itertools._grouper object
at 0x00AAC5A0>), (3, <itertools._grouper object at 0x00AAC5B0>)] >>> list(m[0][1]) [] >>>

thanks,

bryan


I've tripped on this more than once, but it's in the docs
(http://docs.python.org/lib/itertools-functions.html):

"The returned group is itself an iterator that shares the underlying
iterable with groupby(). Because the source is shared, when the groupby
object is advanced, the previous group is no longer visible. So, if
that data is needed later, it should be stored as a list"

George

May 23 '06 #2

P: n/a
George Sakkis wrote:
Bryan wrote:
can some explain why in the 2nd example, m doesn't print the list [1, 1, 1]
which i had expected?

>>> for k, g in groupby([1, 1, 1, 2, 2, 3]):

... print k, list(g)
...
1 [1, 1, 1]
2 [2, 2]
3 [3]

>>> m = list(groupby([1, 1, 1, 2, 2, 3]))
>>> m

[(1, <itertools._grouper object at 0x00AAC600>), (2, <itertools._grouper object
at 0x00AAC5A0>), (3, <itertools._grouper object at 0x00AAC5B0>)]
>>> list(m[0][1])

[]
>>>

thanks,

bryan


I've tripped on this more than once, but it's in the docs
(http://docs.python.org/lib/itertools-functions.html):

"The returned group is itself an iterator that shares the underlying
iterable with groupby(). Because the source is shared, when the groupby
object is advanced, the previous group is no longer visible. So, if
that data is needed later, it should be stored as a list"

George


i read that description in the docs so many times before i posted here. now that
i read it about 10 more times, i finally get it. there's just something about
the wording that kept tripping me up, but i can't explain why :)

thanks,

bryan

May 23 '06 #3

P: n/a
"Bryan" <be****@gmail.com> wrote in message
news:ma***************************************@pyt hon.org...
George Sakkis wrote: <snip>
"The returned group is itself an iterator that shares the underlying
iterable with groupby(). Because the source is shared, when the groupby
object is advanced, the previous group is no longer visible. So, if
that data is needed later, it should be stored as a list"

George


i read that description in the docs so many times before i posted here.

now that i read it about 10 more times, i finally get it. there's just something about the wording that kept tripping me up, but i can't explain why :)

thanks,

bryan


So here's how to save the values from the iterators while iterating over the
groupby:
m = [(x,list(y)) for x,y in groupby([1, 1, 1, 2, 2, 3])]
m

[(1, [1, 1, 1]), (2, [2, 2]), (3, [3])]

-- Paul
May 23 '06 #4

P: n/a
"Paul McGuire" <pt***@austin.rr._bogus_.com> wrote in message
news:bz******************@tornado.texas.rr.com...
So here's how to save the values from the iterators while iterating over the groupby:
m = [(x,list(y)) for x,y in groupby([1, 1, 1, 2, 2, 3])]
m

[(1, [1, 1, 1]), (2, [2, 2]), (3, [3])]

-- Paul


Playing some more with groupby. Here's a one-liner to tally a list of
integers into a histogram:

# create data set, random selection of numbers from 1-10
dataValueRange = range(1,11)
data = [random.choice(dataValueRange) for i in xrange(10)]
print data

# tally values into histogram:
# (from the inside out:
# - sort data into ascending order, so groupby will see all like values
together
# - call groupby, return iterator of (value,valueItemIterator) tuples
# - tally groupby results into a dict of (value, valueFrequency) tuples
# - expand dict into histogram list, filling in zeroes for any keys that
didn't get a value
hist = [ (k1,dict((k,len(list(g))) for k,g in
itertools.groupby(sorted(data))).get(k1,0)) for k1 in dataValueRange ]

print hist

Gives:
[9, 6, 8, 3, 2, 3, 10, 7, 6, 2]
[(1, 0), (2, 2), (3, 2), (4, 0), (5, 0), (6, 2), (7, 1), (8, 1), (9, 1),
(10, 1)]

Change the generation of the original data list to 10,000 values, and you
get something like:
[(1, 995), (2, 986), (3, 941), (4, 998), (5, 978), (6, 1007), (7, 997), (8,
1033), (9, 1038), (10, 1027)]

If you know there wont be any zero frequency values (or don't care about
them), you can skip the fill-in-the-zeros step, with one of these
expressions:
histAsList = [ (k,len(list(g))) for k,g in itertools.groupby(sorted(data)) ]
histAsDict = dict((k,len(list(g))) for k,g in
itertools.groupby(sorted(data)))

-- Paul
May 27 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.