By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,539 Members | 1,289 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,539 IT Pros & Developers. It's quick & easy.

Summing a 2D list

P: n/a
Hi all,

I have a scenario where I have a list like this:

User Score
1 0
1 1
1 5
2 3
2 1
3 2
4 3
4 3
4 2

And I need to add up the score for each user to get something like
this:

User Score
1 6
2 4
3 2
4 8

Is this possible? If so, how can I do it? I've tried looping through
the arrays and not had much luck so far.

Any help much appreciated,

Mark
Jun 27 '08 #1
Share this Question
Share on Google+
27 Replies


P: n/a
Mark wrote:
Hi all,

I have a scenario where I have a list like this:

User Score
1 0
1 1
1 5
2 3
2 1
3 2
4 3
4 3
4 2

And I need to add up the score for each user to get something like
this:

User Score
1 6
2 4
3 2
4 8

Is this possible? If so, how can I do it? I've tried looping through
the arrays and not had much luck so far.

Any help much appreciated,
Show us your efforts in code so far. Especially what the actual data looks
like. Then we can suggest a solution.

Diez
Jun 27 '08 #2

P: n/a
On Jun 12, 3:48*pm, Mark <markjtur...@gmail.comwrote:
Hi all,

I have a scenario where I have a list like this:

User * * * * * *Score
1 * * * * * * * * 0
1 * * * * * * * * 1
1 * * * * * * * * 5
2 * * * * * * * * 3
2 * * * * * * * * 1
3 * * * * * * * * 2
4 * * * * * * * * 3
4 * * * * * * * * 3
4 * * * * * * * * 2

And I need to add up the score for each user to get something like
this:

User * * * * * *Score
1 * * * * * * * * 6
2 * * * * * * * * 4
3 * * * * * * * * 2
4 * * * * * * * * 8

Is this possible? If so, how can I do it? I've tried looping through
the arrays and not had much luck so far.

Any help much appreciated,

Mark
user_score = {}
for record in list:
user, score = record.split()
if user in user_score: user_score[user] += score
else: user_score[user] = score

print '\n'.join(['%s\t%s' % (user, score) for user,score in
sorted(user_score.items())])

You don't mention what data structure you are keeping your records in
but hopefully this helps you in the right direction.
Jun 27 '08 #3

P: n/a
On Jun 12, 3:02*pm, "Diez B. Roggisch" <de...@nospam.web.dewrote:
Mark wrote:
Hi all,
I have a scenario where I have a list like this:
User * * * * * *Score
1 * * * * * * * * 0
1 * * * * * * * * 1
1 * * * * * * * * 5
2 * * * * * * * * 3
2 * * * * * * * * 1
3 * * * * * * * * 2
4 * * * * * * * * 3
4 * * * * * * * * 3
4 * * * * * * * * 2
And I need to add up the score for each user to get something like
this:
User * * * * * *Score
1 * * * * * * * * 6
2 * * * * * * * * 4
3 * * * * * * * * 2
4 * * * * * * * * 8
Is this possible? If so, how can I do it? I've tried looping through
the arrays and not had much luck so far.
Any help much appreciated,

Show us your efforts in code so far. Especially what the actual data looks
like. Then we can suggest a solution.

Diez
Hi Diez, thanks for the quick reply.

To be honest I'm relatively new to Python, so I don't know too much
about how all the loop constructs work and how they differ to other
languages. I'm building an app in Django and this data is coming out
of a database and it looks like what I put up there!

This was my (failed) attempt:

predictions = Prediction.objects.all()
scores = []
for prediction in predictions:
i = [prediction.predictor.id, 0]
if prediction.predictionscore:
i[1] += int(prediction.predictionscore)
scores.append(i)

I did have another loop in there (I'm fairly sure I need one) but that
didn't work either. I don't imagine that snippet is very helpful,
sorry!

Any tips would be gratefully recieved!

Thanks,

Mark
Jun 27 '08 #4

P: n/a
Mark wrote:
Hi all,

I have a scenario where I have a list like this:

User Score
1 0
1 1
1 5
2 3
2 1
3 2
4 3
4 3
4 2

And I need to add up the score for each user to get something like
this:

User Score
1 6
2 4
3 2
4 8

Is this possible? If so, how can I do it? I've tried looping through
the arrays and not had much luck so far.

Any help much appreciated,

Mark

does this work for you?
users = [1,1,1,2,2,3,4,4,4]
score = [0,1,5,3,1,2,3,3,2]

d = dict()

for u,s in zip(users,score):
if d.has_key(u):
d[u] += s
else:
d[u] = s

for key in d.keys():
print 'user: %d\nscore: %d\n' % (key,d[key])
Jun 27 '08 #5

P: n/a
"Mark" <ma*********@gmail.comwrote in message
news:c0**********************************@79g2000h sk.googlegroups.com...
On Jun 12, 3:02 pm, "Diez B. Roggisch" <de...@nospam.web.dewrote:
Mark wrote:
---
This was my (failed) attempt:

predictions = Prediction.objects.all()
scores = []
for prediction in predictions:
i = [prediction.predictor.id, 0]
if prediction.predictionscore:
i[1] += int(prediction.predictionscore)
scores.append(i)
---

Your question sounds like a fun little project, but can you post what the
actual list of users/scores looks like? Is it a list of tuples like this:

[(1, 0), (1, 1) ... ]

Or something else?
Jun 27 '08 #6

P: n/a
To be honest I'm relatively new to Python, so I don't know too much
about how all the loop constructs work and how they differ to other
languages. I'm building an app in Django and this data is coming out
of a database and it looks like what I put up there!

This was my (failed) attempt:

predictions = Prediction.objects.all()
scores = []
for prediction in predictions:
i = [prediction.predictor.id, 0]
if prediction.predictionscore:
i[1] += int(prediction.predictionscore)
scores.append(i)

I did have another loop in there (I'm fairly sure I need one) but that
didn't work either. I don't imagine that snippet is very helpful,
sorry!
It is helpful because it tells us what your actual data looks like.

What you need is to get a list of (predictor, score)-pairs. These you should
be able to get like this:

l = [(p.predictor.id, p.predictionscore) for p in predictions]

Now you need to sort this list - because in the next step, we will aggregate
the values for each predictor.

result = []
current_predictor = None
total_sum = 0
for predictor, score in l:
if predictor != current_predictor:
# only if we really have a current_predictor,
# the non-existent first one doesn't count
if current_predictor is not None:
result.append((predictor, total_sum))
total_sum = 0
current_predictor = predictor
total_sum += score

That should be roughly it.

Diez
Jun 27 '08 #7

P: n/a
John, it's a QuerySet coming from a database in Django. I don't know
enough about the structure of this object to go into detail I'm
afraid.

Aidan, I got an error trying your suggestion: 'zip argument #2 must
support iteration', I don't know what this means!

Thanks to all who have answered! Sorry I'm not being very specific!
Jun 27 '08 #8

P: n/a
Mark wrote:
John, it's a QuerySet coming from a database in Django. I don't know
enough about the structure of this object to go into detail I'm
afraid.

Aidan, I got an error trying your suggestion: 'zip argument #2 must
support iteration', I don't know what this means!
well, if we can create 2 iterable sequences one which contains the user
the other the scores, it should work

the error means that the second argument to the zip function was not an
iterable, such as a list tuple or string

can you show me the lines you're using to retrieve the data sets from
the database? then i might be able to tell you how to build the 2 lists
you need.
Thanks to all who have answered! Sorry I'm not being very specific!
Jun 27 '08 #9

P: n/a
Aidan wrote:
Mark wrote:
>John, it's a QuerySet coming from a database in Django. I don't know
enough about the structure of this object to go into detail I'm
afraid.

Aidan, I got an error trying your suggestion: 'zip argument #2 must
support iteration', I don't know what this means!

well, if we can create 2 iterable sequences one which contains the user
the other the scores, it should work

the error means that the second argument to the zip function was not an
iterable, such as a list tuple or string

can you show me the lines you're using to retrieve the data sets from
the database? then i might be able to tell you how to build the 2 lists
you need.
wait you already did...

predictions = Prediction.objects.all()
pairs = [(p.predictor.id,p.predictionscore) for p in predictions]

those 2 lines will will build a list of user/score pairs. you can then
replace the call to zip with pairs

any luck?

Jun 27 '08 #10

P: n/a
Mark wrote:
John, it's a QuerySet coming from a database in Django. I don't know
enough about the structure of this object to go into detail I'm
afraid. [...]
Then let the database do the summing up. That's what it's there for :-)

select user, sum(score) from score_table
group by user

or something very similar, depending on the actual database schema. I
don't know how to do this with Django's ORM, but is the way to do it in
plain SQL.

-- Gerhard

Jun 27 '08 #11

P: n/a
Aidan wrote:
does this work for you?

users = [1,1,1,2,2,3,4,4,4]
score = [0,1,5,3,1,2,3,3,2]

d = dict()

for u,s in zip(users,score):
if d.has_key(u):
d[u] += s
else:
d[u] = s

for key in d.keys():
print 'user: %d\nscore: %d\n' % (key,d[key])
I've recently had the very same problem and needed to optimize for the
best solution. I've tried quite a few, including:

1) using a dictionary with a default value

d = collections.defaultdict(lambda: 0)
d[key] += value

2) Trying out if avoiding object allocation is worth the effort. Using
Cython:

cdef class Counter:
cdef int _counter
def __init__(self):
self._counter = 0

def inc(self):
self._counter += 1

def __int__(self):
return self._counter

def __iadd__(self, operand):
self._counter += 1
return self

And no, this was *not* faster than the final solution. This counter
class, which is basically a mutable int, is exactly as fast as just
using this one (final solution) - tada!

counter = {}
try:
counter[key] += 1
except KeyError:
counter[key] = 1

Using psyco makes this a bit faster still. psyco can't optimize
defaultdict or my custom Counter class, though.

-- Gerhard

Jun 27 '08 #12

P: n/a
On Jun 12, 3:45*pm, Aidan <awe...@gmail.comwrote:
Aidan wrote:
Mark wrote:
John, it's a QuerySet coming from a database in Django. I don't know
enough about the structure of this object to go into detail I'm
afraid.
Aidan, I got an error trying your suggestion: 'zip argument #2 must
support iteration', I don't know what this means!
well, if we can create 2 iterable sequences one which contains the user
the other the scores, it should work
the error means that the second argument to the zip function was not an
iterable, such as a list tuple or string
can you show me the lines you're using to retrieve the data sets from
the database? then i might be able to tell you how to build the 2 lists
you need.

wait you already did...

predictions = Prediction.objects.all()
pairs = [(p.predictor.id,p.predictionscore) for p in predictions]

those 2 lines will will build a list of user/score pairs. *you can then
replace the call to zip with pairs

any luck?
Thanks Aidan, this works great!

Thanks also to everyone else, I'm sure your suggestions would have
worked too if I'd been competent enough to do them properly!
Jun 27 '08 #13

P: n/a
On Jun 12, 4:14 pm, Gerhard Häring <g...@ghaering.dewrote:
Aidan wrote:
does this work for you?
users = [1,1,1,2,2,3,4,4,4]
score = [0,1,5,3,1,2,3,3,2]
d = dict()
for u,s in zip(users,score):
if d.has_key(u):
d[u] += s
else:
d[u] = s
for key in d.keys():
print 'user: %d\nscore: %d\n' % (key,d[key])

I've recently had the very same problem and needed to optimize for the
best solution. I've tried quite a few, including:

1) using a dictionary with a default value

d = collections.defaultdict(lambda: 0)
d[key] += value
<<SNIP>>
-- Gerhard
This might be faster, by avoiding the lambda:

d = collections.defaultdict(int)
d[key] += value

- Paddy.
Jun 27 '08 #14

P: n/a
Hi Mark,

Mark <ma*********@gmail.comwrites:
I have a scenario where I have a list like this:

User Score
1 0
1 1
1 5
2 3
2 1
3 2
4 3
4 3
4 2

And I need to add up the score for each user to get something like
this:

User Score
1 6
2 4
3 2
4 8

Is this possible? If so, how can I do it? I've tried looping through
the arrays and not had much luck so far.
Although your problem has already been solved, I'd like to present a
different approach which can be quite a bit faster. The most common
approach seems to be using a dictionary:

summed_up={}
for user,vote in pairs:
if summed_up.has_key(user):
summed_up[user]+=vote
else:
summed_up[user]=vote

But if the list of users is compact and the maximum value is known
before, the using a list and coding the user into the list position is
much more elegant:

summed_up=list( (0,) * max_user )
for user,vote in pairs:
summed_up[user] += vote

I've run a quick and dirty test on these approaches and found that the
latter takes only half the time than the first. More precisely, with
about 2 million pairs, i got:

* dict approach: 2s
(4s with "try: ... except KeyError:" instead of the "if")
* list approach: 0.9s

BTW this was inspired by the book "Programming Pearls" I read some
years ago where a similar approach saved some magnitudes of time
(using a bit field instead of a list to store reserved/free phone
numbers IIRC).

Yours,
Karsten
Jun 27 '08 #15

P: n/a
On Fri, Jun 13, 2008 at 2:12 PM, Karsten Heymann
<ka*************@blue-cable.netwrote:
Although your problem has already been solved, I'd like to present a
different approach which can be quite a bit faster. The most common
approach seems to be using a dictionary:

summed_up={}
for user,vote in pairs:
if summed_up.has_key(user):
summed_up[user]+=vote
else:
summed_up[user]=vote
You'll save even more by using:

if user in summed_up:

instead of has_key.

--
mvh Björn
Jun 27 '08 #16

P: n/a
On Thu, Jun 12, 2008 at 3:48 PM, Mark <ma*********@gmail.comwrote:
Hi all,

I have a scenario where I have a list like this:

User Score
1 0
1 1
1 5
2 3
2 1
3 2
4 3
4 3
4 2

And I need to add up the score for each user to get something like
this:

User Score
1 6
2 4
3 2
4 8

Is this possible? If so, how can I do it? I've tried looping through
the arrays and not had much luck so far.
Here is another solution:

from itertools import groupby
from operator import itemgetter

users = [1, 1, 1, 2, 2, 3, 4, 4, 4]
scores = [0, 1, 5, 3, 1, 2, 3, 3, 2]

for u, s in groupby(zip(users, scores), itemgetter(0)):
print u, sum(y for x, y in s)

You probably should reconsider how you can store your data in a more
efficient format.

--
mvh Björn
Jun 27 '08 #17

P: n/a
BJörn Lindqvist wrote:
[...]
Here is another solution:

from itertools import groupby
from operator import itemgetter

users = [1, 1, 1, 2, 2, 3, 4, 4, 4]
scores = [0, 1, 5, 3, 1, 2, 3, 3, 2]

for u, s in groupby(zip(users, scores), itemgetter(0)):
print u, sum(y for x, y in s)
Except that this won't work unless users and scores are sorted by user
first. groupby() only coalesces identical values, and doesn't do what a
"GROUP BY" clause in SQL is doing.

Adding a sorted() call around zip() should be enough to make groupby()
actually useful.

But then, this code definitely starts to look like somebody desperately
wanted to find a use for Python's new gimmicks.

Here's more of the same sort ;-)
>>import sqlite3
sqlite3.connect(":memory:").execute("create table tmp(user,
score)").executemany("insert into tmp(user, score) values (?, ?)",
zip(users, scores)).execute("select user, sum(score) from tmp group by
user").fetchall()
[(1, 6), (2, 4), (3, 2), (4, 8)]

-- Gerhard

Jun 27 '08 #18

P: n/a
Hi Björn,

"BJörn Lindqvist" <bj*****@gmail.comwrites:
On Fri, Jun 13, 2008 at 2:12 PM, Karsten Heymann
<ka*************@blue-cable.netwrote:
>summed_up={}
for user,vote in pairs:
if summed_up.has_key(user):
summed_up[user]+=vote
else:
summed_up[user]=vote

You'll save even more by using:

if user in summed_up:

instead of has_key.
You're right, then it goes down to 1.5s (compared to 0.9 for the pure
list version). Pythons dictionaries are really great :-)

Yours
Karsten
Jun 27 '08 #19

P: n/a
On Jun 13, 1:12 pm, Karsten Heymann <karsten.heym...@blue-cable.net>
wrote:
Hi Mark,

Mark <markjtur...@gmail.comwrites:
I have a scenario where I have a list like this:
User Score
1 0
1 1
1 5
2 3
2 1
3 2
4 3
4 3
4 2
And I need to add up the score for each user to get something like
this:
User Score
1 6
2 4
3 2
4 8
Is this possible? If so, how can I do it? I've tried looping through
the arrays and not had much luck so far.

Although your problem has already been solved, I'd like to present a
different approach which can be quite a bit faster. The most common
approach seems to be using a dictionary:

summed_up={}
for user,vote in pairs:
if summed_up.has_key(user):
summed_up[user]+=vote
else:
summed_up[user]=vote

But if the list of users is compact and the maximum value is known
before, the using a list and coding the user into the list position is
much more elegant:

summed_up=list( (0,) * max_user )
for user,vote in pairs:
summed_up[user] += vote

I've run a quick and dirty test on these approaches and found that the
latter takes only half the time than the first. More precisely, with
about 2 million pairs, i got:

* dict approach: 2s
(4s with "try: ... except KeyError:" instead of the "if")
* list approach: 0.9s

BTW this was inspired by the book "Programming Pearls" I read some
years ago where a similar approach saved some magnitudes of time
(using a bit field instead of a list to store reserved/free phone
numbers IIRC).

Yours,
Karsten
How does your solution fare against the defaultdict solution of:

d = collections.defaultdict(int)
for u,s in zip(users,score): d[u] += s
- Paddy
Jun 27 '08 #20

P: n/a
Le Friday 13 June 2008 14:12:40 Karsten Heymann, vous avez écrit*:
Hi Mark,

Mark <ma*********@gmail.comwrites:
I have a scenario where I have a list like this:

User * * * * * *Score
1 * * * * * * * * 0
1 * * * * * * * * 1
1 * * * * * * * * 5
2 * * * * * * * * 3
2 * * * * * * * * 1
3 * * * * * * * * 2
4 * * * * * * * * 3
4 * * * * * * * * 3
4 * * * * * * * * 2

And I need to add up the score for each user to get something like
this:

User * * * * * *Score
1 * * * * * * * * 6
2 * * * * * * * * 4
3 * * * * * * * * 2
4 * * * * * * * * 8

Is this possible? If so, how can I do it? I've tried looping through
the arrays and not had much luck so far.

Although your problem has already been solved, I'd like to present a
different approach which can be quite a bit faster. The most common
approach seems to be using a dictionary:

summed_up={}
for user,vote in pairs:
* if summed_up.has_key(user):
* * summed_up[user]+=vote
* else:
* * summed_up[user]=vote

But if the list of users is compact and the maximum value is known
before, the using a list and coding the user into the list position is
much more elegant:
So, writing C in python, which has dictionnary as builtin type, should be
considered "more elegant" ?
summed_up=list( (0,) * max_user )
for user,vote in pairs:
* summed_up[user] += vote

I've run a quick and dirty test on these approaches and found that the
latter takes only half the time than the first. More precisely, with
about 2 million pairs, i got:
* dict approach: 2s
* * * * (4s with "try: ... except KeyError:" instead of the "if")
* list approach: 0.9s
You are comparing apples with lemons, there is no such a difference between
list index access and dictionnary key access in Python.
BTW this was inspired by the book "Programming Pearls" I read some
years ago where a similar approach saved some magnitudes of time
(using a bit field instead of a list to store reserved/free phone
numbers IIRC).
If you know in advance the number and names of users, what prevent you to
initialize completelly the target dictionnary ?

The following code compare the same algorithm, once with list and the second
time with dict :

#!/usr/bin/env python

def do(f, u, v) :
from time import time
n=time()
f(u, v)
return time() -n

def c_dict(users, votes) :
d = dict(((e, 0) for e in users))
for u, v in votes : d[u] += v
return d.values()

def c_list(users, votes) :
d = [ 0 for i in users ]
for u, v in votes : d[u] += v
return d

u = range(3000)

import random

v = list((u[r%3000], random.randint(0,10000)) for r in range(5*10**6))

print "with list", do(c_list, u, v)
print "with dict", do(c_dict, u, v)

The result is pretty close now :

maric@redflag1 17:04:36:~$ ./test.py
with list 1.40726399422
with dict 1.63094091415

So why use list where the obvious and natural data structure is a
dictionnary ?
--
_____________

Maric Michaud

Jun 27 '08 #21

P: n/a
Paddy <pa*******@googlemail.comwrites:
How does your solution fare against the defaultdict solution of:

d = collections.defaultdict(int)
for u,s in zip(users,score): d[u] += s
list: 0.931s
dict + "in": 1.495s
defaultdict : 1.991s
dict + "if": ~2s
dict + "try": ~4s

I've posted the (very rough) code to dpaste:

http://dpaste.com/hold/56468/

Yours
Karsten
Jun 27 '08 #22

P: n/a
Hi Maric,

Maric Michaud <ma***@aristote.infowrites:
So, writing C in python, which has dictionnary as builtin type,
should be considered "more elegant" ?
IMO that's a bit harsh.
You are comparing apples with lemons, there is no such a difference
between list index access and dictionnary key access in Python.
[...]
If you know in advance the number and names of users, what prevent
you to initialize completelly the target dictionnary ?

The following code compare the same algorithm, once with list and
the second time with dict :
[...]
The result is pretty close now :

maric@redflag1 17:04:36:~$ ./test.py
with list 1.40726399422
with dict 1.63094091415

So why use list where the obvious and natural data structure is a
dictionnary ?
I'd never argue that using a dictionary is the obvious and natural
data structure for this case. But is it the best? Honestly, as your
very nice example shows, we have two solutions that are equally fast,
equally complex to code and equally robust, but one needs
approximately the double amount of memory compared to the other. So,
as much as i like dictionaries, what's the gain you get from using it
in this corner case?

Yours,
Karsten
Jun 27 '08 #23

P: n/a
Hello,

Le Friday 13 June 2008 17:55:44 Karsten Heymann, vous avez écrit*:
Maric Michaud <ma***@aristote.infowrites:
So, writing C in python, which has dictionnary as builtin type,
should be considered "more elegant" ?

IMO that's a bit harsh.
harsh ? Sorry, I'm not sure to understand.
You are comparing apples with lemons, there is no such a difference
between list index access and dictionnary key access in Python.

[...]
If you know in advance the number and names of users, what prevent
you to initialize completelly the target dictionnary ?

The following code compare the same algorithm, once with list and
the second time with dict :

[...]
The result is pretty close now :

maric@redflag1 17:04:36:~$ ./test.py
with list 1.40726399422
with dict 1.63094091415

So why use list where the obvious and natural data structure is a
dictionnary ?

I'd never argue that using a dictionary is the obvious and natural
data structure for this case. But is it the best? Honestly, as your
very nice example shows, we have two solutions that are equally fast,
equally complex to code and equally robust, but one needs
Yes, but my example take ordered integer for keys (users' names) which they
should not be in a real case, so retrieving the result is by way easier (and
faster) with a dictionnary.
approximately the double amount of memory compared to the other.
I don't see how you came to this conclusion. Are you sure the extra list take
twice more memory than the extra dictionary ?
So, as much as i like dictionaries, what's the gain you get from using it
in this corner case?
It's the very purpose of it's usage, store and retrieve data by key.

Cheers,

--
_____________

Maric Michaud
Jun 27 '08 #24

P: n/a
Le Friday 13 June 2008 18:55:24 Maric Michaud, vous avez écrit*:
approximately the double amount of memory compared to the other.

I don't see how you came to this conclusion. Are you sure the extra list
take twice more memory than the extra dictionary ?
twice less, I meant, of course...

--
_____________

Maric Michaud
Jun 27 '08 #25

P: n/a
Maric Michaud <ma***@aristote.infowrites:
Le Friday 13 June 2008 17:55:44 Karsten Heymann, vous avez écritÂ*:
>Maric Michaud <ma***@aristote.infowrites:
So, writing C in python, which has dictionnary as builtin type,
should be considered "more elegant" ?

IMO that's a bit harsh.
harsh ? Sorry, I'm not sure to understand.
Never mind, I got carried away.
>I'd never argue that using a dictionary is the obvious and natural
data structure for this case. But is it the best? Honestly, as your
very nice example shows, we have two solutions that are equally
fast, equally complex to code and equally robust, but one needs

Yes, but my example take ordered integer for keys (users' names)
which they should not be in a real case, so retrieving the result is
by way easier (and faster) with a dictionnary.
Of course. As I wrote in my first post, my proposal dependeds upon the
users being a numerical and compact list with a known maximum, as in
the OP's example. Otherwise my approach makes no sense at all :-)
>approximately the double amount of memory compared to the other.

I don't see how you came to this conclusion. Are you sure the extra
list take twice more memory than the extra dictionary ?
I'm referring to the resulting data structure. A list with n items
should take approx. half the space of a dictionary with n keys and n
entries. Or, the other way round, if i have a dictionary with keys
1..n, why shouldn't I use a list instead.

Yours,
Karsten
Jun 27 '08 #26

P: n/a
On Jun 12, 3:48 pm, Mark <markjtur...@gmail.comwrote:
Is this possible?
def foobar(user,score):
sums = {}
for u,s in zip(user,score):
try:
sums[u] += s
except KeyError:
sums[u] = s
return [(u, sums[u]) for u in sums].sort()
usersum = foobar(user,score)
for u,s in usersum:
print "%d %d" % (u,s)


Jun 27 '08 #27

P: n/a
On Jun 14, 4:05 pm, sturlamolden <sturlamol...@yahoo.nowrote:
On Jun 12, 3:48 pm, Mark <markjtur...@gmail.comwrote:
Is this possible?

def foobar(user,score):
sums = {}
for u,s in zip(user,score):
try:
sums[u] += s
except KeyError:
sums[u] = s
return [(u, sums[u]) for u in sums].sort()
sort() sorts the list in-place and returns None. Try this instead:

return sorted([(u, sums[u]) for u in sums])

or, better yet:

return sorted(sums.items())
usersum = foobar(user,score)
for u,s in usersum:
print "%d %d" % (u,s)
Jun 27 '08 #28

This discussion thread is closed

Replies have been disabled for this discussion.