By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
444,100 Members | 2,495 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 444,100 IT Pros & Developers. It's quick & easy.

loop beats generator expr creating large dict!?

P: n/a
[Python 2.5c2 (r25c2:51859, Sep 17 2006, 19:57:40), sparc solaris]

I am puzzled that creating large dicts with an explicit iterable of
key,value pairs seems to be slow. I thought to save time by doing:

palettes = dict((w,set(w)) for w in words)

instead of:

palettes={}
for w in words:
palettes[w]=set(w)

where words is a list of 200000 english words. But, in fact,
timeit shows the generator expression takes 3.0 seconds
and the "for" loop 2.1 seconds. Am I missing something?

For good measure, I did an (old fashioned?!) list comprehension:

palettes = dict([(w,set(w)) for w in words])

which took 3.1 seconds. So it seems that, by either list
comprehension or generator expression, creating a large
dictionary all at once is slower than looping to set individual
entries. This seems illogical. Shouldn't dict be able to take
advantage of having all it's entries at once to create itself
faster?

-- George Young
--
"Are the gods not just?" "Oh no, child.
What would become of us if they were?" (C.S. Lewis)
Oct 3 '06 #1
Share this Question
Share on Google+
7 Replies


P: n/a
George Young wrote:
I am puzzled that creating large dicts with an explicit iterable of
key,value pairs seems to be slow. I thought to save time by doing:

palettes = dict((w,set(w)) for w in words)

instead of:

palettes={}
for w in words:
palettes[w]=set(w)

where words is a list of 200000 english words. But, in fact,
timeit shows the generator expression takes 3.0 seconds
and the "for" loop 2.1 seconds. Am I missing something?
Creating those 200,000 (w, set(w)) intermediate tuples isn't free. You
aren't doing that in for loop version. If you were:

# Slowest of all!
palettes={}
for w,s in ((w,set(w)) for w in words):
palettes[w]=s

--Ben

Oct 3 '06 #2

P: n/a
Ant

George Young wrote:
....
I am puzzled that creating large dicts with an explicit iterable of
key,value pairs seems to be slow. I thought to save time by doing:

palettes = dict((w,set(w)) for w in words)

instead of:

palettes={}
for w in words:
palettes[w]=set(w)
In the generator case, you are first creating a tuple, and then
assigning the first element of the tuple as the key and the second
element as the value in the dictionary. In the loop, you are directly
setting the key and value in the dictionary.

Essentially by doing the generator (or comprehension), you are packing
and unpacking the key, value pairs for each word in your initial list,
all of which will have an overhead.

Oct 3 '06 #3

P: n/a
Does George's example raise the question:
why do dictionaries not implement efficient creation
for two common cases?

- Making a dict from two sequences of the same length.
- Making a dict from a sequence and a function
(as in George's example in this thread).

The current situation is:
use a loop because the obvious generator approach
is not efficient. Something seems wrong here...

Cheers,
Alan Isaac
Oct 3 '06 #4

P: n/a
David Isaac wrote:
The current situation is: use a loop because the obvious generator
approach is not efficient.
"not efficient" compared to what ?

</F>

Oct 3 '06 #5

P: n/a
David Isaac wrote:
Does George's example raise the question:
why do dictionaries not implement efficient creation
for two common cases?

- Making a dict from two sequences of the same length.
- Making a dict from a sequence and a function
(as in George's example in this thread).
Maybe you should lobby for dict comprehensions in Python 3000. It
already has set comprehensions:

http://www.python.org/dev/peps/pep-3100/

STeVe
Oct 3 '06 #6

P: n/a
Fredrik Lundh wrote:
David Isaac wrote:

>>The current situation is: use a loop because the obvious generator
approach is not efficient.


"not efficient" compared to what ?
Compared to understanding what's actually going on and working with
reality as opposed to some superstitious view of how the interpreter
operates, perhaps?

Dag nab it, I'm turning into the effbot!

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Oct 3 '06 #7

P: n/a
Alan Isaac wrote:
The current situation is: use a loop because the obvious generator
approach is not efficient.

"Fredrik Lundh" <fr*****@pythonware.comwrote in message
news:ma***************************************@pyt hon.org...
"not efficient" compared to what ?
I already guess that I've missed your point, but to prove it...
I was referring to the beginning of this thread
where George noted that
palettes = dict((w,set(w)) for w in words)
runs slower than
palettes={}
for w in words:
palettes[w]=set(w)
The reason seems obvious: the otiose tuple creation,
but there is no "more efficient" syntax to use with 'dict'.

Alan Isaac
Oct 4 '06 #8

This discussion thread is closed

Replies have been disabled for this discussion.