469,271 Members | 1,790 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,271 developers. It's quick & easy.

Fast constant functions for Py2.5's defaultdict()

FWIW, here are three ways of writing constant functions for
collections.defaultdict():

d = defaultdict(int) # slowest way; works only for zero
d = defaultdict(lambda: 0) # faster way; works for any constant
d = defaultdict(itertools.repeat(0).next) # fastest way; works
for any constant

Another approach is to use the __missing__ method for dictionary
subclasses:

class zerodict (dict):
def __missing__ (self, key):
return 0 # fast on initial miss, but slow on
non-misses; works for any constant

The itertools.repeat(const).next approach wins on speed and
flexibility.
Raymond

Feb 13 '07 #1
3 1796
On 13/02/2007 20.01, Raymond Hettinger wrote:
FWIW, here are three ways of writing constant functions for
collections.defaultdict():

d = defaultdict(int) # slowest way; works only for zero
d = defaultdict(lambda: 0) # faster way; works for any constant
d = defaultdict(itertools.repeat(0).next) # fastest way; works
for any constant

Another approach is to use the __missing__ method for dictionary
subclasses:

class zerodict (dict):
def __missing__ (self, key):
return 0 # fast on initial miss, but slow on
non-misses; works for any constant

The itertools.repeat(const).next approach wins on speed and
flexibility.
But it's the most unreadable too. I'm surprised that defaultdict(int) is
slower than the lambda one though. What's the reason?
--
Giovanni Bajo
Feb 14 '07 #2
On Feb 13, 5:09 pm, Giovanni Bajo <n...@ask.mewrote:
The itertools.repeat(const).next approach wins on speed and
flexibility.

But it's the most unreadable too.
Not really. It's unusual but plenty readable (no surprise that
repeat(0) repeatedly gives you zero). I think it more surprising that
int() with no arguments gives you a zero.
I'm surprised that defaultdict(int) is
slower than the lambda one though. What's the reason?
All that comes to mind is that int() has to call
PyArg_ParseTupleAndKeywords() while the lambda is unburdened by
argument passing.
Raymond
Feb 14 '07 #3
On Feb 14, 9:11 am, "Raymond Hettinger" <pyt...@rcn.comwrote:
On Feb 13, 5:09 pm, Giovanni Bajo <n...@ask.mewrote:
The itertools.repeat(const).next approach wins on speed and
flexibility.
But it's the most unreadable too.

Not really. It's unusual but plenty readable (no surprise that
repeat(0) repeatedly gives you zero). I think it more surprising that
int() with no arguments gives you a zero.
Well, if I was doing code review of some of my coworkers I would ask
them
to use them int if the constant was zero and lambda otherwise. If they
wanted
to use itertools.repeat(const).next they should prove me that the
speed
increase is absolutely significant in their actual use case and
they should put a big comment in the code explaining why they
preferred
the cryptic defaultdict(itertools.repeat(0).next) over the obvious
defaultdict(int).

Michele Simionato

Feb 14 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

6 posts views Thread by thecodemachine | last post: by
reply views Thread by Steven Bethard | last post: by
3 posts views Thread by bearophileHUGS | last post: by
2 posts views Thread by tutufan | last post: by
7 posts views Thread by Matthew Wilson | last post: by
27 posts views Thread by Mark | last post: by
4 posts views Thread by dineshv | last post: by
1 post views Thread by CARIGAR | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.