By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,141 Members | 1,225 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,141 IT Pros & Developers. It's quick & easy.

Fast constant functions for Py2.5's defaultdict()

P: n/a
FWIW, here are three ways of writing constant functions for
collections.defaultdict():

d = defaultdict(int) # slowest way; works only for zero
d = defaultdict(lambda: 0) # faster way; works for any constant
d = defaultdict(itertools.repeat(0).next) # fastest way; works
for any constant

Another approach is to use the __missing__ method for dictionary
subclasses:

class zerodict (dict):
def __missing__ (self, key):
return 0 # fast on initial miss, but slow on
non-misses; works for any constant

The itertools.repeat(const).next approach wins on speed and
flexibility.
Raymond

Feb 13 '07 #1
Share this Question
Share on Google+
3 Replies


P: n/a
On 13/02/2007 20.01, Raymond Hettinger wrote:
FWIW, here are three ways of writing constant functions for
collections.defaultdict():

d = defaultdict(int) # slowest way; works only for zero
d = defaultdict(lambda: 0) # faster way; works for any constant
d = defaultdict(itertools.repeat(0).next) # fastest way; works
for any constant

Another approach is to use the __missing__ method for dictionary
subclasses:

class zerodict (dict):
def __missing__ (self, key):
return 0 # fast on initial miss, but slow on
non-misses; works for any constant

The itertools.repeat(const).next approach wins on speed and
flexibility.
But it's the most unreadable too. I'm surprised that defaultdict(int) is
slower than the lambda one though. What's the reason?
--
Giovanni Bajo
Feb 14 '07 #2

P: n/a
On Feb 13, 5:09 pm, Giovanni Bajo <n...@ask.mewrote:
The itertools.repeat(const).next approach wins on speed and
flexibility.

But it's the most unreadable too.
Not really. It's unusual but plenty readable (no surprise that
repeat(0) repeatedly gives you zero). I think it more surprising that
int() with no arguments gives you a zero.
I'm surprised that defaultdict(int) is
slower than the lambda one though. What's the reason?
All that comes to mind is that int() has to call
PyArg_ParseTupleAndKeywords() while the lambda is unburdened by
argument passing.
Raymond
Feb 14 '07 #3

P: n/a
On Feb 14, 9:11 am, "Raymond Hettinger" <pyt...@rcn.comwrote:
On Feb 13, 5:09 pm, Giovanni Bajo <n...@ask.mewrote:
The itertools.repeat(const).next approach wins on speed and
flexibility.
But it's the most unreadable too.

Not really. It's unusual but plenty readable (no surprise that
repeat(0) repeatedly gives you zero). I think it more surprising that
int() with no arguments gives you a zero.
Well, if I was doing code review of some of my coworkers I would ask
them
to use them int if the constant was zero and lambda otherwise. If they
wanted
to use itertools.repeat(const).next they should prove me that the
speed
increase is absolutely significant in their actual use case and
they should put a big comment in the code explaining why they
preferred
the cryptic defaultdict(itertools.repeat(0).next) over the obvious
defaultdict(int).

Michele Simionato

Feb 14 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.