473,385 Members | 2,013 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Preferred Python idiom for handling non-existing dictionary keys and why?


Hello again All!

(First, I would like to mention I did try to google
for the answer here!)

Say I am populating a dictionary with a list and
appending. I have written it thusly:

d={}
for (k,v) in somedata():
try:
d[k].append(v)
except KeyError:
d[k]=[v]

I could have written:

d={}
for (k,v) in somedata():
if (k in d):
d[k].append(v)
else:
d[k]=[v]
Which is perferred and why? Which is "faster"?

Thanks!!

Quentin
=====
-- Quentin Crain

------------------------------------------------
I care desperately about what I do.
Do I know what product I'm selling? No.
Do I know what I'm doing today? No.
But I'm here and I'm gonna give it my best shot.
-- Hansel

__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com

Jul 18 '05 #1
8 2730
Quentin Crain <cz***@yahoo.com> writes:
(First, I would like to mention I did try to google
for the answer here!)

Say I am populating a dictionary with a list and
appending. I have written it thusly:

[...]

You want the dict.setdefault method.
John
Jul 18 '05 #2

"Quentin Crain" <cz***@yahoo.com> wrote in message
news:ma********************************@python.org ...

Hello again All!

(First, I would like to mention I did try to google
for the answer here!)
The question you asked is frequent, but hard to isolate, and has
special-case answer (to question you did not ask) even harder to find.
Say I am populating a dictionary with a list and
appending. I have written it thusly:

d={}
for (k,v) in somedata():
try:
d[k].append(v)
except KeyError:
d[k]=[v]

I could have written:

d={}
for (k,v) in somedata():
if (k in d):
d[k].append(v)
else:
d[k]=[v]
Which is perferred and why?
Neither. For me, both are superceded by

d={}
for (k,v) in somedata():
d[k] = d.get(k, []).append(v)

Read Library Reference 2.2.7 Mapping Types to learn current dict
methods.
Which is "faster"?


Tradeoff is small extra overhead every loop (the condition) versus
'occasional' big overhead (exception catching). Choice depends on
frequency of exceptions. As I remember, one data-based rule of thumb
from years ago is to use conditional if frequency more that 10%. You
could try new timeit() on all three versions.

Terry J. Reedy
Jul 18 '05 #3
Say I am populating a dictionary with a list and appending. I have
written it thusly:


John> You want the dict.setdefault method.

d.setdefault() never made any sense to me (IOW, to use it I always had to
look it up). The semantics of what it does just never stick in my brain.
Consequently, even though it's less efficient I generally write such loops
like this:

d = {}
for (key, val) in some_items:
lst = d.get(key) or []
lst.append(val)
d[key] = lst

Note that the first statement of the loop is correct (though perhaps not
obvious at first glance), since once initialized, d[key] never tests as
False. FYI, timeit tells the performace tale:

% timeit.py -s 'd={}' 'x = d.setdefault("x", [])'
1000000 loops, best of 3: 1.82 usec per loop
% timeit.py -s 'd={}' 'x = d.get("x") or [] ; d["x"] = x'
100000 loops, best of 3: 2.34 usec per loop

But my way isn't bad enough for me to change. ;-)

Skip

Jul 18 '05 #4
Skip Montanaro wrote:
>> Say I am populating a dictionary with a list and appending. I have
>> written it thusly:


John> You want the dict.setdefault method.

d.setdefault() never made any sense to me (IOW, to use it I always had to
look it up).


I have had a wrong idea about setdefault for a very long time. To me, it
sounds like: "set this value the default value of dict, so after this
call, let each non-existing key result in this value".

Gerrit.

--
261. If any one hire a herdsman for cattle or sheep, he shall pay him
eight gur of corn per annum.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger Syndroom - een persoonlijke benadering:
http://people.nl.linux.org/~gerrit/
Kom in verzet tegen dit kabinet:
http://www.sp.nl/

Jul 18 '05 #5
In article <jO********************@comcast.com>, Terry Reedy wrote:

"Quentin Crain" <cz***@yahoo.com> wrote in message
news:ma********************************@python.org ...

Which is perferred and why?


Neither. For me, both are superceded by

d={}
for (k,v) in somedata():
d[k] = d.get(k, []).append(v)


Not quite... append returns None, so you'll need to write that as two
separate statements, ie.:

d[k] = d.get(k, [])
d[k].append(v)

Or, just:

d.setdefault(k, []).append(v)

--
..:[ dave benjamin (ramenboy) -:- www.ramenfest.com -:- www.3dex.com ]:.
: d r i n k i n g l i f e o u t o f t h e c o n t a i n e r :
Jul 18 '05 #6
Terry Reedy wrote:
Neither. For me, both are superceded by

d={}
for (k,v) in somedata():
d[k] = d.get(k, []).append(v)


This would have to look up the key twice, so it has no advantage over

if k in d:
d[k].append(v)
else:
d[k] = [v]

Anyway, list.append() returns always None, so it does not work.
I think you mean

d = {}
for (k, v) in somedata:
d.setdefault(k, []).append(v)

There is a small overhead for throwing away a new list object if the key is
already in the dictionary, but I don't really care.
(Or is the compiler smart enough to reuse the empty list?)

Peter
Jul 18 '05 #7
Skip Montanaro wrote:
...
% timeit.py -s 'd={}' 'x = d.setdefault("x", [])'
1000000 loops, best of 3: 1.82 usec per loop
% timeit.py -s 'd={}' 'x = d.get("x") or [] ; d["x"] = x'
100000 loops, best of 3: 2.34 usec per loop

But my way isn't bad enough for me to change. ;-)


Actually, you can still do a bit better w/o using setdefault:

[alex@lancelot pop]$ timeit.py -s'd={}' 'x=d.setdefault("x",[])'
1000000 loops, best of 3: 0.925 usec per loop
[alex@lancelot pop]$ timeit.py -s'd={}' 'x=d.get("x") or []; d["x"]=x'
1000000 loops, best of 3: 1.21 usec per loop
[alex@lancelot pop]$ timeit.py -s'd={}' 'x=d.get("x",[]); d["x"]=x'
1000000 loops, best of 3: 1.13 usec per loop

as d.get takes a second optional argument, you can still save the 'or'.
Alex

Jul 18 '05 #8

"Terry Reedy" <tj*****@udel.edu> wrote in message
news:jO********************@comcast.com...
Read Library Reference 2.2.7 Mapping Types to learn current dict
methods.


Seeing the other responses, I see I need to do the same and read about
setdefault() ;-)
Jul 18 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

29
by: Catalin | last post by:
Can Python replace PHP? Can I use a python program to make an interface to a mysql 4.X database? If that's possible where can I find a tutorial?
0
by: Dave Benjamin | last post by:
I just noticed that the "new" module is deprecated in Python 2.3. Since the old way of adding a method to a particular instance (not its class) was to use new.instancemethod, I am guessing that we...
49
by: Ville Vainio | last post by:
I don't know if you have seen this before, but here goes: http://text.userlinux.com/white_paper.html There is a jab at Python, though, mentioning that Ruby is more "refined". -- Ville...
0
by: Kurt B. Kaiser | last post by:
Patch / Bug Summary ___________________ Patches : 235 open ( -6) / 2633 closed (+11) / 2868 total ( +5) Bugs : 767 open ( +3) / 4463 closed (+10) / 5230 total (+13) RFE : 151 open...
68
by: Lad | last post by:
Is anyone capable of providing Python advantages over PHP if there are any? Cheers, L.
137
by: Philippe C. Martin | last post by:
I apologize in advance for launching this post but I might get enlightment somehow (PS: I am _very_ agnostic ;-). - 1) I do not consider my intelligence/education above average - 2) I am very...
112
by: mystilleef | last post by:
Hello, What is the Pythonic way of implementing getters and setters. I've heard people say the use of accessors is not Pythonic. But why? And what is the alternative? I refrain from using them...
4
by: tleeuwenburg | last post by:
To whom it may concern, I have been involved in putting together a new Python journal, called (oh so originally) The Python Journal. This isn't related to a previous project also called The...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.