469,352 Members | 1,661 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,352 developers. It's quick & easy.

Patch : doct.merge

Hi,

I've posted this patch on Source forge :

http://sourceforge.net/tracker/index...70&atid=305470

If you want to update a dictionary with another one, you can simply use
update :

a = dict(a=1,c=3)
b = dict(a=0,b=2)
a.update(b)
assert a == dict(a=0,b=2,c=3)

However, sometimes you want to merge the second dict into the first,
all while keeping the values that are already defined in the first.
This is useful if you want to insert default values in the dictionary
without overriding what is already defined.

Currently this can be done in a few different ways, but all are awkward
and/or inefficient :

a = dict(a=1,c=3)
b = dict(a=0,b=2)

Method 1:
for k in b:
if k not in a:
a[k] = b[k]

Method 2:
temp = dict(b)
temp.update(a)
a = temp

This patch adds a merge() method to the dict object, with the same
signature and usage as the update() method. Under the hood, it simply
uses PyDict_Merge() with the override parameter set to 0 instead of 1.
There's nothing new, therefore : the C API already provides this
functionality (though it is not used in the dictobject.c scope), so why
not expose it ? The result is :

a = dict(a=1,c=3)
b = dict(a=0,b=2)
a.merge(b)
assert a == dict(a=1,b=2,c=3)

Does this seem a good idea to you guys ?

Regards,
Nicolas

Dec 27 '05 #1
1 1243
Here's method 3 :

# Python 2.3 (no generator expression)
a.update([(k,v) for k,v in b.iteritems() if k not in a])

# Python 2.4 (with generator expression)
a.update((k,v) for k,v in b.iteritems() if k not in a)

It's a bit cleaner but still less efficient than using what's already
in the PyDict_Merge C API. It's even less efficient than method 1 and 2
! Here is the benchmark I used :

import timeit

init = '''a = dict((i,i) for i in xrange(1000) if i%2==0); b =
dict((i,i+1) for i in xrange(1000))'''

t = timeit.Timer('''for k in b:\n\tif k not in a:\n\t\ta[k] =
b[k]''',init)
print 'Method 1 : %.3f'%t.timeit(10000)

t = timeit.Timer('''temp = dict(b); temp.update(a); a = temp''',init)
print 'Method 2 : %.3f'%t.timeit(10000)

t = timeit.Timer('''a.update((k,v) for k,v in b.iteritems() if k not in
a)''',init)
print 'Method 3 : %.3f'%t.timeit(10000)

t = timeit.Timer('''a.merge(b)''',init)
print 'Using dict.merge() : %.3f'%t.timeit(10000)

Here are the results :

Method 1 : 5.315
Method 2 : 3.855
Method 3 : 7.815
Using dict.merge() : 1.425

So using generator expressions is a bad idea, and using the new
dict.merge() method gives an appreciable performance boost (~ x 3.73
here).

Regards,
Nicolas

Dec 28 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by Vorname.nachname | last post: by
reply views Thread by Angelos Karantzalis | last post: by
8 posts views Thread by Squirrel | last post: by
4 posts views Thread by John J. Hughes II | last post: by
reply views Thread by Kurt B. Kaiser | last post: by
reply views Thread by Kurt B. Kaiser | last post: by
1 post views Thread by skip | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.