473,398 Members | 2,812 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

missing? dictionary methods

Well at least I find them missing.

For the moment I frequently come across the following cases.

1) Two files, each with key-value pairs for the same dictionary.
However it is an error if the second file contains a key that
was not in the first file.

In treating the second file I miss a 'set' method.
dct.set(key, value) would be equivallent to dct[key] = value,
except that it would raise a KeyError if the key wasn't
already in the dictionary.
2) One file with key-value pairs. However it is an error
if a key is duplicated in the file.

In treating such files I miss a 'make' method.
dct.make(key, value) would be equivallent to dct[key] = value.
except that it would raise a KeyError if the key was
already in the dictionary.
What do other people think about this?

--
Antoon Pardon
Jul 18 '05 #1
12 2206
"Antoon Pardon" <ap*****@forel.vub.ac.be> wrote in message
news:sl********************@rcpc42.vub.ac.be...
Well at least I find them missing.

For the moment I frequently come across the following cases.

1) Two files, each with key-value pairs for the same dictionary.
However it is an error if the second file contains a key that
was not in the first file.

In treating the second file I miss a 'set' method.
dct.set(key, value) would be equivallent to dct[key] = value,
except that it would raise a KeyError if the key wasn't
already in the dictionary.
2) One file with key-value pairs. However it is an error
if a key is duplicated in the file.

In treating such files I miss a 'make' method.
dct.make(key, value) would be equivallent to dct[key] = value.
except that it would raise a KeyError if the key was
already in the dictionary.
What do other people think about this?

--
Antoon Pardon


+1

I'm sure I've needed and implemented this functionality in the past, but it was simple enough to
even think of extracting them into functions/methods. In contrast to the recent pre-PEP about dict
accumulating methods, set() and make() (or whatever they might be called) are meaningful for all
dicts, so they're good candidates for being added to the base dict class.

As for naming, I would suggest reset() instead of set(), to emphasize that the key must be there.
make() is ok; other candidates could be add() or put().

George
Jul 18 '05 #2
Antoon Pardon wrote:
Well at least I find them missing.

For the moment I frequently come across the following cases.

1) Two files, each with key-value pairs for the same dictionary.
However it is an error if the second file contains a key that
was not in the first file.

In treating the second file I miss a 'set' method.
dct.set(key, value) would be equivallent to dct[key] = value,
except that it would raise a KeyError if the key wasn't
already in the dictionary.
2) One file with key-value pairs. However it is an error
if a key is duplicated in the file.

In treating such files I miss a 'make' method.
dct.make(key, value) would be equivallent to dct[key] = value.
except that it would raise a KeyError if the key was
already in the dictionary.
What do other people think about this?


def safeset(dct, key, value):
if key not in dct:
raise KeyError(key)
else:
dct[key] = value

def make(dct, key, value):
if key in dct:
raise KeyError('%r already in dict' % key)
else:
dct[key] = value

I don't see a good reason to make these built in to dict type.

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
Jul 18 '05 #3
Op 2005-03-21, Robert Kern schreef <rk***@ucsd.edu>:
Antoon Pardon wrote:
Well at least I find them missing.

For the moment I frequently come across the following cases.

1) Two files, each with key-value pairs for the same dictionary.
However it is an error if the second file contains a key that
was not in the first file.

In treating the second file I miss a 'set' method.
dct.set(key, value) would be equivallent to dct[key] = value,
except that it would raise a KeyError if the key wasn't
already in the dictionary.
2) One file with key-value pairs. However it is an error
if a key is duplicated in the file.

In treating such files I miss a 'make' method.
dct.make(key, value) would be equivallent to dct[key] = value.
except that it would raise a KeyError if the key was
already in the dictionary.
What do other people think about this?


def safeset(dct, key, value):
if key not in dct:
raise KeyError(key)
else:
dct[key] = value

def make(dct, key, value):
if key in dct:
raise KeyError('%r already in dict' % key)
else:
dct[key] = value

I don't see a good reason to make these built in to dict type.


I would say the same reason that we have get. There is no
reason to have a builtin get it is easily implemented
like this:

def get(dct, key, default):

try:
return dct[key]
except KeyError:
return default
I would go even so far that there is more reason to have a built-in
safeset and make, than there is a reason to have a built-in get.

The reason is that a python implementation of safeset and make,
will mean two accesses in the dictionary, once for the test and
once for the assignment. This double access could be eliminated
with a built-in. The get on the other hand does only one dictionary
access, so having it implemeted in python is a lesser burden.

--
Antoon Pardon
Jul 18 '05 #4
Antoon Pardon wrote:
I would say the same reason that we have get. There is no
reason to have a builtin get it is easily implemented
like this:

def get(dct, key, default):

try:
return dct[key]
except KeyError:
return default
I would go even so far that there is more reason to have a built-in
safeset and make, than there is a reason to have a built-in get.

The reason is that a python implementation of safeset and make,
will mean two accesses in the dictionary, once for the test and
once for the assignment. This double access could be eliminated
with a built-in. The get on the other hand does only one dictionary
access, so having it implemeted in python is a lesser burden.


That's not true; they're on more or less the same level
computation-wise. try:...except... doesn't relieve the burden; it's
expensive.

For me, the issue boils down to how often such constructs are used. I
don't think that I've ever run into use cases for safeset() and make().
dct.get(key, default) comes up *a lot*, and in places where speed can
matter. Searching through the standard library can give you an idea how
often.

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
Jul 18 '05 #5
Op 2005-03-21, Robert Kern schreef <rk***@ucsd.edu>:
Antoon Pardon wrote:
I would say the same reason that we have get. There is no
reason to have a builtin get it is easily implemented
like this:

def get(dct, key, default):

try:
return dct[key]
except KeyError:
return default
I would go even so far that there is more reason to have a built-in
safeset and make, than there is a reason to have a built-in get.

The reason is that a python implementation of safeset and make,
will mean two accesses in the dictionary, once for the test and
once for the assignment. This double access could be eliminated
with a built-in. The get on the other hand does only one dictionary
access, so having it implemeted in python is a lesser burden.
That's not true; they're on more or less the same level
computation-wise. try:...except... doesn't relieve the burden; it's
expensive.


I have always heard that try: ... except is relatively inexpensive
in python. Particularly if there is no exception raised.
For me, the issue boils down to how often such constructs are used. I
don't think that I've ever run into use cases for safeset() and make(). dct.get(key, default) comes up *a lot*, and in places where speed can
matter. Searching through the standard library can give you an idea how
often.


It is always hard to compare the popularity/usefullness of two things when
one is already implemented and the other is not. IME it is not that
uncommon to know in some part of the code that the keys you use should
already be in the dictionary or contrary that you know the key should
not already be in the dictionary.

--
Antoon Pardon
Jul 18 '05 #6
Ron
On 21 Mar 2005 08:21:40 GMT, Antoon Pardon <ap*****@forel.vub.ac.be>
wrote:
Well at least I find them missing.

For the moment I frequently come across the following cases.

1) Two files, each with key-value pairs for the same dictionary.
However it is an error if the second file contains a key that
was not in the first file.

In treating the second file I miss a 'set' method.
dct.set(key, value) would be equivallent to dct[key] = value,
except that it would raise a KeyError if the key wasn't
already in the dictionary.
2) One file with key-value pairs. However it is an error
if a key is duplicated in the file.

In treating such files I miss a 'make' method.
dct.make(key, value) would be equivallent to dct[key] = value.
except that it would raise a KeyError if the key was
already in the dictionary.
What do other people think about this?

There is a has_key(k) method that helps with these.

Adding these wouldn't be that hard and it can apply to all
dictionaries with any data.

class newdict(dict):
def new_key( self, key, value):
if self.has_key(key):
raise KeyError, 'key already exists'
else:
self[key]=value
def set_key( self, key, value):
if self.has_key(key):
self[key]=value
else:
raise KeyError, 'key does not exist'

d = newdict()
for x in list('abc'):
d[x]=x
print d
d.new_key('z', 'z')
d.set_key('a', 'b')
print d

Which is faster? (has_key()) or (key in keys())?
Jul 18 '05 #7

"Antoon Pardon" <ap*****@forel.vub.ac.be> wrote in message
news:sl********************@rcpc42.vub.ac.be...
For the moment I frequently come across the following cases.

1) Two files, each with key-value pairs for the same dictionary.
However it is an error if the second file contains a key that
was not in the first file.

In treating the second file I miss a 'set' method.
dct.set(key, value) would be equivallent to dct[key] = value,
except that it would raise a KeyError if the key wasn't
already in the dictionary.
2) One file with key-value pairs. However it is an error
if a key is duplicated in the file.

In treating such files I miss a 'make' method.
dct.make(key, value) would be equivallent to dct[key] = value.
except that it would raise a KeyError if the key was
already in the dictionary.
What do other people think about this?


To me, one of the major problems with OOP is that there are an unbounded
number of functions that we can think of to operate on a date structure and
thus a continual pressure to turn functions into methods and thus
indefinitely expand a data structure class. And whatever is the least used
current method, there will always be candidates which are arguably at least
or almost as useful. And the addition of one method will be seen as reason
to add another, and another, and another. I was almost opposed to .get for
this reason. I think dict has about enough 'basic' methods.

So, without suppost from many people, your two examples strike me as fairly
specialized usages best written, as easily done, as Python functions.

Terry J. Reedy

Jul 18 '05 #8

Antoon Pardon wrote:
Well at least I find them missing.

For the moment I frequently come across the following cases.

1) Two files, each with key-value pairs for the same dictionary.
However it is an error if the second file contains a key that
was not in the first file.

In treating the second file I miss a 'set' method.
dct.set(key, value) would be equivallent to dct[key] = value,
except that it would raise a KeyError if the key wasn't
already in the dictionary.
2) One file with key-value pairs. However it is an error
if a key is duplicated in the file.

In treating such files I miss a 'make' method.
dct.make(key, value) would be equivallent to dct[key] = value.
except that it would raise a KeyError if the key was
already in the dictionary.
What do other people think about this?

--
Antoon Pardon


If (1) gets accepted, I propose the name .change(key, val) It's
simple, logical, and makes sense.

Jul 18 '05 #9
George Sakkis wrote:
As for naming, I would suggest reset() instead of set(), to emphasize that the key must be there.
make() is ok; other candidates could be add() or put().


How about 'new' and 'old'?

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg
Jul 18 '05 #10
Op 2005-03-21, Terry Reedy schreef <tj*****@udel.edu>:

"Antoon Pardon" <ap*****@forel.vub.ac.be> wrote in message
news:sl********************@rcpc42.vub.ac.be...
For the moment I frequently come across the following cases.

1) Two files, each with key-value pairs for the same dictionary.
However it is an error if the second file contains a key that
was not in the first file.

In treating the second file I miss a 'set' method.
dct.set(key, value) would be equivallent to dct[key] = value,
except that it would raise a KeyError if the key wasn't
already in the dictionary.
2) One file with key-value pairs. However it is an error
if a key is duplicated in the file.

In treating such files I miss a 'make' method.
dct.make(key, value) would be equivallent to dct[key] = value.
except that it would raise a KeyError if the key was
already in the dictionary.
What do other people think about this?


To me, one of the major problems with OOP is that there are an unbounded
number of functions that we can think of to operate on a date structure and
thus a continual pressure to turn functions into methods and thus
indefinitely expand a data structure class. And whatever is the least used
current method, there will always be candidates which are arguably at least
or almost as useful. And the addition of one method will be seen as reason
to add another, and another, and another. I was almost opposed to .get for
this reason. I think dict has about enough 'basic' methods.

So, without suppost from many people, your two examples strike me as fairly
specialized usages best written, as easily done, as Python functions.


I don't know it they are so specialized. I would rather say the
map[key] = value semantics is specialized. If we work with a list
the key already has to exist. If you have a list with 4 elements
and you try to assign to the 6th element you get an IndexError.
If you want to assign to the 6th element you have to construct
that first. That and for symetric reason with var = dct[key]
make me think that dct[key] = value shouldn't just construct
an entry when it isn't present.

I also was under the impression that a particular part of
my program almost doubled in execution time once I replaced
the naive dictionary assignment with these self implemented
methods. A rather heavy burden IMO for something that would
require almost no extra burden when implemented as a built-in.

But you are right that there doesn't seem to be much support
for this. So I won't press the matter.

--
Antoon Pardon
Jul 18 '05 #11
On 22 Mar 2005 07:40:50 GMT, Antoon Pardon <ap*****@forel.vub.ac.be> wrote:
[...]
I also was under the impression that a particular part of
my program almost doubled in execution time once I replaced
the naive dictionary assignment with these self implemented
methods. A rather heavy burden IMO for something that would
require almost no extra burden when implemented as a built-in.
I think I see a conflict of concerns between language design
and optimization. I call it "arms-length assembler programming"
when I see language features being proposed to achieve assembler-level
code improvements.

For example, what if subclassing could be optimized to have virtually
zero cost, with some kind of sticky-mro hint etc to the compiler/optimizer?
How many language features would be dismissed with "just do a sticky subclass?"
But you are right that there doesn't seem to be much support
for this. So I won't press the matter.

I think I would rather see efficient general composition mechanisms
such as subclassing, decoration, and metaclassing etc. for program elements,
if possible, than incremental aggregation of efficient elements into the built-in core.

Also, because optimization risks using more computation to optimize than the expression
being optimized, I suspect that some kind of evaluate-expression-once (at def-time or first
execution time) and optimize-particular-expression hints could pay off more in general
than particular useful methods. Maybe Pypy will be an easier place to experiment with
these kinds of things.

Regards,
Bengt Richter
Jul 18 '05 #12
Op 2005-03-22, Bengt Richter schreef <bo**@oz.net>:
On 22 Mar 2005 07:40:50 GMT, Antoon Pardon <ap*****@forel.vub.ac.be> wrote:
[...]
I also was under the impression that a particular part of
my program almost doubled in execution time once I replaced
the naive dictionary assignment with these self implemented
methods. A rather heavy burden IMO for something that would
require almost no extra burden when implemented as a built-in.

I think I see a conflict of concerns between language design
and optimization. I call it "arms-length assembler programming"
when I see language features being proposed to achieve assembler-level
code improvements.

For example, what if subclassing could be optimized to have virtually
zero cost, with some kind of sticky-mro hint etc to the compiler/optimizer?
How many language features would be dismissed with "just do a sticky subclass?"


I'm sorry you have lost me here. What do you mean with "stick-mro"

My feeling about this is the following. A[key] = value,
A.reset(key, value) and A.make(key, value) would do almost
identical things, so identical that it would probably easy
to unite them into something like A.assign(key, value, flag)
where flag would indicate which of the three options is wanted.

Also a lot of this code is identical to searching for a key.
Now because the implemantation doesn't provide some of the
possibilities I have to duplicate some of the work.

One could argue that hashes are fast enough so that this
doesn't matter, but dictionaries are the template for
all mappings in python. What it you are using a tree
and you have to go through it twice or what if you
are working with slower mediums like with one of
the dbm modules where you have to go through your
structure on disk twice.

You can see it as assembler-level code improvements, but
you also can see it as an imcomplete interface to your
structure. IMO it would be like only providing '<'
and if people wanted '==' they would have to implement
that like 'not (b < a or a < b)' and in this
case too, this would increase the cost compared with
a directly implemented '=='.

But you are right that there doesn't seem to be much support
for this. So I won't press the matter.

I think I would rather see efficient general composition mechanisms
such as subclassing, decoration, and metaclassing etc. for program elements,
if possible, than incremental aggregation of efficient elements into the built-in core.

Also, because optimization risks using more computation to optimize than the expression
being optimized,


I think this would hardly be the case here. The dictionary code already
has to find out if the key is already in the hash or not. Instead of
just continuing the branch it decided on as is now the case, the code
would test if the branch is appropiate for the demanded action
and raise an exception if not.

--
Antoon Pardon
Jul 18 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Edward Diener | last post by:
Version 2.0 of the Python database API was written over 5 years ago, in 1999. While it has been used successfully by many implementations, there is no generic access into the data dictionary of...
125
by: Raymond Hettinger | last post by:
I would like to get everyone's thoughts on two new dictionary methods: def count(self, value, qty=1): try: self += qty except KeyError: self = qty def appendlist(self, key, *values): try:
5
by: TWiSTeD ViBE | last post by:
Hi, While pouring over some code I've discovered a previous developer heavily uses the "dictionary" object. Whilst I see some of the advantages of using this system It's something I've not used...
1
by: john wright | last post by:
I have a dictionary oject I created and I want to bind a listbox to it. I am including the code for the dictionary object. Here is the error I am getting: "System.Exception: Complex...
90
by: Christoph Zwerschke | last post by:
Ok, the answer is easy: For historical reasons - built-in sets exist only since Python 2.4. Anyway, I was thinking about whether it would be possible and desirable to change the old behavior in...
4
by: Martin Widmer | last post by:
Hi folks. I am using this collection class: Public Class ContentBlocksCollection Inherits DictionaryBase 'Object variables for attributes 'Attributes Default Public Property Item(ByVal...
4
by: Dave Booker | last post by:
So did the .NET 2.0 working group just run out of steam before it got to a ReadOnlyDictionary<> wrapper? I am publishing events containing a SortedList<>, and obviously I don't want event...
7
by: Bill Woodruff | last post by:
I've found it's no problem to insert instances of named delegates as values into a generic dictionary of the form : private Dictionary<KeyType, DelegatemyDictionary = new Dictionary<KeyType,...
6
by: daohuy.hua | last post by:
The context is that I have a C# class named MainModel which has a private Dictionary<string, FileStreammember named dict. I also have a property Dict to access to this member: public...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.