I would like to get everyone's thoughts on two new dictionary methods:
def count(self, value, qty=1):
try:
self[key] += qty
except KeyError:
self[key] = qty
def appendlist(self, key, *values):
try:
self[key].extend(values)
except KeyError:
self[key] = list(values)
The rationale is to replace the awkward and slow existing idioms for dictionary
based accumulation:
d[key] = d.get(key, 0) + qty
d.setdefault(key, []).extend(values)
In simplest form, those two statements would now be coded more readably as:
d.count(key)
d.appendlist(key, value)
In their multi-value forms, they would now be coded as:
d.count(key, qty)
d.appendlist(key, *values)
The error messages returned by the new methods are the same as those returned by
the existing idioms.
The get() method would continue to exist because it is useful for applications
other than accumulation.
The setdefault() method would continue to exist but would likely not make it
into Py3.0.
PROBLEMS BEING SOLVED
---------------------
The readability issues with the existing constructs are:
* They are awkward to teach, create, read, and review.
* Their wording tends to hide the real meaning (accumulation).
* The meaning of setdefault() 's method name is not self-evident.
The performance issues with the existing constructs are:
* They translate into many opcodes which slows them considerably.
* The get() idiom requires two dictionary lookups of the same key.
* The setdefault() idiom instantiates a new, empty list prior to every call.
* That new list is often not needed and is immediately discarded.
* The setdefault() idiom requires an attribute lookup for extend/append.
* The setdefault() idiom makes two function calls.
The latter issues are evident from a disassembly: dis(compile('d[key] = d.get(key, 0) + qty', '', 'exec'))
1 0 LOAD_NAME 0 (d)
3 LOAD_ATTR 1 (get)
6 LOAD_NAME 2 (key)
9 LOAD_CONST 0 (0)
12 CALL_FUNCTION 2
15 LOAD_NAME 3 (qty)
18 BINARY_ADD
19 LOAD_NAME 0 (d)
22 LOAD_NAME 2 (key)
25 STORE_SUBSCR
26 LOAD_CONST 1 (None)
29 RETURN_VALUE
dis(compile('d.setdefault(key, []).extend(values)', '', 'exec'))
1 0 LOAD_NAME 0 (d)
3 LOAD_ATTR 1 (setdefault)
6 LOAD_NAME 2 (key)
9 BUILD_LIST 0
12 CALL_FUNCTION 2
15 LOAD_ATTR 3 (extend)
18 LOAD_NAME 4 (values)
21 CALL_FUNCTION 1
24 POP_TOP
25 LOAD_CONST 0 (None)
28 RETURN_VALUE
In contrast, the proposed methods use only a single attribute lookup and
function call, they use only one dictionary lookup, they use very few opcodes,
and they directly access the accumulation functions, PyNumber_Add() or
PyList_Append(). IOW, the performance improvement matches the readability
improvement.
ISSUES
------
The proposed names could possibly be improved (perhaps tally() is more active
and clear than count()).
The appendlist() method is not as versatile as setdefault() which can be used
with other object types (perhaps for creating dictionaries of dictionaries).
However, most uses I've seen are with lists. For other uses, plain Python code
suffices in terms of speed, clarity, and avoiding unnecessary instantiation of
empty containers:
if key not in d:
d.key = {subkey:value}
else:
d[key][subkey] = value
Raymond Hettinger 125 6996
Hi All--
Maybe I'm not getting it, but I'd think a better name for count would be
add. As in
d.add(key)
d.add(key,-1)
d.add(key,399)
etc.
Raymond Hettinger wrote: I would like to get everyone's thoughts on two new dictionary methods:
def count(self, value, qty=1): try: self[key] += qty except KeyError: self[key] = qty
There is no existing add() method for dictionaries. Given the name
change, I'd like to see it.
Metta,
Ivan
----------------------------------------------
Ivan Van Laningham
God N Locomotive Works http://www.pauahtun.org/ http://www.andi-holmes.com/
Army Signal Corps: Cu Chi, Class of '70
Author: Teach Yourself Python in 24 Hours
Raymond Hettinger wrote: def count(self, value, qty=1): try: self[key] += qty except KeyError: self[key] = qty
I presume that the argument list is a typo, and should actually be
def count(self, key, qty=1): ...
Correct?
Jeff Shannon
In article <JbL_d.8237$qN3.2116@trndny01>,
Raymond Hettinger <py****@rcn.com> wrote: I would like to get everyone's thoughts on two new dictionary methods:
def count(self, value, qty=1): try: self[key] += qty except KeyError: self[key] = qty
You mean
def count(self, key, qty=1)
Right?
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/
"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
Ivan Van Laningham wrote: Hi All-- Maybe I'm not getting it, but I'd think a better name for count would be add. As in
d.add(key) d.add(key,-1) d.add(key,399) etc.
IMHO inc (for increment) is better.
d.inc(key)
add can be read as add key to d
Mike
> > def count(self, value, qty=1):
[Aahz] You mean def count(self, key, qty=1)
Right?
Yes.
Also, there is a typo in the final snippet (pure python version of dictionary of
dictionaries). It should read:
if key not in d:
d[key] = {subkey:value}
else:
d[key][subkey] = value
Raymond
I like this, it is short, low impact, and makes things more readable. I
tend to go with just the literal way of doing it instead of using get and
setdefault, which I find awkward.
But alas I had a my short, low impact, useful suggestion and I think it
died. It was for any() and all() for lists. Actually Google just released
their "functional.py" module on code.google.com with the exact same thing.
Except they are missing the identity as a default which is very useful, i.e.
any(lst, f=lambda x: x) instead of any(lst, f).
Maybe you can tack that onto your PEP :)
That is kind of related, they are accumulators as well. They could probably
be generalized for dictionaries, but I don't know how useful that would be.
"Raymond Hettinger" <vz******@verizon.net> wrote in message
news:JbL_d.8237$qN3.2116@trndny01... I would like to get everyone's thoughts on two new dictionary methods:
def count(self, value, qty=1): try: self[key] += qty except KeyError: self[key] = qty
def appendlist(self, key, *values): try: self[key].extend(values) except KeyError: self[key] = list(values)
The rationale is to replace the awkward and slow existing idioms for
dictionary based accumulation:
d[key] = d.get(key, 0) + qty d.setdefault(key, []).extend(values)
In simplest form, those two statements would now be coded more readably
as: d.count(key) d.appendlist(key, value)
In their multi-value forms, they would now be coded as:
d.count(key, qty) d.appendlist(key, *values)
The error messages returned by the new methods are the same as those
returned by the existing idioms.
The get() method would continue to exist because it is useful for
applications other than accumulation.
The setdefault() method would continue to exist but would likely not make
it into Py3.0.
PROBLEMS BEING SOLVED ---------------------
The readability issues with the existing constructs are:
* They are awkward to teach, create, read, and review. * Their wording tends to hide the real meaning (accumulation). * The meaning of setdefault() 's method name is not self-evident.
The performance issues with the existing constructs are:
* They translate into many opcodes which slows them considerably. * The get() idiom requires two dictionary lookups of the same key. * The setdefault() idiom instantiates a new, empty list prior to every
call. * That new list is often not needed and is immediately discarded. * The setdefault() idiom requires an attribute lookup for extend/append. * The setdefault() idiom makes two function calls.
The latter issues are evident from a disassembly:
dis(compile('d[key] = d.get(key, 0) + qty', '', 'exec')) 1 0 LOAD_NAME 0 (d) 3 LOAD_ATTR 1 (get) 6 LOAD_NAME 2 (key) 9 LOAD_CONST 0 (0) 12 CALL_FUNCTION 2 15 LOAD_NAME 3 (qty) 18 BINARY_ADD 19 LOAD_NAME 0 (d) 22 LOAD_NAME 2 (key) 25 STORE_SUBSCR 26 LOAD_CONST 1 (None) 29 RETURN_VALUE dis(compile('d.setdefault(key, []).extend(values)', '', 'exec')) 1 0 LOAD_NAME 0 (d) 3 LOAD_ATTR 1 (setdefault) 6 LOAD_NAME 2 (key) 9 BUILD_LIST 0 12 CALL_FUNCTION 2 15 LOAD_ATTR 3 (extend) 18 LOAD_NAME 4 (values) 21 CALL_FUNCTION 1 24 POP_TOP 25 LOAD_CONST 0 (None) 28 RETURN_VALUE
In contrast, the proposed methods use only a single attribute lookup and function call, they use only one dictionary lookup, they use very few
opcodes, and they directly access the accumulation functions, PyNumber_Add() or PyList_Append(). IOW, the performance improvement matches the readability improvement.
ISSUES ------
The proposed names could possibly be improved (perhaps tally() is more
active and clear than count()).
The appendlist() method is not as versatile as setdefault() which can be
used with other object types (perhaps for creating dictionaries of
dictionaries). However, most uses I've seen are with lists. For other uses, plain Python
code suffices in terms of speed, clarity, and avoiding unnecessary
instantiation of empty containers:
if key not in d: d.key = {subkey:value} else: d[key][subkey] = value Raymond Hettinger
On Sat, 19 Mar 2005 01:24:57 GMT, "Raymond Hettinger" <vz******@verizon.net> wrote: I would like to get everyone's thoughts on two new dictionary methods:
def count(self, value, qty=1): try: self[key] += qty except KeyError: self[key] = qty
def appendlist(self, key, *values): try: self[key].extend(values) except KeyError: self[key] = list(values)
The rationale is to replace the awkward and slow existing idioms for dictionary based accumulation:
d[key] = d.get(key, 0) + qty d.setdefault(key, []).extend(values)
In simplest form, those two statements would now be coded more readably as:
d.count(key) d.appendlist(key, value)
In their multi-value forms, they would now be coded as:
d.count(key, qty) d.appendlist(key, *values)
How about an efficient duck-typing value-incrementer to replace both? E.g. functionally like: class xdict(dict):
... def valadd(self, key, incr=1):
... try: self[key] = self[key] + type(self[key])(incr)
... except KeyError: self[key] = incr
... xd = xdict() xd
{} xd.valadd('x') xd
{'x': 1} xd.valadd('x', range(3))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 3, in valadd
TypeError: int() argument must be a string or a number xd.valadd('y', range(3)) xd
{'y': [0, 1, 2], 'x': 1} xd.valadd('z', (1,2)) xd
{'y': [0, 1, 2], 'x': 1, 'z': (1, 2)} xd.valadd('x', 100) xd['x']
101 xd.valadd('y', range(3,6)) xd['y']
[0, 1, 2, 3, 4, 5] xd.valadd('z', (3,4)) xd['z']
(1, 2, 3, 4)
ISSUES ------
The proposed names could possibly be improved (perhaps tally() is more active and clear than count()).
I'm thinking the idea that the counting is happening with the value corresponding
to the key should be emphasised more. Hence valadd or such? The appendlist() method is not as versatile as setdefault() which can be used with other object types (perhaps for creating dictionaries of dictionaries). However, most uses I've seen are with lists. For other uses, plain Python code suffices in terms of speed, clarity, and avoiding unnecessary instantiation of empty containers:
if key not in d: d.key = {subkey:value} else: d[key][subkey] = value
Yes, but duck typing for any obj that supports "+" gets you a lot, ISTM at this stage
of this BF ;-)
Regards,
Bengt Richter
Maybe something for sets like 'appendlist' ('unionset'?)
Jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQFCO6KTJd01MZaTXX0RAj9IAKCr0dxRjOtbgo4GUyR5K6 SbUSpA+gCgp75t
FkFSrxoiMQZcCg+GRzdaTnw=
=YF1H
-----END PGP SIGNATURE-----
[Jeff Epler] Maybe something for sets like 'appendlist' ('unionset'?)
I do not follow. Can you provide a pure python equivalent?
Raymond
[Roose] I like this, it is short, low impact, and makes things more readable. I tend to go with just the literal way of doing it instead of using get and setdefault, which I find awkward.
Thanks. Many people find setdefault() to be an oddball.
But alas I had a my short, low impact, useful suggestion and I think it died. It was for any() and all() for lists. Actually Google just released their "functional.py" module on code.google.com with the exact same thing. Except they are missing the identity as a default which is very useful, i.e. any(lst, f=lambda x: x) instead of any(lst, f).
Maybe you can tack that onto your PEP :)
Py2.5 is already going to include any() and all() as builtins. The signature
does not include a function, identity or otherwise. Instead, the caller can
write a listcomp or genexp that evaluates to True or False:
any(x >= 42 for x in data)
If you wanted an identify function, that simplifies to just:
any(data)
Raymond Hettinger
Raymond Hettinger said unto the world upon 2005-03-18 20:24: I would like to get everyone's thoughts on two new dictionary methods:
def count(self, value, qty=1): try: self[key] += qty except KeyError: self[key] = qty
def appendlist(self, key, *values): try: self[key].extend(values) except KeyError: self[key] = list(values)
The rationale is to replace the awkward and slow existing idioms for dictionary based accumulation:
d[key] = d.get(key, 0) + qty d.setdefault(key, []).extend(values)
<SNIP>
Hi all,
I am *far* less experienced with Python and programming than those
who've weighed in as yet.
I quite like count, though I agree with posts up thread that `count'
might not be the best name.
For appendlist, I would have expected
def appendlist(self, key, sequence):
try:
self[key].extend(sequence)
except KeyError:
self[key] = list(sequence)
I am, however, very open to the possibility that this says more about
my level of experience than it does about which way is best :-)
Best to all,
Brian vdB
On Sat, 19 Mar 2005 03:14:07 GMT, bo**@oz.net (Bengt Richter) wrote:
[...] Yes, but duck typing for any obj that supports "+" gets you a lot, ISTM at this stage of this BF ;-)
Just in case, by "this BF," I meant to refer to my addval idea,
with no offensive charaterization of anyone else's ideas intended ;-)
Regards,
Bengt Richter
+1 for inc instead of count.
appendlist seems a bit too specific (I do not use dictionaries of lists
that often).
The problem with setdefault is the name, not the functionality.
get_or_set would be a better name: we could use it as an alias for
setdefault and then remove setdefault in Python 3000.
Just my 2 Eurocents,
Michele Simionato
"Michele Simionato" <mi***************@gmail.com> writes: +1 for inc instead of count.
I'd prefer incr or increment to inc. add is also ok. count isn't so great.
Something like add_count or inc_count or add_num or whatever could be ok.
Raymond Hettinger wrote: I would like to get everyone's thoughts on two new dictionary methods:
+1 count
? appendlist The proposed names could possibly be improved (perhaps tally() is more active and clear than count()).
IMO 'tally' is exactly the right method name
One issue is with negative increments reaching zero i.e., deleting the key once
the tally goes to 0: this behavior would match the automatic key creation.
The appendlist() method is not as versatile as setdefault() which can be used
Simpler list initialization augmentation would be very handy - but I would also
like the equivalent functionality sets (don't care about dicts of dicts though).
Given the difference in set/list augmentation methods, this may present a
challenge. Alternatively, perhaps dict_of_list and dict_of_set could become
specialized containers - they might then have value_append/value_extend and
value_add methods respectively. I imagine (without any basis in fact) that it
would also be possible the optimize the performance of large mappings of
containers compared with the generic dict.
Michael
[Michele Simionato] +1 for inc instead of count.
Any takers for tally()?
We should avoid abbreviations like inc() or incr() that different people tend to
abbreviate differently (for example, that is why the new partial() function has
its "keywords" argument spelled-out). The only other issue I see with that name
is that historically incrementing is more associated with +=1 than with +=n.
Also, there are reasonable use cases for a negative n and it would be misleading
to call it incrementing when decrementing is what is intended.
The issue with add() is that other types with that method use it for a radically
different purpose. For example, aSet.add(n) is not at all similar in function
to the proposed aDict.tally(n) or whatever it ends up being called. Of course,
count() is also problematic because the meaning doesn't parallel that for
list.count().
appendlist seems a bit too specific (I do not use dictionaries of lists that often).
I'm curious. When you do use setdefault, what is the typical second argument?
In all the code I've encountered, nine times out of ten it is []. In the rare
case of {}, the resulting statement is a mess because both the subkey and value
need to be applied -- a pure python equivalent is much clearer. That leaves two
other mutable containers, set() and collections.deque() neither of which I've
ever seen used with setdefault().
IOW, I believe that, in practice, setdefault() is all about dictionaries of
lists. If so, I'm recommending a method that gets straight to the point with no
fuss, no waste, and no obfuscation.
In order to have some unused and unneeded versatility with respect to the
default object, I'm asserting that we've been burdened with an awkward, slow
idiom that is unnecesarily hard to learn and explain.
The problem with setdefault is the name, not the functionality.
Are you happy with the readability of the argument order? To me, the key and
default value are not at all related. Do you prefer having the default value
pre-instantiated on every call when the effort is likely to be wasted? Do you
like the current design of returning an object and then making a further (second
dot) method lookup and call for append or extend? When you first saw setdefault
explained, was it immediately obvious or did it taking more learning effort than
other dictionary methods? To me, it is the least explainable dictionary method.
Even when given a good definition of setdefault(), it is not immediately obvious
that it is meant to be futher combined with append() or some such. When showing
code to newbies or non-pythonistas, do they find the meaning of the current
idiom self-evident? That last question is not compelling, but it does contrast
with other Python code which tends to be grokkable by non-pythonistas and
clients.
get_or_set would be a better name: we could use it as an alias for setdefault and then remove setdefault in Python 3000.
While get_or_set would be a bit of an improvement, it is still obtuse.
Eventhough a set operation only occurs conditionally, the get always occurs.
The proposed name doesn't make it clear that the method alway returns an object.
Even if a wording is found that better describes the both the get and set
operation, it is still a distractor from the intent of the combined statement,
the intent of building up a list. That is an intrinsic wording limitation that
cannot be solved by a better name for setdefault. If any change is made at all,
we ought to go the distance and provide a better designed tool rather than just
a name change.
Just my 2 Eurocents,
I raise you by a ruble and a pound ;-)
Raymond Hettinger
"Raymond Hettinger" <vz******@verizon.net> writes: [Michele Simionato] +1 for inc instead of count. Any takers for tally()?
I'd say "tally" has some connotation of a counter that can never go
negative. I don't know if that behavior is desirable. Someone suggested
deleting the key if the tally is decremented to 0. I'd suggest instead
throwing an exception on an attempt to decrement it to less than 0.
We should avoid abbreviations like inc() or incr() that different people tend to abbreviate differently (for example, that is why the new partial() function has its "keywords" argument spelled-out).
Ok, "increment" then.
The only other issue I see with that name is that historically incrementing is more associated with +=1 than with +=n. Also, there are reasonable use cases for a negative n and it would be misleading to call it incrementing when decrementing is what is intended.
Setting the default to 1 is enough for that. I mean, adding a negative
number to something is normally called "subtraction", but you can still
pass a negative argument to __iadd__.
The issue with add() is that other types with that method use it for a radically different purpose. For example, aSet.add(n) is not at all similar in function to the proposed aDict.tally(n)
Hmm, ok.
I'm curious. When you do use setdefault, what is the typical second argument? In all the code I've encountered, nine times out of ten it is [].
Yeah, me too.
Raymond Hettinger wrote: [Michele Simionato] +1 for inc instead of count.
Any takers for tally()?
Well, as a non-native speaker, I had to look up this one in my
dictionary. That said, it may be bad luck on my side, but it may be that
this word is relatively uncommon and there are many others who would be
happier with increment.
Reinhold
> +1 for inc instead of count. appendlist seems a bit too specific (I do not use dictionaries of lists that often).
No way, I use that all the time. I use that more than count, I would say.
Roose
> > d.count(key, qty) d.appendlist(key, *values)
[Bengt Richter] How about an efficient duck-typing value-incrementer to replace both?
There is some Zen of Python that argues against this interesting idea. Also, I'm
concerned that by folding appendlist() into valadd() we would lose an important
cue that a list is being built-up.
Another issue is that duck-typed multiple-dispatch is only readable when the
type of the input argument is obvious from the surrounding code. Given
d.valadd(x), it is hard to grok if x was created by some code far away. Since a
primary goal is readability and clarity, having two separate, concrete methods
is likely better than having a single more-abstracted multi-purpose method. The
performance gains are just icing on the cake.
I'm thinking the idea that the counting is happening with the value
corresponding to the key should be emphasised more. Hence valadd or such?
How about countkey() or tabulate()?
Raymond Hettinger
Raymond Hettinger: Any takers for tally()?
Dunno, to me "tally" reads "counts the numbers of votes for a candidate
in an election".
We should avoid abbreviations like inc() or incr() that different
people tend to abbreviate differently (for example, that is why the new partial()
function has its "keywords" argument spelled-out). The only other issue I see with
that name is that historically incrementing is more associated with +=1 than
with +=n. Also, there are reasonable use cases for a negative n and it would be
misleading to call it incrementing when decrementing is what is intended.
I agree with Paul Rubin's argument on that issue, let's use increment()
and do not
worry about negative increments. appendlist seems a bit too specific (I do not use dictionaries of
lists that often).
I'm curious. When you do use setdefault, what is the typical second
argument?
Well, I have used setdefault *very few times* in years of heavy Python
usage.
His disappearence would not bother me that much. Grepping my source
code I find that practically
my main use case for setdefault is in a memoize recipe where the result
of a function call
is stored in a dictionary (if not already there) and returned. Then I
have a second case
with a list as second argument. The problem with setdefault is the name, not the functionality.
Are you happy with the readability of the argument order? To me, the
key and default value are not at all related. Do you prefer having the
default value pre-instantiated on every call when the effort is likely to be
wasted? Do you like the current design of returning an object and then making a
further (second dot) method lookup and call for append or extend? When you first saw
setdefault explained, was it immediately obvious or did it taking more learning
effort than other dictionary methods? To me, it is the least explainable
dictionary method. Even when given a good definition of setdefault(), it is not
immediately obvious that it is meant to be futher combined with append() or some such.
When showing code to newbies or non-pythonistas, do they find the meaning of the
current idiom self-evident? That last question is not compelling, but it
does contrast with other Python code which tends to be grokkable by non-pythonistas
and clients.
get_or_set would be a better name: we could use it as an alias for setdefault and then remove setdefault in Python 3000.
While get_or_set would be a bit of an improvement, it is still
obtuse. Eventhough a set operation only occurs conditionally, the get always
occurs. The proposed name doesn't make it clear that the method alway returns
an object.
Honestly, I don't care about the performance arguments. However I care
a lot about
about readability and clarity. setdefault is terrible in this respect,
since most
of the time it does *not* set a default, it just get a value. So I am
always confused
and I have to read at the documentation to remind to myself what it is
doing. The
only right name would be "get_and_possibly_set" but it is a bit long to
type.
Even if a wording is found that better describes the both the get and
set operation, it is still a distractor from the intent of the combined
statement, the intent of building up a list. That is an intrinsic wording
limitation that cannot be solved by a better name for setdefault. If any change is
made at all, we ought to go the distance and provide a better designed tool rather
than just a name change.
Well, I never figured out that the intent of setdefault was to build up
a list ;)
Anyway, if I think at how many times I have used setdefault in my code
(practically
twice) and how much time I have spent trying to decipher it (any time I
reread the
code using it) I think I would have better served by NOT having the
setdefault
method available ;)
About appendlist(): still it seems a bit special purpose to me. I mean,
dictionaries
already have lots of methods and I would think twice before adding new
ones; expecially
methods that may turn out not that useful in the long range, or easily
replaceble by
user code.
Michele Simionato
Reinhold Birkenfeld <re************************@wolke7.net> writes: Any takers for tally()?
Well, as a non-native speaker, I had to look up this one in my dictionary. That said, it may be bad luck on my side, but it may be that this word is relatively uncommon and there are many others who would be happier with increment.
It is sort of an uncommon word. As a US English speaker I'd say it
sounds a bit old-fashioned, except when used idiomatically ("let's
tally up the posts about accumulator messages") or in nonstandard
dialect ("Hey mister tally man, tally me banana" is a song about
working on plantations in Jamaica). It may be more common in UK
English. There's an expression "tally-ho!" which had something to do
with British fox hunts, but they don't have those any more.
I'd say I prefer most of the suggested alternatives (count, add,
incr/increment) to "tally".
> Py2.5 is already going to include any() and all() as builtins. The
signature does not include a function, identity or otherwise. Instead, the caller
can write a listcomp or genexp that evaluates to True or False:
any(x >= 42 for x in data)
If you wanted an identify function, that simplifies to just:
any(data)
Oh great, I just saw that. I was referring to this, which didn't get much
discussion: http://mail.python.org/pipermail/pyt...ry/051556.html
but it looks like it went much further, to builtins! I'm surprised.
But I wish it could be included in Python 2.4.x. I really hope it won't
have any bugs in it. :) At my job we are probably going to upgrade to 2.4,
and that takes a long time, so it'll probably be a year or 18 months after
that happens (which itself might be months from now) that we would consider
upgrading again. Oh well...
[Michele Simionato] Dunno, to me "tally" reads "counts the numbers of votes for a candidate in an election".
That isn't a pleasant image ;-)
The only right name would be "get_and_possibly_set" but it is a bit long to type.
Even if a wording is found that better describes the both the get and set operation, it is still a distractor from the intent of the combined statement, the intent of building up a list. That is an intrinsic wording limitation that cannot be solved by a better name for setdefault. If any change is made at all, we ought to go the distance and provide a better designed tool rather than just a name change.
Well, I never figured out that the intent of setdefault was to build up a list ;)
Right! What does have that intent is the full statement: d.setdefault(k,
[]).append(v).
My thought is that setdefault() is rarely used by itself. Instead, it is
typically part of a longer sentence whose intent and meaning is to accumulate or
build-up. That meaning is not well expressed by the current idiom.
Raymond Hettinger
> > Py2.5 is already going to include any() and all() as builtins. The signature does not include a function, identity or otherwise. Instead, the caller can write a listcomp or genexp that evaluates to True or False:
any(x >= 42 for x in data)
[Roose] Oh great, I just saw that.
. . . But I wish it could be included in Python 2.4.x.
If it is any consolation, the any() can already be expressed somewhat cleanly
and efficiently in Py2.4 with genexps:
True in (x >= 42 for x in data)
The translation for all() is a little less elegant:
False not in (x >= 42 for x in data)
Raymond Hettinger
> Py2.5 is already going to include any() and all() as builtins. The
signature does not include a function, identity or otherwise. Instead, the caller
can write a listcomp or genexp that evaluates to True or False:
Actually I was just looking at Python 2.5 docs since you mentioned this. http://www.python.org/dev/doc/devel/whatsnew/node3.html
It says min() and max() will gain a key function parameter, and sort()
gained one in Python 2.4 (news to me).
And they do indeed default to the identity in all 3 cases, so this seems
very inconsistent. If one of them has it, and sort gained the argument even
in Python 2.4 with generator expressions, then they all should have it.
any(x >= 42 for x in data)
Not to belabor the point, but in the example on that page, max(L, key=len)
could be written max(len(x) for x in L).
Now I know why Guido said he didn't want a PEP for this... such a trivial
thing can produce a lot of opinions. : )
Roose
Roose wrote: Not to belabor the point, but in the example on that page, max(L, key=len) could be written max(len(x) for x in L).
No, it can't:
Python 2.5a0 (#2, Mar 5 2005, 17:44:37)
[GCC 3.3.3 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information. max(["a", "bbb", "cc"], key=len)
'bbb'
Peter
[Roose] Actually I was just looking at Python 2.5 docs since you mentioned this.
http://www.python.org/dev/doc/devel/whatsnew/node3.html
It says min() and max() will gain a key function parameter, and sort() gained one in Python 2.4 (news to me).
It also appears in itertools.groupby() and, for Py2.5, in heapq.nsmallest() and
heapq.nlargest().
And they do indeed default to the identity in all 3 cases, so this seems very inconsistent. If one of them has it, and sort gained the argument even in Python 2.4 with generator expressions, then they all should have it.
any(x >= 42 for x in data)
Not to belabor the point, but in the example on that page, max(L, key=len) could be written max(len(x) for x in L).
Think about it. A key= function is quite a different thing. It provides a
*temporary* comparison key while retaining the original value. IOW, your
re-write is incorrect: L = ['the', 'quick', 'brownish', 'toad'] max(L, key=len)
'brownish' max(len(x) for x in L)
8
Remain calm. Keep the faith. Guido's design works fine.
No important use cases were left unserved by any() and all().
Raymond Hettinger
On Sat, 19 Mar 2005 01:24:57 GMT, "Raymond Hettinger"
<vz******@verizon.net> wrote: I would like to get everyone's thoughts on two new dictionary methods:
def count(self, value, qty=1): try: self[key] += qty except KeyError: self[key] = qty
def appendlist(self, key, *values): try: self[key].extend(values) except KeyError: self[key] = list(values)
Bengt Richter wrote: >>> class xdict(dict):
... def valadd(self, key, incr=1): ... try: self[key] = self[key] + type(self[key])(incr) ... except KeyError: self[key] = incr
What about :
import copy
class safedict(dict):
def __init__(self, default=None):
self.default = default
def __getitem__(self, key):
try:
return dict.__getitem__(self, key)
except KeyError:
return copy.copy(self.default)
x = safedict(0)
x[3] += 1
y = safedict([])
y[5] += range(3)
print x, y
print x[123], y[234]
Bengt Richter wrote: On Sat, 19 Mar 2005 01:24:57 GMT, "Raymond Hettinger" <vz******@verizon.net> wrote:
I would like to get everyone's thoughts on two new dictionary methods:
def count(self, value, qty=1): try: self[key] += qty except KeyError: self[key] = qty
def appendlist(self, key, *values): try: self[key].extend(values) except KeyError: self[key] = list(values) How about an efficient duck-typing value-incrementer to replace both? E.g. functionally like:
>>> class xdict(dict):
... def valadd(self, key, incr=1): ... try: self[key] = self[key] + type(self[key])(incr) ... except KeyError: self[key] = incr
A big problem with this is that there are reasonable use cases for both
d.count(key, <some integer>)
and
d.appendlist(key, <some integer>)
Word counting is an obvious use for the first. Consolidating a list of key, value pairs where the
values are ints requires the second.
Combining count() and appendlist() into one function eliminates the second possibility.
Kent
Raymond Hettinger wrote: I would like to get everyone's thoughts on two new dictionary
methods: def count(self, value, qty=1): try: self[key] += qty except KeyError: self[key] = qty
def appendlist(self, key, *values): try: self[key].extend(values) except KeyError: self[key] = list(values)
Emphatic +1
I use both of these idioms all the time. (Kind of surprised to see
people confused about the need for the latter; I do it regularly.)
This is just the kind of thing experience shows cropping up enough that
it makes sense to put it in the language.
About the names: Seeing that these have specific uses, and do something
that is hard to explain in one word, I would suggest that short names
like count might betray the complexity of the operations. Therefore,
I'd suggest:
increment_value() (or add_to_value())
append_to_value()
Although they don't explicitly communicate that a value would be
created if it didn't exist, they do at least make it clear that it
happens to the value, which kind of implies that it would be created.
If we do have to use short names:
I don't like increment (or inc or incr) at all because it has the air
of a mutator method. Maybe it's just my previous experience with Java
and C++, but to me, a.incr() looks like it's incrementing a, and
a.incr(b) looks like it might be adding b to a. I don't like count
because it's too vague; it's pretty obvious what it does as an
iterator, but not as a method of dict. I could live with tally,
though. As for a short name for the other one, maybe fileas or
fileunder?
--
CARL BANKS
Brian van den Broek wrote: Raymond Hettinger said unto the world upon 2005-03-18 20:24:
I would like to get everyone's thoughts on two new dictionary methods:
def appendlist(self, key, *values): try: self[key].extend(values) except KeyError: self[key] = list(values) For appendlist, I would have expected
def appendlist(self, key, sequence): try: self[key].extend(sequence) except KeyError: self[key] = list(sequence)
The original proposal reads better at the point of call when values is a single item. In my
experience this will be the typical usage:
d.appendlist(key, 'some value')
as opposed to your proposal which has to be written
d.appendlist(key, ['some value'])
The original allows values to be a sequence using
d.appendlist(key, *value_list)
Kent
Ivan Van Laningham a écrit : Hi All-- Maybe I'm not getting it, but I'd think a better name for count would be add. As in
d.add(key) d.add(key,-1) d.add(key,399) etc.
[...]
There is no existing add() method for dictionaries. Given the name change, I'd like to see it.
Metta, Ivan
I don't think "add" is a good name ... even if it doesn't exist in
dictionnarie, it exists in sets and, IMHO, this would add confusion ...
Pierre
---------------------------------------------- Ivan Van Laningham God N Locomotive Works http://www.pauahtun.org/ http://www.andi-holmes.com/ Army Signal Corps: Cu Chi, Class of '70 Author: Teach Yourself Python in 24 Hours
On Sat, 19 Mar 2005 01:24:57 GMT,
"Raymond Hettinger" <vz******@verizon.net> wrote: The proposed names could possibly be improved (perhaps tally() is more active and clear than count()).
Curious that in this lengthy discussion, a method name of "accumulate"
never came up. I'm not sure how to separate the two cases (accumulating
scalars vs. accumulating a list), though.
Regards,
Dan
--
Dan Sommers
<http://www.tombstonezero.net/dan/>
μ₀ × ε₀ × c² = 1
> [Jeff Epler] Maybe something for sets like 'appendlist' ('unionset'?)
On Sat, Mar 19, 2005 at 04:18:43AM +0000, Raymond Hettinger wrote: I do not follow. Can you provide a pure python equivalent?
Here's what I had in mind:
$ python /tmp/unionset.py
Set(['set', 'self', 'since', 's', 'sys', 'source', 'S', 'Set', 'sets', 'starting'])
#------------------------------------------------------------------------
try:
set
except:
from sets import Set as set
def unionset(self, key, *values):
try:
self[key].update(values)
except KeyError:
self[key] = set(values)
if __name__ == '__main__':
import sys, re
index = {}
# We need a source of words. This file will do.
corpus = open(sys.argv[0]).read()
words = re.findall('\w+', corpus)
# Create an index of the words according to the first letter.
# repeated words are listed once since the values are sets
for word in words:
unionset(index, word[0].lower(), word)
# Display the words starting with 'S'
print index['s']
#------------------------------------------------------------------------
Jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQFCPDCNJd01MZaTXX0RArwwAJ49TWEKx9zWBR/ZP+O0vik13LdB7QCfbVpy
2U26jFyYPFwWbBnlXrcnFck=
=1s9E
-----END PGP SIGNATURE-----
Hi All--
Raymond Hettinger wrote: [Michele Simionato] +1 for inc instead of count.
Any takers for tally()?
Sure. Given the reasons for avoiding add(), tally()'s a much better
choice than count().
What about d.tally(key,0) then? Deleting the key as was suggested by
Michael Spencer seems non-intuitive to me. Just my 2 Eurocents,
I raise you by a ruble and a pound ;-)
<hardly-anything-is-worth-less-than-vietnamese-dong>-ly y'rs,
Ivan
----------------------------------------------
Ivan Van Laningham
God N Locomotive Works http://www.andi-holmes.com/ http://www.foretec.com/python/worksh...oceedings.html
Army Signal Corps: Cu Chi, Class of '70
Author: Teach Yourself Python in 24 Hours
Michele Simionato wrote: +1 for inc instead of count.
-1 for inc, increment, or anything that carries a
connotation of *increasing* the value, so long as
the proposal allows for negative numbers to be
involved. "Incrementing by -1" is a pretty silly
picture.
+1 for add and, given the above, I'm unsure there's
a viable alternative (unless this is restricted to
positive values, or perhaps even to "+1" specifically).
appendlist seems a bit too specific (I do not use dictionaries of lists that often).
As Raymond does, I use this much more than the other.
The problem with setdefault is the name, not the functionality. get_or_set would be a better name: we could use it as an alias for setdefault and then remove setdefault in Python 3000.
Agreed...
-Peter
Peter Hansen wrote: Michele Simionato wrote: +1 for inc instead of count.
-1 for inc, increment, or anything that carries a connotation of *increasing* the value, so long as the proposal allows for negative numbers to be involved. "Incrementing by -1" is a pretty silly picture.
+1 for add and, given the above, I'm unsure there's a viable alternative (unless this is restricted to positive values, or perhaps even to "+1" specifically).
What about `addto()'? add() just has the connotation of adding something
to the dict and not to an item in it.
Reinhold
Reinhold Birkenfeld wrote: Peter Hansen wrote:+1 for add and, given the above, I'm unsure there's a viable alternative (unless this is restricted to positive values, or perhaps even to "+1" specifically).
What about `addto()'? add() just has the connotation of adding something to the dict and not to an item in it.
Hmm... better than add anyway. I take back my ill-considered
+1 above, and apply instead a +0 to "count". I don't actually
like any of the alternatives at this point... needs more thought
(for my part, anyway).
To be honest, the only time I've ever seen this particular
idiom is in tutorial code or examples of how you produce
a histogram of word usage in a text document. Never in real
code (not that it doesn't happen, just that I've never
stumbled across it). The "appending to a list" idiom, on
the other hand, I've seen and used quite often.
I'm just going to stay out of the "add/inc/count/addto"
debate and consider the other half of the thread now. :-)
-Peter
[Jeff Epler] Maybe something for sets like 'appendlist' ('unionset'?)
While this could work and potentially be useful, I think it is better to keep
the proposal focused on the two common use cases. Adding a third would reduce
the chance of acceptance.
Also, in all of my code base, I've not run across a single opportunity to use
something like unionset(). This is surprising because I'm the set() author and
frequently use set based algorithms. Your example was a good one and I can
also image a graph represented as a dictionary of sets. Still, I don't mind
writing out the plain Python for this one if it only comes up once in a blue
moon.
Raymond
[Dan Sommers] Curious that in this lengthy discussion, a method name of "accumulate" never came up. I'm not sure how to separate the two cases (accumulating scalars vs. accumulating a list), though.
Separating the two cases is essential. Also, the wording should contain strong
cues that remind you of addition and of building a list.
For the first, how about addup():
d = {}
for word in text.split():
d.addup(word)
Raymond
On 18 Mar 2005 21:03:52 -0800 Michele Simionato wrote:
MS> +1 for inc instead of count.
MS> appendlist seems a bit too specific (I do not use dictionaries of
MS> lists that often).
inc is too specific too.
MS> The problem with setdefault is the name, not the functionality.
The problem with functionality: d.setdefault(k, v) can't be used as
lvalue. If it could, we wouldn't need count/inc/add/tally method.
MS> get_or_set would be a better name: we could use it as an alias for
MS> setdefault and then remove setdefault in Python 3000.
What about d.get(k, setdefault=v) alternative? Not sure whether it's
good idea to overload get() method, just an idea.
--
Denis S. Otkidach http://www.python.ru/ [ru]
Hi All--
Raymond Hettinger wrote: Separating the two cases is essential. Also, the wording should contain strong cues that remind you of addition and of building a list.
For the first, how about addup():
d = {} for word in text.split(): d.addup(word)
I still prefer tally(), despite perceived political connotations.
They're only connotations, after all, and tally() comprises both
positive and negative incrementing, whereas add() and addup() will tease
users into thinking they are only for incrementing.
What about adding another method, "setincrement()"?
d={}
d.setincrement(-1)
for word in text.split():
d.tally(word,1)
if word.lower() in ["a","an","the"]:
d.tally(word)
Not that there's any real utility in that.
Metta,
Ivan
----------------------------------------------
Ivan Van Laningham
God N Locomotive Works http://www.pauahtun.org/ http://www.andi-holmes.com/
Army Signal Corps: Cu Chi, Class of '70
Author: Teach Yourself Python in 24 Hours
Dan Sommers wrote: On Sat, 19 Mar 2005 01:24:57 GMT, "Raymond Hettinger" <vz******@verizon.net> wrote:
The proposed names could possibly be improved (perhaps tally() is
more active and clear than count()). Curious that in this lengthy discussion, a method name of
"accumulate" never came up. I'm not sure how to separate the two cases
(accumulating scalars vs. accumulating a list), though.
Is it even necessary to use a method name?
import copy
class safedict(dict):
def __init__(self, default=None):
self.default = default
def __getitem__(self, key):
try:
return dict.__getitem__(self, key)
except KeyError:
return copy.copy(self.default)
x = safedict(0)
x[3] += 1
y = safedict([])
y[5] += range(3)
print x, y
print x[123], y[234]
[Ivan Van Laningham] What about adding another method, "setincrement()"?
. . .
Not that there's any real utility in that.
That was a short lived suggestion ;-)
Also, it would entail storing an extra value in the dictionary header. That
alone would be a killer.
Raymond
-1 on set increment.
I think this makes your intent much clearer:
..d={}
..for word in text.split():
.. d.tally(word)
.. if word.lower() in ["a","an","the"]:
.. d.tally(word,-1)
or perhaps simplest:
..d={}
..for word in text.split():
.. if word.lower() not in ["a","an","the"]:
.. d.tally(word)
Personally, I'm +1 for tally(), and possibly tallyList() and tallySet()
to complete the thought for the cumulative container cases. I think
there is something to be gained if these methods get named in some
similar manner.
For those dead set against tally() and its ilk, how about accum(),
accumList() and accumSet()?
-- Paul
In article <JbL_d.8237$qN3.2116@trndny01>,
Raymond Hettinger <py****@rcn.com> wrote: The proposed names could possibly be improved (perhaps tally() is more active and clear than count()).
+1 tally()
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/
"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
Raymond Hettinger wrote: Separating the two cases is essential. Also, the wording should
contain strong cues that remind you of addition and of building a list.
For the first, how about addup():
d = {} for word in text.split(): d.addup(word)
import copy
class safedict(dict):
def __init__(self, default=None):
self.default = default
def __getitem__(self, key):
if not self.has_key(key):
self[key] = copy.copy(self.default)
return dict.__getitem__(self, key)
text = 'a b c b a'
words = text.split()
counts = safedict(0)
positions = safedict([])
for i, word in enumerate(words):
counts[word] += 1
positions[word].append(i)
print counts, positions
In article <mVQ_d.9216$u76.1850@trndny08>,
Raymond Hettinger <py****@rcn.com> wrote: How about countkey() or tabulate()?
Those rank roughly equal to tally() for me, with a slight edge to these
two for clarity and a slight edge to tally() for conciseness.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/
"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Headless |
last post by:
I've marked up song lyrics with the <pre> tag because it seems the most
appropriate type of markup for the type of data. This results in
inefficient use of horizontal space due to UA's default...
|
by: Alan Illeman |
last post by:
How do I set several different properties for PRE in
a CSS stylesheet, rather than resorting to this:
<BODY>
<PRE STYLE="font-family:monospace;
font-size:0.95em;
width:40%;
border:red 2px...
|
by: Buck Turgidson |
last post by:
I want to have a css with 2 PRE styles, one bold with large font, and
another non-bold and smaller font.
I am new to CSS (and not exactly an expert in HTML, for that matter). Is
there a way to...
|
by: Michael Shell |
last post by:
Greetings,
Consider the XHTML document attached at the end of this post.
When viewed under Firefox 1.0.5 on Linux, highlighting
and pasting (into a text editor) the <pre> tag listing will...
|
by: Jarno Suni not |
last post by:
It seems to be invalid in HTML 4.01, but valid in XHTML 1.0. Why is there the difference? Can that pose a problem when such a XHTML document is served as text/html?
|
by: Rocky Moore |
last post by:
I have a web site called HintsAndTips.com. On this site people post tips
using a very simply webform with a multi line TextBox for inputing the tip
text. This text is encode to HTML so that no...
|
by: Eric Lindsay |
last post by:
I can't figure how to best display little snippets of shell script using
<pre>. I just got around to organising to bulk validate some of my web
pages, and one of the problems occurs with Bash...
|
by: Xah Lee |
last post by:
The Concepts and Confusions of Pre-fix, In-fix, Post-fix and Fully
Functional Notations
Xah Lee, 2006-03-15
Let me summarize: The LISP notation, is a functional notation, and is
not a...
|
by: Schraalhans Keukenmeester |
last post by:
I am building a default sheet for my linux-related pages.
Since many linux users still rely on/prefer viewing textmode and
unstyled content I try to stick to the correct html tags to pertain good...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM)
The start time is equivalent to 19:00 (7PM) in Central...
|
by: Aliciasmith |
last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
|
by: tracyyun |
last post by:
Hello everyone,
I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
|
by: giovanniandrean |
last post by:
The energy model is structured as follows and uses excel sheets to give input data:
1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
|
by: NeoPa |
last post by:
Hello everyone.
I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report).
I know it can be done by selecting :...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM)
Please note that the UK and Europe revert to winter time on...
|
by: nia12 |
last post by:
Hi there,
I am very new to Access so apologies if any of this is obvious/not clear.
I am creating a data collection tool for health care employees to complete. It consists of a number of...
|
by: NeoPa |
last post by:
Introduction
For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
|
by: GKJR |
last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...
| |