magical expanding hash

braver

I need a magical expanding hash with the following properties:

* it creates all intermediate keys

meh['foo']['bar] = 1

-- works even if meh['foo'] didn't exist before

* allows pushing new elements to leaves which are arrays

meh['foo']['list] << elem1
meh['foo']['list] << elem2

* allows incrementing numeric leaves

meh['foo']['count'] += 7

* serializable

I have such a class in ruby. Can python do that?

Jan 17 '06 #1

Subscribe Post Reply

1723

James Stroud

braver wrote:

I need a magical expanding hash with the following properties:

* it creates all intermediate keys

meh['foo']['bar] = 1

-- works even if meh['foo'] didn't exist before

* allows pushing new elements to leaves which are arrays

meh['foo']['list] << elem1
meh['foo']['list] << elem2

* allows incrementing numeric leaves

meh['foo']['count'] += 7

* serializable

I have such a class in ruby. Can python do that?

Is this too magical?
class meh(dict):
def __getitem__(self, item):
if self.has_key(item):
return dict.__getitem__(self, item)
else:
anitem = meh()
dict.__setitem__(self, item, anitem)
return anitem
m = meh()

m['bob']['carol']['ted'] = 2

print m['bob']['carol']['ted']

Jan 17 '06 #2

Paul Rubin

"braver" <de*********@gmail.com> writes:

I need a magical expanding hash with the following properties: ...
I have such a class in ruby. Can python do that?

Python's built-in dict objects don't do that but you could write such
a class pretty straightforwardly.

Jan 17 '06 #3

Giovanni Bajo

James Stroud wrote:

I need a magical expanding hash with the following properties:

* it creates all intermediate keys

meh['foo']['bar] = 1

-- works even if meh['foo'] didn't exist before

* allows pushing new elements to leaves which are arrays

meh['foo']['list] << elem1
meh['foo']['list] << elem2

* allows incrementing numeric leaves

meh['foo']['count'] += 7

* serializable

I have such a class in ruby. Can python do that?

Is this too magical?
class meh(dict):
def __getitem__(self, item):
if self.has_key(item):
return dict.__getitem__(self, item)
else:
anitem = meh()
dict.__setitem__(self, item, anitem)
return anitem

Actually what the OP wants is already a method of dict, it's called
setdefault(). It's not overloaded by "[]" because it's believed to be better to
be able to say "I want auto-generation" explicitally rather than implicitly: it
gives the user more power to control, and to enforce stricter rules.

class meh(dict): .... def __getitem__(self, item):
.... return dict.setdefault(self, item, meh())
.... a = meh()
a["foo"]["bar"] = 2
a["foo"]["dup"] = 3
print a["foo"]["bar"] 2 print a

{'foo': {'dup': 3, 'bar': 2}}
So I advise using this class, and suggest the OP to try using setdefault()
explicitally to better understand Python's philosophy.

BTW: remember that setdefault() is written "setdefault()" but it's read
"getorset()".
--
Giovanni Bajo

Jan 17 '06 #4

Diez B. Roggisch

BTW: remember that setdefault() is written "setdefault()" but it's read
"getorset()".

I can only second that. The misleading name has - well, mislead me :)

Regards,

Diez

Jan 17 '06 #5

Paul Rubin

"Diez B. Roggisch" <de***@nospam.web.de> writes:

BTW: remember that setdefault() is written "setdefault()" but it's read
"getorset()".

I can only second that. The misleading name has - well, mislead me :)

Hmm,

x[a][b][c][d] = e # x is a "magic" dict

becomes

x.setdefault(a,{}).setdefault(b,{}).setdefault(c,{ })[d] = e

if I understand correctly. Ugh.

Jan 17 '06 #6

Steven Bethard

Paul Rubin wrote:

Hmm,

x[a][b][c][d] = e # x is a "magic" dict

becomes

x.setdefault(a,{}).setdefault(b,{}).setdefault(c,{ })[d] = e

if I understand correctly. Ugh.

Agreed. I really hope that Python 3.0 applies Raymond Hettinger's
suggestion "Improved default value logic for Dictionaries" from
http://wiki.python.org/moin/Python3%2e0Suggestions

This would allow you to make the setdefault() call only once, instead of
on every lookup:

class meh(dict):
def __init__(self, *args, **kwargs):
super(meh, self).__init__(*args, **kwargs)
self.setdefault(function=meh)

STeVe

Jan 17 '06 #7

braver

Nice. What about pushing to leaves which are arrays, or incrementing
leaves which are numbers? If the array leaf didn't exist, or a number
wasn't set yet, << must create an empty array and push the element from
the RHS into it, and += must init the leaf to 0 and add the RHS to it.
Here's the corresponding ruby:

# ruby!

class MagicalExpandingHash < Hash
def initialize(*params)
if params.first.is_a? MagicalExpandingHash
@parentObj, @parentKey = params[0..1]
params = params[2..-1]
end
super(*params) { |h,k|
h[k] = MagicalExpandingHash.new(self,k)
}
end
def <<(elem)
if @parentObj[@parentKey].empty?
@parentObj[@parentKey] = [ elem ]
else
raise ArgumentError, "Can't push onto populated index", caller
end
end
def +(elem)
unless elem.is_a? Numeric
raise ArgumentError, "Can't add a non-Numeric value", caller
end
if @parentObj[@parentKey].empty?
@parentObj[@parentKey] = elem
else
raise ArgumentError, "Can't add to populated index", caller
end
end

def to_hash
h = Hash.new
self.each_pair {|k,v| h[k]=(v.class==self.class)? v.to_hash : v }
return h
end

def from_hash(h)
h.each_pair {|k,v| self[k]=(v.is_a? Hash) ?
self.class.new.from_hash(v) : v}
end

def marshal_dump
self.to_hash
end
def marshal_load(h)
from_hash(h)
end
end

# examples
if $0 == __FILE__
meh = MagicalExpandingHash.new

meh['usa']['france'] << 'tocqueville'
meh['usa']['france'] << 'freedom fries'
meh['life']['meaning'] += 42

puts meh.inspect
# => {"usa"=>{"france"=>["tocqueville", "freedom fries"]},
"life"=>{"meaning"=>42}}
end

Jan 17 '06 #8

Paul Rubin

"braver" <de*********@gmail.com> writes:

Nice. What about pushing to leaves which are arrays, or incrementing
leaves which are numbers? If the array leaf didn't exist, or a number
wasn't set yet, << must create an empty array and push the element from
the RHS into it, and += must init the leaf to 0 and add the RHS to it.

Are you trying to simulate Ruby syntax or just implement those functions?
Implementing the functions is easy enough. If you want Ruby syntax,
use Ruby.

Jan 17 '06 #9

braver

Actually, the behavior is important to translate perl into ruby. Can
it be implemented in python looking similarly?

Jan 17 '06 #10

Paul Rubin

"braver" <de*********@gmail.com> writes:

Actually, the behavior is important to translate perl into ruby. Can
it be implemented in python looking similarly?

It's kind of bizarre in Python to use << as a mutation operator, but I
guess you could do it. Sort of like 'cout << "hello world"' in C++.

Jan 17 '06 #11

Steve Holden

Steven Bethard wrote:

Paul Rubin wrote:
Hmm,

x[a][b][c][d] = e # x is a "magic" dict

becomes

x.setdefault(a,{}).setdefault(b,{}).setdefault(c,{ })[d] = e

if I understand correctly. Ugh.

Agreed. I really hope that Python 3.0 applies Raymond Hettinger's
suggestion "Improved default value logic for Dictionaries" from
http://wiki.python.org/moin/Python3%2e0Suggestions

This would allow you to make the setdefault() call only once, instead of
on every lookup:

class meh(dict):
def __init__(self, *args, **kwargs):
super(meh, self).__init__(*args, **kwargs)
self.setdefault(function=meh)

STeVe

In fact, why not go one better and also add a "default" keyword
parameter to dict()?

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Jan 17 '06 #12

braver

Exactly, << as in C++/ruby streams. But notice the extra checks needed
to see whether we want a new leaf which is an array or a number, or we
create an intermediate hash level. Would the checks look the same in
python?

Jan 17 '06 #13

Paul Rubin

"braver" <de*********@gmail.com> writes:

Exactly, << as in C++/ruby streams. But notice the extra checks needed
to see whether we want a new leaf which is an array or a number, or we
create an intermediate hash level. Would the checks look the same in
python?

You could check what is being shifted and make a new leaf of the
appropriate type. If you put a number there though, you wouldn't be
able to then add more nodes beneath that number.

Jan 17 '06 #14

Fredrik Lundh

"braver" wrote

Exactly, << as in C++/ruby streams. But notice the extra checks needed
to see whether we want a new leaf which is an array or a number, or we
create an intermediate hash level. Would the checks look the same in
python?

we?

trust me, the number of people who think it's a good idea to write perl

</F>

Jan 17 '06 #15

Fredrik Lundh

"braver" wrote

Exactly, << as in C++/ruby streams. But notice the extra checks needed
to see whether we want a new leaf which is an array or a number, or we
create an intermediate hash level. Would the checks look the same in
python?

we?

trust me, the number of people who think it's a good idea to write perl-
inspired ruby and run that code in a python interpreter is very limited.

and even if you succeed in persuading someone else to write the code
for you, don't you think your users will find out pretty quickly that python's
not ruby ?

</F>

Jan 17 '06 #16

braver

The point of this exercise is to compare how either ruby or python can
implement perl's default behavior when dealing with hashes. Since
these are bread and butter of scripting, having a MEH class handy can
enable fast semantically equivalent translation. This can be
beneficial for demonstrating feasibility of migrating to python.
Instead of debating philosophical justifications, I rather wonder
what's the most appropriate pythonic way to solve the problem as stated.

Jan 17 '06 #17

Fredrik Lundh

"braver" wrote:

The point of this exercise is to compare how either ruby or python can
implement perl's default behavior when dealing with hashes. Since
these are bread and butter of scripting, having a MEH class handy can
enable fast semantically equivalent translation. This can be
beneficial for demonstrating feasibility of migrating to python.
if you want to write perl code, why migrate to some other language ?
Instead of debating philosophical justifications, I rather wonder
what's the most appropriate pythonic way to solve the problem
as stated.

write python code.

</F>

Jan 17 '06 #18

braver

Can assigning to hash without intermediate levels, possibly adding to a
numeric leaf or adding an element to a leaf array, be python code?

h['a']['b']['c'] += 42

If it can, I'd like to have a class which supports it.

Is keeping a list at the leaf of a hash python code?

h['a']['b']['c'].push(7) # or override push as an operator of your
choosing

Hashes with accumulating lists or counters at the leaves are universal
data structures used in python as much as anywhere else. Python is
used for scripting purposes at least as much as for some abstract ones.
Having a useful data structure is handy.

The multi-level hashes with default or accumulation come up naturally
in text parsing. Designing a dedicated class structure may or may not
be a better choice, depending on expediency.

Jan 17 '06 #19

Paul Rubin

"braver" <de*********@gmail.com> writes:

Can assigning to hash without intermediate levels, possibly adding to a
numeric leaf or adding an element to a leaf array, be python code?

h['a']['b']['c'] += 42

If it can, I'd like to have a class which supports it.
Yes, it's simple enough to write a class like that. What is your
purpose in asking someone else to write it for you? If you're going
to write Python applications that use that class, you're going to have
to learn enough Python to easily write the class yourself.
The multi-level hashes with default or accumulation come up naturally
in text parsing. Designing a dedicated class structure may or may not
be a better choice, depending on expediency.

I understand that, I've written things like that in the past (not in
Python as it happens) and they were useful.

Jan 17 '06 #20

braver

Well, I know some python, but since there are powerful and magical
features in it, I just wonder whether there're some which address this
issue better than others.

Jan 17 '06 #21

James Stroud

braver wrote:

Well, I know some python, but since there are powerful and magical
features in it, I just wonder whether there're some which address this
issue better than others.

In python, += is short, of course, for

a = a + 1

But if we haven't already assigned a, how does the interpreter know that
we want an int, float, complex, long, or some other data-type that
defines "+"?

Better, clearer or more pythonic would be:

a = 0.0 # we want a float, compiler didn't have to read mind
b = 0 # now we want an int, saving compiler lots of guesswork
a += 1 # incrementing a float by one

The "<<" operator corresponds to the __lshift__ magic method. You can
make a custom data-type here:

class lshiftinglist(list):
def __lshift__(self, value):
list.append(self, value)

class meh(dict):
def __getitem__(self, item):
return dict.setdefault(self, item, meh())

m = meh()
m['bob']['carol'] = 1
m['bob']['carol'] += 1
m['bob']['ted'] = lshiftinglist()
m['bob']['ted'] << 42
m['bob']['ted'] << 43

print m # {'bob': {'carol': 2, 'ted': [42, 43]}}
Other than magically reading mind of programmer, this works pretty much
according to specification.

If you really want a lot of mindreading abilities, you have to write
your own mindreading code. Here is a tiny example:

class meh(dict):
def __getitem__(self, item):
return dict.setdefault(self, item, meh())
def __getattr__(self, attr):
return self.ga(attr)
def __lshift__(self, value):
print "You are thinking of '%s'." % value
def __iadd__(self, other):
# don't try this on a populated meh!!!!!
return other

m = meh()

# mindreading way
m['carol'] += 4
m['carol'] += 5
m['bob'] << 44 # "You are thinking of '44'."

# better, not mindreading way
m['alice'] = [10]
m['alice'].append(11)
m['ted'] = 18
m['ted'] += 1

print m # "{'carol': 9, 'ted': 19, 'bob': {}, 'alice': [10, 11]}"
It would take a lot of coding to make that << work right. Better is the
pythonic

m[key] = [value]

Its really only one more keystroke than

m[key] << value
James

Jan 18 '06 #22

Steven Bethard

Steve Holden wrote:

Steven Bethard wrote:
Agreed. I really hope that Python 3.0 applies Raymond Hettinger's
suggestion "Improved default value logic for Dictionaries" from
http://wiki.python.org/moin/Python3%2e0Suggestions

This would allow you to make the setdefault() call only once, instead
of on every lookup:

class meh(dict):
def __init__(self, *args, **kwargs):
super(meh, self).__init__(*args, **kwargs)
self.setdefault(function=meh)

STeVe

In fact, why not go one better and also add a "default" keyword
parameter to dict()?

It's not backwards compatible:

dict(default=4)

{'default': 4}

And I use the **kwargs form of the dict constructor often enough to hope
that it doesn't go away in Python 3.0.

STeVe

Jan 18 '06 #23

braver

Thanks, James! This is really helpful.

: It would take a lot of coding to make that << work right. Better is
the pythonic
:
: m[key] = [value]
:
: Its really only one more keystroke than
:
: m[key] << value

But it's only for the first element, right? I'd have to say
meh[key1]...[keyN].append(elem2) after that, while I want an operator
to look the same.

Also, what's the shortest python idiom for get_or_set in expression?
E.g., when creating a numeric leaf, I'd say

if meh.has_key('a'): meh['a'] += 7
else: meh['a'] = 7

-- but I'd have to do it everywhere! That's why I'd like to override
+= to do the check/init for me.

Jan 18 '06 #24

James Stroud

braver wrote:

Thanks, James! This is really helpful.

: It would take a lot of coding to make that << work right. Better is
the pythonic
:
: m[key] = [value]
:
: Its really only one more keystroke than
:
: m[key] << value

But it's only for the first element, right? I'd have to say
meh[key1]...[keyN].append(elem2) after that, while I want an operator
to look the same.

Yes, being explicit is only for the first element with the "<<". If you
use the lshiftinglist I provided, you could easily do
class lshiftinglist(list):
def __lshift__(self, value):
list.append(self, value)

class meh(dict):
def __getitem__(self, item):
return dict.setdefault(self, item, meh())
def __getattr__(self, attr):
return self.ga(attr)
def __lshift__(self, value):
print "You are thinking of '%s'." % value
def __iadd__(self, other):
# don't try this on a populated meh!!!!!
return other

m = meh()

m['fred'] = lshiftinglist([18])
m['fred'] << 25
m['barney'] += 1
m['barney'] += 1

print m # {'barney': 2, 'fred': [18, 25]}
And so-on. More pythonic, of course is

m['fred'] = [18]
m['key'].append(25)
m['barney'] = 1
m['barney'] += 1
Now the reason "m['barney'] += 1" works in the former is becasue "+="
actually returns a value to which the name on the left gets re-assigned.
"<<" does not work this way, so it can't be done as easily.

You might want to make a named method that thinks for you. The resulting
code is less terse but more clear (i.e. pythonic):
def meh_append(ameh, key, value):
if not ameh.has_key(key):
ameh[key] = [value]
else:
ameh[key].append(value)

def meh_addleaf(ameh, key, value={}):
if value == {}:
ameh[key] = {}
else:
ameh[key] = value

m = meh()

meh_addleaf(m['bob'], 'carol', None)
meh_append(m['ted'], 'alice', 14)
meh_append(m, 'fred', 1)
meh_append(m, 'fred', 2)

print m # {'bob': {'carol': None},
# 'ted': {'alice': [14]}, 'fred': [1, 2]}

But now its getting very pythonic. And the only magic we need is the
original __getattr__ modification, which we could find a way of
eliminating if we tried.

James

Jan 18 '06 #25

Steve Holden

Steven Bethard wrote:

Steve Holden wrote:
Steven Bethard wrote:
Agreed. I really hope that Python 3.0 applies Raymond Hettinger's
suggestion "Improved default value logic for Dictionaries" from
http://wiki.python.org/moin/Python3%2e0Suggestions

This would allow you to make the setdefault() call only once, instead
of on every lookup:

class meh(dict):
def __init__(self, *args, **kwargs):
super(meh, self).__init__(*args, **kwargs)
self.setdefault(function=meh)

STeVe

In fact, why not go one better and also add a "default" keyword
parameter to dict()?

It's not backwards compatible:
>>> dict(default=4)

{'default': 4}

And I use the **kwargs form of the dict constructor often enough to hope
that it doesn't go away in Python 3.0.

Nyargle. Thanks, you're quite right, of course: I was focussing on the
list-of-pairs argument style when I wrote that. So the best we could do
is provide a subtype, defaultdict(default, *args, *kw).

It still seems to me that would be better than having to call a method
(though I don't object to the method for use if the defaut must change
dynamically). Maybe I just liked Icon tables too much.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Jan 18 '06 #26

Steven D'Aprano

On Tue, 17 Jan 2006 18:00:00 -0700, Steven Bethard wrote:

Steve Holden wrote:
Steven Bethard wrote:
Agreed. I really hope that Python 3.0 applies Raymond Hettinger's
suggestion "Improved default value logic for Dictionaries" from
http://wiki.python.org/moin/Python3%2e0Suggestions

This would allow you to make the setdefault() call only once, instead
of on every lookup:

class meh(dict):
def __init__(self, *args, **kwargs):
super(meh, self).__init__(*args, **kwargs)
self.setdefault(function=meh)

STeVe

In fact, why not go one better and also add a "default" keyword
parameter to dict()?

It's not backwards compatible:
>>> dict(default=4)

{'default': 4}

And I use the **kwargs form of the dict constructor often enough to hope
that it doesn't go away in Python 3.0.

I don't like the idea of all dicts having default values. Sometimes you
don't want a default value, you want an exception when the key isn't in
the dict.

And even if you do want defaults, sometimes you want a default which is
global to the dict, and sometimes you want a default which depends on the
key. More of a "missing value" than a default.

I vote to leave dict just as it is, and add a subclass, either in a module
or as a built in (I'm not fussed either way) for dicts-with-defaults.

--
Steven.

Jan 18 '06 #27

Giovanni Bajo

braver wrote:

Also, what's the shortest python idiom for get_or_set in expression?

dict.setdefault, as I already explained to you.

Again, I'd like to point out that what you're doing is *not* the correct
Pythonic way of doing things. In Python, there is simply no implicit
sub-dicts creation, nor implicit type inference from operators. And there
are very good reason for that. Python is a strongly typed languages: objects
have a type and keep it, they don't change it when used with different
operators. setdefault() is you get'n'set, everything else has to be made
explicit for a good reason. Strong typing has its virtues, let me give you a
link about this:

http://wingware.com/python/success/astra
See specifically the paragraph "Python's Error Handling Improves Robustness"

I believe you're attacking the problem from a very bad point of view.
Instead of trying to write a Python data structure which behaves like
Perl's, convert a Perl code snippet into Python, using the *Pythonic* way of
doing it, and then compare things. Don't try to write Perl in Python, just
write Python and then compare the differences.
--
Giovanni Bajo

Jan 18 '06 #28

braver

Giovanni Bajo wrote,

dict.setdefault, as I already explained to you.

I wonder about numerics too. Say we have a = None somewhere.

I want to increment it, so I'd say a += 8. Now if this is a parsing
app, the increment may happen everywhere -- so I'd write a function to
do it if I worry about it being initialized before. Is there any
operator/expression level support for what in ruby looks like

a ||= 0

Jan 19 '06 #29

magical expanding hash

Similar topics