Connecting Tech Pros Worldwide Forums | Help | Site Map

how to keep collection of existing instances and return one oninstantiation

marduk
Guest
 
Posts: n/a
#1: Oct 5 '05
I couldn't think of a good subject..

Basically, say I have a class

class Spam:
def __init__(self, x):
self.x = x


then if I create two instances:

a = Spam('foo')
b = Spam('foo')

a == b # False

What I *really* want is to keep a collection of all the Spam instances,
and if i try to create a new Spam instance with the same contructor
parameters, then return the existing Spam instance. I thought new-style
classes would do it:

class Spam(object):
cache = {}
def __new__(cls, x):
if cls.cache.has_key(x):
return cls.cache[x]
def __init__(self, x):
self.x = x
self.cache[x] = self

a = Spam('foo')
b = Spam('foo')

Well, in this case a and b are identical... to None! I assume this is
because the test in __new__ fails so it returns None, I need to then
create a new Spam.. but how do I do that without calling __new__ again?
I can't call __init__ because there's no self...

So what is the best/preferred way to do this?


Diez B. Roggisch
Guest
 
Posts: n/a
#2: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


[color=blue]
> What I *really* want is to keep a collection of all the Spam instances,
> and if i try to create a new Spam instance with the same contructor
> parameters, then return the existing Spam instance. I thought new-style
> classes would do it:[/color]
<snip>
[color=blue]
> So what is the best/preferred way to do this?[/color]

Use the BORG-pattern. See


http://aspn.activestate.com/ASPN/Coo...n/Recipe/66531

Together with your caching, that should do the trick.

Diez
Jonathan LaCour
Guest
 
Posts: n/a
#3: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


> class Spam(object):[color=blue]
> cache = {}
> def __new__(cls, x):
> if cls.cache.has_key(x):
> return cls.cache[x]
> def __init__(self, x):
> self.x = x
> self.cache[x] = self
>
> a = Spam('foo')
> b = Spam('foo')
>
> Well, in this case a and b are identical... to None! I assume this is
> because the test in __new__ fails so it returns None, I need to then
> create a new Spam.. but how do I do that without calling __new__
> again?
> I can't call __init__ because there's no self...
>
>[/color]

Oops, you forgot to return object.__new__(cls, x) in the case the
object isn't in the cache. That should fix it.

Jonathan
http://cleverdevil.org


Peter Otten
Guest
 
Posts: n/a
#4: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


marduk wrote:
[color=blue]
> What I *really* want is to keep a collection of all the Spam instances,
> and if i try to create a new Spam instance with the same contructor
> parameters, then return the existing Spam instance. I thought new-style
> classes would do it:
>
> class Spam(object):
> cache = {}
> def __new__(cls, x):
> if cls.cache.has_key(x):
> return cls.cache[x][/color]

On cache misses you implicitly return None. But your __init__() method will
only be called if __new__() returns a Spam instance.
[color=blue]
> def __init__(self, x):
> self.x = x
> self.cache[x] = self
>
> a = Spam('foo')
> b = Spam('foo')
>
> Well, in this case a and b are identical... to None! I assume this is
> because the test in __new__ fails so it returns None, I need to then
> create a new Spam.. but how do I do that without calling __new__ again?
> I can't call __init__ because there's no self...
>
> So what is the best/preferred way to do this?[/color]

class Spam(object):
cache = {}
def __new__(cls, x):
try:
inst = cls.cache[x]
print "from cache"
except KeyError:
cls.cache[x] = inst = object.__new__(cls)
print "new instance"
return inst # always return a Spam instance

def __init__(self, x):
# put one-off initialization into __new__() because __init__()
# will be called with instances from cache hits, too.
print "init", x

a = Spam('foo')
b = Spam('foo')

print a, b, a is b

Peter

marduk
Guest
 
Posts: n/a
#5: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


On Wed, 2005-10-05 at 12:56 -0400, Jonathan LaCour wrote:[color=blue]
> Oops, you forgot to return object.__new__(cls, x) in the case the
> object isn't in the cache. That should fix it.[/color]

Ahh, that did it. I didn't even think of calling object...

so the new class looks like:

class Spam(object):
cache = {}
def __new__(cls, x):
if cls.cache.has_key(x):
return cls.cache[x]
else:
new_Spam = object.__new__(cls, x)
cls.cache[x] = new_Spam
return new_Spam
def __init__(self, x):
self.x = x

a = Spam(2)
b = Spam(2)

a == b # => True
id(a) == id(b) # => True

Thanks for all your help.

marduk
Guest
 
Posts: n/a
#6: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


On Wed, 2005-10-05 at 18:28 +0200, Diez B. Roggisch wrote:[color=blue]
> Use the BORG-pattern. See
>
>
> http://aspn.activestate.com/ASPN/Coo...n/Recipe/66531
>
> Together with your caching, that should do the trick.
>[/color]

I looked at the Borg Pattern, but I don't think it was exactly what I
want.

The Borg patten appears to be if you want multiple instances that point
to the same "data".

What I wanted is multiple calls to create a new object with the same
parameters points to the "original" object instead of creating a new
one.

marduk
Guest
 
Posts: n/a
#7: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


On Wed, 2005-10-05 at 12:56 -0400, Jonathan LaCour wrote:[color=blue][color=green]
> > class Spam(object):
> > cache = {}
> > def __new__(cls, x):
> > if cls.cache.has_key(x):
> > return cls.cache[x]
> > def __init__(self, x):
> > self.x = x
> > self.cache[x] = self
> >
> > a = Spam('foo')
> > b = Spam('foo')
> >
> > Well, in this case a and b are identical... to None! I assume this is
> > because the test in __new__ fails so it returns None, I need to then
> > create a new Spam.. but how do I do that without calling __new__
> > again?
> > I can't call __init__ because there's no self...
> >
> >[/color]
>
> Oops, you forgot to return object.__new__(cls, x) in the case the
> object isn't in the cache. That should fix it.[/color]

Okay, one more question... say I then

c = Spam('bar')
del a
del b

I've removed all references to the object, except for the cache. Do I
have to implement my own garbage collecting is or there some "magical"
way of doing this within Python? I pretty much want to get rid of the
cache as soon as there are no other references (other than the cache).

Peter Otten
Guest
 
Posts: n/a
#8: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


marduk wrote:
[color=blue]
> On Wed, 2005-10-05 at 12:56 -0400, Jonathan LaCour wrote:[color=green][color=darkred]
>> > class Spam(object):
>> > cache = {}
>> > def __new__(cls, x):
>> > if cls.cache.has_key(x):
>> > return cls.cache[x]
>> > def __init__(self, x):
>> > self.x = x
>> > self.cache[x] = self
>> >
>> > a = Spam('foo')
>> > b = Spam('foo')
>> >
>> > Well, in this case a and b are identical... to None! I assume this is
>> > because the test in __new__ fails so it returns None, I need to then
>> > create a new Spam.. but how do I do that without calling __new__
>> > again?
>> > I can't call __init__ because there's no self...
>> >
>> >[/color]
>>
>> Oops, you forgot to return object.__new__(cls, x) in the case the
>> object isn't in the cache. That should fix it.[/color]
>
> Okay, one more question... say I then
>
> c = Spam('bar')
> del a
> del b
>
> I've removed all references to the object, except for the cache. Do I
> have to implement my own garbage collecting is or there some "magical"
> way of doing this within Python? I pretty much want to get rid of the
> cache as soon as there are no other references (other than the cache).[/color]

Use a weakref.WeakValueDictionary as the cache instead of a normal dict.

Peter

Laszlo Zsolt Nagy
Guest
 
Posts: n/a
#9: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


[color=blue]
>I've removed all references to the object, except for the cache. Do I
>have to implement my own garbage collecting is or there some "magical"
>way of doing this within Python? I pretty much want to get rid of the
>cache as soon as there are no other references (other than the cache).
>[/color]
Store weak references to instances.

from weakref import ref

class Spam(object):
cache = {}
def __new__(cls, x):
instance = None
if cls.cache.has_key(x):
instance = cls.cache[x]()
if instance is None:
instance = object.__new__(cls, x)
cls.cache[x] = ref(instance )
return instance
def __init__(self, x):
self.x = x

Then:
[color=blue][color=green][color=darkred]
>>> a = Spam('foo')
>>> b = Spam('foo')
>>> a is b[/color][/color][/color]
True[color=blue][color=green][color=darkred]
>>> print Spam.cache[/color][/color][/color]
{'foo': <weakref at 00A1F690; to 'Spam' at 00A2A650>}[color=blue][color=green][color=darkred]
>>> del a
>>> del b
>>> print Spam.cache[/color][/color][/color]
{'foo': <weakref at 00A1F690; dead>}[color=blue][color=green][color=darkred]
>>>[/color][/color][/color]

Well, of course this is still not thread safe, and weak references will
use some memory (but it can be much less expensive).
You can grabage collect dead weak references periodically, if you wish.

Best,

Les


Diez B. Roggisch
Guest
 
Posts: n/a
#10: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


>[color=blue]
> I looked at the Borg Pattern, but I don't think it was exactly what I
> want.
>
> The Borg patten appears to be if you want multiple instances that point
> to the same "data".
>
> What I wanted is multiple calls to create a new object with the same
> parameters points to the "original" object instead of creating a new
> one.[/color]

Read the comments. What you say is essentially the same - the data
matters, after all. What do you care if there are several instances around?

Diez
Fredrik Lundh
Guest
 
Posts: n/a
#11: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


"marduk" wrote:
[color=blue]
> Do I have to implement my own garbage collecting is or there some "magical"
> way of doing this within Python?[/color]

http://docs.python.org/lib/module-weakref.html

</F>



marduk
Guest
 
Posts: n/a
#12: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


On Wed, 2005-10-05 at 19:37 +0200, Diez B. Roggisch wrote:[color=blue][color=green]
> > What I wanted is multiple calls to create a new object with the same
> > parameters points to the "original" object instead of creating a new
> > one.[/color]
>
> Read the comments. What you say is essentially the same - the data
> matters, after all. What do you care if there are several instances
> around?
>
> Diez[/color]

In my case it matters more that the objects are the same.

For example I want set([Spam(1), Spam(2),
Spam(3)]).intersect(set([Spam(1), Spam(2)]) to contain two items instead
of 0.

For this and many other reasons it's important that Spam(n) is Spam(n).

marduk
Guest
 
Posts: n/a
#13: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


On Wed, 2005-10-05 at 19:24 +0200, Peter Otten wrote:[color=blue]
> Use a weakref.WeakValueDictionary as the cache instead of a normal
> dict.
>
> Peter[/color]

Thanks for the reference to the weakref module. Until now I've never
had a use for it, but it sounds like what I'm looking for.

-m

Diez B. Roggisch
Guest
 
Posts: n/a
#14: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


Diez B. Roggisch wrote:[color=blue]
>[color=green][color=darkred]
>>> Read the comments. What you say is essentially the same - the data
>>> matters, after all. What do you care if there are several instances
>>> around?
>>>[/color]
>> In my case it matters more that the objects are the same.
>>
>> For example I want set([Spam(1), Spam(2),
>> Spam(3)]).intersect(set([Spam(1), Spam(2)]) to contain two items instead
>> of 0.
>>
>> For this and many other reasons it's important that Spam(n) is Spam(n).[/color]
>
>
> Ah, ok. Well, you could always use the __hash__ method to ensure that -
> might be better anyway, because then _you_ define what equality means.
> But YMMV.[/color]

And the __cmp__ or __eq__/__ne__ methdos of course....

Diez
Diez B. Roggisch
Guest
 
Posts: n/a
#15: Oct 5 '05

re: how to keep collection of existing instances and return one oninstantiation


[color=blue][color=green]
>>Read the comments. What you say is essentially the same - the data
>>matters, after all. What do you care if there are several instances
>>around?
>>[/color]
> In my case it matters more that the objects are the same.
>
> For example I want set([Spam(1), Spam(2),
> Spam(3)]).intersect(set([Spam(1), Spam(2)]) to contain two items instead
> of 0.
>
> For this and many other reasons it's important that Spam(n) is Spam(n).[/color]

Ah, ok. Well, you could always use the __hash__ method to ensure that -
might be better anyway, because then _you_ define what equality means.
But YMMV.

Diez
Closed Thread