469,609 Members | 2,210 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,609 developers. It's quick & easy.

Are weak refs slower than strong refs?

Are weak refs slower than strong refs? I've been considering making the
"parent" links in BeautifulSoup into weak refs, so the trees will release
immediately when they're no longer needed. In general, all links back
towards the root of a tree should be weak refs; this breaks the loops
that give reference counting trouble.

John Nagle
Feb 25 '07 #1
3 3145
On Feb 24, 10:17 pm, John Nagle <n...@animats.comwrote:
Are weak refs slower than strong refs? I've been considering making the
"parent" links in BeautifulSoup into weak refs, so the trees will release
immediately when they're no longer needed. In general, all links back
towards the root of a tree should be weak refs; this breaks the loops
that give reference counting trouble.
I've never really benchmarked their overhead. Thinking about how they
work, I wouldn't expect a measurable difference in the time for
dereferencing the weakrefs. But internally objects must track who all
their weak reference holders are, so that will add some cost. I am
guessing if the hierarchy is built once and remains fairly static you
won't see the cost of this management. If the hierarchy is very
dynamic there will be more weakref overhead. On the other hand, you
will be freeing up the work done by the cyclic testing in the garbage
collector.

After doing similar work with weakrefs in hierarchies, I'd say it is
worth doing. If you do the work, it would be interesting to see what
the performance looks like. Be sure there is a gc.collect() in the
timing for the original strong-ref version. :-)

Feb 25 '07 #2
John Nagle <na***@animats.comwrote:
Are weak refs slower than strong refs? I've been considering
making the "parent" links in BeautifulSoup into weak refs, so the
trees will release immediately when they're no longer needed. In
general, all links back towards the root of a tree should be weak
refs; this breaks the loops that give reference counting trouble.
Yes, weak references are slower, but that may not be significant unless
people are following the parent links a lot: I would expect other
processing to make the overhead fairly insignificant.

Why not do a few timing tests with some typical use cases? Here's an
example which does nothing much but create and follow. Typical output:

Using proxy
100 loops, best of 3: 5.6 msec per loop
Call to dummy proxy
100 loops, best of 3: 3.11 msec per loop
Without proxy call
100 loops, best of 3: 2.71 msec per loop

------ timetest.py --------------------------------------------------------
from timeit import Timer, default_timer as timer
from weakref import proxy

class C:
'''Use a weak proxy (or whatever 'proxy' returns)'''
def __init__(self, parent=None):
self.parent = proxy(parent) if parent is not None else parent
self.child = None

def show_parents(self):
while self is not None:
# print "show_parents", id(self)
self = self.parent

def show_children(self):
while self is not None:
# print "show_children", id(self)
self = self.child
def t1(n):
base = C()
current = base
for i in range(n):
current.child = C(current)
current = current.child
base.show_children()
current.show_parents()

class D:
'''Strong parent reference'''
def __init__(self, parent=None):
self.parent = parent if parent is not None else parent
self.child = None

def show_parents(self):
while self is not None:
# print "show_parents", id(self)
self = self.parent

def show_children(self):
while self is not None:
# print "show_children", id(self)
self = self.child

def t2(n):
base = D()
current = base
for i in range(n):
current.child = D(current)
current = current.child
base.show_children()
current.show_parents()
def timetest(stmt):
repeat = 3
precision = 3
t = Timer(stmt, 'import __main__', timer)

# determine number so that 0.2 <= total time < 2.0
for i in range(1, 10):
number = 10**i
try:
x = t.timeit(number)
except:
t.print_exc()
return 1
if x >= 0.2:
break

try:
r = t.repeat(repeat, number)
except:
t.print_exc()
return 1
best = min(r)
print "%d loops," % number,
usec = best * 1e6 / number
if usec < 1000:
print "best of %d: %.*g usec per loop" % (repeat, precision, usec)
else:
msec = usec / 1000
if msec < 1000:
print "best of %d: %.*g msec per loop" % (repeat, precision,
msec)
else:
sec = msec / 1000
print "best of %d: %.*g sec per loop" % (repeat, precision,
sec)

if __name__=='__main__':
print "Using proxy"
timetest('__main__.t1(1000)')
print "Call to dummy proxy"
def proxy(x): return x
timetest('__main__.t1(1000)')
print "Without proxy call"
timetest('__main__.t2(1000)')
-------------------------------------------------------------------------
Feb 25 '07 #3
John Nagle wrote:
Are weak refs slower than strong refs? I've been considering making the
"parent" links in BeautifulSoup into weak refs, so the trees will release
immediately when they're no longer needed. In general, all links back
towards the root of a tree should be weak refs; this breaks the loops
that give reference counting trouble.

John Nagle
I just finished converting BeautifulSoup to use weak back references.
All that's necessary is to make the "parent", "previous", "previousSibling",
and "tagStack" links into weak proxy references. Those are the
"backlinks" of the Beautiful Soup tree; once they're weak links,
BeautifulSoup then becomes loop-free and GC in debug mode reports
zero garbage. This is useful, because BeautifulSoup's backlinks create huge
amounts of collectable garbage, leading to frequent GC cycles.

The "weakref" module could use some support functions. If you
try to create a weakref proxy from a weakref proxy, or from "None",
you get an exception, which is not usually what you want.
It's more useful to pass through "None" or an existing weakref proxy.
So I wrote the function below, which makes it much easier to
convert code to use weak proxy references.

"weakref.proxy()" probably should work that way.
Weakref proxies are supposed to be transparent, but they're not
quite transparent enough.

Patches to BeautifulSoup are below.

John Nagle

59a60,80
import weakref # Weak references for previous, parent, previousSibling
#
# Weakref allocation control
#
# The following links are always weak references, to avoid internal referenc
# require extra garbage collection.
# self.parent
# self.previous
# self.previousSibling
# These are all "back links".
#
# backref -- create a weak reference as a back pointer
#
# Generates a weakref proxy, but handles input of "none" or an existing weak
#
def backref(p) :
if p == None : # if none
return(None) # then none
if isinstance(p,weakref.ProxyType) or isinstance(p,weakref.CallableProxyTyp
return(p)
return(weakref.proxy(p)) # otherwise a new weakref
60a82
#
79,80c101,102
< self.parent = parent
< self.previous = previous
---
self.parent = backref(parent)
self.previous = backref(previous)
85c107
< self.previousSibling = self.parent.contents[-1]
---
self.previousSibling = backref(self.parent.contents[-1])
127c149
< self.nextSibling.previousSibling = self.previousSibling
---
self.nextSibling.previousSibling = backref(self.previousSibling)
157c179
< newChild.parent = self
---
newChild.parent = backref(self)
161c183
< newChild.previous = self
---
newChild.previous = backref(self)
164c186
< newChild.previousSibling = previousChild
---
newChild.previousSibling = backref(previousChild)
166c188
< newChild.previous = previousChild._lastRecursiveChild()
---
newChild.previous = backref(previousChild._lastRecursiveChild())
190c212
< newChild.nextSibling.previousSibling = newChild
---
newChild.nextSibling.previousSibling = backref(newChild)
194c216
< newChildsLastElement.next.previous = newChildsLastElement
---
newChildsLastElement.next.previous = backref(newChildsLastElemen
1052c1074
< self.tagStack.append(tag)
---
self.tagStack.append(backref(tag))
1074c1096
< self.previous = o
---
self.previous = backref(o)
1167c1189
< self.previous = tag
---
self.previous = backref(tag)
Feb 25 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

94 posts views Thread by Gabriel Zachmann | last post: by
3 posts views Thread by memememe | last post: by
2 posts views Thread by Matthew Herrmann | last post: by
1 post views Thread by seyong_choi | last post: by
1 post views Thread by Henri.Chinasque | last post: by
reply views Thread by gheharukoh7 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.