Function decorator that caches function results

Lasse Vågsæther Karlsen

After working through a fair number of the challenges at
www.mathschallenge.net, I noticed that some long-running functions can
be helped *a lot* by caching their function results and retrieving from
cache instead of calculating again. This means that often I can use a
natural recursive implementation instead of unwinding the recursive
calls to avoid big exponential running-times.

Ok, so I thought, how about creating a decorator that caches the
function results and retrieves them from cache if possible, otherwise it
calls the function and store the value in the cache for the next invokation.

This is what I came up with so far:

def cache_function(fn):
cache = {}
def cached_result(*args, **kwargs):
if args in cache:
return cache[args]
result = fn(*args, **kwargs)
cache[args] = result
return result
return cached_result

Example of usage:

@cache_function
def fibonacci(idx):
if idx in (1, 2):
return 1
else:
return fibonacci(idx-1) + fibonacci(idx-2)

for index in range(1, 101):
print index, fibonacci(index)

this example goes from taking exponential time to run to being near
instantaneous.

However, this:

for index in range(1, 101):
print index, fibonacci(idx = index)

this one uses the kwargs list of arguments, and I'm not sure how to
change my function to take that into account.

Does anyone have any clues as to how to do that?

The solution would have to consider fibonacci(50) and fibonacci(idx =
50) as the same call and thus retrieve the second one from the cache.

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:la***@vkarlsen.no
PGP KeyID: 0x2A42A1C2

Oct 8 '05 #1

Subscribe Post Reply

2724

Duncan Booth

Lasse Vågsæther Karlsen wrote:

However, this:

for index in range(1, 101):
print index, fibonacci(idx = index)

this one uses the kwargs list of arguments, and I'm not sure how to
change my function to take that into account.

Does anyone have any clues as to how to do that?

The solution would have to consider fibonacci(50) and fibonacci(idx =
50) as the same call and thus retrieve the second one from the cache.

Most of your gain comes from caching the recursive calls, not from caching
the original call.

If the solution doesn't consider keyword arguments equivalent to positional
arguments, it will make *one* call to the real function, and then the
recursive call will still come out of the cache, so the net result will
still be to give you almost all the speedup you wanted.

Unless the results stored in the cache are very large data structures, I
would suggest that you simply store (args,kwargs) as the cache key and
accept the hit that sometime you'll cache the same call multiple times.

Oct 8 '05 #2

jepler

On Sat, Oct 08, 2005 at 01:53:28PM +0000, Duncan Booth wrote:

Unless the results stored in the cache are very large data structures, I
would suggest that you simply store (args,kwargs) as the cache key and
accept the hit that sometime you'll cache the same call multiple times.

... except that dicts cannot be dict keys

Another 'memoize' decorator uses this to get the key:
kw = kwargs.items()
kw.sort()
key = (args, tuple(kw))
http://aspn.activestate.com/ASPN/Coo.../Recipe/325905

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDR9vXJd01MZaTXX0RAq5+AJ4yVo8MbputurPvObHBdt urrhZ70gCfUpN2
HxoUEMfpxhkFW3FUDVR+Qkg=
=4UHm
-----END PGP SIGNATURE-----

Oct 8 '05 #3

Sam Pointon

What about not storing args at all? Something like this:

def cache_function(func, args_list):
cache = {}
def cached_result(*args, **kwargs):
kwargs.update(dict(zip(args_list, args)))
if kwargs in cache:
return cache[kwargs]
result = func(**kwargs)
cache[kwargs] = result
return result
return cached_result

args_list is a list of all the argument names, so that they can be
converted into keyword arguments.

Oct 8 '05 #4

Lasse Vågsæther Karlsen

je****@unpythonic.net wrote:

On Sat, Oct 08, 2005 at 01:53:28PM +0000, Duncan Booth wrote:
Unless the results stored in the cache are very large data structures, I
would suggest that you simply store (args,kwargs) as the cache key and
accept the hit that sometime you'll cache the same call multiple times.

... except that dicts cannot be dict keys

Another 'memoize' decorator uses this to get the key:
kw = kwargs.items()
kw.sort()
key = (args, tuple(kw))
http://aspn.activestate.com/ASPN/Coo.../Recipe/325905

Jeff

Yeah, but as far as I can see it, this one too fails to recognize
situations where the function is called twice with essentially the same
values, except that in one call it uses named arguments:

k1 = fibonacci(100)
k2 = fibonacci(idx = 100)

this is essentially the same call, except the second one uses a named
argument, which means the function will be invoked and a second cache
entry will be stored.

Granted, not a big problem in most such cases, but here's my augmented
function. Bare in mind that I'm 2 weeks into Python so there's bound to
be room for improvement :)

def cache(fn):
cache = {}
arg_names = inspect.getargspec(fn)[0]
def cached_result(*args, **kwargs):
# If function is called without parameters, call it without
using the cache
if len(args) == 0 and len(kwargs) == 0:
return fn()

# Work out all parameter names and values
values = {}
for i in range(len(args)):
values[arg_names[i]] = args[i]
for key in kwargs:
values[key] = kwargs[key]
key = tuple([(key, value) for (key, value) in
sorted(values.iteritems())])

# Check cache and return cached value if possible
if key in cache:
return cache[key]

# Work out new value, cache it and return it
result = fn(*args, **kwargs)
cache[key] = result
return result

# Return wrapper function
return cached_result
--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:la***@vkarlsen.no
PGP KeyID: 0x2A42A1C2

Oct 8 '05 #5

Lasse Vågsæther Karlsen

Sam Pointon wrote:

What about not storing args at all? Something like this:

def cache_function(func, args_list):
cache = {}
def cached_result(*args, **kwargs):
kwargs.update(dict(zip(args_list, args)))
if kwargs in cache:
return cache[kwargs]
result = func(**kwargs)
cache[kwargs] = result
return result
return cached_result

args_list is a list of all the argument names, so that they can be
converted into keyword arguments.

I'll take a look at the zip function, didn't know about that one, but
your example also has the problem that dictionaries can't be used as
dictionary keys, but that can be easily solved. I think I can simplify
my solution with some of yours.

The parameter names can be gotten by doing a inspect.getargspec(fn)[0]
so that can be done by the decorator function.

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:la***@vkarlsen.no
PGP KeyID: 0x2A42A1C2

Oct 8 '05 #6

Fredrik Lundh

Lasse Vågsæther Karlsen wrote:

Yeah, but as far as I can see it, this one too fails to recognize
situations where the function is called twice with essentially the same
values, except that in one call it uses named arguments:

k1 = fibonacci(100)
k2 = fibonacci(idx = 100)

this is essentially the same call, except the second one uses a named
argument, which means the function will be invoked and a second cache
entry will be stored.

whoever writes code like that deserves to be punished.

I'd say this thread points to a misunderstanding of what keyword arguments
are, and how they should be used. the basic rule is that you shouldn't mix and
match; use positional arguments for things that are documented to be positional
parameters, and keyword arguments for keyword parameters. if you don't,
you'll end up with code that depends on implementation details. and that's
never a good idea...

</F>

Oct 8 '05 #7

Lasse Vågsæther Karlsen

Lasse Vågsæther Karlsen wrote:

Sam Pointon wrote:
What about not storing args at all? Something like this:

<snip>

Ok, here's my updated version:

class cache(object):
def __init__(self, timeout=0):
self.timeout = timeout
self.cache = {}

def __call__(self, fn):
arg_names = inspect.getargspec(fn)[0]
def cached_result(*args, **kwargs):
# Update named arguments with positional argument values
kwargs.update(dict(zip(arg_names, args)))

# Work out key as a tuple of ('argname', value) pairs
key = tuple(sorted(kwargs.items()))

# Check cache and return cached value if possible
if key in self.cache:
(value, last_time) = self.cache[key]
if self.timeout <= 0 or time.time() - last_time <=
self.timeout:
return value

# Work out new value, cache it and return it
result = fn(**kwargs)

self.cache[key] = (result, time.time())
return result

# Return wrapper function
return cached_result

Changed from previous versions:
- converted to class, must use () on decorator now
- added timeout, results will be recalculated when it expires
- timeout=0 means no timeout, results will always be reused
- now handles both positional and keyword arguments

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:la***@vkarlsen.no
PGP KeyID: 0x2A42A1C2

Oct 8 '05 #8

Lasse Vågsæther Karlsen

Fredrik Lundh wrote:
<snip>

k1 = fibonacci(100)
k2 = fibonacci(idx = 100)
<snip> whoever writes code like that deserves to be punished.

I'd say this thread points to a misunderstanding of what keyword arguments
are, and how they should be used. the basic rule is that you shouldn't mix and
match; use positional arguments for things that are documented to be positional

<snip>

I agree completely.
However, it turns out I have to communicate code with and from people
that have a coding standard that dictates using keyword arguments for
all interfaces. Those functions will also benefit from the cache system
as many of them involves database lookups.

In any case, your response gave me another thing that my solution won't
handle so I'm going to just leave it as it is and look at it if I ever
need it for positional arguments, which I honestly don't believe I will.

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:la***@vkarlsen.no
PGP KeyID: 0x2A42A1C2

Oct 8 '05 #9

George Sakkis

"Lasse Vågsæther Karlsen" <la***@vkarlsen.no> wrote:

[snip]

Ok, so I thought, how about creating a decorator that caches the
function results and retrieves them from cache if possible, otherwise it
calls the function and store the value in the cache for the next invokation.

[snip]

Cool, you re-invented the memoization pattern:
http://en.wikipedia.org/wiki/Memoization
http://aspn.activestate.com/ASPN/sea...ype=Subsection

Yes, it's kinda discouraging that most interesting ideas have already been conceived, implemented
and used by others...<wink>

George

Oct 8 '05 #10

Lasse Vågsæther Karlsen

George Sakkis wrote:
<snip>

Cool, you re-invented the memoization pattern:
http://en.wikipedia.org/wiki/Memoization
http://aspn.activestate.com/ASPN/sea...ype=Subsection

Yes, it's kinda discouraging that most interesting ideas have already been conceived, implemented
and used by others...<wink>

<snip>

I know, I've been scouring over the ASPN recipes and digging through
code like there's no tomorrow.

But, I don't view it as pointless even though there are existing
implementations and solutions out there.

First of all, I don't like to stuff a lot of what is obviously library
type of code into a project, unless I can reference a library that got
that function or class or whatnot. Creates a rather big maintenance
nightmare :)

So, I would have to stuff that into a library, which is what I did with
my "own" function (thank you for helping btw). The recipe on ASPN is
probably better than what I got, but... I understand how my solution
came about and what makes it tick.

Secondly, I've been "programming" Python for, what, about 11 days now or
so, so I want to re-implement as much as possibly right now, even to the
point where I create a worse solution than an existing one, as long as
it works for me, just to be able to learn the nuances of Python, because
Python is ... different than what I'm used to.

For instance, in C# and .NET you got attributes, but they don't actually
do anything on their own, in other words you can't tag a method and have
the operation of that method deviate from a similar method without the
attribute, unless you pick one of the attributes the compiler knows
about, so it's just meta-data that sits silent until some other method
goes around to look for it.

In Python I've now learned that a function is just an object like
everything else and I can wrap a new object around it to modify its
behaviour, and I can use the decorator pattern to do it.

I'm probably going to be back here in a few days or even hours with
another "task" where you can probably cough up dozens of existing source
code solutions that I could use.

For instance, there's this thing I've heard of called the "wheel".....

:)

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:la***@vkarlsen.no
PGP KeyID: 0x2A42A1C2

Oct 8 '05 #11

Steven D'Aprano

On Sat, 08 Oct 2005 15:20:12 +0200, Lasse Vågsæther Karlsen wrote:

Ok, so I thought, how about creating a decorator that caches the
function results and retrieves them from cache if possible, otherwise it
calls the function and store the value in the cache for the next invokation.

This is what I came up with so far:

def cache_function(fn):
cache = {}
def cached_result(*args, **kwargs):
if args in cache:
return cache[args]
result = fn(*args, **kwargs)
cache[args] = result
return result
return cached_result
I'm curious... where does cache live after you use cache_function to
memoize some function? It doesn't appear to be an attribute of the newly
memoized function, nor does it look like a global variable.
The solution would have to consider fibonacci(50) and fibonacci(idx =
50) as the same call and thus retrieve the second one from the cache.

In general, you probably can't do that without lots and lots of really
clever code searching through the function argument lists. If you know
that your function will always have the same keyword, then you can do
something like this:

if args in cache:
return cache[args]
elif kwargs.keys() == "idx":
return cache[kwargs["idx"]]

but that's a special case and you really don't want to do it *wink*

Otherwise, just accept that you may be caching multiple copies of the same
data, and do something like this:

if args in cache:
return cache[args]
elif kwargs.items() in cache:
return cache[kwargs.items()]

with the appropriate changes to the rest of the code.

You may also find that you could get a slight performance increase by
using the idiom

try:
return cache[args]
except:
# calculate the result and store it in the cache.

instead of testing for membership. As always, timing some test functions
is your friend...

--
Steven.

Oct 9 '05 #12

Paul Rubin

Steven D'Aprano <st***@REMOVETHIScyber.com.au> writes:

def cache_function(fn):
cache = {}
def cached_result(*args, **kwargs):
if args in cache:
return cache[args]
result = fn(*args, **kwargs)
cache[args] = result
return result
return cached_result

I'm curious... where does cache live after you use cache_function to
memoize some function? It doesn't appear to be an attribute of the newly
memoized function, nor does it look like a global variable.

It's in the closure returned by cache_function. cached_result doesn't
change the value of 'cache' (it only changes the boxed content),
cached_result finds it in the outer scope, so you can get away with
that in Python. You couldn't do something like:

def counter():
n = 0
def k():
n += 1 # rebinds n, so Python thinks n is local, oops
return n
return k

Since k changes the value of n, Python thinks n is local to k, and
you get a NameError when you try to increment it the first time.
That you can't set the value of a closure variable is something of a
Python wart and is one of the things that makes me want a "local"
declaration in Python.

Closures are a Scheme idiom and Pythonistas tend to use class
instances instead, but both techniques are useful.

Oct 9 '05 #13

Diez B. Roggisch

>>def cache_function(fn):

cache = {}
def cached_result(*args, **kwargs):
if args in cache:
return cache[args]
result = fn(*args, **kwargs)
cache[args] = result
return result
return cached_result

I'm curious... where does cache live after you use cache_function to
memoize some function? It doesn't appear to be an attribute of the newly
memoized function, nor does it look like a global variable.

It's part of the closure - as well as fn, btw. So it is
per-decorated-function.

Diez

Oct 9 '05 #14

Steven D'Aprano

On Sun, 09 Oct 2005 02:49:38 -0700, Paul Rubin wrote:

I'm curious... where does cache live after you use cache_function to
memoize some function? It doesn't appear to be an attribute of the newly
memoized function, nor does it look like a global variable.

It's in the closure returned by cache_function.

What's a closure?

Googling on define:closure isn't very useful. There are a *lot* of
different meanings for closure, none of them to do with programming.

No, wait, I tell a lie -- Wikipedia to the rescue again:

http://en.wikipedia.org/wiki/Closure_(programming)

I notice that type(some_closure) and type(ordinary_function) both return
the same result, <type 'function'>. I also notice that both the closure
and ordinary functions have an attribute "func_closure". Is there a way to
tell which is a closure and which is not? In my tests, the ordinary
function has func_closure == None, but I don't know if that will always be
the case of just for my tests.

If I wanted to inspect the value of cache, where would I find it? In other
words, if I say some_closure.SOMETHING I would get cache... what should
SOMETHING be set to?

That assumes that cache can be seen in this way. If it can't, is this a
way to create "really private" variables in Python?

--
Steven.

Oct 9 '05 #15

Fredrik Lundh

Steven D'Aprano wrote:

I notice that type(some_closure) and type(ordinary_function) both return
the same result, <type 'function'>. I also notice that both the closure
and ordinary functions have an attribute "func_closure". Is there a way to
tell which is a closure and which is not? In my tests, the ordinary
function has func_closure == None, but I don't know if that will always be
the case of just for my tests.
closure != function.

(did you read the wikipedia page? the closure is a combination of the function
code itself, the information needed to call it, and the context it was defined in.
in Python, the full closure includes stuff you can reach via assorted attributes
on the function object; most notably func_closure and func_globals)
If I wanted to inspect the value of cache, where would I find it?
nowhere. at least no officially; see "Rebinding names in enclosing
scopes" in the design document for a discussion:

http://www.python.org/peps/pep-0227.html
That assumes that cache can be seen in this way. If it can't, is this a
way to create "really private" variables in Python?

no, because Python doesn't prevent you from digging into the
internals:

http://aspn.activestate.com/ASPN/Coo.../Recipe/440515

</F>

Oct 9 '05 #16

Steven D'Aprano

On Sun, 09 Oct 2005 13:43:38 +0200, Fredrik Lundh wrote:

Steven D'Aprano wrote:
I notice that type(some_closure) and type(ordinary_function) both return
the same result, <type 'function'>. I also notice that both the closure
and ordinary functions have an attribute "func_closure". Is there a way to
tell which is a closure and which is not? In my tests, the ordinary
function has func_closure == None, but I don't know if that will always be
the case of just for my tests.

closure != function.

(did you read the wikipedia page?

Yes I did. Did you read my post?

If you create a closure, using a memoization technique as per the original
post, and then call type() on that closure, Python reports <type 'function'>.

If you use dir() on that closure-that-Python-calls-a-function, it tells
you that there is an attribute "func_closure". But ordinary functions that
aren't closures also have that attribute.

According to my tests, ordinary functions have None as the func_closure
attribute, but I don't know if that will always be the case, or just the
few examples I tested it. With a sample size of three, I have no
confidence that I've found a bullet-proof test.
Other that that, thank you for the links.

--
Steven.

Oct 9 '05 #17

Diez B. Roggisch

> If you create a closure, using a memoization technique as per the original

post, and then call type() on that closure, Python reports <type 'function'>.
Because it is. The closure is only sort of an extension to the locals()
available to that function. Not more, not less.
If you use dir() on that closure-that-Python-calls-a-function, it tells
you that there is an attribute "func_closure". But ordinary functions that
aren't closures also have that attribute.
Because there is no difference between them - a function _can_ have a
closure, the same way a HttpRequest _can_ have POST-data or not. That
doesn't make it different.

You are of course right that one _could_ have implemented this with
different classes. But as it happens, it isn't.

According to my tests, ordinary functions have None as the func_closure
attribute, but I don't know if that will always be the case, or just the
few examples I tested it. With a sample size of three, I have no
confidence that I've found a bullet-proof test.

See this small testscript how to expose the closures content -- however
I'm not familiar with the mechanics the bind "some_variable" to the
cell-objects content. But actually I don't care...

def closure_test():
some_variable = {}
print "%x" % id(some_variable)
def get(x):
return some_variable[x]
def set(x,v):
some_variable[x] = v
return get, set

g, s = closure_test()

s(10, 20)
print g(10)
print s.func_closure[0]

Diez

Oct 9 '05 #18

Fredrik Lundh

Steven D'Aprano wrote:

Yes I did. Did you read my post?
given that the wikipedia page says

"a closure is an abstraction representing a function, plus the
lexical environment (see static scoping) in which the function
was created."

and you're still focussing on type(function), it sure looks as if you missed
certain parts of that explanation.

let's take it again, with emphasis on some important words:

"a closure is an ABSTRACTION representing a function, PLUS the
lexical ENVIRONMENT (see static scoping) in which the function
was created."
If you create a closure, using a memoization technique as per the original
post, and then call type() on that closure, Python reports <type 'function'>.
that's because "closure" is an abstract concept. there is no "closure" object
in Python, just as there is no "program" object. function objects always con-
tain all the information they need about their context, and the sum of that is
what forms the closure.
If you use dir() on that closure-that-Python-calls-a-function, it tells
you that there is an attribute "func_closure". But ordinary functions that
aren't closures also have that attribute.

"ordinary functions" also have closures: the global context is also part of
the function's environment, and therefore part of the closure (in Python,
it can be argued that all attributes of the function object are part of the
closure)

</F>

Oct 9 '05 #19

Ron Adam

Steven D'Aprano wrote:

On Sat, 08 Oct 2005 15:20:12 +0200, Lasse Vågsæther Karlsen wrote:

Ok, so I thought, how about creating a decorator that caches the
function results and retrieves them from cache if possible, otherwise it
calls the function and store the value in the cache for the next invokation.

This is what I came up with so far:

def cache_function(fn):
cache = {}
def cached_result(*args, **kwargs):
if args in cache:
return cache[args]
result = fn(*args, **kwargs)
cache[args] = result
return result
return cached_result

I'm curious... where does cache live after you use cache_function to
memoize some function? It doesn't appear to be an attribute of the newly
memoized function, nor does it look like a global variable.

If I understand this correctly...

if args in cache:

Creates a local binding to the "cache" object that gets returned with
the cashed_result function the same as...

result = fn(*args, **kwargs)

... the function 'fn' does here. So they remain local bindings to
objects in the scope they were first referenced from even after the
function is returned. In effect, 'cache' and 'fn' are replaced by the
objects they reference before the cached_result function is returned.

Is this correct?

Cheers,
Ron

Oct 9 '05 #20

Steven D'Aprano

On Sun, 09 Oct 2005 14:27:32 +0200, Fredrik Lundh wrote:

Steven D'Aprano wrote:
Yes I did. Did you read my post?
given that the wikipedia page says

"a closure is an abstraction representing a function, plus the
lexical environment (see static scoping) in which the function
was created."

and you're still focussing on type(function), it sure looks as if you missed
certain parts of that explanation.

let's take it again, with emphasis on some important words:

"a closure is an ABSTRACTION representing a function, PLUS the
lexical ENVIRONMENT (see static scoping) in which the function
was created."
If you create a closure, using a memoization technique as per the original
post, and then call type() on that closure, Python reports <type 'function'>.

that's because "closure" is an abstract concept.

So are functions, modules, classes, instances, ints and strings.
there is no "closure" object
in Python, just as there is no "program" object.
Clearly there is no DISTINCT closure object. If there were, I wouldn't
need to ask how one can tell them apart, because type() would just report
that one was a function and one was a closure. I don't have a problem with
that. But read on...
function objects always con-
tain all the information they need about their context, and the sum of that is
what forms the closure.

If what you say is true, then all functions are closures, and closure is
just a synonym for function, and there is no difference between a function
and a closure. Then why bother to create the term? Clearly, whoever
invented the term did so to distinguish the two.

At a practical, Python level, there is a difference between a function
before and after it gets made into a closure using, e.g. the original
poster's memoization technique. In Python at least, it is not true that a
function and a function-turned-into-closure is the same thing.

See, for example, this:-

def closefy(fn): .... def do_nothing(*args, **kwargs):
.... return fn(*args, **kwargs)
.... return do_nothing
.... def spam1(): .... return 0
.... spam2 = closefy(spam1)
spam1() 0 spam2() 0 spam2 is spam1 False spam1.func_closure is None True spam2.func_closure (<cell at 0xf70a2734: function object at 0xf6e0a144>,) hex(id(spam1)) '0xf6e0a144'

In one case, the "raw" function has None stored in its func_closure
attribute. In the other, it has a tuple containing at least one object of
type cell. That cell object appears to contain a reference back to the
original "raw" function.

Now, I'll tell you what I think is going on: spam1() obviously has lexical
scope, but that scope doesn't need to be saved with the function because
it is implicitly understood by Python.

The output of closefy(), on the other hand, has scope different from
the "normal" Python scope. So its output has to package up enough
information to re-create it's lexical scope in a cell object, and
that gets stored in the output's func_closure attribute.

So no, there is no "abstract" difference between a closure and a raw
function, but there is a practical difference. Now perhaps in some other
languages there is no practical difference either, but (as far as I can
see) Python is not that language.

Am I close?

Here are another two code snippets that show something of what Python is
doing:-
def thingo(): .... def shrubbery():
.... return 0
.... return shrubbery
.... spam3 = thingo()
spam3() 0 spam3.func_closure is None True
def thingy(): .... n = 0
.... def shrubbery():
.... return n
.... return shrubbery
.... spam4 = thingy()
spam4.func_closure

(<cell at 0xf70a2b24: int object at 0x901a158>,)
--
Steven.

Oct 9 '05 #21

Fredrik Lundh

Ron Adam wrote:

In effect, 'cache' and 'fn' are replaced by the objects they reference
before the cached_result function is returned.

not really; accesses to "free variables" always go via special cell objects
(rather than direct object references), and the compiler uses special byte
codes for all accesses to such objects.

this snippet illustrates the differences:

def func1():
return a

def func2(a):
return a

def wrapper(a):
def func3():
return a
return func3

def decode(func):
import dis
print func.__name__ + ":"
dis.dis(func)
print "co_freevars =", func.func_code.co_freevars
print "func_closure =", func.func_closure
print

decode(func1)
decode(func2)
decode(wrapper(10))

if you run this on a recent version of Python, you get something like:

func1:
2 0 LOAD_GLOBAL 0 (a)
3 RETURN_VALUE
co_freevars = ()
func_closure = None

func2:
5 0 LOAD_FAST 0 (a)
3 RETURN_VALUE
co_freevars = ()
func_closure = None

func3:
9 0 LOAD_DEREF 0 (a)
3 RETURN_VALUE
co_freevars = ('a',)
func_closure = (<cell at 0x0092C970: int object at 0x008A537C>,)

note the use of LOAD_DEREF for the variable access.

Other opcodes include STORE_DEREF (that updates a local variable
through a cell, used inside the "defining" scope), and MAKE_CLOSURE
(which sets up the function object for functions that actually uses
nested scopes; this is where the func_closure attribute is initialized).

(now, if I weren't limited to plain text, I could have drawn a nice little
diagram for you, but that's an entirely different thread...)

</F>

Oct 9 '05 #22

Steven D'Aprano

On Sun, 09 Oct 2005 13:43:38 +0200, Fredrik Lundh wrote:

If I wanted to inspect the value of cache, where would I find it?

nowhere. at least no officially; see "Rebinding names in enclosing
scopes" in the design document for a discussion:

http://www.python.org/peps/pep-0227.html

That's interesting to read, and does clarify a few things that weren't
straight in my head, but I notice that the author, Jeremy Hylton,
distinguishes between Python creating function definitions that are
closures from those that aren't:

"Each def or lambda expression that is executed will create a closure if
the body of the function or any contained function has free variables."

Presumably that means that if there are no free variables, no closure
is created.

That assumes that cache can be seen in this way. If it can't, is this a
way to create "really private" variables in Python?

no, because Python doesn't prevent you from digging into the
internals:

http://aspn.activestate.com/ASPN/Coo.../Recipe/440515

That's blackest voodoo, and as the warning says, could cause Python to
sigfault if you get the opcodes wrong.

I think that this is close enough to "really private" to satisfy most
people. Maybe even Paul Rubin *wink*
--
Steven.

Oct 9 '05 #23

Fredrik Lundh

Steven D'Aprano wrote:

let's take it again, with emphasis on some important words:

"a closure is an ABSTRACTION representing a function, PLUS the
lexical ENVIRONMENT (see static scoping) in which the function
was created."
If you create a closure, using a memoization technique as per the original
post, and then call type() on that closure, Python reports <type 'function'>.
that's because "closure" is an abstract concept.

So are functions, modules, classes, instances, ints and strings.

they're hardly abstract; they all have distinct object implementations in the
Python runtime. there's an include file and a main implementation file for each
one of them, and you can access the type objects for each one of them from
Python code. any implementation of Python has to implement them. there's
no such thing for "closures"; they're the sum of a lot of different parts, all of
which can be implemented in many different ways.
If what you say is true, then all functions are closures, and closure is
just a synonym for function, and there is no difference between a function
and a closure.
having something isn't the same thing as being something.

if you replace "function" with "function object", you get a bit closer to the
truth. e.g. in this snippet

def wrapper(a):
def func3():
return a
return func

"func" and "wrapper" are functions, but if you call wrapper twice, you get
two different function objects, each with it's own distinct closure.

(to see this in action, take the snippet I posted in an earlier post and
add

f = wrapper(10); decode(f)
f = wrapper(10); decode(f)

to the end. this prints:

--------------------------------------------------------------------
func3:
9 0 LOAD_DEREF 0 (a)
3 RETURN_VALUE
4 LOAD_CONST 0 (None)
7 RETURN_VALUE
co_freevars = ('a',)
func_closure = (<cell at 0x0083EE50: int object at 0x00798250>,)

--------------------------------------------------------------------
func3:
9 0 LOAD_DEREF 0 (a)
3 RETURN_VALUE
4 LOAD_CONST 0 (None)
7 RETURN_VALUE
co_freevars = ('a',)
func_closure = (<cell at 0x0083EDB0: int object at 0x0079A960>,)

)

same function, same code, different closures.

(note that in CPython, functions only exist on the syntax level; when
you writing something that you think is a function, the CPython com-
piler will turn this into a code object and translate "def" to bytecode;
the actual function object isn't created until you execute those byte-
codes)
At a practical, Python level, there is a difference between a function
before and after it gets made into a closure using, e.g. the original
poster's memoization technique. In Python at least, it is not true that a
function and a function-turned-into-closure is the same thing.
function or function object? all instances of the latter have a global en-
vironment (func_globals), some also have default arguments (func_defaults),
some also have references to nested scopes (func_freevars). all of them
have all these attributes; that they're set to None when empty is just an
memory optimization.

spam1.func_closure is None True spam2.func_closure (<cell at 0xf70a2734: function object at 0xf6e0a144>,) hex(id(spam1))

'0xf6e0a144'

In one case, the "raw" function has None stored in its func_closure
attribute. In the other, it has a tuple containing at least one object of
type cell. That cell object appears to contain a reference back to the
original "raw" function.

func_closure is the name of an attribute. it contains information about
a certain part of a function object's closure (free variables in an existing
outer scope), but, despite its name, it's not *the* closure.
So no, there is no "abstract" difference between a closure and a raw
function, but there is a practical difference.

only if you're obsessed with CPython implementation details.

</F>

Oct 9 '05 #24

Diez B. Roggisch

> Clearly there is no DISTINCT closure object. If there were, I wouldn't

need to ask how one can tell them apart, because type() would just report
that one was a function and one was a closure. I don't have a problem with
that. But read on...

function objects always con-
tain all the information they need about their context, and the sum of that is
what forms the closure.

If what you say is true, then all functions are closures, and closure is
just a synonym for function, and there is no difference between a function
and a closure. Then why bother to create the term? Clearly, whoever
invented the term did so to distinguish the two.

You seem to still not getting to it: they aren't supposed to be
distinguished - a function is a function. But it _can_ have a closure,
which (shortly spoken) enriches it's local variables lookup. Speaking in
terms of "is a" could be seen as some inheritance relation. Which
functions and functions with a closure _could_ have if one wanted to.
But that is true for other OO-concepts two. See below.
At a practical, Python level, there is a difference between a function
before and after it gets made into a closure using, e.g. the original
poster's memoization technique. In Python at least, it is not true that a
function and a function-turned-into-closure is the same thing.

See, for example, this:-

<snip>

Nope. You mix things here. spam1 is a function. spam2 is actually bound
to the function-object created by do_nothing - a function that has a
closure because it is nested in closefy, and accesses variables from its
outer lexical scope - thus i needs a closure. Part of do_nothing's
closure is spam1 (or, better to say, the function reference that has
been originally bound to spam1 - spam1 is just a name). Important is
that the closure gets created at runtime, so each returned instance of
do_nothing has its own. But the original spam1 implementation is
untouched contrary to what you state above. It still is teh same
thing, and has no closure.

Consider this:

class Foo(object):
pass

f = Foo()
g = Foo()
g.bar = "whatever"

f and g are both Foo - but one has a different attribute set. And if Foo
had some semantics that did something different based on bar, would you
also say they're supposed have two different classes? You could in fact
do that - create one class for each incarnation of possible state. But
then the concept of classes and objects gets somewhat blurred, as each
instance would have its own class. The same is true for functions.
So no, there is no "abstract" difference between a closure and a raw
function, but there is a practical difference. Now perhaps in some other
languages there is no practical difference either, but (as far as I can
see) Python is not that language.

Am I close?

Closer :)

Regards,

Diez

Oct 9 '05 #25

Fredrik Lundh

Steven D'Aprano wrote:

"Each def or lambda expression that is executed will create a closure if
the body of the function or any contained function has free variables."

Presumably that means that if there are no free variables, no closure
is created.

you're quoting selectively; the word "closure" isn't used anywhere
in the design document except for a section that talks about "flat
closures", which is an implementation approach used in CPython
(as the sentence before the one you quoted explains).

I don't think it's necessarily a good idea to treat a reference to an
algorithm and an attribute on an object in a specific implementation
as the definition of an computer term...

</F>

Oct 9 '05 #26

Ron Adam

Fredrik Lundh wrote:

Ron Adam wrote:

In effect, 'cache' and 'fn' are replaced by the objects they reference
before the cached_result function is returned.

not really; accesses to "free variables" always go via special cell objects
(rather than direct object references), and the compiler uses special byte
codes for all accesses to such objects.

this snippet illustrates the differences:

def func1():
return a

def func2(a):
return a

def wrapper(a):
def func3():
return a
return func3

def decode(func):
import dis
print func.__name__ + ":"
dis.dis(func)
print "co_freevars =", func.func_code.co_freevars
print "func_closure =", func.func_closure
print

decode(func1)
decode(func2)
decode(wrapper(10))

if you run this on a recent version of Python, you get something like:

func1:
2 0 LOAD_GLOBAL 0 (a)
3 RETURN_VALUE
co_freevars = ()
func_closure = None

func2:
5 0 LOAD_FAST 0 (a)
3 RETURN_VALUE
co_freevars = ()
func_closure = None

func3:
9 0 LOAD_DEREF 0 (a)
3 RETURN_VALUE
co_freevars = ('a',)
func_closure = (<cell at 0x0092C970: int object at 0x008A537C>,)

note the use of LOAD_DEREF for the variable access.

Other opcodes include STORE_DEREF (that updates a local variable
through a cell, used inside the "defining" scope), and MAKE_CLOSURE
(which sets up the function object for functions that actually uses
nested scopes; this is where the func_closure attribute is initialized).

(now, if I weren't limited to plain text, I could have drawn a nice little
diagram for you, but that's an entirely different thread...)

</F>

Thanks Fred, That's something useful to know when working with
decorators I think. And it should answer Stevens question too.

Abstract terminology is great once you already know the things and
concepts it refers to. But if you don't, finding where the abstract
connects to reality is sometimes not easy. There's nothing better than
a good example to illistrate the concept. ;-)

Cheers,
Ron

Oct 9 '05 #27

Steven D'Aprano

On Sun, 09 Oct 2005 18:00:13 +0200, Fredrik Lundh wrote:

Steven D'Aprano wrote:
"Each def or lambda expression that is executed will create a closure if
the body of the function or any contained function has free variables."

Presumably that means that if there are no free variables, no closure
is created.
you're quoting selectively; the word "closure" isn't used anywhere
in the design document except for a section that talks about "flat
closures", which is an implementation approach used in CPython
(as the sentence before the one you quoted explains).

Given that when I asked about closures, you pointed me at that document
for further information, I thought maybe I could take it as authoritative.
I don't think it's necessarily a good idea to treat a reference to an
algorithm and an attribute on an object in a specific implementation as
the definition of an computer term...

Are you suggesting that Jeremy Hylton got it wrong? That's fine if he did,
but it would help if you said so rather than beating around the bush.

Is it correct to say that Python creates closures *ever*?

Is it correct to say that Python creates closures when a def or lambda
statement is executed?

Is it correct to say that Python *always* creates a closure whenever a def
or lambda is executed? Or is it only for *certain* defs/lambdas?

Is it correct to talk about functions *having* closures, or that they
*are* closures, or that the function is *part of* a closure?

If all functions are closures, what does it mean to say that some
languages (e.g. Java, C++, C#) don't have closures but merely simulate
them? Are there any languages that don't have closures?
Thanks for your patience, I'm sure I'll get there eventually.

--
Steven.

Oct 9 '05 #28

Steven D'Aprano

On Sun, 09 Oct 2005 17:55:16 +0200, Diez B. Roggisch wrote:

If what you say is true, then all functions are closures, and closure is
just a synonym for function, and there is no difference between a function
and a closure. Then why bother to create the term? Clearly, whoever
invented the term did so to distinguish the two.
You seem to still not getting to it: they aren't supposed to be
distinguished - a function is a function. But it _can_ have a closure,
which (shortly spoken) enriches it's local variables lookup.

[penny drops] Now we're getting somewhere... a closure is something
_added_ to a function. So we should talk about functions-without-closures
and functions-with-closures.
Speaking in
terms of "is a" could be seen as some inheritance relation.
[penny rises again] Dammit, just when I thought I got it. So a closure is
like a subclass of function? Functions are a higher level classification,
like mammals. There are mammals that are primates (functions that _are_
closures) and mammals that aren't (functions that _aren't_ closures).

Which
functions and functions with a closure _could_ have if one wanted to.
But that is true for other OO-concepts two. See below.
At a practical, Python level, there is a difference between a function
before and after it gets made into a closure using, e.g. the original
poster's memoization technique. In Python at least, it is not true that a
function and a function-turned-into-closure is the same thing.

See, for example, this:-

<snip>

Nope. You mix things here. spam1 is a function.

Yes.
spam2 is actually bound
to the function-object created by do_nothing -
Yes.
a function that has a closure because it is nested in closefy, and
accesses variables from its outer lexical scope - thus i needs a
closure.
Right! So closures are something which functions _have_ -- which is why
Python functions have an attribute called func_closure, which is empty in
functions without closures, and contain a tuple of cells in those with
them.

Now somebody is going to tell me this is an implementation detail and it
isn't true in the general language independent case...

Part of do_nothing's closure is spam1 (or, better to say, the
function reference that has been originally bound to spam1 - spam1 is
just a name).
Yes, I can see that.
Important is that the closure gets created at runtime, so
each returned instance of do_nothing has its own. But the original spam1
implementation is untouched contrary to what you state above. It still
is teh same thing, and has no closure.

Consider this:

class Foo(object):
pass

f = Foo()
g = Foo()
g.bar = "whatever"

f and g are both Foo
Now you're trying to trick me... they are both *instances* of Foo, not
Foo. Foo is a class, and f and g are not classes, let alone the same class.
- but one has a different attribute set. And if Foo
had some semantics that did something different based on bar, would you
also say they're supposed have two different classes? You could in fact
do that - create one class for each incarnation of possible state. But
then the concept of classes and objects gets somewhat blurred, as each
instance would have its own class. The same is true for functions.

Ah, but in languages like Pascal that can take functions as arguments, but
can't return functions as output, *all* functions have to be created
separately. To take your analogy and run with it, Pascal would be like a
language that didn't allow f and g to have different attributes unless
they belonged to different classes.

Python is not like that, which is why you can write a function to return
functions (a factory function?). If the output function needs access to
the namespace of the factory function, Python adds a closure to that
output function, giving it access to the objects in that namespace.
--
Steven.

Oct 9 '05 #29

Steven D'Aprano

On Sun, 09 Oct 2005 17:39:23 +0200, Fredrik Lundh wrote:

only if you're obsessed with CPython implementation details.

No. I'm obsessed with finding out what closures are, since nobody seems to
have a good definition of them!

However, I have learnt some things: closures are something which functions
HAVE, not ARE. The func_closure attribute is just part of the
implementation of the closure. Some languages have closures and some
don't. And a closure is something that lets a function object access the
lexical scope that existed when the function object was created, e.g. the
namespace of the function which created it.

Am I getting there now?
--
Steven
who is very grateful for the time folks have taken to explain this.

Oct 9 '05 #30

Piet van Oostrum

>>>>> Steven D'Aprano <st***@REMOVETHIScyber.com.au> (SD) wrote:

SD> [penny drops] Now we're getting somewhere... a closure is something
SD> _added_ to a function. So we should talk about functions-without-closures
SD> and functions-with-closures.

Well, I have never heard that definition. For more than half of my life I
only have heard the definition that a closure is the combination of a
function (i.e. its code) plus an environment containing the unbound
variables. Depending on the language the variables may be bound to values,
memory locations or some other things.
--
Piet van Oostrum <pi**@cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: pi**@vanoostrum.org

Oct 9 '05 #31

Paul Rubin

Steven D'Aprano <st***@REMOVETHIScyber.com.au> writes:

No. I'm obsessed with finding out what closures are, since nobody seems to
have a good definition of them!
Why don't you read SICP:

http://mitpress.mit.edu/sicp/full-text/book/book.html

You will be a much wiser person for having done so.
However, I have learnt some things: closures are something which
functions HAVE, not ARE.
Nah, that describes the data layout through which CPython implements
closures, not the bigger idea of what closures are.
The func_closure attribute is just part of the implementation of the
closure. Some languages have closures and some don't. And a closure
is something that lets a function object access the lexical scope
that existed when the function object was created, e.g. the
namespace of the function which created it.

I'd say, the closure is the function's executable code, plus the
lexical environment that got recorded at the time the function was
created.

Oct 9 '05 #32

Paul Rubin

Steven D'Aprano <st***@REMOVETHIScyber.com.au> writes:

Is it correct to say that Python *always* creates a closure whenever a def
or lambda is executed? Or is it only for *certain* defs/lambdas?
The word closure is being used two different ways. First of all the
computer-science term "closure" which means a combination of a
function and its lexical environment. Second of all the data field
named "closure" which lives inside certain function objects in Python.
The second one is created only in certain Python functions but that's
an implementation detail. It was added because in the old days,
Python didn't support closures at all, so implementing them was done
straightforwardly by adding a field that contains the closure
environment, for Python functions that needed it.

So the answer to your first question above is "always" in the CS sense
and "only for certain defs/lambdas" in the implementation sense.
Is it correct to talk about functions *having* closures, or that they
*are* closures, or that the function is *part of* a closure?
The function is part of the closure. I think you're a little confused
by implementation issues. In Python, functions are first-class
objects, like class instances or strings. If you say
x = 'foo' + 'bar'
a new string is created out of nowhere. If you say
def f(x): return x+3
a new function is created out of nowhere. Lambda notation maybe
shows more clearly that functions are values:
f = (lambda x: x+3)
is the same as the "def f" above.

Let's go back a step. Do you understand what a free variable is?
Let's say:
def f():
print x
f refers to x but x is not defined in f. We say x is "free" in f.

In the old days, this could only mean one thing: x was a global
variable. Python didn't have nested scopes. If a variable wasn't
found in a function's locals, it could only come from the globals
that are shared across all functions.

Python now has nested scopes. That means x doesn't have to come
from the globals, it can be a local from some surrounding function:
def g(n):
x = n # x is local to g
def f():
print x
f()
When you run g(3), Python prints 3. That's not too confusing, right?

Suppose instead of calling f, you return it. You can do that in Python:
def g(n):
x = n # x is local to g
def f():
print x
return f

h3 = g(3) # h3 is the function g creates when you run it

What will happen when you call h3()? It prints 3. How did it know to
do that? Because h3 doesn't just contain executable bytecode, it also
contains f's lexical environment from when g ran. (f's lexical
environment means the values of f's free variables from outer scopes
at the time f was created). Now suppose you say:

h5 = g(5) # make a new function

See, h3 and h5 are two different functions:

h3() # still prints 3
h5() # prints 5

They contain the same executable code, but different lexical environments.

The combination of executable code plus lexical environment is called a
closure (CS term). In Python, every function is a (CS) closure. The
environment lives in a field called func_closure or something like that.

However, when there are no free variables, Python doesn't bother
creating that field. That was ALWAYS the case in old versions of
Python. Python didn't recognize nested scopes; there were only locals
and globals.
If all functions are closures, what does it mean to say that some
languages (e.g. Java, C++, C#) don't have closures but merely
simulate them? Are there any languages that don't have closures?

Those languages listed there don't have closures. They don't even
have first-class functions. Imagine what the h3 and h5 examples from
above would do in those languages.

To be fair, there's an argument that closures are un-Pythonic: you can
use class instances instead. You could say:

class G:
def __init__(self, n): self.x = n
def __call__(self): return self.x
h3 = G(3)
h5 = G(5)

and h3() prints 3 and h5() prints 5, just like the corresponding
closures. But now you've polluted the name space with this extra
class G, and (depending on the situation) maybe you've written more
cumbersome code. It's partly a matter of taste and preferences. To
Lisp and Scheme users, closures are very natural. To OOP programmers,
maybe they're confusing and unnecessary.

Oct 9 '05 #33

Diez B. Roggisch

[penny drops] Now we're getting somewhere... a closure is something
_added_ to a function. So we should talk about functions-without-closures
and functions-with-closures.
Yes.

Speaking in
terms of "is a" could be seen as some inheritance relation.

[penny rises again] Dammit, just when I thought I got it. So a closure is
like a subclass of function? Functions are a higher level classification,
like mammals. There are mammals that are primates (functions that _are_
closures) and mammals that aren't (functions that _aren't_ closures).

Hrmpf. No, I wanted to say the exact opposite. My bad - just ignore that
paragraph of mine in the earlier post, ok?

Right! So closures are something which functions _have_ -- which is why
Python functions have an attribute called func_closure, which is empty in
functions without closures, and contain a tuple of cells in those with
them.
Yes.

Now somebody is going to tell me this is an implementation detail and it
isn't true in the general language independent case...
Paul Rubin did so, and I guess it is, although I don't know for sure.
Now you're trying to trick me... they are both *instances* of Foo, not
Foo. Foo is a class, and f and g are not classes, let alone the same class.
As two functions are instances of the "class of functions". I just
wanted to point out that two instances of one class can have differnet
state. The analogy isn't perfect, as this special state "closure" isn't
available easily for you to alter, and gets created behind the scenes.
But as each function with a closure is created on the fly - same code,
different closure data - it sort of fitted, I hoped.

- but one has a different attribute set. And if Foo
had some semantics that did something different based on bar, would you
also say they're supposed have two different classes? You could in fact
do that - create one class for each incarnation of possible state. But
then the concept of classes and objects gets somewhat blurred, as each
instance would have its own class. The same is true for functions.

Ah, but in languages like Pascal that can take functions as arguments, but
can't return functions as output, *all* functions have to be created
separately. To take your analogy and run with it, Pascal would be like a
language that didn't allow f and g to have different attributes unless
they belonged to different classes.

Python is not like that, which is why you can write a function to return
functions (a factory function?). If the output function needs access to
the namespace of the factory function, Python adds a closure to that
output function, giving it access to the objects in that namespace.

Hm, no. You can return functions in other languages, e.g. C. Don't know
pascal good enough. But they will always be the same without any state
associated to them, whereas in python you have some state. The same way
as bound methods know about their first arguments, functions with
closures know about - well, their closure :)

A closure is really only an environment, adding values to locals() -
implemented as cells in python - that adds to the local variables when
the function is executed, and is dynamically created when a function
with a closure is defined. This works because of the interpreted nature
of python.
Regards,

Diez

Oct 9 '05 #34

Tom Anderson

On Mon, 10 Oct 2005, Steven D'Aprano wrote:

On Sun, 09 Oct 2005 17:39:23 +0200, Fredrik Lundh wrote:
only if you're obsessed with CPython implementation details.

No. I'm obsessed with finding out what closures are, since nobody seems
to have a good definition of them!

On the contrary - the problem is that several people have good but
incompatible definitions of them!

I think you pretty much understand the mechanics of what's going on; i've
spent god knows how long trying to write a clear but accurate definition
of closures, but i can't, so i'm just going to say that (a) closures are
functions, and (b) the things in func_closure are not closures - they're
the variables over which a closure (the function you're inspecting) is
closed; this is just sloppy terminology on the part of python's
implementors.

Okay, a crack at a definition: a closure is a function in which some of
the variable names refer to variables outside the function. And i don't
mean global variables - i mean somebody else's locals; call them 'remote
variables'. The 'somebody else' who owns those locals is a function whose
scope encloses the definition of the closure function. A crucial point is
that these names keep working even after the scope where they started out
dies - when a closure escapes the 'mother scope' that houses its remote
variables, those variables effectively transcend, becoming, well, i don't
know. Nonlocal variables or something.

For some reason, this makes me think of Stephen Baxter novels. And a
little of H. P. Lovecraft. I guess the point is that you shouldn't delve
too deeply into the mysteries of functional programming if you wish to
retain your humanity.

tom

--
Orange paint menace

Oct 10 '05 #35

Paul Rubin

Tom Anderson <tw**@urchin.earth.li> writes:

Okay, a crack at a definition: a closure is a function in which some
of the variable names refer to variables outside the function.

That's misleading, I'd say a closure is a combination of a function
(executable code) and a lexical environment (the values of the
function's free variables, as taken from surrounding scopes at the
time the function was created). For example:
def f(n):
def g(): return n
return g
h1 = f(1)
h2 = f(2)

h1 and h2 are two different closures. They have the same executable
code but their environments are different. In h1, n=1, but in h2, n=2.
So, h1() will return 1 but h2() will return 2. Is there really anything
confusing about this? All that's happened is that when you call f, f
allocates a memory slot for n. g makes a reference to the slot and
then f returns. Since the reference to the slot still exists, the slot
doesn't get GC'd. When you call f again, it allocates a new slot.

This is all described in SICP (mitpress.mit.edu/sicp).

Oct 11 '05 #36

Piet van Oostrum

>>>>> Paul Rubin <http://ph****@NOSPAM.invalid> (PR) wrote:

PR> Tom Anderson <tw**@urchin.earth.li> writes: PR> That's misleading, I'd say a closure is a combination of a function
PR> (executable code) and a lexical environment [snip]PR> This is all described in SICP (mitpress.mit.edu/sicp).

Where the word closure is used for something completely different (it
mentions that others use the word closure as we do here).
:=(
--
Piet van Oostrum <pi**@cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: pi**@vanoostrum.org

Oct 11 '05 #37

Fredrik Lundh

Tom Anderson wrote:

Okay, a crack at a definition: a closure is a function in which some of
the variable names refer to variables outside the function. And i don't
mean global variables - i mean somebody else's locals; call them 'remote
variables'.

in Python, the global variables are someone else's locals.

</F>

Oct 11 '05 #38

Tom Anderson

On Tue, 10 Oct 2005, it was written:

Tom Anderson <tw**@urchin.earth.li> writes:
Okay, a crack at a definition: a closure is a function in which some of
the variable names refer to variables outside the function.
That's misleading,

You're right. I don't think it's wrong, but it depends on the reader
knowing what i mean by 'variable'. I didn't make it clear that variables
live in an *invocation* of a function, not a definition. IOW, in this
code:

def foo():
x = 1
foo()
foo()

There are two occurrances of assignment, and they are to two *different*
variables.

Hmm. Maybe i *am* wrong - looking at your example:
For example:
def f(n):
def g(): return n
return g
h1 = f(1)
h2 = f(2)
h1 and h2 are closures, while g is "a function in which some of the
variable names refer to variables outside the function", but is not a
closure.

So a closure is something that is created when a function is taken out of
its lexical environment (through being returned, or passed downwards, or
stored on the heap); when you take it out, it retains a connection to that
environment. What we need now is a good metaphor for this ...
I'd say a closure is a combination of a function (executable code) and a
lexical environment (the values of the function's free variables, as
taken from surrounding scopes at the time the function was created).
I was trying to avoid the word 'lexical', since a lot of people aren't too
clear on what that means.
h1 and h2 are two different closures. They have the same executable
code but their environments are different. In h1, n=1, but in h2, n=2.
So, h1() will return 1 but h2() will return 2. Is there really anything
confusing about this? All that's happened is that when you call f, f
allocates a memory slot for n. g makes a reference to the slot and
then f returns. Since the reference to the slot still exists, the slot
doesn't get GC'd. When you call f again, it allocates a new slot.
That's quite a good way of explaining it. The thing about closures is that
they're really obvious when you actually write them, but rather tricky to
define in words. So, perhaps the best explanation is to walk the reader
through an example.
This is all described in SICP (mitpress.mit.edu/sicp).

A lot of things are described in SICP. ISTM that someone should not have
to read the whole of SICP to understand what closures are.

tom

--
That's no moon!

Oct 13 '05 #39

Function decorator that caches function results

Similar topics