473,387 Members | 1,766 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Memoizing Generators

Hello all,

Peter Norvig has a great recipe for memoizing Python functions available at
http://www.norvig.com/python-iaq.html . I made good use of this until I
tried to memoize a generator. At this point, it breaks down because the
_generator_ object is stored as the value for the *args key.

So, I came up with this:

class MemoizeGenerator:
def __init__(self, fn):
self.cache={}
self.fn=fn
def __call__(self,*args):
if self.cache.has_key(args):
for _i in self.cache[args]:
yield _i
else:
self.cache[args]=()
for _i in self.fn(*args):
self.cache[args]+=(_i,)
yield _i

Basically, if the generator hasn't been called before with these args, it
will run the generator's iterator through and append a tuple in the
dictionary value field for these keys. If the args have been used before,
then it iterates through the previously computed list. It works like a
charm for me.

Now, the two remaining issues are 1) can we "unify" the generator
memoization with the "standard" memoization and 2) can we deal with lists
as well (i.e. my generator was returning tuples ... since I'm scared of
lists in recursive generator contexts *wink*). One other memoization
issues: what if the args (which will become keys) are mutables ... in
particular, lists. I suppose we could just tuple() them up?

Regards,
Mark
Jul 18 '05 #1
2 2465
> Now, the two remaining issues are 1) can we "unify" the generator
memoization with the "standard" memoization and 2) can we deal with lists
as well (i.e. my generator was returning tuples ... since I'm scared of
lists in recursive generator contexts *wink*).
One other memoization issues: what if the args (which will become
keys) are mutables ... in particular, lists. I suppose we could just
tuple() them up?


How would you deal with the following generator? (On the assumption you want a
a general purpose memoisation mechanism :)

def retrieveData(source=None, command=True,blocksize=1024):
if source:
if not command:
fh = open(source, "rb",0)
else:
fh = os.popen(self.command)
else:
fh = sys.stdin
try:
try:
while 1:
data = fh.read(blocksize)
if data:
yield data
else:
raise Finality
except Exception, e:
if source:
fh.close()
raise e
except Finality:
pass

If your source is a network connection, or a real user this becomes further
indeterminate... Not a suggestion to not try, just worth considering :)

The other problem as I see it with your implementation is it expects the
generator to terminate. This is far from guaranteed - generators are useful
for dealing with infinite sequences/lists and working with functions that
might not halt. (eg calculate the best approximation of pi that you can based
on time available, not based on number of iterations/memory available)

eg
def runGenerator(fn,args,timeAlloted):
tstart = time.time()
gen = fn(args).next
r = gen()
while 1:
if time.time()-tstart >=timeAlloted:
break
r = gen()
return r

Consider the possibility that the function is a "next move" in chess and the
time is set by the first player to "hard" - ie lots of time. The next time a
player comes along with the time set to "easy", the memoised version won't be
aware of this in this scenario and returns the hard result (but very
quickly...), or worse hits stop iteration - which the original version
wouldn't. Or even worse - occasionally hits the StopIteration and normally
returns the "hard" result aroung 80-90% of the time.

That said memoising _good_ chess results from certain positions _might_ be
considered a good thing. It's still an issue though.

Regards,
Michael.
Jul 18 '05 #2
Hi Michael,
Michael Sparks wrote:
How would you deal with the following generator? (On the assumption you
want a a general purpose memoisation mechanism :) <*snip* a generator reading from an input source> If your source is a network connection, or a real user this becomes
further indeterminate... Not a suggestion to not try, just worth
considering :)
I think one of the standard assumptions with memoization is that you have a
deterministic algorithm ... specifically, given an input, the output is
fixed. So, your example of entering a file and getting different data,
throws typical (in my opinion ... which is by no means sacred) memoization
out the window.

def runGenerator(fn,args,timeAlloted):
tstart = time.time()
gen = fn(args).next
r = gen()
while 1:
if time.time()-tstart >=timeAlloted:
break
r = gen()
return r
Now, this is a problem I like. It would be pretty damn awsome to memoize a
function with 1) a time constraint and 2) a parameter that would indicate
when a time constraint is no longer sufficient and thus more work needs to
be done. Finally, if the generator could _restart_ at the point of work
already done, you would be saving the work done to that point which is
memoization's purpose:

Something like:

runGenerator(PIdigits, None, 10)

and then later

runGenerator(PIdigits, None, 20)

Then our memoizer would say, well, I can get more digits in the extra 10
seconds .... so, I need to recompute this answer. But wait, hopefully,
I've stored both the numerical result for 10 seconds along with the
generator that got me there. So, all I need to do is look up the 10 second
generator and run it for another 10 seconds. Humm, can we copy generators?
I don't seem to think so.

So, the memoizer class would need some more logic and storage to determine
whether to return a stored result, computer a brand new result, or
bootstrap off a previous generator. There would also need to be a
mechanism for determining just how long is "long enough" that more time
needs to be allocated (a really hard problem might not get any further in
12 seconds than it does in 9 seconds). And we'd need a copy of the
generator so we could go from, say 10 second generator to 20 second, but
later go from 10 to 15 seconds. Although, we could get around the problem
by using the maximally accurate result computed so far (presuming we can
look it up in the user specified time interval). Then we'd only need the
generator at the farther point ... and we could just keep running it when
the user gives us more time. Yeah, that could work nicely!

Consider the possibility that the function is a "next move" in chess and
the time is set by the first player to "hard" - ie lots of time. The next
time a player comes along with the time set to "easy", the memoised
version won't be aware of this in this scenario and returns the hard
We could paramaterize the generator (and/or the memoizer) based on the
difficulty setting .... couldn't we?
result (but very quickly...), or worse hits stop iteration - which the
This is another interesting problem .... the "quick result" is actually a
means of making the game harder.
Regards,
Michael.


Great ideas. I might code up the timing based memoizer for kicks.

Regards,
Mark
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

23
by: Francis Avila | last post by:
Below is an implementation a 'flattening' recursive generator (take a nested iterator and remove all its nesting). Is this possibly general and useful enough to be included in itertools? (I know...
9
by: Francis Avila | last post by:
A little annoyed one day that I couldn't use the statefulness of generators as "resumable functions", I came across Hettinger's PEP 288 (http://www.python.org/peps/pep-0288.html, still listed as...
8
by: Timothy Fitz | last post by:
It seems to me that in python, generators are not truly coroutines. I do not understand why. What I see is that generators are used almost exclusively for generation of lists just-in-time. Side...
3
by: Carlos Ribeiro | last post by:
As a side track of my latest investigations, I began to rely heavily on generators for some stuff where I would previsouly use a more conventional approach. Whenever I need to process a list, I'm...
5
by: Robert Oschler | last post by:
Preamble: - I know this is the Python forum - I know about (and have used) Jython I already posted this question in comp.lang.java. But after a week I have still not received a single reply....
3
by: Michael Sparks | last post by:
Hi, I'm posting a link to this since I hope it's of interest to people here :) I've written up the talk I gave at ACCU Python UK on the Kamaelia Framework, and it's been published as a BBC...
6
by: Talin | last post by:
I've been using generators to implement backtracking search for a while now. Unfortunately, my code is large and complex enough (doing unification on math expressions) that its hard to post a...
6
by: Daishi Harada | last post by:
Hi, I'm trying to find/write a memoizing decorator that works both for functions and methods. I've been looking at the following two samples: ...
13
by: Martin Sand Christensen | last post by:
Hi! First a bit of context. Yesterday I spent a lot of time debugging the following method in a rather slim database abstraction layer we've developed: ,---- | def selectColumn(self,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.