Hello all,
I've been messing about for fun creating a trial division factorizing
function and I'm naturally interested in optimising it as much as
possible.
I've been told that iteration in python is generally more
timeefficient than recursion. Is that true?
Here is my algorithm as it stands. Any suggestions appreciated!
from math import *
def factorize(x):
""" Return factors of x """
factors = []
lim = int(sqrt(x))
y = divmod(x,2)
if y[1] == 0: # x is even
factors = factors + [2] + factorize(y[0])
else: # x is odd
i = 3
while i <= lim:
y = divmod(x,i)
if y[1] == 0:
factors = factors + [i] + factorize(y[0])
i = lim+1
else:
i += 2
if factors == []: # necessary step for correct recursion
factors = [x]
return factors 18 3557
MTD wrote: Hello all,
(snip)
I've been told that iteration in python is generally more timeefficient than recursion. Is that true?
(snip)
AFAIK, in most languages it's a memory thing. Each time a function
calls itself, the 'state' of that function has to be stored somewhere
so that it may continue, as was, when the recursive function returns.
Therefore, you can effectively think of it as creating N many objects
which don't start getting released until the very last nested call
returns. This (depending on the stack size and implementation etc...)
may force several memory allocations. Then of course, as it starts
going back 'upwards' (towards the initiator of the recursive call that
is), the garbage collector may kick in freeing memory...
Depending on return values, iterating will just require space for the
returned value from each function in term (which in most cases  I
would imagine fits on the stack), so although it's doing effectively
the same thing, it's doing so with less memory.
I probably haven't explained too well, and I may not even be right. So
take with a pinch of salt.
All the best,
Jon.
In article <11**********************@h76g2000cwa.googlegroups .com>,
"Jon Clements" <jo****@googlemail.com> writes:
> MTD wrote:
>
> > I've been told that iteration in python is generally more
> > timeefficient than recursion. Is that true?
>
> AFAIK, in most languages it's a memory thing. Each time a function
> calls itself, the 'state' of that function has to be stored somewhere
> so that it may continue, as was, when the recursive function returns.
Well, vaguely. That is probably the case in Python, but it isn't always
true, and is not the only issue.
Tail recursion removal can often eliminate the memory drain, but the
code has to be written so that will work  and I don't know offhand
whether Python does it.
In compiled "3rd GL" languages, there are a lot of optimisation issues
where iteration will often produce better code than recursion, but few
(if any) are relevant to Python.
Regards,
Nick Maclaren.
Nick Maclaren wrote:
(snip) Tail recursion removal can often eliminate the memory drain, but the code has to be written so that will work  and I don't know offhand whether Python does it.
It doesn't. Technical possible, but BDFL's decision...

bruno desthuilliers
python c "print '@'.join(['.'.join([w[::1] for w in p.split('.')]) for
p in 'o****@xiludom.gro'.split('@')])"
Bruno, It doesn't. Technical possible, but BDFL's decision...
Sure. But why bother?
Anything that can be done with recursion can be done with iteration.
Turng proved that in 1936.
Recursion was just an attempt to "unify" design approach by abstracting
itteration and creating a new context. It allowed the programmer to
isolate himself from the reality that he was actually iterating. Talk
about mind fuck.
It seems things were just to simple the way they were.
Like all fashion, this too shall pass.
Sudden Disruption

Sudden View...
the radical option for editing text http://www.sudden.net/ http://suddendisruption.blogspot.com
Sudden Disruption wrote: Bruno,
It doesn't. Technical possible, but BDFL's decision... Sure. But why bother?
I agree.
Anything that can be done with recursion can be done with iteration. Turng proved that in 1936.
Recursion was just an attempt to "unify" design approach by abstracting itteration and creating a new context. It allowed the programmer to isolate himself from the reality that he was actually iterating. Talk about mind fuck.
Well, unless I'm seriously mistaken, it also breaks good design. If a
function calls another function, it's because it requires that
function's specific service. If the service it requires is itself, then
the function should iterate over a set of data and accumulate/reduce
or whatever else it needs to do. As well as that, I can imagine
exception handling becoming quite cumbersome/clumsy.
It seems things were just to simple the way they were.
Like all fashion, this too shall pass.
Be great if it does; but I don't imagine this will happen until
examples of traversing a binary tree using recursion disappear from
computer science text books (the ones I have seen anyway...). Unless,
later in the course (they might do this, I don't know for sure), they
then say, "BTW people, this is the correct way to do it, because the
previous way isn't too good an idea...".
Sudden Disruption  Sudden View... the radical option for editing text http://www.sudden.net/ http://suddendisruption.blogspot.com
Just my little rant,
Jon.
Sudden Disruption a écrit : Bruno,
It doesn't. Technical possible, but BDFL's decision...
Sure. But why bother?
Because I do like recursion, and would personnally prefer tailrecursion
optimisation over nice tracebacks. But I'm not in position to decide
anything here.
Anything that can be done with recursion can be done with iteration. Turng proved that in 1936.
Yes. And everything done with Python can be done with assembly language
too.
Recursion was just an attempt to "unify" design approach by abstracting itteration and creating a new context. It allowed the programmer to isolate himself from the reality that he was actually iterating. Talk about mind fuck.
Recursion is the most convenient way to express some common algorithms.
Too bad for you if it does some nasty things to your mind.
Kay Schluehr wrote: Nick Maclaren wrote:
Tail recursion removal can often eliminate the memory drain, but the code has to be written so that will work  and I don't know offhand whether Python does it.
http://aspn.activestate.com/ASPN/Coo.../Recipe/496691
Regards, Kay
Interesting.....
Thanks Kay.
Jon.
In article <44**********************@news.free.fr>,
Bruno Desthuilliers <bd*****************@free.quelquepart.fr> writes:
> Sudden Disruption a écrit :
>
> Because I do like recursion, and would personnally prefer tailrecursion
> optimisation over nice tracebacks. But I'm not in position to decide
> anything here.
[ See below before you jump to conclusions :) ]
If you have ever had to unpick a really nasty problem where tail recursion
removal has destroyed all evidence of how the program got there AND the
program was written so that it plain did not work without it (memory
exhaustion), you will have cursed the concept to hell and back again.
Been there  done that :(
> > Recursion was just an attempt to "unify" design approach by abstracting
> > itteration and creating a new context. It allowed the programmer to
> > isolate himself from the reality that he was actually iterating. Talk
> > about mind fuck.
As someone who was in this area when the Algol versus Fortran wars were
being fought, that is almost entirely incorrect. Recursion plus tail
removal AS A SUBSTITUTE FOR ITERATION is what you are describing, but
that came a decade after recursion became widespread in programming
languages.
> Recursion is the most convenient way to express some common algorithms.
> Too bad for you if it does some nasty things to your mind.
Agreed. Recursion should be used when it is the right technology to
clarify the code, and not as a gimmicky, obfuscatory and often dogmatic
substitute for iteration! There are algorithms that become almost
incomprehensible without recursion, and I have implemented a recursion
layer in both assembler AND Fortran just to enable me to write them
without going bonkers.
Regards,
Nick Maclaren.
[MTD] I've been messing about for fun creating a trial division factorizing function and I'm naturally interested in optimising it as much as possible.
I've been told that iteration in python is generally more timeefficient than recursion. Is that true?
Since you heard it from me to begin with, I suppose saying "yes" again
won't really help ;) You can use time.time() or time.clock(), or the
`timeit` module, to measure speed. Try timing a deadsimple factorial
functions both ways.
BTW, I've worked on Python's implementation for close to 15 years now,
so when I say something about Python, it's _usually_ safe to just
believe it :) Questioning is fine, but in cases like this where it's
easy to find out for yourself, I probably won't make time to respond.
Here is my algorithm as it stands. Any suggestions appreciated!
from math import *
def factorize(x): """ Return factors of x """ factors = [] lim = int(sqrt(x))
Here you're limited to integers for which a floating sqrt is accurate.
That's 53 bits on most platforms. For integers larger than that, the
code may produce incorrect results in mysterious ways at times
(because the square root may be inaccurate).
y = divmod(x,2)
Most people would use "unpacking assignment" here instead, like
q, r = divmod(x, 2)
Then later code gets to use meaningful names instead of messing with
mysterious little integer subscripts. It's also quicker to use local
names than to build subscripting expressions.
if y[1] == 0: # x is even factors = factors + [2] + factorize(y[0])
Since `factors` upon entry to this block must be [], it's kinda
contorted. More obvious (if you _have_ to use recursion) would be:
return [2] + factorize(y[0])
and then unindent the rest of the code a level.
else: # x is odd i = 3 while i <= lim: y = divmod(x,i)
Again more commonly written:
q, r = divmod(x, i)
if y[1] == 0: factors = factors + [i] + factorize(y[0])
It hardly matters in this context, but appending to a list is
_generally_ much faster than building a brand new list from scratch.
At a higher level, your recursions always start from 2 again; e.g., if
i happens to be 434349 here, the factorize(y[0]) call starts over from
2, despite that you already know that no divisor less than 434349 can
succeed. To do this _efficiently_ with recursion requires passing
more knowledge across recursion levels.
i = lim+1 else: i += 2
if factors == []: # necessary step for correct recursion factors = [x]
return factors
Recursion is pretty bizarre here. Not _wrong_, just "pretty bizarre",
and is leading you into higherlevel inefficiences. There are all
sorts of ways to repair that, but I'm not going to talk about them
because it's easy to write an iterative algorithm instead.
Here's one way to do that, avoiding float sqrt (this works correctly
for integers of any size, although trial division is hopelessly
_inefficient_ for finding "large" factors), and skipping multiples of
3 in the inner loop too:
def trial(n):
if n <= 1:
return [n]
factors = []
for tiny in 2, 3:
while n % tiny == 0:
factors.append(tiny)
n //= tiny
delta = 2
d = 5
while 1:
q, r = divmod(n, d)
if r:
if q < d:
break
d += delta
delta = 6  delta
else:
factors.append(d)
n = q
if n > 1:
factors.append(n)
return factors
If we call the original input N, the key invariants holding at the top
of the loop are:
1. N >= 2, and N == n * the product of the integers in `factors`
2. n is not divisible by any prime < d
3. all elements in `factors` are prime
It may take a bit of thought to see that d marches over all integers= 5 that aren't divisible by either 2 or 3: 5, 7, 11, 13, 17, 19,
23, 25, 29, 31, ...
It's moreorless generally true that establishing and proving loop
invariants in an iterative algorithm is "the same thing" as
establishing and proving pre and postconditions in a recursive
algorithm. While it may feel strained at first, it's very much
worthwhile learning how to do that. For example, from #2 it's easy to
prove that if q < d, n must either be 1 or a prime. That's what
justifies the "break" statement inside the loop. Then since that's
the only way to get out of the loop, n is either 1 or a prime when the
loop finishes, and that explains the rest of the code.
Bruno, Yes. And everything done with Python can be done with assembly language too.
It's even more general than that...
To simply Turning, anything that can compute, can compute anything that
can be computed  hardware OR software.
But that doesn't mean you should use it. Imagine if we all took that
to heart and only coded for the Turing Machine!
So yes, I'll concede the point. Some tools ARE better than others.
It's just that recursion is more often misused than not.
Recursion is the most convenient way to express some common algorithms.
Yes. But very few.
Too bad for you if it does some nasty things to your mind.
It's not recursion per se that does the nasty things. It's the way
it's been promoted over the years when the simplier iterative solution
is a better choice.
Sudden Disruption

Sudden View...
the radical option for editing text http://www.sudden.net/ http://suddendisruption.blogspot.com
Thanks a lot! More food for thought!
Nick, you will have cursed the concept to hell and back again. Been there  done that :(
My point exactly. Years ago with Pascal I took the recursive approach
way too often with much distress. I began to learn that if I move
enough stuff out of the loop and set up a context that could easily see
what was getting "recursed" (great term), iteration was often much
easier to debug and FAR more effective to execute.
Since those times I can count on one hand the times I've used recursion
 and then only because I was late for lunch and I knew "i" wouldn't
get away from me.
As someone who was in this area when the Algol versus Fortran wars were
I'll take your word for it. My start with recursion was Pascal.
Agreed. Recursion should be used when it is the right technology to clarify the code, and not as a gimmicky, obfuscatory and often dogmatic substitute for iteration!
Well put.
There are algorithms that become almost incomprehensible without recursion, and I have implemented a recursion layer in both assembler AND Fortran just to enable me to write them without going bonkers.
With a reasonable exception.
Sudden Disruption

Sudden View...
the radical option for editing text http://www.sudden.net/ http://suddendisruption.blogspot.com
You might use a separate prime generator to produce prime factors. The
factorize algorithm becomes quite simple and configurable by prime
generators. For demonstration purposes I use the eratosthenes sieve.
def eratosthenes():
memo = {}
q = 2
while True:
p = memo.pop(q, None)
if p is None:
yield q
memo[q*q] = q
else:
x = p + q
while x in memo:
x += p
memo[x] = p
q+=1
def factorize(n, sieve = eratosthenes):
if n <= 1:
return [n]
factors = []
primes = sieve()
for q in primes:
while n % q == 0:
factors.append(q)
n //= q
if n == 1:
return factors
Regards,
Kay
In article <11**********************@m73g2000cwd.googlegroups .com>,
"Kay Schluehr" <ka**********@gmx.net> writes:
>
> You might use a separate prime generator to produce prime factors. The
> factorize algorithm becomes quite simple and configurable by prime
> generators. For demonstration purposes I use the eratosthenes sieve.
That is a good point. The differences between iteration and recursion
are wellunderstood (by some people, at least), but the difference
between those two and generators is not. I have mixed feelings whether
they are a good idea or not, largely because I have never seen a
language that provides a declaration to guarantee that a generator is
'clean'. And an unclean generator (e.g. one with sideeffects) is a
most revolting object, from a software engineering (including validation)
point of view.
One good example of this is streaming input (I/O). Traditional,
clean streaming input can be implemented efficiently and with good
error diagnostics. The unclean C and POSIXlike streaming input can't
be, or at least only one of the two can be provided at once.
Regards,
Nick Maclaren.
[Kay Schluehr] You might use a separate prime generator to produce prime factors. The factorize algorithm becomes quite simple and configurable by prime generators.
Alas, yours was _so_ simple that it always takes time proportional to
the largest prime factor of n (which may be n) instead of worstcase
time proportional to sqrt(n). Repair that, and it's not quite so
simple anymore. The speed of the sieve generator would also benefit a
lot by never looking at even integers, beyond an initial "yield 2" ...
the OP said he was looking for speed more than elegance, and they're
not good buddies here ;)
def eratosthenes(): memo = {} q = 2 while True: p = memo.pop(q, None) if p is None: yield q memo[q*q] = q else: x = p + q while x in memo: x += p memo[x] = p q+=1
def factorize(n, sieve = eratosthenes): if n <= 1: return [n] factors = [] primes = sieve() for q in primes: while n % q == 0: factors.append(q) n //= q if n == 1: return factors
At _some_ point you might think that's bound to be faster than only
skipping multiples of 2 and 3, because it does fewer divisions. But
to get to a particular prime p, the sieve there has to go around its
outer loop about 3x as often as the trial() function goes around its
inner loop, and grows a dict with about p/ln(p) entries (while the
trial() function's memory use is constant), and those aren't free.
Applied to 9999991**2 (constructed so that both functions take time
proportional to sqrt(n)), on my box the factorize() function took 4x
longer than trial() (approximately 16 seconds versus 4). Trial
division isn't practical for such large inputs, but I picked the
square of a "large" prime to give factorize() maximum advantage in
skipping division attempts (by the time we get to a prime p,
factorize() tries about p/ln(p) divisions, while trial() does about
p/3; so trial() does about ln(p)/3 times as many divisions as
factorize(): the larger the p we need, the larger that _factor_
gets).
Alas, it appears that made the dict so big that cache faults on dict
accesses more than wiped out the division advantage. At the smaller
999983**2, factorize() took only about 3.6x longer, despite losing
some of its relative division advantage.
In short, it's never what you think it is ;)
I've been testing my recursive function against your iterative
function, and yours is generally a quite steady 50% faster on
factorizing 2**n +/ 1 for 0 < n < 60. I think that, for a challenge,
I'll try to make a recursive function that matche or beats the
iterative function  it's worth the experiment!
Cheers,
MTD
[MTD <ma***********@gmail.com>] I've been testing my recursive function against your iterative function, and yours is generally a quite steady 50% faster on factorizing 2**n +/ 1 for 0 < n < 60.
If you're still not skipping multiples of 3, that should account for most of it.
I think that, for a challenge, I'll try to make a recursive function that matche or beats the iterative function  it's worth the experiment!
Don't stop on that one before you succeed ;) Provided you make no
more than one recursive call per factor, you can't make more than
log(n, 2) recursive calls in total, and _typically_ far fewer than
that. IOW, the number of recursive calls should generally be trivial,
and factoring 2**n will be the worst case. This discussion thread is closed Replies have been disabled for this discussion. Similar topics
226 posts
views
Thread by Stephen C. Waterbury 
last post: by

4 posts
views
Thread by Ed Davis 
last post: by

2 posts
views
Thread by Francois Vanderseypen 
last post: by

75 posts
views
Thread by Sathyaish 
last post: by

59 posts
views
Thread by PyFun 
last post: by
          