By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,806 Members | 1,283 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,806 IT Pros & Developers. It's quick & easy.

How smart is the Python interpreter?

P: n/a
def str_sort(string):
s = ""
for a in sorted(string):
s+=a
return s
if i instead do:

def str_sort(string):
s = ""
so = sorted(string)
for a in so:
s+=a
return s
will that be faster or the interpreter can figure out that it only has
to do sorted(string) once? or that kind of cleverness is usually
reserved for compilers and not interpreters?
Jul 31 '08 #1
Share this Question
Share on Google+
6 Replies

P: n/a
ssecorp wrote:
def str_sort(string):
s = ""
for a in sorted(string):
s+=a
return s
if i instead do:

def str_sort(string):
s = ""
so = sorted(string)
for a in so:
s+=a
return s

will that be faster or the interpreter can figure out that it only has
to do sorted(string) once?
Actually, by replacing sorted() with a function that outputs when it is
called you could have seen that this is only called once in both cases. It
must not be called more than once in fact, consider e.g. the case that it
introduces side effects (like e.g. reading a file).

Uli

--
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

Jul 31 '08 #2

P: n/a
Am Donnerstag, 31. Juli 2008 13:09:57 schrieb ssecorp:
def str_sort(string):
s = ""
for a in sorted(string):
s+=a
return s
if i instead do:

def str_sort(string):
s = ""
so = sorted(string)
for a in so:
s+=a
return s
will that be faster or the interpreter can figure out that it only has
to do sorted(string) once? or that kind of cleverness is usually
reserved for compilers and not interpreters?
In a statement of the form

for <namein <iterable>:

the expression <iterablewill only be evaluated once (to retrieve an
iterator), so basically, both ways of stating it are equivalent and make
negligible difference in runtime (the second version will be slower, because
you have additional code to assign/fetch a local).

Anyway, if you care about speed, probably:

def str_sort(string):
return "".join(sorted(string))

will be the fastest way of stating this.

--
Heiko Wundram
Jul 31 '08 #3

P: n/a
ssecorp wrote:
def str_sort(string):
s = ""
for a in sorted(string):
s+=a
return s
if i instead do:

def str_sort(string):
s = ""
so = sorted(string)
for a in so:
s+=a
return s
will that be faster or the interpreter can figure out that it only has
to do sorted(string) once? or that kind of cleverness is usually
reserved for compilers and not interpreters?
There isn't much cleverness involved here - why on earth should one execute
the sorted(string) several times?

The

for <namein <iterable_yielding_expression>

construct will evaluate the <iterable_yielding_expressionof course only
once.

Diez
Jul 31 '08 #4

P: n/a
ssecorp wrote:
def str_sort(string):
s = ""
for a in sorted(string):
s+=a
return s
if i instead do:

def str_sort(string):
s = ""
so = sorted(string)
for a in so:
s+=a
return s
will that be faster or the interpreter can figure out that it only has
to do sorted(string) once? or that kind of cleverness is usually
reserved for compilers and not interpreters?
--
http://mail.python.org/mailman/listinfo/python-list
The 'for' statement is only executed once of course. It's the body of
the 'for' which is executed multiple times. So in both pieces of code,
the 'sorted' is only executed once, and the returned string is bound to
a name in the second but not the first.

However, you are worrying about optimizing the wrong things here. The
's+=a' line has terrible (quadratic) performance. Instead use the
string method 'join' which has linear performance.

def str_sort(string):
return "".join(sorted(string))

No for loop, no inefficient accumulation.

Gary Herron
Jul 31 '08 #5

P: n/a
On Thu, 31 Jul 2008 04:09:57 -0700, ssecorp wrote:
def str_sort(string):
s = ""
for a in sorted(string):
s+=a
return s
if i instead do:

def str_sort(string):
s = ""
so = sorted(string)
for a in so:
s+=a
return s
will that be faster

Oh dear. Premature optimization, the root of all (programming) evil.

You can test which is faster by using the timeit module. In the
interactive interpreter, define the two functions above with different
names, and a string to supply as argument. Then call:

from timeit import Timer
t1 = Timer('str_sort1(s)', 'from __main__ import str_sort1, s')
t2 = Timer('str_sort2(s)', 'from __main__ import str_sort2, s')
t1.repeat(number=1000)
t2.repeat(number=1000)

I'll be hugely surprised if there was any meaningful difference.

or the interpreter can figure out that it only has
to do sorted(string) once? or that kind of cleverness is usually
reserved for compilers and not interpreters?
Python uses a compiler. It doesn't do a lot of clever optimization, but
it does some. In this case, no, it doesn't optimize your function, so
technically the first may be a tiny bit faster. But, frankly, your
function is so painfully inefficient, doing a lot of useless work, that
you probably won't notice any real difference.

The correct way to do what you have done above is ''.join(sorted(s)).
Anything else is much slower.
>>def sort_str(s):
.... ss = ""
.... for c in sorted(s):
.... ss += c
.... return ss
....
>>s = "abcdefghijklmnopqrstuvwxyz"*100
from timeit import Timer
t1 = Timer('"".join(sorted(s))', 'from __main__ import s')
t2 = Timer('sort_str(s)', 'from __main__ import sort_str, s')
t1.repeat(number=1000)
[1.6792540550231934, 1.6882510185241699, 1.660383939743042]
>>t2.repeat(number=1000)
[2.5500221252441406, 2.4761130809783936, 2.5888760089874268]
--
Steven
Jul 31 '08 #6

P: n/a


ssecorp wrote:
def str_sort(string):
s = ""
for a in sorted(string):
s+=a
return s
if i instead do:

def str_sort(string):
s = ""
so = sorted(string)
for a in so:
s+=a
return s
will that be faster or the interpreter can figure out that it only has
to do sorted(string) once? or that kind of cleverness is usually
reserved for compilers and not interpreters?
The optimizations performed by a Python interpreter and where they are
performed depend on the implementation and version. CPython is
conservative about optimizations. Not only do the developers want to be
sure they are 100% correct (unlike too many optimizing compilers), but
Guido also rejects some that are too tricky and too fragile (easily
broken by new maintainers). In Python 3.0, here are two compiler
optimizations
>>from dis import dis
def f(): return 1+2
>>dis(f)
1 0 LOAD_CONST 3 (3)
3 RETURN_VALUE

# constant arithmetic (folding); done with floats also
>>def f():
a,b = 1,2
return a+b
>>dis(f)
2 0 LOAD_CONST 3 ((1, 2))
3 UNPACK_SEQUENCE 2
6 STORE_FAST 0 (a)
9 STORE_FAST 1 (b)

3 12 LOAD_FAST 0 (a)
15 LOAD_FAST 1 (b)
18 BINARY_ADD
19 RETURN_VALUE

# tuples with constant members are pre-built by the compiler and stored
in the code object. What you don't see (if it is still there) is an
optimization in the interpreter loop for BINARY_ADD that takes a
shortcut if both operands are ints.

tjr

Aug 1 '08 #7

This discussion thread is closed

Replies have been disabled for this discussion.