464,806 Members | 1,283 Online Need help? Post your question and get tips & solutions from a community of 464,806 IT Pros & Developers. It's quick & easy.

# How smart is the Python interpreter?

 P: n/a def str_sort(string): s = "" for a in sorted(string): s+=a return s if i instead do: def str_sort(string): s = "" so = sorted(string) for a in so: s+=a return s will that be faster or the interpreter can figure out that it only has to do sorted(string) once? or that kind of cleverness is usually reserved for compilers and not interpreters? Jul 31 '08 #1
6 Replies

 P: n/a ssecorp wrote: def str_sort(string): s = "" for a in sorted(string): s+=a return s if i instead do: def str_sort(string): s = "" so = sorted(string) for a in so: s+=a return s will that be faster or the interpreter can figure out that it only has to do sorted(string) once? Actually, by replacing sorted() with a function that outputs when it is called you could have seen that this is only called once in both cases. It must not be called more than once in fact, consider e.g. the case that it introduces side effects (like e.g. reading a file). Uli -- Sator Laser GmbH GeschÃ¤ftsfÃ¼hrer: Thorsten FÃ¶cking, Amtsgericht Hamburg HR B62 932 Jul 31 '08 #2

 P: n/a Am Donnerstag, 31. Juli 2008 13:09:57 schrieb ssecorp: def str_sort(string): s = "" for a in sorted(string): s+=a return s if i instead do: def str_sort(string): s = "" so = sorted(string) for a in so: s+=a return s will that be faster or the interpreter can figure out that it only has to do sorted(string) once? or that kind of cleverness is usually reserved for compilers and not interpreters? In a statement of the form for : the expression

 P: n/a ssecorp wrote: def str_sort(string): s = "" for a in sorted(string): s+=a return s if i instead do: def str_sort(string): s = "" so = sorted(string) for a in so: s+=a return s will that be faster or the interpreter can figure out that it only has to do sorted(string) once? or that kind of cleverness is usually reserved for compilers and not interpreters? There isn't much cleverness involved here - why on earth should one execute the sorted(string) several times? The for construct will evaluate the

 P: n/a ssecorp wrote: def str_sort(string): s = "" for a in sorted(string): s+=a return s if i instead do: def str_sort(string): s = "" so = sorted(string) for a in so: s+=a return s will that be faster or the interpreter can figure out that it only has to do sorted(string) once? or that kind of cleverness is usually reserved for compilers and not interpreters? -- http://mail.python.org/mailman/listinfo/python-list The 'for' statement is only executed once of course. It's the body of the 'for' which is executed multiple times. So in both pieces of code, the 'sorted' is only executed once, and the returned string is bound to a name in the second but not the first. However, you are worrying about optimizing the wrong things here. The 's+=a' line has terrible (quadratic) performance. Instead use the string method 'join' which has linear performance. def str_sort(string): return "".join(sorted(string)) No for loop, no inefficient accumulation. Gary Herron Jul 31 '08 #5

 P: n/a On Thu, 31 Jul 2008 04:09:57 -0700, ssecorp wrote: def str_sort(string): s = "" for a in sorted(string): s+=a return s if i instead do: def str_sort(string): s = "" so = sorted(string) for a in so: s+=a return s will that be faster Oh dear. Premature optimization, the root of all (programming) evil. You can test which is faster by using the timeit module. In the interactive interpreter, define the two functions above with different names, and a string to supply as argument. Then call: from timeit import Timer t1 = Timer('str_sort1(s)', 'from __main__ import str_sort1, s') t2 = Timer('str_sort2(s)', 'from __main__ import str_sort2, s') t1.repeat(number=1000) t2.repeat(number=1000) I'll be hugely surprised if there was any meaningful difference. or the interpreter can figure out that it only has to do sorted(string) once? or that kind of cleverness is usually reserved for compilers and not interpreters? Python uses a compiler. It doesn't do a lot of clever optimization, but it does some. In this case, no, it doesn't optimize your function, so technically the first may be a tiny bit faster. But, frankly, your function is so painfully inefficient, doing a lot of useless work, that you probably won't notice any real difference. The correct way to do what you have done above is ''.join(sorted(s)). Anything else is much slower. >>def sort_str(s): .... ss = "" .... for c in sorted(s): .... ss += c .... return ss .... >>s = "abcdefghijklmnopqrstuvwxyz"*100from timeit import Timert1 = Timer('"".join(sorted(s))', 'from __main__ import s')t2 = Timer('sort_str(s)', 'from __main__ import sort_str, s')t1.repeat(number=1000) [1.6792540550231934, 1.6882510185241699, 1.660383939743042] >>t2.repeat(number=1000) [2.5500221252441406, 2.4761130809783936, 2.5888760089874268] -- Steven Jul 31 '08 #6

 P: n/a ssecorp wrote: def str_sort(string): s = "" for a in sorted(string): s+=a return s if i instead do: def str_sort(string): s = "" so = sorted(string) for a in so: s+=a return s will that be faster or the interpreter can figure out that it only has to do sorted(string) once? or that kind of cleverness is usually reserved for compilers and not interpreters? The optimizations performed by a Python interpreter and where they are performed depend on the implementation and version. CPython is conservative about optimizations. Not only do the developers want to be sure they are 100% correct (unlike too many optimizing compilers), but Guido also rejects some that are too tricky and too fragile (easily broken by new maintainers). In Python 3.0, here are two compiler optimizations >>from dis import disdef f(): return 1+2 >>dis(f) 1 0 LOAD_CONST 3 (3) 3 RETURN_VALUE # constant arithmetic (folding); done with floats also >>def f(): a,b = 1,2 return a+b >>dis(f) 2 0 LOAD_CONST 3 ((1, 2)) 3 UNPACK_SEQUENCE 2 6 STORE_FAST 0 (a) 9 STORE_FAST 1 (b) 3 12 LOAD_FAST 0 (a) 15 LOAD_FAST 1 (b) 18 BINARY_ADD 19 RETURN_VALUE # tuples with constant members are pre-built by the compiler and stored in the code object. What you don't see (if it is still there) is an optimization in the interpreter loop for BINARY_ADD that takes a shortcut if both operands are ints. tjr Aug 1 '08 #7

### This discussion thread is closed

Replies have been disabled for this discussion. 