473,383 Members | 1,877 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

Building Python 2.4 with icc and processor-specific optimizations

Just out of curiosity, I was wondering if anyone has
compiled Python 2.4 with the Intel C Compiler and its
processor specific optimizations. I can build it fine
with OPT="-O3" or OPT="-xN" but when I try to combine
them I get this as soon as ./python is run:

"""
case $MAKEFLAGS in \
*-s*) CC='icc -pthread' LDSHARED='icc -pthread -shared' OPT='-DNDEBUG -O3 -xN' ./python -E ./setup.py -q build;; \
*) CC='icc -pthread' LDSHARED='icc -pthread -shared' OPT='-DNDEBUG -O3 -xN' ./python -E ./setup.py build;; \
esac
'import site' failed; use -v for traceback
Traceback (most recent call last):
File "./setup.py", line 6, in ?
import sys, os, getopt, imp, re
File "/usr/local/src/Python-2.4/Lib/os.py", line 130, in ?
raise ImportError, 'no os specific module found'
ImportError: no os specific module found
make: *** [sharedmods] Error 1
"""

Also, if I run ./python, I have this interesting result:

"""
$ ./python
'import site' failed; use -v for traceback
Python 2.4 (#34, Mar 12 2005, 18:46:28)
[GCC Intel(R) C++ gcc 3.0 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
import sys
sys.builtin_module_names

('__main__', '__builtin__', '__builtin__', '__builtin__', '__builtin__', '__builtin__', '__builtin__', '__builtin__', '__builtin__', '__builtin__', '__builtin__', '__builtin__', '__builtin__', 'exceptions', 'gc', 'gc')
"""

Whoa--what's going on? Any ideas?

--
Michael Hoffman
Jul 18 '05 #1
3 2138
Michael Hoffman wrote:
Just out of curiosity, I was wondering if anyone has
compiled Python 2.4 with the Intel C Compiler and its
processor specific optimizations. I can build it fine
with OPT="-O3" or OPT="-xN" but when I try to combine
them I get this as soon as ./python is run:

"""
case $MAKEFLAGS in \
*-s*) CC='icc -pthread' LDSHARED='icc -pthread -shared' OPT='-DNDEBUG
-O3 -xN' ./python -E ./setup.py -q build;; \
*) CC='icc -pthread' LDSHARED='icc -pthread -shared' OPT='-DNDEBUG -O3
-xN' ./python -E ./setup.py build;; \
esac
'import site' failed; use -v for traceback
Traceback (most recent call last):
File "./setup.py", line 6, in ?
import sys, os, getopt, imp, re
File "/usr/local/src/Python-2.4/Lib/os.py", line 130, in ?
raise ImportError, 'no os specific module found'
ImportError: no os specific module found
make: *** [sharedmods] Error 1
"""

Also, if I run ./python, I have this interesting result:

"""
$ ./python
'import site' failed; use -v for traceback
Python 2.4 (#34, Mar 12 2005, 18:46:28)
[GCC Intel(R) C++ gcc 3.0 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.builtin_module_names

('__main__', '__builtin__', '__builtin__', '__builtin__', '__builtin__',
'__builtin__', '__builtin__', '__builtin__', '__builtin__',
'__builtin__', '__builtin__', '__builtin__', '__builtin__',
'exceptions', 'gc', 'gc')
"""

Whoa--what's going on? Any ideas?


Further investigation reveals that the function that sets
sys.builtin_module_names sorts the list before turning it into a
tuple. And binarysort() in Objects/listobject.c doesn't work when
optimized in that fashion. Adding #pragma optimize("", off)
beforehand solves the problem. Why that is, I have no idea. Is
anyone else curious?

Also, if anyone is looking for a way to squeeze a little extra time
out of the startup, perhaps sorting the list at build-time,
rather than when Python starts would be good. Although probably
not worth the trouble. ;-)
--
Michael Hoffman
Jul 18 '05 #2
Michael Hoffman wrote:
Further investigation reveals that the function that sets
sys.builtin_module_names sorts the list before turning it into a
tuple. And binarysort() in Objects/listobject.c doesn't work when
optimized in that fashion. Adding #pragma optimize("", off)
beforehand solves the problem. Why that is, I have no idea. Is
anyone else curious?
I would really like to know, indeed. OTOH, I probably don't have the
time to analyse it myself.

Looks like a compiler bug to me: perhaps, some condition is compile-time
asserted to be always true even though it could happen that it is false.

OTOH, it could also be Python's failure to follow C's aliasing rules
correctly; Python casts between C pointers which, in strict C, causes
undefined behaviour. So if your compiler has something similar to GCC's
-fno-strict-aliasing, you could see whether this helps.

If not, just try comparing the assembler output of either code, on
a function-by-function basis. Alternatively, try to annotate the
calls that go out of the sorting (e.g. to RichCompareBool) so that
you get tracing, and then see where the traces differ.
Also, if anyone is looking for a way to squeeze a little extra time
out of the startup, perhaps sorting the list at build-time,
rather than when Python starts would be good. Although probably
not worth the trouble. ;-)


Probably not. config.c is hand-written in some (embedded Python)
environments, and expecting it to be sorted would break these
environments.

Regards,
Martin
Jul 18 '05 #3
Martin v. Löwis wrote:
OTOH, it could also be Python's failure to follow C's aliasing rules
correctly; Python casts between C pointers which, in strict C, causes
undefined behaviour. So if your compiler has something similar to GCC's
-fno-strict-aliasing, you could see whether this helps.
There's nothing like that specifically. There is an -falias option
which the manual just says "assume aliasing."
If not, just try comparing the assembler output of either code, on
a function-by-function basis.
Oh boy, it's a 10,000 line diff. The joys of interprocedural
optimization. I think I'll quit while I'm ahead...
Alternatively, try to annotate the
calls that go out of the sorting (e.g. to RichCompareBool) so that
you get tracing, and then see where the traces differ.


Well, they go wrong almost right away:

non-optimized:

PyObject_RichCompareBool('signal', 'thread', 0)
PyObject_RichCompareBool('posix', 'signal', 0)
PyObject_RichCompareBool('errno', 'posix', 0)
PyObject_RichCompareBool('_sre', 'errno', 0)
PyObject_RichCompareBool('_codecs', '_sre', 0)
PyObject_RichCompareBool('zipimport', '_codecs', 0)
PyObject_RichCompareBool('zipimport', 'posix', 0)
PyObject_RichCompareBool('zipimport', 'thread', 0)
PyObject_RichCompareBool('_symtable', 'posix', 0)

optimized:

PyObject_RichCompareBool('signal', 'thread', 0)
PyObject_RichCompareBool('posix', 'errno', 0) # hmmm, comparing in the wrong direction
PyObject_RichCompareBool('posix', 'thread', 0)
PyObject_RichCompareBool('posix', 'signal', 0)
PyObject_RichCompareBool('errno', 'errno', 0) # totally bogus!
PyObject_RichCompareBool('errno', 'errno', 0) # and repeating it twice for good measure!
PyObject_RichCompareBool('_sre', 'errno', 0)
PyObject_RichCompareBool('_sre', 'errno', 0)
PyObject_RichCompareBool('_sre', 'posix', 0)

Well I probably have spent too much time on this already. To top things off, python
compiled with -O3 and without -xN actually runs faster, so I shouldn't even be trying
this road.
--
Michael Hoffman
Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Dave Brueck | last post by:
Below is some information I collected from a *small* project in which I wrote a Python version of a Java application. I share this info only as a data point (rather than trying to say this data...
6
by: Thomas Womack | last post by:
If I have a dual-processor hyperthreaded machine (so with four CPU contexts), will a python program distribute threads over all four logical processors? I ask because I'm fairly sure that this...
7
by: Irmen de Jong | last post by:
Hi, Things like Twisted, medusa, etc.... that claim to be able to support hundreds of concurrent connections because of the async I/O framework they're based on.... can someone give a few...
68
by: Lad | last post by:
Is anyone capable of providing Python advantages over PHP if there are any? Cheers, L.
23
by: Simon Hengel | last post by:
Hello, we are hosting a python coding contest an we even managed to provide a price for the winner... http://pycontest.net/ The contest is coincidentally held during the 22c3 and we will be...
9
by: corey.coughlin | last post by:
Alright, so I've been following some of the arguments about enhancing parallelism in python, and I've kind of been struck by how hard things still are. It seems like what we really need is a more...
118
by: 63q2o4i02 | last post by:
Hi, I've been thinking about Python vs. Lisp. I've been learning Python the past few months and like it very much. A few years ago I had an AI class where we had to use Lisp, and I absolutely...
20
by: Jack | last post by:
Is there a Python packaging that is specifically for embedded systems? ie, very small and configurable so the user gets to select what modules to install? For Linux-based embedded systems in...
8
by: Brendan | last post by:
Hello, I just tried to use the Windows XP installer for Python 2.5 AMD64 but I get the error message: "Installation package not supported by processor type" I am running Windows XP Pro on an...
2
by: tgiles | last post by:
Hi, All! I started back programming Python again after a hiatus of several years and run into a sticky problem that I can't seem to fix, regardless of how hard I try- it it starts with tailing a...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.