473,406 Members | 2,894 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

Shared vs static link performance hit --and Windows?

Last night I was compiling the latest python snapshot at my home Linux
system (a K6-III @420 --the extra 20 Hz is overclocking :); then I tried
building a shared version of the interpreter. I did some speed
comparisons, and pystone reported ~6090 pystones for the shared and
~7680 pystones for the (default) static build.

This is quite a difference, and while I do know what impact position
independent code has on libraries, this was the first time I measured a
difference of 25% in performance...

This is not a complaint (I am happy with the static build and its
speed), but it's a question because I know that in Windows the build is
"shared" (there's a python23.dll which is about the same as the .so of
the shared lib), and pystone performance is close to ~6100 pystones.

I might say something stupid since I am not an "expert" on Windows
architectures, but would it be feasible (and useful) to build a static
python.exe on windows? Or is it that there would be no difference
performance-wise in the final interpreter?

Any replies are welcome.
--
TZOTZIOY, I speak England very best,
Microsoft Security Alert: the Matrix began as open source.
Jul 18 '05 #1
7 5673

Christos> .... I did some speed comparisons, and pystone reported ~6090
Christos> pystones for the shared and ~7680 pystones for the (default)
Christos> static build.

Christos> This is quite a difference, and while I do know what impact
Christos> position independent code has on libraries, this was the first
Christos> time I measured a difference of 25% in performance...

It's possible that the Python interpreter makes (percentage-wise) more calls
through the indirect links of the position-independent code. We know it
makes lots of function calls (often to extension modules) to implement its
functionality, so it's quite possible that the interpreter takes a much
larger hit than your typical C program would.

Skip
Jul 18 '05 #2
[I: statically-linked python is 25% faster than dynamically-linked on my
machine]

[Skip: explains that python possibly uses excessive function calls to
and fro shared libraries so that explains the difference]

Thanks, Skip; now if only someone more Windows-knowledgeable than me
could also comment on whether we could see a similar speed increase by
building a static Windows EXE...

I do have a computer with Visual Studio 6 available at work, but I don't
know much about VS[1] to answer this myself; I gave it a try, but I got
lost in the Project Settings and how to merge the pythoncore (the dll I
believe) and the python (the exe) projects.

[1] Although lately I did give a shot to compile a custom module for
haar image transforms following the directions in the docs; it was
trivial in Linux --kudos to all of you guys who worked on distutils--
and fairly easy on Windows, so I have both an .so and a .pyd with the
same functionality for my program...
--
TZOTZIOY, I speak England very best,
Microsoft Security Alert: the Matrix began as open source.
Jul 18 '05 #3
On Tue, 8 Jul 2003, Christos TZOTZIOY Georgiou wrote:
Thanks, Skip; now if only someone more Windows-knowledgeable than me
could also comment on whether we could see a similar speed increase by
building a static Windows EXE...


AFAIK, Windows & OS/2 don't really use the concept of PIC code for DLLs,
as the DLLs get mapped into system address space rather than user address
space, and all processes that access DLL code access the code at the same
address.

--
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: an*****@bullseye.apana.org.au (pref) | Snail: PO Box 370
an*****@pcug.org.au (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia

Jul 18 '05 #4
Andrew MacIntyre <an*****@bullseye.apana.org.au> writes:
AFAIK, Windows & OS/2 don't really use the concept of PIC code for DLLs,
as the DLLs get mapped into system address space rather than user address
space, and all processes that access DLL code access the code at the same
address.


Wrong, atleast for Win32. In Win32, there is no "system address
space"; each process has its separate address space.

Each DLL has a preferred load address, and it is linked to that
address. If the dynamic loader can, at run-time, link the DLL to that
address also, there won't be any relocations. If that address is
already in use, the dynamic loader choses a different address, and
they are not shared across address spaces anymore.

For Python standard DLLs and PYDs, PC/dllbase_nt.txt lists the
addresses that have been assigned for each DLL to avoid conflicts. For
custom extension DLLs, most likely, the linker default is used, which
will conflict with all other DLLs which don't use the /BASE linker
flag, either.

Regards,
Martin
Jul 18 '05 #5
Christos "TZOTZIOY" Georgiou <tz**@sil-tec.gr> wrote in message news:<k7********************************@4ax.com>. ..
Last night I was compiling the latest python snapshot at my home Linux
system (a K6-III @420 --the extra 20 Hz is overclocking :); then I tried
building a shared version of the interpreter. I did some speed
comparisons, and pystone reported ~6090 pystones for the shared and
~7680 pystones for the (default) static build.


Yes, today i recommend to not use the -fPIC option for certain
libraries when compiling a .so library. If you use it you get one more
indirection and this can be very bad on systems with long CPU
pipelines (PIV systems). If you don't use -fPIC then the shared
library will be patched and is only shared on disk but not in memory.

I hope that the UNIX community gives up this 20 year old ELF format
and start to use a new one with better performance - look at KDE to
see the pain.
Jul 18 '05 #6
At some point, ll*****@web.de (Lothar Scholz) wrote:
Christos "TZOTZIOY" Georgiou <tz**@sil-tec.gr> wrote in message news:<k7********************************@4ax.com>. ..
Last night I was compiling the latest python snapshot at my home Linux
system (a K6-III @420 --the extra 20 Hz is overclocking :); then I tried
building a shared version of the interpreter. I did some speed
comparisons, and pystone reported ~6090 pystones for the shared and
~7680 pystones for the (default) static build.


Yes, today i recommend to not use the -fPIC option for certain
libraries when compiling a .so library. If you use it you get one more
indirection and this can be very bad on systems with long CPU
pipelines (PIV systems). If you don't use -fPIC then the shared
library will be patched and is only shared on disk but not in memory.


On PowerPC Linux (or any PPC Unix using ELF), AFAIK, compiling shared
libraries *without* -fPIC will run you into trouble. This is espically
true if you then link to a library that was compiled with -fPIC.
You'll get errors like 'R_PPC_REL24 relocation ...'. I certainly see
problems with bad code that doesn't use -fPIC that runs on x86, but
not on PPC.

x86 has a different way of doing things from PPC that doesn't bite you
on the ass when you compile without -fPIC. You especially run into
trouble when you mix non-fPIC code with -fPIC code.

[while we're on the subject, I'll point out that -fpic and -fPIC are
not equivalent. -fpic can run out of space in the global offset table,
while -fPIC avoids that. x86 doesn't have that limit, so I see people
using -fpic where they should be using -fPIC.]

--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
Jul 18 '05 #7
Christos TZOTZIOY Georgiou wrote:
I might say something stupid since I am not an "expert" on Windows
architectures, but would it be feasible (and useful) to build a static
python.exe on windows? Or is it that there would be no difference
performance-wise in the final interpreter?


On windows, DLLs (and .pyd) can only call function in other DLLs, not in
the EXE they are called from. So building a statically linked python
would make it impossible to load dynamically .pyd files.

OT: One could of course change the Python extending API so that every
extension function gets a pointer to the interpreter instances and all
API functions are members of the interpreter object. This adds an
additional indirection, so it wouldn't help the performace. On the plus
side, this would make it at least feasable that extenions are binary
compatible to newer Pythons the same way they often are on Unix.

Daniel

Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

52
by: Neuruss | last post by:
It seems there are quite a few projects aimed to improve Python's speed and, therefore, eliminate its main limitation for mainstream acceptance. I just wonder what do you all think? Will Python...
7
by: BekTek | last post by:
When I build boost libs, I realized that there is static link at runtime.. What does that mean? I thought static linking is done at compile time. Am I wrong?
1
by: barron | last post by:
I have already built a project, and run it. But if I want to run the .exe on other computers which does not install Visual Studio, it will need .dll files (such as msvcrt.dll and msvcp60.dll in...
2
by: Stan Leung | last post by:
Hello all, I am interested in know if anyone has set up clustering for performance and fail over using PostgreSQL. We are currently using Oracle for a distribution application and would like to use...
2
by: Angel Lopez | last post by:
I am quite a beginner in C. I am writting an application on my debian linux computer that uses some imagemagick library functions. I can link them and compile (with cc) in my computer but when I...
22
by: Steve - DND | last post by:
We're currently doing some tests to determine the performance of static vs non-static functions, and we're coming up with some odd(in our opinion) results. We used a very simple setup. One class...
4
by: MPF | last post by:
When designing a n-tier architecture, what is the preferred method/function accessibility? <Specifically for asp.net apps> A private constructor and shared/static methods & functions? A public...
2
by: Random | last post by:
Here's a design question I'm curious to know if anyone here has wrestled with before... I'm writing my data access methods in classes in the App_Code directory. I know that I can easily...
0
by: sriNani | last post by:
Whats the diff bitween console app performance and windows sevice performance..? which can suggestable to follow from .net 2005?
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.