By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
444,035 Members | 1,388 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 444,035 IT Pros & Developers. It's quick & easy.

Is __import__ known to be slow in windows?

P: n/a
[I tried googling for this, didn't find anything relevant.]

We've recently been doing some profiling on a project of ours. It runs
quite fast on Linux but *really* bogs down on Windows 2003. We initially
thought it was the simplejson libraries (we don't use the C extensions) but
profiling proved otherwise.

We have a function that does some runtime imports via calls to __import__.
We ran 1000 iterations (we used cProfile) of the application (web app).
There were eight calls to __import__ per iteration, so 8000 calls total.
Identical hardware, by the way.

On Linux (Debian Etch, Python 2.5.1)
Total time was 2.793 CPU seconds, with __import__ using 1.059 seconds of
that. So, 37% of the time was spent in import. Not great, but not a show
stopper.

On Windows 2003 (R2, Python 2.5.1)
Total time was 18.532 CPU seconds, with __import__ using 16.330 seconds
(88%) of that.

So, Linux spends 1.734 seconds on non-import activities, and Windows spends
2.202 seconds on non-import activities. Pretty close. But 16.3 seconds on
import!?

Is this a known deficiency in Windows' Python import calls, or is there
something deeper going on here?

Pointers, suggestions, and URLs welcome.

j

Nov 30 '07 #1
Share this Question
Share on Google+
4 Replies


P: n/a
On Nov 30, 5:08 pm, Joshua Kugler <jkug...@bigfoot.comwrote:
[I tried googling for this, didn't find anything relevant.]

We've recently been doing some profiling on a project of ours. It runs
quite fast on Linux but *really* bogs down on Windows 2003. We initially
thought it was the simplejson libraries (we don't use the C extensions) but
profiling proved otherwise.

We have a function that does some runtime imports via calls to __import__.
We ran 1000 iterations (we used cProfile) of the application (web app).
There were eight calls to __import__ per iteration, so 8000 calls total.
Identical hardware, by the way.

On Linux (Debian Etch, Python 2.5.1)
Total time was 2.793 CPU seconds, with __import__ using 1.059 seconds of
that. So, 37% of the time was spent in import. Not great, but not a show
stopper.

On Windows 2003 (R2, Python 2.5.1)
Total time was 18.532 CPU seconds, with __import__ using 16.330 seconds
(88%) of that.

So, Linux spends 1.734 seconds on non-import activities, and Windows spends
2.202 seconds on non-import activities. Pretty close. But 16.3 seconds on
import!?

Is this a known deficiency in Windows' Python import calls, or is there
something deeper going on here?

Pointers, suggestions, and URLs welcome.

j
Imagine you have two laundry baskets, one is green, one is purple.
Both contain 10 pairs of pants, but the pockets of the pants in the
purple basket are filled with rocks.

You pick up the green basket - kind of heavy, but not terrible. Then
you pick up the purple basked - wow! Really heavy!

Who would have had any idea that the color of the laundry basket would
make such a difference in the weight? :)

Of course, to clear up this question, you empty both baskets, try to
lift each one, and then find out that they both weigh about the same.
(Or one does weigh more than the other, but now you have ruled out the
possibility that the baskets' contents were a factor in the
comparison.)

Ok, back to your question. __import__ doesn't just load up a module.
At import time, all of the top-level code in the module is executed
also. So if you want to really assess the impact of *just* __import__
on different platforms, then you should try importing:
- empty .py modules
- .py modules containing no top-level executable statements, but some
class or method definitions
- modules that have already been imported

It's possible that your imported module imports additional modules
which have significant platform-dependent logic, or import modules
which are compiled builtins on one platform, but written in Python on
the other.

As it stands, your experiment still has too many unknowns to draw any
decent conclusion, and it certainly is too early to jump to "wow!
__import__ on Windows is sure slower than __import__ on Linux!"

-- Paul
Nov 30 '07 #2

P: n/a
On Dec 1, 10:08 am, Joshua Kugler <jkug...@bigfoot.comwrote:
[I tried googling for this, didn't find anything relevant.]

We've recently been doing some profiling on a project of ours. It runs
quite fast on Linux but *really* bogs down on Windows 2003. We initially
thought it was the simplejson libraries (we don't use the C extensions) but
profiling proved otherwise.

We have a function that does some runtime imports via calls to __import__.
We ran 1000 iterations (we used cProfile) of the application (web app).
There were eight calls to __import__ per iteration, so 8000 calls total.
Identical hardware, by the way.

On Linux (Debian Etch, Python 2.5.1)
Total time was 2.793 CPU seconds, with __import__ using 1.059 seconds of
that. So, 37% of the time was spent in import. Not great, but not a show
stopper.

On Windows 2003 (R2, Python 2.5.1)
Total time was 18.532 CPU seconds, with __import__ using 16.330 seconds
(88%) of that.

So, Linux spends 1.734 seconds on non-import activities, and Windows spends
2.202 seconds on non-import activities. Pretty close. But 16.3 seconds on
import!?

Is this a known deficiency in Windows' Python import calls, or is there
something deeper going on here?

Pointers, suggestions, and URLs welcome.

j
What modules are you __import__ing, and what is platform-dependent in
each?

Dec 1 '07 #3

P: n/a
What modules are you __import__ing, and what is platform-dependent in
each?
The only thing we're importing __import__ are some modules of ours, with no
sytem dependent code in them at all. Some of them are even empty modules,
as was suggested by one response (for benchmarking purposes).

Turning off Symantec's on-access checking shaved about a bunch of time off
the time spent in __import__ (10 seconds vs 16), but still too high.

Commenting out the import of the only system dependent code we have in our
project (which we don't "manually" import via an __import__ call) produced
no change in run time.

So, we've found a major culprit. I'll see if I can find more.

Thanks for the pointers so far.

j

Dec 1 '07 #4

P: n/a
John Machin wrote:
On Dec 1, 2:12 pm, Joshua Kugler <jkug...@bigfoot.comwrote:
> x = __import__(m)

Have you ever tried print m, x.__file__ here to check that the modules
are being found where you expect them to be found?
No, I haven't, but I do know for a fact that the only location of the module
found is where I think it is. There are no other modules on the system (or
in the search path) named <ourprefix>_modulename.
> except ImportError, e:
if not e.message.startswith('No module named'):
raise

Why are you blindly ignoring the possibly that the module is not
found? Note that loading a module after it is found is not a zero-cost
operation.
That was in the original code to silently ignore things like a README file
or a directory that didn't have a __init__.py file. The code I posted was
from my benchmarking script. The code in the framework does a listdir()
and goes through that list trying to import. We'll probably "smarten up"
that code a bit, but for now it works.
> x = None

Each of those three module names is a directory under /path/to/code with
an empty __init_.py.
Is there anything else in the /path/to/code directory?
A directory and a file that aren't referenced in my test, but are
in the framework's import attempts (see listdir() comment above).
Regardless, setting sys.path to just one directory speeds up both the
benchmark and the framework.
>def r():
for m in ['three','module','names']:
x = __import__(m)

Have you tried print m, x.__file__ here to check that the modules are
being found where you expect them to be found?
As I said above, no I haven't, but I will just to double check.

On Linux, print x, x.__name__ produces the expected result whether I have
one element in sys.path or all of them. On Windows, same result.
Call me crazy, but: First experiment, sys.path was ['/path/to/code',
'', etc etc]. Now it's only ['/path/to/code']. How can that still load
properly but run faster??
I don't know, but it does. Right now, we're just importing empty modules.
Once those modules have imports of their own, my "fix" will probably fall
apart, and we'll be back to square one.
What directory tree walking?? Should be none
if the modules are found in /path/to/code.
I know, that's the crazy thing...there should be no directory tree walking
or other filesystem activity after finding the module. But setting
sys.path to one element does make a difference; that can be seen. I don't
know the internal implementation of __import__, so I can't really comment
beyond my results. Maybe I need to strace a python run.
Are you sure the modules
are always found in /path/to/code? What is in the current directory
[matched by '' in sys.path]?
'' would be whereever I start the benchmark script (~/bin), or whereever the
framework is started from, probably from the web server's home directory,
or from ~/bin when I'm profiling the framework.

Thanks for your responses. I still trying to figure this out too. :)

j

Dec 1 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.