By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,874 Members | 1,027 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,874 IT Pros & Developers. It's quick & easy.

import in threads: crashes & strange exceptions on dual core machines

P: n/a
I get python crashes and (in better cases) strange Python exceptions when (in most cases) importing and using cookielib lazy on demand in a thread.
It is mainly with cookielib, but remember the problem also with other imports (e.g. urllib2 etc.).
And again very often in all these cases where I get weired Python exceptions, the problem is around re-functions - usually during re.compile calls during import (see some of the exceptions below). But not only.

Very strange: The errors occur almost only on machines with dual core/multi processors - and very very rarely on very fast single core machines (>3GHz).

I'm using Python2.3.5 on Win with win32ui (build 210) - the cookielib taken from Python 2.5.

I took care that I'm not starting off thread things or main application loop etc. during an import (which would cause a simple & explainable deadlock freeze on the import lock)

With real OS-level crashes I know from user reports (packaged app), that these errors occur very likely early after app start - thus when lazy imports are likely to do real execution.

I researched this bug for some time. I think I can meanwhile exclude (ref-count, mem.leak) problems in win32ui (the only complex extension lib I use) as cause for this. All statistics point towards import problems.

Any ideas?
Are there problems known with the import lock (Python 2.3.5) ?

(I cannot easily change from Python 2.3 and it takes weeks to get significant feedback after random improvements)

-robert

PS:

The basic pattern of usage is:

==================
def f():
...
opener = urlcookie_openers.get(user)
if not opener:
import cookielib #<----1
cj=cookielib.CookieJar() #<----2
build_opener = urllib2.build_opener
httpCookieProcessor = urllib2.HTTPCookieProcessor(cj)
if url2_proxy:
opener = build_opener(url2_proxy,httpCookieProcessor)
else:
opener = build_opener(httpCookieProcessor)
opener.addheaders #$pycheck_no
opener.addheaders= app_addheaders
urlcookie_openers[user] = opener
ufile = opener.open(urllib2.Request(url,data,dict(headers) ))
...
thread.start_new(f,())
=========================

Symptoms:
__________

sometimes ufile is None and other weired invalid states.

typical Python exceptions when in better cases there is no OS-level crash:

---------

# Attributes randomly missing like:
#<----2

"AttributeError: \'module\' object has no attribute \'CookieJar\'\\n"]
---------

# weired invalid states during computation like:
#<----1

.... File "cookielib.pyo", line 184, in ?\\n\', \' File
"sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File
"sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \'
File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 588, in
_parse\\n\', \' File "sre_parse.pyo", line 92, in closegroup\\n\', \'ValueError: list.remove(x): x
not in list\\n\']
....
'windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")
---------

#<----1
File "cookielib.pyo", line 116, in ?\\n\', \' File "sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File "sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \' File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 494, in _parse\\n\', \' File "sre_parse.pyo", line 140, in __setitem__\\n\', \'IndexError: list assignment index out of range\\n\']

('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2"

---------

# weired errors in other threads:

# after dlg.DoModal() in main thread

File "wintools.pyo", line 115, in PreTranslateMessage\\n\', \'TypeError: an integer is required\\n\']

('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")

---------

# after win32ui.PumpWaitingMessages(wc.WM_PAINT, wc.WM_MOUSELAST) in main thread

\'TypeError: argument list must be a tuple\\n\'
....
Oct 30 '06 #1
Share this Question
Share on Google+
6 Replies


P: n/a
It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the
threads are spawned. Better yet, use your own locks around the
acquisition of the opener instance (this code seems fraughtfully
thread-unsafe--fix that and you solve other problems besides this one).

regards,
-Mike

robert wrote:
I get python crashes and (in better cases) strange Python exceptions when (in most cases) importing and using cookielib lazy on demand in a thread.
It is mainly with cookielib, but remember the problem also with other imports (e.g. urllib2 etc.).
And again very often in all these cases where I get weired Python exceptions, the problem is around re-functions - usually during re.compile calls during import (see some of the exceptions below). But not only.

Very strange: The errors occur almost only on machines with dual core/multi processors - and very very rarely on very fast single core machines (>3GHz).

I'm using Python2.3.5 on Win with win32ui (build 210) - the cookielib taken from Python 2.5.

I took care that I'm not starting off thread things or main application loop etc. during an import (which would cause a simple & explainable deadlock freeze on the import lock)

With real OS-level crashes I know from user reports (packaged app), that these errors occur very likely early after app start - thus when lazy imports are likely to do real execution.

I researched this bug for some time. I think I can meanwhile exclude (ref-count, mem.leak) problems in win32ui (the only complex extension lib I use) as cause for this. All statistics point towards import problems.

Any ideas?
Are there problems known with the import lock (Python 2.3.5) ?

(I cannot easily change from Python 2.3 and it takes weeks to get significant feedback after random improvements)

-robert

PS:

The basic pattern of usage is:

==================
def f():
...
opener = urlcookie_openers.get(user)
if not opener:
import cookielib #<----1
cj=cookielib.CookieJar() #<----2
build_opener = urllib2.build_opener
httpCookieProcessor = urllib2.HTTPCookieProcessor(cj)
if url2_proxy:
opener = build_opener(url2_proxy,httpCookieProcessor)
else:
opener = build_opener(httpCookieProcessor)
opener.addheaders #$pycheck_no
opener.addheaders= app_addheaders
urlcookie_openers[user] = opener
ufile = opener.open(urllib2.Request(url,data,dict(headers) ))
...
thread.start_new(f,())
=========================

Symptoms:
__________

sometimes ufile is None and other weired invalid states.

typical Python exceptions when in better cases there is no OS-level crash:

---------

# Attributes randomly missing like:
#<----2

"AttributeError: \'module\' object has no attribute \'CookieJar\'\\n"]
---------

# weired invalid states during computation like:
#<----1

... File "cookielib.pyo", line 184, in ?\\n\', \' File
"sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File
"sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \'
File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 588, in
_parse\\n\', \' File "sre_parse.pyo", line 92, in closegroup\\n\', \'ValueError: list.remove(x): x
not in list\\n\']
...
'windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")
---------

#<----1
File "cookielib.pyo", line 116, in ?\\n\', \' File "sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File "sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \' File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 494, in _parse\\n\', \' File "sre_parse.pyo", line 140, in __setitem__\\n\', \'IndexError: list assignment index out of range\\n\']

('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2"

---------

# weired errors in other threads:

# after dlg.DoModal() in main thread

File "wintools.pyo", line 115, in PreTranslateMessage\\n\', \'TypeError: an integer is required\\n\']

('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")

---------

# after win32ui.PumpWaitingMessages(wc.WM_PAINT, wc.WM_MOUSELAST) in main thread

\'TypeError: argument list must be a tuple\\n\'
...
Oct 31 '06 #2

P: n/a
Klaas wrote:
It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the
What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)
threads are spawned. Better yet, use your own locks around the
acquisition of the opener instance (this code seems fraughtfully
thread-unsafe--fix that and you solve other problems besides this one).
thanks. I will probably have to do the costly pre-import of things in main thread and spread locks as I have also no other real idea so far.

Yet this costs the smoothness of app startup and corrupts my believe in Python capabs of "lazy execution on demand".
I'd like to get a more fundamental understanding of the real problems than just a general "stay away and lock and lock everything without real understanding".

* I have no real explanation why the import of a module like cookielib is not thread-safe. And in no way I can really explain the real OS-level crashes on dual cores/fast CPU's. Python may throw this and that, Python variable states maybe wrong, but how can it crash on OS-level when no extension libs are (hopefully) responsible?
* The Import Lock should be a very hard lock: As soon as any thread imports something, all other threads are guaranteed to be out of any imports. A dead lock is not the problem here.
* cookielib module execution code consists only of definitions and of re.compile's. re.compile's should be thread safe?
* the things in my code patter are function local code except "opener = urlcookie_openers.get(user)" and "urlcookie_openers[user] = opener" : Simple dictionary accesses which are atomic from all my knowledge and experience. I think, I have thought about enough, what could be not thread safe. The only questionable things have to do with rare change of some globals, but this has not at all to do with the severe problems here and could only affect e.g wrong url2_proxy or double/unecessary re-creation of an opener, which is uncritical in my app.
I'm still puzzled and suspect there is a major problem in Python, maybe in win32ui or - no idea ... ?
-robert

================================================== ================================
def f():
...
opener = urlcookie_openers.get(user)
if not opener:
import cookielib #<----1
cj=cookielib.CookieJar() #<----2 build_opener = urllib2.build_opener
httpCookieProcessor = urllib2.HTTPCookieProcessor(cj)
if url2_proxy:
opener = build_opener(url2_proxy,httpCookieProcessor)
else:
opener = build_opener(httpCookieProcessor)
opener.addheaders #$pycheck_no
opener.addheaders= app_addheaders
urlcookie_openers[user] = opener
ufile = opener.open(urllib2.Request(url,data,dict(headers) ))
...
thread.start_new(f,())
Oct 31 '06 #3

P: n/a
robert wrote:
Klaas wrote:
It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the

What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)
No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules). Perhaps it _is_ held,
but released at various points of the import process. Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.
thanks. I will probably have to do the costly pre-import of things in main thread and spread locks as I have also no other real idea so far.
Costly?
Yet this costs the smoothness of app startup and corrupts my believe in Python capabs of "lazy execution on demand".
If you lock your code properly, you can do the import anytime you wish
I'd like to get a more fundamental understanding of the real problems than just a general "stay away and lock and lock everything without real understanding".
Of course. But you have so far provided no information to that
regard--not even a stack trace. If you suspect a bug in python, have
you submitted a bug report at sourceforge?
* I have no real explanation why the import of a module like cookielib is not thread-safe. And in no way I can really explain the real OS-level crashes on dual cores/fast CPU's. Python may throw this and that, Python variable states maybe wrong, but how can it crash on OS-level when no extension libs are (hopefully) responsible?
If you are certain (and not just hopeful) that no extension modules are
involved, this points to a bug in python.
* The Import Lock should be a very hard lock: As soon as any thread imports something, all other threads are guaranteed to be out of any imports. A dead lock is not the problem here.
What do you mean by "should"? Is this based on your knowledge of
python internals?
* the things in my code patter are function local code except "opener = urlcookie_openers.get(user)" and "urlcookie_openers[user] = opener" : Simple dictionary accesses which are atomic from all my knowledge and experience. I think, I have thought about enough, what could be not thread safe. The only questionable things have to do with rare change of some globals,
It is very easy for dictionary accesses to be thread-unsafe, as they
can call into python-level __hash__ and __eq__ code. If this happens,
a context switch is possible. Are you sure this isn't the case?
but this has not at all to do with the severe problems here and could only affect e.g wrong url2_proxy or double/unecessary re-creation of an opener, which is uncritical in my app.
Your code contains the following pattern, which can cause any number of
application errors, depending on the app:

a = getA()
if a is None:
<lots of code>
setA()

If duplicating the creation of an opener isn't a problem, why not just
create one for a user to begin with?
I'm still puzzled and suspect there is a major problem in Python, maybe in win32ui or - no idea ... ?
Python does a relatively decent job of maintaining thread security for
its most basic operations, but this is no substitute for caring about
thread safety in your own application. It is only true in the most
basic cases that a single line of code corresponds to a single opcode,
and determining that the code is correct is even more difficult than
when using explicit locking. The advantages just aren't worth it:

$ python -m timeit -s "import thread; t=thread.allocate_lock()"
"t.acquire(); t.release()"
1000000 loops, best of 3: 1.34 usec per loop

Note that this is actually less expensive than the handle of python
code that dummy_threading does:

$ python -m timeit -s "import dummy_threading; t =
dummy_threading.Lock()" "t.acquire(); t.release()"
100000 loops, best of 3: 2.05 usec per loop

Note that this _doesn't_ mean that you should "lock everything without
real understanding", but in my experience there is very little
meaningful python code that the GIL locks adequately.

As for your crashes, those should be investigated. But without really
any hints, I don't see that happening. If you can't reproduce it, it
seems unlikely that anyone else will be able to.

-Mike

Oct 31 '06 #4

P: n/a
Klaas wrote:
robert wrote:
>>Klaas wrote:
>>>It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the

What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)


No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules). Perhaps it _is_ held,
but released at various points of the import process. Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.
.... and who better than Tim Peters?

http://mail.python.org/pipermail/pyt...er/254497.html

HTH

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Nov 1 '06 #5

P: n/a
Steve Holden wrote:
Klaas wrote:
>robert wrote:
>>Klaas wrote:

It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the

What is the exact meaning of "not include fully-executing" -
regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having
executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)


No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules). Perhaps it _is_ held,
but released at various points of the import process. Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.
... and who better than Tim Peters?

http://mail.python.org/pipermail/pyt...er/254497.html
This describes also a dead lock condition, when (thread starting) main-loop code is executed already during import and not via a "module.start_loop()" scheme.

Here the situation is reverse: imports are done in threads. That should go well.

The problem seem to be module namespace corruption (partial execution?) and worse OS-level crashes.
I'm using zip files on the sys.path.

(
Inspected also the Py2.3.5 zipimport scheme for that, but no indication of flaws so far. The lock is done very early in PyImport_ImportModuleEx.
Just if there would be multiple Interpreters (Py_NewInterpreter) - which I don't use - the "static void lock_import(void)" would possibly be weak.
)
I made a isolated thread race test with dozens of such OP code like patterns (reloads enforced) executing in parallel and found no problem.
I've spread locks and main thread (global) imports now in my app code, but it takes time until I get success significant feedback. As I have no explanation to the point .. a low percentage task :-)

win32ui/win32api (the only extension libs) are still on my radar for general ref-count/memleak problems (I found a few in the past), but that leads to abyss. I thought that the pattern of the crashes (crash early after app start; strange exceptions frequently around cookielib / re.compile) points to something else in Python itself ...
-robert
Nov 1 '06 #6

P: n/a
Steve Holden wrote:
Klaas wrote:
>robert wrote:
>>Klaas wrote:

It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the

What is the exact meaning of "not include fully-executing" -
regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having
executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)


No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules). Perhaps it _is_ held,
but released at various points of the import process. Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.
... and who better than Tim Peters?

http://mail.python.org/pipermail/pyt...er/254497.html
this describes also a dead lock condition, when (thread starting) main-loop code is executed already during import and not via a "module.start_loop()" scheme.

Here the situation is reverse: imports are done in threads. That should go well.

The problem seem to be module namespace corruption (partial execution?) and worse OS-level crashes. I'm using zip files on the sys.path.
(
Inspected also the Py2.3.5 zipimport scheme for that, but no indication of flaws so far. The lock is done very early in PyImport_ImportModuleEx. Just if there would be multiple Interpreters (Py_NewInterpreter) - which I don't use - the "static void lock_import(void)" would possibly be weak. )
I made a isolated thread race test with dozens of such OP code like patterns (reloads enforced) executing in parallel and found no problem.
I've spread locks and main thread (global) imports now in my app code, but it takes time until I get success significant feedback. As I have no explanation to the point .. a low percentage task :-)

win32ui/win32api (the only extension libs) are still on my radar for general ref-count/memleak problems (I found a few in the past), but that leads to abyss. I thought that the pattern of the crashes (crash early after app start; strange exceptions frequently around cookielib / re.compile) points to something else in Python itself ...
-robert
Nov 1 '06 #7

This discussion thread is closed

Replies have been disabled for this discussion.