473,320 Members | 1,865 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

import in threads: crashes & strange exceptions on dual core machines

I get python crashes and (in better cases) strange Python exceptions when (in most cases) importing and using cookielib lazy on demand in a thread.
It is mainly with cookielib, but remember the problem also with other imports (e.g. urllib2 etc.).
And again very often in all these cases where I get weired Python exceptions, the problem is around re-functions - usually during re.compile calls during import (see some of the exceptions below). But not only.

Very strange: The errors occur almost only on machines with dual core/multi processors - and very very rarely on very fast single core machines (>3GHz).

I'm using Python2.3.5 on Win with win32ui (build 210) - the cookielib taken from Python 2.5.

I took care that I'm not starting off thread things or main application loop etc. during an import (which would cause a simple & explainable deadlock freeze on the import lock)

With real OS-level crashes I know from user reports (packaged app), that these errors occur very likely early after app start - thus when lazy imports are likely to do real execution.

I researched this bug for some time. I think I can meanwhile exclude (ref-count, mem.leak) problems in win32ui (the only complex extension lib I use) as cause for this. All statistics point towards import problems.

Any ideas?
Are there problems known with the import lock (Python 2.3.5) ?

(I cannot easily change from Python 2.3 and it takes weeks to get significant feedback after random improvements)

-robert

PS:

The basic pattern of usage is:

==================
def f():
...
opener = urlcookie_openers.get(user)
if not opener:
import cookielib #<----1
cj=cookielib.CookieJar() #<----2
build_opener = urllib2.build_opener
httpCookieProcessor = urllib2.HTTPCookieProcessor(cj)
if url2_proxy:
opener = build_opener(url2_proxy,httpCookieProcessor)
else:
opener = build_opener(httpCookieProcessor)
opener.addheaders #$pycheck_no
opener.addheaders= app_addheaders
urlcookie_openers[user] = opener
ufile = opener.open(urllib2.Request(url,data,dict(headers) ))
...
thread.start_new(f,())
=========================

Symptoms:
__________

sometimes ufile is None and other weired invalid states.

typical Python exceptions when in better cases there is no OS-level crash:

---------

# Attributes randomly missing like:
#<----2

"AttributeError: \'module\' object has no attribute \'CookieJar\'\\n"]
---------

# weired invalid states during computation like:
#<----1

.... File "cookielib.pyo", line 184, in ?\\n\', \' File
"sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File
"sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \'
File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 588, in
_parse\\n\', \' File "sre_parse.pyo", line 92, in closegroup\\n\', \'ValueError: list.remove(x): x
not in list\\n\']
....
'windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")
---------

#<----1
File "cookielib.pyo", line 116, in ?\\n\', \' File "sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File "sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \' File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 494, in _parse\\n\', \' File "sre_parse.pyo", line 140, in __setitem__\\n\', \'IndexError: list assignment index out of range\\n\']

('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2"

---------

# weired errors in other threads:

# after dlg.DoModal() in main thread

File "wintools.pyo", line 115, in PreTranslateMessage\\n\', \'TypeError: an integer is required\\n\']

('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")

---------

# after win32ui.PumpWaitingMessages(wc.WM_PAINT, wc.WM_MOUSELAST) in main thread

\'TypeError: argument list must be a tuple\\n\'
....
Oct 30 '06 #1
6 2331
It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the
threads are spawned. Better yet, use your own locks around the
acquisition of the opener instance (this code seems fraughtfully
thread-unsafe--fix that and you solve other problems besides this one).

regards,
-Mike

robert wrote:
I get python crashes and (in better cases) strange Python exceptions when (in most cases) importing and using cookielib lazy on demand in a thread.
It is mainly with cookielib, but remember the problem also with other imports (e.g. urllib2 etc.).
And again very often in all these cases where I get weired Python exceptions, the problem is around re-functions - usually during re.compile calls during import (see some of the exceptions below). But not only.

Very strange: The errors occur almost only on machines with dual core/multi processors - and very very rarely on very fast single core machines (>3GHz).

I'm using Python2.3.5 on Win with win32ui (build 210) - the cookielib taken from Python 2.5.

I took care that I'm not starting off thread things or main application loop etc. during an import (which would cause a simple & explainable deadlock freeze on the import lock)

With real OS-level crashes I know from user reports (packaged app), that these errors occur very likely early after app start - thus when lazy imports are likely to do real execution.

I researched this bug for some time. I think I can meanwhile exclude (ref-count, mem.leak) problems in win32ui (the only complex extension lib I use) as cause for this. All statistics point towards import problems.

Any ideas?
Are there problems known with the import lock (Python 2.3.5) ?

(I cannot easily change from Python 2.3 and it takes weeks to get significant feedback after random improvements)

-robert

PS:

The basic pattern of usage is:

==================
def f():
...
opener = urlcookie_openers.get(user)
if not opener:
import cookielib #<----1
cj=cookielib.CookieJar() #<----2
build_opener = urllib2.build_opener
httpCookieProcessor = urllib2.HTTPCookieProcessor(cj)
if url2_proxy:
opener = build_opener(url2_proxy,httpCookieProcessor)
else:
opener = build_opener(httpCookieProcessor)
opener.addheaders #$pycheck_no
opener.addheaders= app_addheaders
urlcookie_openers[user] = opener
ufile = opener.open(urllib2.Request(url,data,dict(headers) ))
...
thread.start_new(f,())
=========================

Symptoms:
__________

sometimes ufile is None and other weired invalid states.

typical Python exceptions when in better cases there is no OS-level crash:

---------

# Attributes randomly missing like:
#<----2

"AttributeError: \'module\' object has no attribute \'CookieJar\'\\n"]
---------

# weired invalid states during computation like:
#<----1

... File "cookielib.pyo", line 184, in ?\\n\', \' File
"sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File
"sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \'
File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 588, in
_parse\\n\', \' File "sre_parse.pyo", line 92, in closegroup\\n\', \'ValueError: list.remove(x): x
not in list\\n\']
...
'windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")
---------

#<----1
File "cookielib.pyo", line 116, in ?\\n\', \' File "sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File "sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \' File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 494, in _parse\\n\', \' File "sre_parse.pyo", line 140, in __setitem__\\n\', \'IndexError: list assignment index out of range\\n\']

('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2"

---------

# weired errors in other threads:

# after dlg.DoModal() in main thread

File "wintools.pyo", line 115, in PreTranslateMessage\\n\', \'TypeError: an integer is required\\n\']

('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")

---------

# after win32ui.PumpWaitingMessages(wc.WM_PAINT, wc.WM_MOUSELAST) in main thread

\'TypeError: argument list must be a tuple\\n\'
...
Oct 31 '06 #2
Klaas wrote:
It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the
What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)
threads are spawned. Better yet, use your own locks around the
acquisition of the opener instance (this code seems fraughtfully
thread-unsafe--fix that and you solve other problems besides this one).
thanks. I will probably have to do the costly pre-import of things in main thread and spread locks as I have also no other real idea so far.

Yet this costs the smoothness of app startup and corrupts my believe in Python capabs of "lazy execution on demand".
I'd like to get a more fundamental understanding of the real problems than just a general "stay away and lock and lock everything without real understanding".

* I have no real explanation why the import of a module like cookielib is not thread-safe. And in no way I can really explain the real OS-level crashes on dual cores/fast CPU's. Python may throw this and that, Python variable states maybe wrong, but how can it crash on OS-level when no extension libs are (hopefully) responsible?
* The Import Lock should be a very hard lock: As soon as any thread imports something, all other threads are guaranteed to be out of any imports. A dead lock is not the problem here.
* cookielib module execution code consists only of definitions and of re.compile's. re.compile's should be thread safe?
* the things in my code patter are function local code except "opener = urlcookie_openers.get(user)" and "urlcookie_openers[user] = opener" : Simple dictionary accesses which are atomic from all my knowledge and experience. I think, I have thought about enough, what could be not thread safe. The only questionable things have to do with rare change of some globals, but this has not at all to do with the severe problems here and could only affect e.g wrong url2_proxy or double/unecessary re-creation of an opener, which is uncritical in my app.
I'm still puzzled and suspect there is a major problem in Python, maybe in win32ui or - no idea ... ?
-robert

================================================== ================================
def f():
...
opener = urlcookie_openers.get(user)
if not opener:
import cookielib #<----1
cj=cookielib.CookieJar() #<----2 build_opener = urllib2.build_opener
httpCookieProcessor = urllib2.HTTPCookieProcessor(cj)
if url2_proxy:
opener = build_opener(url2_proxy,httpCookieProcessor)
else:
opener = build_opener(httpCookieProcessor)
opener.addheaders #$pycheck_no
opener.addheaders= app_addheaders
urlcookie_openers[user] = opener
ufile = opener.open(urllib2.Request(url,data,dict(headers) ))
...
thread.start_new(f,())
Oct 31 '06 #3
robert wrote:
Klaas wrote:
It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the

What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)
No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules). Perhaps it _is_ held,
but released at various points of the import process. Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.
thanks. I will probably have to do the costly pre-import of things in main thread and spread locks as I have also no other real idea so far.
Costly?
Yet this costs the smoothness of app startup and corrupts my believe in Python capabs of "lazy execution on demand".
If you lock your code properly, you can do the import anytime you wish
I'd like to get a more fundamental understanding of the real problems than just a general "stay away and lock and lock everything without real understanding".
Of course. But you have so far provided no information to that
regard--not even a stack trace. If you suspect a bug in python, have
you submitted a bug report at sourceforge?
* I have no real explanation why the import of a module like cookielib is not thread-safe. And in no way I can really explain the real OS-level crashes on dual cores/fast CPU's. Python may throw this and that, Python variable states maybe wrong, but how can it crash on OS-level when no extension libs are (hopefully) responsible?
If you are certain (and not just hopeful) that no extension modules are
involved, this points to a bug in python.
* The Import Lock should be a very hard lock: As soon as any thread imports something, all other threads are guaranteed to be out of any imports. A dead lock is not the problem here.
What do you mean by "should"? Is this based on your knowledge of
python internals?
* the things in my code patter are function local code except "opener = urlcookie_openers.get(user)" and "urlcookie_openers[user] = opener" : Simple dictionary accesses which are atomic from all my knowledge and experience. I think, I have thought about enough, what could be not thread safe. The only questionable things have to do with rare change of some globals,
It is very easy for dictionary accesses to be thread-unsafe, as they
can call into python-level __hash__ and __eq__ code. If this happens,
a context switch is possible. Are you sure this isn't the case?
but this has not at all to do with the severe problems here and could only affect e.g wrong url2_proxy or double/unecessary re-creation of an opener, which is uncritical in my app.
Your code contains the following pattern, which can cause any number of
application errors, depending on the app:

a = getA()
if a is None:
<lots of code>
setA()

If duplicating the creation of an opener isn't a problem, why not just
create one for a user to begin with?
I'm still puzzled and suspect there is a major problem in Python, maybe in win32ui or - no idea ... ?
Python does a relatively decent job of maintaining thread security for
its most basic operations, but this is no substitute for caring about
thread safety in your own application. It is only true in the most
basic cases that a single line of code corresponds to a single opcode,
and determining that the code is correct is even more difficult than
when using explicit locking. The advantages just aren't worth it:

$ python -m timeit -s "import thread; t=thread.allocate_lock()"
"t.acquire(); t.release()"
1000000 loops, best of 3: 1.34 usec per loop

Note that this is actually less expensive than the handle of python
code that dummy_threading does:

$ python -m timeit -s "import dummy_threading; t =
dummy_threading.Lock()" "t.acquire(); t.release()"
100000 loops, best of 3: 2.05 usec per loop

Note that this _doesn't_ mean that you should "lock everything without
real understanding", but in my experience there is very little
meaningful python code that the GIL locks adequately.

As for your crashes, those should be investigated. But without really
any hints, I don't see that happening. If you can't reproduce it, it
seems unlikely that anyone else will be able to.

-Mike

Oct 31 '06 #4
Klaas wrote:
robert wrote:
>>Klaas wrote:
>>>It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the

What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)


No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules). Perhaps it _is_ held,
but released at various points of the import process. Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.
.... and who better than Tim Peters?

http://mail.python.org/pipermail/pyt...er/254497.html

HTH

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Nov 1 '06 #5
Steve Holden wrote:
Klaas wrote:
>robert wrote:
>>Klaas wrote:

It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the

What is the exact meaning of "not include fully-executing" -
regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having
executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)


No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules). Perhaps it _is_ held,
but released at various points of the import process. Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.
... and who better than Tim Peters?

http://mail.python.org/pipermail/pyt...er/254497.html
This describes also a dead lock condition, when (thread starting) main-loop code is executed already during import and not via a "module.start_loop()" scheme.

Here the situation is reverse: imports are done in threads. That should go well.

The problem seem to be module namespace corruption (partial execution?) and worse OS-level crashes.
I'm using zip files on the sys.path.

(
Inspected also the Py2.3.5 zipimport scheme for that, but no indication of flaws so far. The lock is done very early in PyImport_ImportModuleEx.
Just if there would be multiple Interpreters (Py_NewInterpreter) - which I don't use - the "static void lock_import(void)" would possibly be weak.
)
I made a isolated thread race test with dozens of such OP code like patterns (reloads enforced) executing in parallel and found no problem.
I've spread locks and main thread (global) imports now in my app code, but it takes time until I get success significant feedback. As I have no explanation to the point .. a low percentage task :-)

win32ui/win32api (the only extension libs) are still on my radar for general ref-count/memleak problems (I found a few in the past), but that leads to abyss. I thought that the pattern of the crashes (crash early after app start; strange exceptions frequently around cookielib / re.compile) points to something else in Python itself ...
-robert
Nov 1 '06 #6
Steve Holden wrote:
Klaas wrote:
>robert wrote:
>>Klaas wrote:

It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the

What is the exact meaning of "not include fully-executing" -
regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having
executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)


No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules). Perhaps it _is_ held,
but released at various points of the import process. Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.
... and who better than Tim Peters?

http://mail.python.org/pipermail/pyt...er/254497.html
this describes also a dead lock condition, when (thread starting) main-loop code is executed already during import and not via a "module.start_loop()" scheme.

Here the situation is reverse: imports are done in threads. That should go well.

The problem seem to be module namespace corruption (partial execution?) and worse OS-level crashes. I'm using zip files on the sys.path.
(
Inspected also the Py2.3.5 zipimport scheme for that, but no indication of flaws so far. The lock is done very early in PyImport_ImportModuleEx. Just if there would be multiple Interpreters (Py_NewInterpreter) - which I don't use - the "static void lock_import(void)" would possibly be weak. )
I made a isolated thread race test with dozens of such OP code like patterns (reloads enforced) executing in parallel and found no problem.
I've spread locks and main thread (global) imports now in my app code, but it takes time until I get success significant feedback. As I have no explanation to the point .. a low percentage task :-)

win32ui/win32api (the only extension libs) are still on my radar for general ref-count/memleak problems (I found a few in the past), but that leads to abyss. I thought that the pattern of the crashes (crash early after app start; strange exceptions frequently around cookielib / re.compile) points to something else in Python itself ...
-robert
Nov 1 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Kevin | last post by:
Hi I am writing an application for a client that (mostly) have dual core desktop machines. I have a DB query that returns a large amount of data that I will then need to manipulate on the...
10
by: [Yosi] | last post by:
I would like to know how threads behavior in .NET . When an application create 4 threads for example start all of them, the OS task manager will execute all 4 thread in deterministic order manes,...
2
by: webwarrior | last post by:
Hi, Is there a reason why we have to pay more for licensing for a different kind of processor? Why are we not charged for the Hyperthreading on some processors also. If Oracle is really...
2
by: bruce_brodinsky | last post by:
Don't know whether to post this on a hardware or software board, so here goes: I wrote a c# chess program which searches for checkmate. Now, it's single-threaded. But I was thinking. I just got...
24
by: Poly-poly man | last post by:
I'm a total newbie to threads, but am generally good with c. I'm trying to run a program that might take a while, but system by itself does not return until the program is finished. I thought that...
5
by: robert | last post by:
Simple Python code obviously cannot use the dual core by Python threads. Yet, a program drawing CPU mainly for matrix computations - preferably with Numeric/SciPy - will this profit from a dual...
7
by: kunal | last post by:
Hello Friends, I am facing a strange problem which i would like to share with you people and get ur ideas and knowledge about it. Whenever i try to run a C++ program after successfully compiling i...
6
by: nikhilketkar | last post by:
What are the implications of the Global Interpreter Lock in Python ? Does this mean that Python threads cannot exploit a dual core processor and the only advantage of using threads is in that...
3
by: Paul Sijben | last post by:
I am running a multi-threaded python application in a dual core intel running Ubuntu. I am using python 2.5.1 that I compiled myself. At random points I am getting segmentation faults (sometimes...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.