I get python crashes and (in better cases) strange Python exceptions when (in most cases) importing and using cookielib lazy on demand in a thread.
It is mainly with cookielib, but remember the problem also with other imports (e.g. urllib2 etc.).
And again very often in all these cases where I get weired Python exceptions, the problem is around re-functions - usually during re.compile calls during import (see some of the exceptions below). But not only.
Very strange: The errors occur almost only on machines with dual core/multi processors - and very very rarely on very fast single core machines (>3GHz).
I'm using Python2.3.5 on Win with win32ui (build 210) - the cookielib taken from Python 2.5.
I took care that I'm not starting off thread things or main application loop etc. during an import (which would cause a simple & explainable deadlock freeze on the import lock)
With real OS-level crashes I know from user reports (packaged app), that these errors occur very likely early after app start - thus when lazy imports are likely to do real execution.
I researched this bug for some time. I think I can meanwhile exclude (ref-count, mem.leak) problems in win32ui (the only complex extension lib I use) as cause for this. All statistics point towards import problems.
Any ideas?
Are there problems known with the import lock (Python 2.3.5) ?
(I cannot easily change from Python 2.3 and it takes weeks to get significant feedback after random improvements)
-robert
PS:
The basic pattern of usage is:
==================
def f():
...
opener = urlcookie_openers.get(user)
if not opener:
import cookielib #<----1
cj=cookielib.CookieJar() #<----2
build_opener = urllib2.build_opener
httpCookieProcessor = urllib2.HTTPCookieProcessor(cj)
if url2_proxy:
opener = build_opener(url2_proxy,httpCookieProcessor)
else:
opener = build_opener(httpCookieProcessor)
opener.addheaders #$pycheck_no
opener.addheaders= app_addheaders
urlcookie_openers[user] = opener
ufile = opener.open(urllib2.Request(url,data,dict(headers) ))
...
thread.start_new(f,())
=========================
Symptoms:
__________
sometimes ufile is None and other weired invalid states.
typical Python exceptions when in better cases there is no OS-level crash:
---------
# Attributes randomly missing like:
#<----2
"AttributeError: \'module\' object has no attribute \'CookieJar\'\\n"]
---------
# weired invalid states during computation like:
#<----1
.... File "cookielib.pyo", line 184, in ?\\n\', \' File
"sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File
"sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \'
File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 588, in
_parse\\n\', \' File "sre_parse.pyo", line 92, in closegroup\\n\', \'ValueError: list.remove(x): x
not in list\\n\']
....
'windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")
---------
#<----1
File "cookielib.pyo", line 116, in ?\\n\', \' File "sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File "sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \' File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 494, in _parse\\n\', \' File "sre_parse.pyo", line 140, in __setitem__\\n\', \'IndexError: list assignment index out of range\\n\']
('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2"
---------
# weired errors in other threads:
# after dlg.DoModal() in main thread
File "wintools.pyo", line 115, in PreTranslateMessage\\n\', \'TypeError: an integer is required\\n\']
('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")
---------
# after win32ui.PumpWaitingMessages(wc.WM_PAINT, wc.WM_MOUSELAST) in main thread
\'TypeError: argument list must be a tuple\\n\'
.... 6 2222
It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the
threads are spawned. Better yet, use your own locks around the
acquisition of the opener instance (this code seems fraughtfully
thread-unsafe--fix that and you solve other problems besides this one).
regards,
-Mike
robert wrote:
I get python crashes and (in better cases) strange Python exceptions when (in most cases) importing and using cookielib lazy on demand in a thread.
It is mainly with cookielib, but remember the problem also with other imports (e.g. urllib2 etc.).
And again very often in all these cases where I get weired Python exceptions, the problem is around re-functions - usually during re.compile calls during import (see some of the exceptions below). But not only.
Very strange: The errors occur almost only on machines with dual core/multi processors - and very very rarely on very fast single core machines (>3GHz).
I'm using Python2.3.5 on Win with win32ui (build 210) - the cookielib taken from Python 2.5.
I took care that I'm not starting off thread things or main application loop etc. during an import (which would cause a simple & explainable deadlock freeze on the import lock)
With real OS-level crashes I know from user reports (packaged app), that these errors occur very likely early after app start - thus when lazy imports are likely to do real execution.
I researched this bug for some time. I think I can meanwhile exclude (ref-count, mem.leak) problems in win32ui (the only complex extension lib I use) as cause for this. All statistics point towards import problems.
Any ideas?
Are there problems known with the import lock (Python 2.3.5) ?
(I cannot easily change from Python 2.3 and it takes weeks to get significant feedback after random improvements)
-robert
PS:
The basic pattern of usage is:
==================
def f():
...
opener = urlcookie_openers.get(user)
if not opener:
import cookielib #<----1
cj=cookielib.CookieJar() #<----2
build_opener = urllib2.build_opener
httpCookieProcessor = urllib2.HTTPCookieProcessor(cj)
if url2_proxy:
opener = build_opener(url2_proxy,httpCookieProcessor)
else:
opener = build_opener(httpCookieProcessor)
opener.addheaders #$pycheck_no
opener.addheaders= app_addheaders
urlcookie_openers[user] = opener
ufile = opener.open(urllib2.Request(url,data,dict(headers) ))
...
thread.start_new(f,())
=========================
Symptoms:
__________
sometimes ufile is None and other weired invalid states.
typical Python exceptions when in better cases there is no OS-level crash:
---------
# Attributes randomly missing like:
#<----2
"AttributeError: \'module\' object has no attribute \'CookieJar\'\\n"]
---------
# weired invalid states during computation like:
#<----1
... File "cookielib.pyo", line 184, in ?\\n\', \' File
"sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File
"sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \'
File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 588, in
_parse\\n\', \' File "sre_parse.pyo", line 92, in closegroup\\n\', \'ValueError: list.remove(x): x
not in list\\n\']
...
'windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")
---------
#<----1
File "cookielib.pyo", line 116, in ?\\n\', \' File "sre.pyo", line 179, in compile\\n\', \' File "sre.pyo", line 228, in _compile\\n\', \' File "sre_compile.pyo", line 467, in compile\\n\', \' File "sre_parse.pyo", line 624, in parse\\n\', \' File "sre_parse.pyo", line 317, in _parse_sub\\n\', \' File "sre_parse.pyo", line 494, in _parse\\n\', \' File "sre_parse.pyo", line 140, in __setitem__\\n\', \'IndexError: list assignment index out of range\\n\']
('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2"
---------
# weired errors in other threads:
# after dlg.DoModal() in main thread
File "wintools.pyo", line 115, in PreTranslateMessage\\n\', \'TypeError: an integer is required\\n\']
('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")
---------
# after win32ui.PumpWaitingMessages(wc.WM_PAINT, wc.WM_MOUSELAST) in main thread
\'TypeError: argument list must be a tuple\\n\'
...
Klaas wrote:
It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the
What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)
threads are spawned. Better yet, use your own locks around the
acquisition of the opener instance (this code seems fraughtfully
thread-unsafe--fix that and you solve other problems besides this one).
thanks. I will probably have to do the costly pre-import of things in main thread and spread locks as I have also no other real idea so far.
Yet this costs the smoothness of app startup and corrupts my believe in Python capabs of "lazy execution on demand".
I'd like to get a more fundamental understanding of the real problems than just a general "stay away and lock and lock everything without real understanding".
* I have no real explanation why the import of a module like cookielib is not thread-safe. And in no way I can really explain the real OS-level crashes on dual cores/fast CPU's. Python may throw this and that, Python variable states maybe wrong, but how can it crash on OS-level when no extension libs are (hopefully) responsible?
* The Import Lock should be a very hard lock: As soon as any thread imports something, all other threads are guaranteed to be out of any imports. A dead lock is not the problem here.
* cookielib module execution code consists only of definitions and of re.compile's. re.compile's should be thread safe?
* the things in my code patter are function local code except "opener = urlcookie_openers.get(user)" and "urlcookie_openers[user] = opener" : Simple dictionary accesses which are atomic from all my knowledge and experience. I think, I have thought about enough, what could be not thread safe. The only questionable things have to do with rare change of some globals, but this has not at all to do with the severe problems here and could only affect e.g wrong url2_proxy or double/unecessary re-creation of an opener, which is uncritical in my app.
I'm still puzzled and suspect there is a major problem in Python, maybe in win32ui or - no idea ... ?
-robert
================================================== ================================
def f():
...
opener = urlcookie_openers.get(user)
if not opener:
import cookielib #<----1
cj=cookielib.CookieJar() #<----2 build_opener = urllib2.build_opener
httpCookieProcessor = urllib2.HTTPCookieProcessor(cj)
if url2_proxy:
opener = build_opener(url2_proxy,httpCookieProcessor)
else:
opener = build_opener(httpCookieProcessor)
opener.addheaders #$pycheck_no
opener.addheaders= app_addheaders
urlcookie_openers[user] = opener
ufile = opener.open(urllib2.Request(url,data,dict(headers) ))
...
thread.start_new(f,())
robert wrote:
Klaas wrote:
It seems clear that the import lock does not include fully-executing
the module contents. To fix this, just import cookielib before the
What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ?
Do you really mean the import statement can return without having executed the cookielib module code fully?
(As said, a simple deadlock is not at all my problem)
No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules). Perhaps it _is_ held,
but released at various points of the import process. Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.
thanks. I will probably have to do the costly pre-import of things in main thread and spread locks as I have also no other real idea so far.
Costly?
Yet this costs the smoothness of app startup and corrupts my believe in Python capabs of "lazy execution on demand".
If you lock your code properly, you can do the import anytime you wish
I'd like to get a more fundamental understanding of the real problems than just a general "stay away and lock and lock everything without real understanding".
Of course. But you have so far provided no information to that
regard--not even a stack trace. If you suspect a bug in python, have
you submitted a bug report at sourceforge?
* I have no real explanation why the import of a module like cookielib is not thread-safe. And in no way I can really explain the real OS-level crashes on dual cores/fast CPU's. Python may throw this and that, Python variable states maybe wrong, but how can it crash on OS-level when no extension libs are (hopefully) responsible?
If you are certain (and not just hopeful) that no extension modules are
involved, this points to a bug in python.
* The Import Lock should be a very hard lock: As soon as any thread imports something, all other threads are guaranteed to be out of any imports. A dead lock is not the problem here.
What do you mean by "should"? Is this based on your knowledge of
python internals?
* the things in my code patter are function local code except "opener = urlcookie_openers.get(user)" and "urlcookie_openers[user] = opener" : Simple dictionary accesses which are atomic from all my knowledge and experience. I think, I have thought about enough, what could be not thread safe. The only questionable things have to do with rare change of some globals,
It is very easy for dictionary accesses to be thread-unsafe, as they
can call into python-level __hash__ and __eq__ code. If this happens,
a context switch is possible. Are you sure this isn't the case?
but this has not at all to do with the severe problems here and could only affect e.g wrong url2_proxy or double/unecessary re-creation of an opener, which is uncritical in my app.
Your code contains the following pattern, which can cause any number of
application errors, depending on the app:
a = getA()
if a is None:
<lots of code>
setA()
If duplicating the creation of an opener isn't a problem, why not just
create one for a user to begin with?
I'm still puzzled and suspect there is a major problem in Python, maybe in win32ui or - no idea ... ?
Python does a relatively decent job of maintaining thread security for
its most basic operations, but this is no substitute for caring about
thread safety in your own application. It is only true in the most
basic cases that a single line of code corresponds to a single opcode,
and determining that the code is correct is even more difficult than
when using explicit locking. The advantages just aren't worth it:
$ python -m timeit -s "import thread; t=thread.allocate_lock()"
"t.acquire(); t.release()"
1000000 loops, best of 3: 1.34 usec per loop
Note that this is actually less expensive than the handle of python
code that dummy_threading does:
$ python -m timeit -s "import dummy_threading; t =
dummy_threading.Lock()" "t.acquire(); t.release()"
100000 loops, best of 3: 2.05 usec per loop
Note that this _doesn't_ mean that you should "lock everything without
real understanding", but in my experience there is very little
meaningful python code that the GIL locks adequately.
As for your crashes, those should be investigated. But without really
any hints, I don't see that happening. If you can't reproduce it, it
seems unlikely that anyone else will be able to.
-Mike
Klaas wrote:
robert wrote:
>>Klaas wrote:
>>>It seems clear that the import lock does not include fully-executing the module contents. To fix this, just import cookielib before the
What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ? Do you really mean the import statement can return without having executed the cookielib module code fully? (As said, a simple deadlock is not at all my problem)
No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules). Perhaps it _is_ held,
but released at various points of the import process. Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.
.... and who better than Tim Peters? http://mail.python.org/pipermail/pyt...er/254497.html
HTH
regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden
Steve Holden wrote:
Klaas wrote:
>robert wrote:
>>Klaas wrote:
It seems clear that the import lock does not include fully-executing the module contents. To fix this, just import cookielib before the
What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ? Do you really mean the import statement can return without having executed the cookielib module code fully? (As said, a simple deadlock is not at all my problem)
No, I mean that the import lock seems to not be held while the module contents are being executed (which would be why you are getting partially-initialized module in sys.modules). Perhaps it _is_ held, but released at various points of the import process. Someone more knowledgable of python internals will have to answer the question of what _should_ be occurring.
... and who better than Tim Peters?
http://mail.python.org/pipermail/pyt...er/254497.html
This describes also a dead lock condition, when (thread starting) main-loop code is executed already during import and not via a "module.start_loop()" scheme.
Here the situation is reverse: imports are done in threads. That should go well.
The problem seem to be module namespace corruption (partial execution?) and worse OS-level crashes.
I'm using zip files on the sys.path.
(
Inspected also the Py2.3.5 zipimport scheme for that, but no indication of flaws so far. The lock is done very early in PyImport_ImportModuleEx.
Just if there would be multiple Interpreters (Py_NewInterpreter) - which I don't use - the "static void lock_import(void)" would possibly be weak.
)
I made a isolated thread race test with dozens of such OP code like patterns (reloads enforced) executing in parallel and found no problem.
I've spread locks and main thread (global) imports now in my app code, but it takes time until I get success significant feedback. As I have no explanation to the point .. a low percentage task :-)
win32ui/win32api (the only extension libs) are still on my radar for general ref-count/memleak problems (I found a few in the past), but that leads to abyss. I thought that the pattern of the crashes (crash early after app start; strange exceptions frequently around cookielib / re.compile) points to something else in Python itself ...
-robert
Steve Holden wrote:
Klaas wrote:
>robert wrote:
>>Klaas wrote:
It seems clear that the import lock does not include fully-executing the module contents. To fix this, just import cookielib before the
What is the exact meaning of "not include fully-executing" - regarding the examples "import cookielib" ? Do you really mean the import statement can return without having executed the cookielib module code fully? (As said, a simple deadlock is not at all my problem)
No, I mean that the import lock seems to not be held while the module contents are being executed (which would be why you are getting partially-initialized module in sys.modules). Perhaps it _is_ held, but released at various points of the import process. Someone more knowledgable of python internals will have to answer the question of what _should_ be occurring.
... and who better than Tim Peters?
http://mail.python.org/pipermail/pyt...er/254497.html
this describes also a dead lock condition, when (thread starting) main-loop code is executed already during import and not via a "module.start_loop()" scheme.
Here the situation is reverse: imports are done in threads. That should go well.
The problem seem to be module namespace corruption (partial execution?) and worse OS-level crashes. I'm using zip files on the sys.path.
(
Inspected also the Py2.3.5 zipimport scheme for that, but no indication of flaws so far. The lock is done very early in PyImport_ImportModuleEx. Just if there would be multiple Interpreters (Py_NewInterpreter) - which I don't use - the "static void lock_import(void)" would possibly be weak. )
I made a isolated thread race test with dozens of such OP code like patterns (reloads enforced) executing in parallel and found no problem.
I've spread locks and main thread (global) imports now in my app code, but it takes time until I get success significant feedback. As I have no explanation to the point .. a low percentage task :-)
win32ui/win32api (the only extension libs) are still on my radar for general ref-count/memleak problems (I found a few in the past), but that leads to abyss. I thought that the pattern of the crashes (crash early after app start; strange exceptions frequently around cookielib / re.compile) points to something else in Python itself ...
-robert This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Kevin |
last post by:
Hi
I am writing an application for a client that (mostly) have dual core
desktop machines.
I have a DB query that returns a large amount of data...
|
by: [Yosi] |
last post by:
I would like to know how threads behavior in .NET .
When an application create 4 threads for example start all of them, the OS
task manager will...
|
by: webwarrior |
last post by:
Hi,
Is there a reason why we have to pay more for licensing for a different
kind of processor?
Why are we not charged for the Hyperthreading...
|
by: bruce_brodinsky |
last post by:
Don't know whether to post this on a hardware or software board, so
here goes:
I wrote a c# chess program which searches for checkmate. Now, it's...
|
by: Poly-poly man |
last post by:
I'm a total newbie to threads, but am generally good with c. I'm trying to
run a program that might take a while, but system by itself does not...
|
by: robert |
last post by:
Simple Python code obviously cannot use the dual core by Python threads.
Yet, a program drawing CPU mainly for matrix computations - preferably...
|
by: kunal |
last post by:
Hello Friends,
I am facing a strange problem which i would like to share with you
people and get ur ideas and knowledge about it. Whenever i try to...
|
by: nikhilketkar |
last post by:
What are the implications of the Global Interpreter Lock in Python ?
Does this mean that Python threads cannot exploit a dual core
processor and...
|
by: Paul Sijben |
last post by:
I am running a multi-threaded python application in a dual core intel
running Ubuntu.
I am using python 2.5.1 that I compiled myself. At random...
|
by: concettolabs |
last post by:
In today's business world, businesses are increasingly turning to PowerApps to develop custom business applications. PowerApps is a powerful tool...
|
by: teenabhardwaj |
last post by:
How would one discover a valid source for learning news, comfort, and help for engineering designs? Covering through piles of books takes a lot of...
|
by: Kemmylinns12 |
last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
|
by: CD Tom |
last post by:
This happens in runtime 2013 and 2016. When a report is run and then closed a toolbar shows up and the only way to get it to go away is to right...
|
by: Naresh1 |
last post by:
What is WebLogic Admin Training?
WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
|
by: jalbright99669 |
last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made...
|
by: Matthew3360 |
last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
|
by: Arjunsri |
last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and...
|
by: WisdomUfot |
last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific...
| |