473,396 Members | 1,666 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

when does the GIL really block?

I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?
Aug 1 '08 #1
4 1186
On Jul 31, 7:27*pm, Craig Allen <callen...@gmail.comwrote:
I have followed the GIL debate in python for some time. *I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-threadasynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that thethreads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. *In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

I start twothreads... I use gkrellm to watch my processors (dual
processor machine). *If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in mythread's run functions is
actually executing non-serially.
Try using sys.setcheckinterval(10000) (or even larger), overriding the
default of 100. This will reduce the locking overhead, which might by
why you see both CPUs as busy.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? *What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of thethreads... *anyone care to explain?
The GIL is locked during *all* access to the python interpreter.
There's nothing pure python code can do to avoid it - only a C
extension that doesn't access python could.
Aug 1 '08 #2
On Aug 1, 12:06 pm, Rhamphoryncus <rha...@gmail.comwrote:
On Jul 31, 7:27 pm, Craig Allen <callen...@gmail.comwrote:
I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-threadasynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that thethreads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.
My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.
I start twothreads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in mythread's run functions is
actually executing non-serially.

Try using sys.setcheckinterval(10000) (or even larger), overriding the
default of 100. This will reduce the locking overhead, which might by
why you see both CPUs as busy.
I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of thethreads... anyone care to explain?

The GIL is locked during *all* access to the python interpreter.
There's nothing pure python code can do to avoid it - only a C
extension that doesn't access python could.
thanks
Aug 2 '08 #3

On Thu, 2008-07-31 at 18:27 -0700, Craig Allen wrote:
I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?
--
http://mail.python.org/mailman/listinfo/python-list
It's worth mentioning that the most common place for the python
interpreter to release the GIL is during I/O, which printing a number to
the screen certainly counts as. You might try again with a set of loops
that only increment, and don't print, and you may more obviously see the
GIL in action.
--
John Krukoff <jk******@ltgc.com>
Land Title Guarantee Company

Aug 2 '08 #4
On Aug 1, 2:28 pm, John Krukoff <jkruk...@ltgc.comwrote:
On Thu, 2008-07-31 at 18:27 -0700, Craig Allen wrote:
I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.
My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.
I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.
I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?
--
http://mail.python.org/mailman/listinfo/python-list

It's worth mentioning that the most common place for the python
interpreter to release the GIL is during I/O, which printing a number to
the screen certainly counts as. You might try again with a set of loops
that only increment, and don't print, and you may more obviously see the
GIL in action.
--
John Krukoff <jkruk...@ltgc.com>
Land Title Guarantee Company
thanks, good idea, I think I'll try that.
Aug 5 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Craig Ringer | last post by:
Hi folks I'm a bit of a newbie here, though I've tried to appropriately research this issue before posting. I've found a lot of questions, a few answers that don't really answer quite what I'm...
13
by: Paul | last post by:
Hello: I read the FAQ about embedding HTML code in a Javascript. I have used the "<\/tag>" format to get around validator problems. Now the <NOSCRIPT> block is failing with error #65: 1. ...
17
by: Piers Lawson | last post by:
If the following is displayed in IE6 or FireFox, the text box forces the second DIV to expand beyond its 15em width. Is there a way to have the text box fit within the DIV (preferably without...
1
by: terrencel | last post by:
I was told to look at some old C code that was ported to C++. One of the file is like: ========================================= CPPClass* someCPPVar = NULL; extern "C" {
13
by: Seth Spearman | last post by:
Hey guys, I have the following code: '****************************************************** If Not Me.NewRecord Then Dim rs As DAO.Recordset Dim strBookmark As String Set rs =...
74
by: Suyog_Linux | last post by:
I wish to know how the free()function knows how much memory to be freed as we only give pointer to allocated memory as an argument to free(). Does system use an internal variable to store allocated...
22
by: semedao | last post by:
Hi , I am using asyc sockets p2p connection between 2 clients. when I debug step by step the both sides , i'ts work ok. when I run it , in somepoint (same location in the code) when I want to...
6
by: foolmelon | last post by:
If a childThread is in the middle of a catch block and handling an exception caught, the main thread calls childThread.Abort(). At that time a ThreadAbortException is thrown in the childThread. ...
94
by: Samuel R. Neff | last post by:
When is it appropriate to use "volatile" keyword? The docs simply state: " The volatile modifier is usually used for a field that is accessed by multiple threads without using the lock...
15
by: Davo | last post by:
Hello, I've created a table with two columns, the second column is fixed width at 64px and contains a div, the div has a border and contains some text, the text renders to larger than 64px. This...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.