By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,825 Members | 974 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,825 IT Pros & Developers. It's quick & easy.

when does the GIL really block?

P: n/a
I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?
Aug 1 '08 #1
Share this Question
Share on Google+
4 Replies

P: n/a
On Jul 31, 7:27*pm, Craig Allen <callen...@gmail.comwrote:
I have followed the GIL debate in python for some time. *I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-threadasynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that thethreads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. *In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

I start twothreads... I use gkrellm to watch my processors (dual
processor machine). *If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in mythread's run functions is
actually executing non-serially.
Try using sys.setcheckinterval(10000) (or even larger), overriding the
default of 100. This will reduce the locking overhead, which might by
why you see both CPUs as busy.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? *What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of thethreads... *anyone care to explain?
The GIL is locked during *all* access to the python interpreter.
There's nothing pure python code can do to avoid it - only a C
extension that doesn't access python could.
Aug 1 '08 #2

P: n/a
On Aug 1, 12:06 pm, Rhamphoryncus <rha...@gmail.comwrote:
On Jul 31, 7:27 pm, Craig Allen <callen...@gmail.comwrote:
I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-threadasynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that thethreads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.
My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.
I start twothreads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in mythread's run functions is
actually executing non-serially.

Try using sys.setcheckinterval(10000) (or even larger), overriding the
default of 100. This will reduce the locking overhead, which might by
why you see both CPUs as busy.
I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of thethreads... anyone care to explain?

The GIL is locked during *all* access to the python interpreter.
There's nothing pure python code can do to avoid it - only a C
extension that doesn't access python could.
thanks
Aug 2 '08 #3

P: n/a

On Thu, 2008-07-31 at 18:27 -0700, Craig Allen wrote:
I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?
--
http://mail.python.org/mailman/listinfo/python-list
It's worth mentioning that the most common place for the python
interpreter to release the GIL is during I/O, which printing a number to
the screen certainly counts as. You might try again with a set of loops
that only increment, and don't print, and you may more obviously see the
GIL in action.
--
John Krukoff <jk******@ltgc.com>
Land Title Guarantee Company

Aug 2 '08 #4

P: n/a
On Aug 1, 2:28 pm, John Krukoff <jkruk...@ltgc.comwrote:
On Thu, 2008-07-31 at 18:27 -0700, Craig Allen wrote:
I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.
My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.
I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.
I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?
--
http://mail.python.org/mailman/listinfo/python-list

It's worth mentioning that the most common place for the python
interpreter to release the GIL is during I/O, which printing a number to
the screen certainly counts as. You might try again with a set of loops
that only increment, and don't print, and you may more obviously see the
GIL in action.
--
John Krukoff <jkruk...@ltgc.com>
Land Title Guarantee Company
thanks, good idea, I think I'll try that.
Aug 5 '08 #5

This discussion thread is closed

Replies have been disabled for this discussion.