473,395 Members | 1,936 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Threads, GIL and re.match() performance

Hi all

I understand that the C implementation of Python use a global interpreter
lock to avoid problems, so doing CPU bound tasks in multiple threads
will not result in better performance on multi-CPU systems.

However, I assumed that calls to (thread safe) C Library functions
release the global interpreter lock.

Today I checked the performance of some slow re.match() calls and found,
that the do not run in parallel on a multi-CPU system.

1) Is there a reason for this?
2) Is the regex library not thread-safe?
3) Is it possible, to release the GIL in re.match() to
get more performance?

I'm using Python 2.5

Thanks for your help

Mirko
--
"I've found that people who are great at something are not so much
convinced of their own greatness as mystified at why everyone else seems
so incompetent."
Paul Graham in "Great Hackers"
Jun 27 '08 #1
4 1339
In article <sl********************************@dziadzka.de> ,
Mirko Dziadzka <mi************@gmail.comwrote:
>
I understand that the C implementation of Python use a global interpreter
lock to avoid problems, so doing CPU bound tasks in multiple threads
will not result in better performance on multi-CPU systems.

However, I assumed that calls to (thread safe) C Library functions
release the global interpreter lock.
Generally speaking that only applies to I/O calls.
>Today I checked the performance of some slow re.match() calls and found,
that the do not run in parallel on a multi-CPU system.

1) Is there a reason for this?
2) Is the regex library not thread-safe?
3) Is it possible, to release the GIL in re.match() to
get more performance?
Theoretically possible, but the usual rule applies: patches welcome
--
Aahz (aa**@pythoncraft.com) <* http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha
Jun 27 '08 #2
On Jun 25, 9:05*am, Mirko Dziadzka <mirko.dziad...@gmail.comwrote:
>
1) Is there a reason for this?
I think it is because the Python re library uses the Python C-API
which is not threadsafe.
2) Is the regex library not thread-safe?
3) Is it possible, to release the GIL in re.match() to
* *get more performance?
Jun 27 '08 #3
Hi,

The C-API uses references counts as well, so it is not threadsafe.

Matthieu

2008/6/26 Pau Freixes <pf******@gmail.com>:
But Python C-API[1] it's the main base for extent python with C/c++, and
this is not not threadsafe.? I dont understand

[1] http://docs.python.org/api/api.html

On Thu, Jun 26, 2008 at 4:49 AM, Benjamin <mu**************@gmail.com>
wrote:
>>
On Jun 25, 9:05 am, Mirko Dziadzka <mirko.dziad...@gmail.comwrote:
>
1) Is there a reason for this?

I think it is because the Python re library uses the Python C-API
which is not threadsafe.
2) Is the regex library not thread-safe?
3) Is it possible, to release the GIL in re.match() to
get more performance?

--
http://mail.python.org/mailman/listinfo/python-list

--
Pau Freixes
Linux GNU/User
--
http://mail.python.org/mailman/listinfo/python-list


--
French PhD student
Website : http://matthieu-brucher.developpez.com/
Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn : http://www.linkedin.com/in/matthieubrucher
Jun 27 '08 #4
However, I assumed that calls to (thread safe) C Library functions
release the global interpreter lock.
This is mainly applicable to external C libraries. The interface to
them may not be thread-safe; anything that uses the Python API to
create/manage Python objects will require use of the GIL. So the
actual regex search may release the GIL, but the storing of results
(and possibly intermediate results) would not.

Jun 27 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Tzach | last post by:
I'm developing a simple Java client that runs over a CORBA server. The main client thread is waiting for notification from this server. On each notification, The client creates a new thread...
5
by: Bart Nessux | last post by:
Could someone explain the concept of threads and how I might use them in Python? I was a math major, not a CS major (theory instead of practice). Most of my programming knowledge has grown out of...
1
by: Dennis Gavrilov | last post by:
Hi, All! I have two questions: strategic and technical. Technical one first: I need to share an array of objects (implemented as hashes, having references to other objects and hashes, sharing...
6
by: sathyashrayan | last post by:
Following are the selected thread from the date:30-jan-2005 to 31-jan-2005. I did not use any name because of the subject is important. You can get the original thread by typing the subject...
34
by: Kovan Akrei | last post by:
Hi, I would like to know how to reuse an object of a thread (if it is possible) in Csharp? I have the following program: using System; using System.Threading; using System.Collections; ...
7
by: Michael | last post by:
I'm writing an application that decodes a file containing binary records. Each record is a particular event type. Each record is translated into ASCII and then written to a file. Each file contains...
35
by: Carl J. Van Arsdall | last post by:
Alright, based a on discussion on this mailing list, I've started to wonder, why use threads vs processes. So, If I have a system that has a large area of shared memory, which would be better? ...
18
by: Jon Slaughter | last post by:
"Instead of just waiting for its time slice to expire, a thread can block each time it initiates a time-consuming activity in another thread until the activity finishes. This is better than...
167
by: darren | last post by:
Hi I have to write a multi-threaded program. I decided to take an OO approach to it. I had the idea to wrap up all of the thread functions in a mix-in class called Threadable. Then when an...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.