473,408 Members | 2,734 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

high performance/threaded applications in Python - your experiences?

All,

In studying Python, I have predictably run across quite a bit of talk about the GIL and threading in Python. As my day job, I work with a (mostly Java) application that is heavily threaded. As such our application takes good advantage of multiple processors and we can often scale through simply adding processing power to a server.

I was hoping for some experiences that some of you on the list may have had in dealing with Python in a high performance and/or threaded environment. In essence, I'm wondering how big of a deal the GIL can be in a real-world scenario where you need to take advantage of multiple processor machines, thread pools, etc. How much does it get in the way (or not), and how difficult have you found it to architect applications for high performance? I have read a number of articles and opinions on whether or not the GIL is a good thing, and how it affects threaded performance on multiple processor machines, but what I haven't seen is experiences of people who have actually done it and reported back "it was a nightmare" or "it's no big deal" ;)

Your thoughts and opinions are welcome, especially those with relevant experiences. Thanks!

-Jay

Jun 23 '07 #1
2 1522
Jay Loden wrote:
I was hoping for some experiences that some of you on the list may havehad in dealing with Python in a high performance and/or threaded environment. In essence, I'm wondering how big of a deal the GIL can be in a real-world scenario where you need to take advantage of multiple processor machines, thread pools, etc. How much does it get in the way (or not), and how difficult have you found it to architect applications for high performance? I have read a number of articles and opinions on whether or not the GIL is a good thing, and how it affects threaded performance on multiple processor machines, but what I haven't seen is experiences of people who have actually done it and reported back "it was a nightmare" or "it'sno big deal" ;)
The theory: If your threads mostly do IO, you can get decent CPU usage
even with Python. If the threads are CPU-bound (e.g. you do a lot of
computational work), you'll effectively only make use of one processor.

In practice, I've noticed that Python applications don't scale very much
across CPUs even if they're doing mostly IO. I blame cache trashing or
similar effect caused by too many global synchronization events. I
didn't measure but the speedup may even be negative with large-ish
number of CPUs (>=4).

OTOH, if you can get by with using forking instead of threads (given
enough effort) you can achieve very good scaling.

--
(\__/)
(O.o)
(< )

This is Bunny.
Copy Bunny into your signature to help him on his way to world domination!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGfTGhldnAQVacBcgRAqZaAKCG866O3Lg5DYH/fl9/Ig4EwclmSwCfXw6W
99Is2/l13kbYq6P+IJBne+w=
=x2s3
-----END PGP SIGNATURE-----

Jun 23 '07 #2
Ivan Voras wrote:
Jay Loden wrote:
>I was hoping for some experiences that some of you on the list may have had in dealing with Python in a high performance and/or threaded environment. In essence, I'm wondering how big of a deal the GIL can be in a real-world scenario where you need to take advantage of multiple processor machines, thread pools, etc. How much does it get in the way (or not), and how difficult have you found it to architect applications for high performance? I have read a number of articles and opinions on whether or not the GIL is a good thing, and how it affects threaded performance on multiple processor machines, but what I haven't seen is experiences of people who have actually done it and reported back "it was a nightmare" or "it's no big deal" ;)

The theory: If your threads mostly do IO, you can get decent CPU usage
even with Python. If the threads are CPU-bound (e.g. you do a lot of
computational work), you'll effectively only make use of one processor.

In practice, I've noticed that Python applications don't scale very much
across CPUs even if they're doing mostly IO. I blame cache trashing or
similar effect caused by too many global synchronization events. I
didn't measure but the speedup may even be negative with large-ish
number of CPUs (>=4).

OTOH, if you can get by with using forking instead of threads (given
enough effort) you can achieve very good scaling.
Also, see the 'processing' package in the Python cheeseshop. It allows
you to use processes rather than threads with most of the same
abstractions. I hear it recently acquired the ability to pass file
handles between processes on the same machine :)

- Josiah
Jun 24 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Woody | last post by:
I have a page that is linked into by other pages that pass it query strings, it then reads a config file, displays 1 of several forms, gets posted to itself, depending upon users response may...
9
by: bluedolphin | last post by:
Hello All: I have been brought onboard to help on a project that had some performance problems last year. I have taken some steps to address the issues in question, but a huge question mark...
6
by: Jack | last post by:
Basically I am trying to find a high performance web server. Since Python is installed on all of the servers, It'll be great if the web server is written in Python as well. Otherwise, I will have...
0
by: dotnetrocks | last post by:
Hi, I'm writing a high performance tcp/ip server using IOCP. Recently I found XF.Server component at http://www.kodart.com They claim that it is the fastest server implementation. Is it possible?...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.