473,406 Members | 2,220 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

What's the cost of using hundreds of threads?

Hello,

I have written some code, which creates many threads for each connection
('main connection'). The purpose of this code is to balance the load
between several connections ('pipes'). The number of spawned threads
depends on how many pipes I create (= 2*n+2, where n is the number of
pipes).

For good results I'll presumably share main connection's load between 10
pipes - therefore 22 threads will be spawned. Now if about 50
connections are forwarded the number of threads rises to thousand of
threads (or several thousands if even more connections are established).

My questions are:
- What is the cost (in memory / CPU usage) of creating such amounts of
threads?
- Is there any 'upper boundary' that limits the number of threads? (is
it python / OS related)
- Is that the sign of 'clumsy programming' - i.e. Is creating so many
threads a bad habit? (I must say that it simplified the solution of my
problem very much).

Limiting the number of threads is possible, but would affect the
independence of data flows. (ok I admit - creating tricky algorithm
could perhaps gurantee concurrency without spawning so many threads -
but it's the simplest solution to this problem :) ).
Jul 18 '05 #1
11 4171
Przemysław Różycki wrote:
Hello,

I have written some code, which creates many threads for each connection
('main connection'). The purpose of this code is to balance the load
between several connections ('pipes'). The number of spawned threads
depends on how many pipes I create (= 2*n+2, where n is the number of
pipes).

For good results I'll presumably share main connection's load between 10
pipes - therefore 22 threads will be spawned. Now if about 50
connections are forwarded the number of threads rises to thousand of
threads (or several thousands if even more connections are established).

My questions are:
- What is the cost (in memory / CPU usage) of creating such amounts of
threads?
- Is there any 'upper boundary' that limits the number of threads? (is
it python / OS related)
- Is that the sign of 'clumsy programming' - i.e. Is creating so many
threads a bad habit? (I must say that it simplified the solution of my
problem very much).

Limiting the number of threads is possible, but would affect the
independence of data flows. (ok I admit - creating tricky algorithm
could perhaps gurantee concurrency without spawning so many threads -
but it's the simplest solution to this problem :) ).


PR,
I notice there's a resource module with a
getrusage(who) that looks like it would support
a test to get what you need.
wes

Jul 18 '05 #2
Przemysław Różycki napisał(a):
- Is there any 'upper boundary' that limits the number of threads? (is
it python / OS related)
- Is that the sign of 'clumsy programming' - i.e. Is creating so many
threads a bad habit? (I must say that it simplified the solution of my
problem very much).


I've read somewhere (I cann't recall where, though, was it MSDN?) that
Windows is not well suited to run more than 32 threads per process. Most
of the code I saw doesn't spawn more threads than a half of this.

--
Jarek Zgoda
http://jpa.berlios.de/ | http://www.zgodowie.org/
Jul 18 '05 #3
Jarek Zgoda wrote:
Przemysław Różycki napisał(a):
- Is there any 'upper boundary' that limits the number of threads? (is
it python / OS related)
- Is that the sign of 'clumsy programming' - i.e. Is creating so many
threads a bad habit? (I must say that it simplified the solution of my
problem very much).

I've read somewhere (I cann't recall where, though, was it MSDN?) that
Windows is not well suited to run more than 32 threads per process. Most
of the code I saw doesn't spawn more threads than a half of this.

This is apocryphal. Do you have any hard evidence for this assertion?

Apache, for example, can easily spawn more threads under Windows, and
I've written code that uses 200 threads with excellent performance.
Things seem to slow down around the 2,000 mark for some reason I'm not
familiar with.

regards
Steve
--
Meet the Python developers and your c.l.py favorites March 23-25
Come to PyCon DC 2005 http://www.pycon.org/
Steve Holden http://www.holdenweb.com/
Jul 18 '05 #4
In article <d0**********@julia.coi.pw.edu.pl>,
=?ISO-8859-2?Q?Przemys=B3aw_R=F3=BFycki?=
<P.*******@elka.pw.edu.pl> wrote:

I have written some code, which creates many threads for each connection
('main connection'). The purpose of this code is to balance the load
between several connections ('pipes'). The number of spawned threads
depends on how many pipes I create (= 2*n+2, where n is the number of
pipes).

For good results I'll presumably share main connection's load between 10
pipes - therefore 22 threads will be spawned. Now if about 50
connections are forwarded the number of threads rises to thousand of
threads (or several thousands if even more connections are established).
I'm a bit confused by your math. Fifty connections should be 102
threads, which is quite reasonable.
My questions are:
- What is the cost (in memory / CPU usage) of creating such amounts of
threads?
- Is there any 'upper boundary' that limits the number of threads? (is
it python / OS related)
- Is that the sign of 'clumsy programming' - i.e. Is creating so many
threads a bad habit? (I must say that it simplified the solution of my
problem very much).

Limiting the number of threads is possible, but would affect the
independence of data flows. (ok I admit - creating tricky algorithm
could perhaps gurantee concurrency without spawning so many threads -
but it's the simplest solution to this problem :) ).


My experience with lots of threads dates back to Python 1.5.2, but I
rarely saw much improvement with more than a hundred threads, even for
heavily I/O-bound applications on a multi-CPU system. However, if your
focus is algorithmic complexity, you should be able to handle a couple of
thousand threads easily enough.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
Jul 18 '05 #5
In article <4R5Vd.37291$%U2.33444@lakeread01>,
Steve Holden <st***@holdenweb.com> wrote:
Jul 18 '05 #6
> I'm a bit confused by your math. Fifty connections should be 102
threads, which is quite reasonable.
My formula applies to one forwarded ('loadbalanced') connection. Every
such connection creates further n connections (pipes) which share the
load. Every pipe requires two threads to be spawned. Every 'main
connection' spawns two other threads - so my formula: 2*pipes+2 gives
the number of threads spawned per 'main connection'.

Now if connections_count connections are established the thread count
equals:
conn_count * threads_per_main_connection = conn_count * (2*pipes+2)

For 50 connections and about 10 pipes it will give 1100 threads.
My experience with lots of threads dates back to Python 1.5.2, but I
rarely saw much improvement with more than a hundred threads, even for
heavily I/O-bound applications on a multi-CPU system. However, if your
focus is algorithmic complexity, you should be able to handle a couple of
thousand threads easily enough.


I don't spawn them because of computional reasons, but due to the fact
that it makes my code much more simpler. I use built-in tcp features to
achieve loadbalancing - every flow (directed through pipe) has it's own
dedicated threads - separate for down- and upload. For every 'main
connection' these threads share send and receive buffer. If any of pipes
is congested the corresponding threads block on their send / recv
functions - without affecting independence of data flows.

Using threads gives me VERY simple code. To achieve this with poll /
select would be much more difficult. And to guarantee concurrency and
maximal throughput for all of pipes I would probably have to mirror code
from linux TCP stack (I mean window shifting, data acknowlegement,
retransmission queues). Or perhaps I exaggerate.
Jul 18 '05 #7
Thanks for your comments on winXP threads implementation. You confirmed
me in conviction that I shouldn't use windows.
Personally I use linux with 2.6.10 kernel, so hopefully I don't have to
share your grief. ;)
Jul 18 '05 #8
Steve Holden wrote:
Apache, for example, can easily spawn more threads under Windows, and
I've written code that uses 200 threads with excellent performance.
Things seem to slow down around the 2,000 mark for some reason I'm not
familiar with.


As far as I know, the default Windows thread stack size is 2 MB. Do the math :)

On NT4, beyond a couple of hundred threads a *heck* of a lot of time ends up
being spent in the kernel doing context switches (and you can kiss even vaguely
deterministic response times good-bye).

Using a more recent version of Windows improves matters significantly.

Cheers,
Nick.

--
Nick Coghlan | nc******@email.com | Brisbane, Australia
---------------------------------------------------------------
http://boredomandlaziness.skystorm.net
Jul 18 '05 #9
In article <d0**********@julia.coi.pw.edu.pl>,
Przemysław Różycki <P.*******@elka.pw.edu.pl> wrote:
Thanks for your comments on winXP threads implementation. You confirmed
me in conviction that I shouldn't use windows.
Personally I use linux with 2.6.10 kernel, so hopefully I don't have to
share your grief. ;)


? !? I'm confused, and apparently I'm confusing others.
The one message I posted in this thread--largely reinforced
by others--emphasizes only that WinXP is far *better* than
earlier Win* flavors in its thread management. While I not
only agree that Windows has disadvantages, but have stopped
buying it for our company, my reasons have absolutely nothing
to do with the details of implementation of WinXP.
Jul 18 '05 #10
> In article <d0**********@julia.coi.pw.edu.pl>,
Przemysław Różycki <P.*******@elka.pw.edu.pl> wrote:
Thanks for your comments on winXP threads implementation. You confirmed me in conviction that I shouldn't use windows. Personally I use linux with 2.6.10 kernel, so hopefully I don't have
to share your grief. ;)
? !? I'm confused, and apparently I'm confusing others.
The one message I posted in this thread--largely reinforced
by others--emphasizes only that WinXP is far *better* than
earlier Win* flavors in its thread management. While I not
only agree that Windows has disadvantages, but have stopped
buying it for our company, my reasons have absolutely nothing
to do with the details of implementation of WinXP.

:) . Ok, perhaps my answer wasn't that precise. I wrote my post only to
say that your discussion on windows' threading performace doesn't
concern me - because my program is written for linux environment. And
yes, I agree that my comment could sound a bit enigmatic.
Jul 18 '05 #11
In article <d0**********@julia.coi.pw.edu.pl>,
=?ISO-8859-2?Q?Przemys=B3aw_R=F3=BFycki?=
<P.*******@elka.pw.edu.pl> wrote:

I don't spawn them because of computional reasons, but due to the fact
that it makes my code much more simpler. I use built-in tcp features
to achieve loadbalancing - every flow (directed through pipe) has it's
own dedicated threads - separate for down- and upload. For every 'main
connection' these threads share send and receive buffer. If any of
pipes is congested the corresponding threads block on their send / recv
functions - without affecting independence of data flows.

Using threads gives me VERY simple code. To achieve this with poll /
select would be much more difficult. And to guarantee concurrency and
maximal throughput for all of pipes I would probably have to mirror
code from linux TCP stack (I mean window shifting, data acknowlegement,
retransmission queues). Or perhaps I exaggerate.


Maybe it would help if you explained what these "pipes" do. Based on
what you've said so far, what I'd do in your situation is create one
thread per pipe and one thread per connection, then use Queue to move
data between threads.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
Jul 18 '05 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Ajay | last post by:
hi! i have an application that runs on a pocket pc. the application has a server which responds to UDP requests. each request contains the address of another server (say, server_b). the pocket...
3
by: Peter | last post by:
If I want to build a web services application (not web application), what is the hardware requirement ? Does anyone have the experience?
13
by: gtux | last post by:
Hi everybody: I'm new in Javascript, I found some code and there is this: var fruit = { 'apple' : { 'weight' : 10, 'cost' : 9}, 'peach' : { 'weight' : 19, 'cost' : 10} }
2
by: denny | last post by:
Hey all, I know that dynamic_cast<> takes some time, but , for instance, is there a memoy cost associated in with it? Does it have to maintain a table in memory, thus bloating the runtime ram...
13
by: Jason Huang | last post by:
Hi, Would someone explain the following coding more detail for me? What's the ( ) for? CurrentText = (TextBox)e.Item.Cells.Controls; Thanks. Jason
126
by: ramyach | last post by:
Hi friends, I need to write a parallel code in 'C' on the server that is running SGI Irix 6.5. This server supports MIPS Pro C compiler. I don't have any idea of parallel C languages. I looked...
11
by: GVN | last post by:
Hi All, Can anyone guide me when asynchronous method calls will be benificial? Are there any disadvantages of using asynchronous calls? Thanks,
13
by: Rob S | last post by:
I'm new to C#, new to a project, and wondering about the differences between two ways of handling a particular situation in our code. In a nutshell, what is the performance difference between...
5
by: Bruce | last post by:
Hello I am building a C# app that creates anywhere from 10 to 100 connections to a specified server and sends 1000s of TCP requests and processes the responses. (it is a stress tool) I planned...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.