Here is a small sample program that I have.
#include <stdlib.h>
#include <pthread.h>
#include <string>
using namespace std;
pthread_t threads[10];
pthread_attr_t thr_attr;
int thr_in[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
int totalIter = 0;
int thr_cnt = 0;
bool debug = false;
extern "C" void *do_something(void *tid);
int main( int argc, const char* argv[] )
{
int thr_var = 0;
//------------------
// how many threads?
//------------------
thr_cnt = atoi(argv[1]);
if (thr_cnt > 8)
{
cout << "WARNING: Limiting the thread count to 8" <<
endl;
thr_cnt = 8;
}
//--------------------------
// how much work to be done?
//--------------------------
totalIter = atoi(argv[2]);
if (totalIter > 5000000)
{
cout << "WARNING: Limiting the iteration count to
5000000" << endl;
totalIter = 5000000;
}
//-------------------------------
// do you want to check up on me?
//-------------------------------
if (argv[3] != NULL) debug = true;
//--------
// threads
//--------
pthread_attr_init(&thr_attr);
pthread_attr_setdetachstate(&thr_attr,
PTHREAD_CREATE_JOINABLE);
for (thr_var = 1; thr_var<=thr_cnt; thr_var++)
pthread_create(&threads[thr_var], &thr_attr,
do_something, (void *)
&(thr_in[thr_var]));
for (thr_var=0; thr_var<thr_cnt; thr_var++)
pthread_join(threads[thr_var], NULL);
pthread_attr_destroy(&thr_attr);
return 0;
}
void *do_something(void *tid)
{
int myThreadId = *((int *)tid);
FILE *fp = NULL;
if (debug)
{
char filename[50] = "";
sprintf(filename, "%d.out", myThreadId);
fp = fopen(filename, "w");
fprintf(fp, "thread #%d processing starts\n",
myThreadId);
}
for (int i=1; i<=totalIter; i++)
{
if (i%thr_cnt == myThreadId-1)
{
if (debug)
{
fprintf(fp, "thread #%d processing
index %d\n", myThreadId, i);
}
string a("abc"), b;
b = a;
}
}
if (debug)
{
fprintf(fp, "thread #%d processing finish\n",
myThreadId);
fflush(fp);
fclose(fp);
}
pthread_exit(NULL);
return NULL;
}
Now when I run this with 1 thread, here is the time taken.
/home/skher/testIPC/testThr> time $BIN/testThr 1 5000000
real 0m0.65s
user 0m0.47s
sys 0m0.14s
/home/skher/testIPC/testThr>
impressive, considering I am doing 5 million iterations. So, I thought
when I run with 2 or more threads, I should be done even in less time.
But here is what I found.
/home/skher/testIPC/testThr> time $BIN/testThr 2 5000000
real 0m34.67s
user 0m58.48s
sys 0m5.20s
/home/skher/testIPC/testThr>
Why is this? I guess this is because whenever I allocate any STL
object, using the _node_alloc template defined in _alloc.c, it has a
lock and unlock mechanism using a static class _Node_Alloc_Lock which
has a static member variable.
Part of that class code is shown here.
template <bool __threads, int __inst>
class _Node_Alloc_Lock {
public:
_Node_Alloc_Lock() {
# ifdef _STLP_SGI_THREADS
if (__threads && __us_rsthread_malloc)
# else /* !_STLP_SGI_THREADS */
if (__threads)
# endif
_S_lock._M_acquire_lock();
}
~_Node_Alloc_Lock() {
# ifdef _STLP_SGI_THREADS
if (__threads && __us_rsthread_malloc)
# else /* !_STLP_SGI_THREADS */
if (__threads)
# endif
_S_lock._M_release_lock();
}
static _STLP_STATIC_MUTEX _S_lock;
};
OK. Now my (worth million dollar only to me) question.
How do I get around this? How do I make my program run faster with more
threads. If you see, the threads are really mutually exclusive since
they are working on different indexes (indices) but still compete with
each other for resources viz. lock while creating STL object. How do I
make this competition go away thus making my program run faster with
more threads.
Any help will be appreciated. Thanx, Sunil. 4 1933
On 2005-11-23, su*******@hotmail.com <su*******@hotmail.com> wrote: OK. Now my (worth million dollar only to me) question.
How do I get around this? How do I make my program run faster with more threads. If you see, the threads are really mutually exclusive since they are working on different indexes (indices) but still compete with each other for resources viz. lock while creating STL object. How do I make this competition go away thus making my program run faster with more threads.
Any help will be appreciated. Thanx, Sunil.
Unless you literally have more than one processor, I'd say forget
it. Unfortunately, the question is off-topic for this group. Try
comp.programming, or a group specific to your C++ implementation.
--
Neil Cerutti su*******@hotmail.com skrev: Here is a small sample program that I have.
[snip] Now when I run this with 1 thread, here is the time taken.
/home/skher/testIPC/testThr> time $BIN/testThr 1 5000000
real 0m0.65s user 0m0.47s sys 0m0.14s /home/skher/testIPC/testThr>
impressive, considering I am doing 5 million iterations. So, I thought when I run with 2 or more threads, I should be done even in less time. But here is what I found.
/home/skher/testIPC/testThr> time $BIN/testThr 2 5000000
real 0m34.67s user 0m58.48s sys 0m5.20s /home/skher/testIPC/testThr>
Why is this? I guess this is because whenever I allocate any STL object, using the _node_alloc template defined in _alloc.c, it has a lock and unlock mechanism using a static class _Node_Alloc_Lock which has a static member variable.
[snip]
I have not examined your code in depth, but it looks like there is a
difference in what is done in a one-thread scenario and in a n-thread
scenario, Still - it does not matter as this probably is not the
correct group. Off-hand I know that memory allocation can be much more
expensive in a multithreaded environment, so this is certainly a
factor. Also it is not certain that using more threads will make your
program faster. If the task is CPU-bound this requires hardware with
multiple CPUs. But better ask in comp.programming.threads or a group
dedicated to your platform.
Peter
Yes, I would ask it in that group but since both of you have asked the
same question - I am indeed using a 4 CPU machine running Sun Solaris.
Thanx, Sunil. su*******@hotmail.com wrote: Yes, I would ask it in that group but since both of you have asked the same question - I am indeed using a 4 CPU machine running Sun Solaris. Thanx, Sunil.
It's been a long time since I used Solaris, but I seem to remember
that the pthreads implementation on Solaris did not use multiple
CPU's, yet the Solaris native thread library (lwp ?) did use
multiple CPU's. You'd better ask on a Solaris newsgroup to
get definitive answers.
Pthreads on Linux (2.6+), I believe, will use multiple CPU's.
Regards,
Larry This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: andrewpalumbo |
last post by:
I'm trying to write some code which will split up a vector into two
halves and run a method on the objects in the vector using two seperate
threads. I was hoping to see a near linear speedup on an...
|
by: m |
last post by:
Hello,
I have an application that processes thousands of files each day. The
filenames and various related file information is retrieved, related
filenames are associate and placed in a linked...
|
by: Kovan Akrei |
last post by:
Hi,
I would like to know how to reuse an object of a thread (if it is possible)
in Csharp? I have the following program:
using System;
using System.Threading;
using System.Collections;
...
|
by: R. Nachtsturm |
last post by:
Hi,
i have the problem that when i create a low priority background thread,
start it, and wait for it to finish that it does not seem to terminate even
after it is finished..
if i use...
|
by: Carl J. Van Arsdall |
last post by:
Alright, based a on discussion on this mailing list, I've started to
wonder, why use threads vs processes. So, If I have a system that has a
large area of shared memory, which would be better? ...
|
by: keerthyragavendran |
last post by:
hi
i'm downloading a single file using multiple threads...
how can i specify a particular range of bytes alone from a single
large file... for example say if i need only bytes ranging from...
|
by: WATYF |
last post by:
Hi there... I have a huge text file that needs to be processed. At the
moment, I'm loading it into memory in small chunks (x amount of lines)
and processing it that way. I'd like the process to be...
|
by: tdahsu |
last post by:
All,
I'd appreciate any help. I've got a list of files in a directory, and
I'd like to iterate through that list and process each one. Rather
than do that serially, I was thinking I should...
|
by: =?GB2312?B?0rvK18qr?= |
last post by:
Hi all,
Recently I had a new coworker. There is some dispute between us.
The last company he worked for has a special networking programming
model. They split the business logic into...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
| |