473,385 Members | 1,813 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Threads with STL - can I run faster?

Here is a small sample program that I have.

#include <stdlib.h>
#include <pthread.h>
#include <string>

using namespace std;

pthread_t threads[10];
pthread_attr_t thr_attr;
int thr_in[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
int totalIter = 0;
int thr_cnt = 0;
bool debug = false;

extern "C" void *do_something(void *tid);

int main( int argc, const char* argv[] )
{
int thr_var = 0;

//------------------
// how many threads?
//------------------
thr_cnt = atoi(argv[1]);
if (thr_cnt > 8)
{
cout << "WARNING: Limiting the thread count to 8" <<
endl;
thr_cnt = 8;
}

//--------------------------
// how much work to be done?
//--------------------------
totalIter = atoi(argv[2]);
if (totalIter > 5000000)
{
cout << "WARNING: Limiting the iteration count to
5000000" << endl;
totalIter = 5000000;
}

//-------------------------------
// do you want to check up on me?
//-------------------------------
if (argv[3] != NULL) debug = true;

//--------
// threads
//--------
pthread_attr_init(&thr_attr);
pthread_attr_setdetachstate(&thr_attr,
PTHREAD_CREATE_JOINABLE);

for (thr_var = 1; thr_var<=thr_cnt; thr_var++)
pthread_create(&threads[thr_var], &thr_attr,
do_something, (void *)
&(thr_in[thr_var]));

for (thr_var=0; thr_var<thr_cnt; thr_var++)
pthread_join(threads[thr_var], NULL);

pthread_attr_destroy(&thr_attr);

return 0;
}
void *do_something(void *tid)
{
int myThreadId = *((int *)tid);
FILE *fp = NULL;
if (debug)
{
char filename[50] = "";
sprintf(filename, "%d.out", myThreadId);
fp = fopen(filename, "w");
fprintf(fp, "thread #%d processing starts\n",
myThreadId);
}

for (int i=1; i<=totalIter; i++)
{
if (i%thr_cnt == myThreadId-1)
{
if (debug)
{
fprintf(fp, "thread #%d processing
index %d\n", myThreadId, i);
}

string a("abc"), b;
b = a;
}
}

if (debug)
{
fprintf(fp, "thread #%d processing finish\n",
myThreadId);
fflush(fp);
fclose(fp);
}

pthread_exit(NULL);
return NULL;

}
Now when I run this with 1 thread, here is the time taken.

/home/skher/testIPC/testThr> time $BIN/testThr 1 5000000

real 0m0.65s
user 0m0.47s
sys 0m0.14s
/home/skher/testIPC/testThr>

impressive, considering I am doing 5 million iterations. So, I thought
when I run with 2 or more threads, I should be done even in less time.
But here is what I found.

/home/skher/testIPC/testThr> time $BIN/testThr 2 5000000

real 0m34.67s
user 0m58.48s
sys 0m5.20s
/home/skher/testIPC/testThr>

Why is this? I guess this is because whenever I allocate any STL
object, using the _node_alloc template defined in _alloc.c, it has a
lock and unlock mechanism using a static class _Node_Alloc_Lock which
has a static member variable.

Part of that class code is shown here.

template <bool __threads, int __inst>
class _Node_Alloc_Lock {
public:
_Node_Alloc_Lock() {

# ifdef _STLP_SGI_THREADS
if (__threads && __us_rsthread_malloc)
# else /* !_STLP_SGI_THREADS */
if (__threads)
# endif
_S_lock._M_acquire_lock();
}

~_Node_Alloc_Lock() {
# ifdef _STLP_SGI_THREADS
if (__threads && __us_rsthread_malloc)
# else /* !_STLP_SGI_THREADS */
if (__threads)
# endif
_S_lock._M_release_lock();
}

static _STLP_STATIC_MUTEX _S_lock;
};
OK. Now my (worth million dollar only to me) question.

How do I get around this? How do I make my program run faster with more
threads. If you see, the threads are really mutually exclusive since
they are working on different indexes (indices) but still compete with
each other for resources viz. lock while creating STL object. How do I
make this competition go away thus making my program run faster with
more threads.

Any help will be appreciated. Thanx, Sunil.

Nov 23 '05 #1
4 1933
On 2005-11-23, su*******@hotmail.com <su*******@hotmail.com> wrote:
OK. Now my (worth million dollar only to me) question.

How do I get around this? How do I make my program run faster
with more threads. If you see, the threads are really mutually
exclusive since they are working on different indexes (indices)
but still compete with each other for resources viz. lock while
creating STL object. How do I make this competition go away
thus making my program run faster with more threads.

Any help will be appreciated. Thanx, Sunil.


Unless you literally have more than one processor, I'd say forget
it. Unfortunately, the question is off-topic for this group. Try
comp.programming, or a group specific to your C++ implementation.

--
Neil Cerutti
Nov 23 '05 #2

su*******@hotmail.com skrev:
Here is a small sample program that I have.
[snip]
Now when I run this with 1 thread, here is the time taken.

/home/skher/testIPC/testThr> time $BIN/testThr 1 5000000

real 0m0.65s
user 0m0.47s
sys 0m0.14s
/home/skher/testIPC/testThr>

impressive, considering I am doing 5 million iterations. So, I thought
when I run with 2 or more threads, I should be done even in less time.
But here is what I found.

/home/skher/testIPC/testThr> time $BIN/testThr 2 5000000

real 0m34.67s
user 0m58.48s
sys 0m5.20s
/home/skher/testIPC/testThr>

Why is this? I guess this is because whenever I allocate any STL
object, using the _node_alloc template defined in _alloc.c, it has a
lock and unlock mechanism using a static class _Node_Alloc_Lock which
has a static member variable.

[snip]
I have not examined your code in depth, but it looks like there is a
difference in what is done in a one-thread scenario and in a n-thread
scenario, Still - it does not matter as this probably is not the
correct group. Off-hand I know that memory allocation can be much more
expensive in a multithreaded environment, so this is certainly a
factor. Also it is not certain that using more threads will make your
program faster. If the task is CPU-bound this requires hardware with
multiple CPUs. But better ask in comp.programming.threads or a group
dedicated to your platform.

Peter

Nov 24 '05 #3
Yes, I would ask it in that group but since both of you have asked the
same question - I am indeed using a 4 CPU machine running Sun Solaris.
Thanx, Sunil.

Nov 24 '05 #4
su*******@hotmail.com wrote:
Yes, I would ask it in that group but since both of you have asked the
same question - I am indeed using a 4 CPU machine running Sun Solaris.
Thanx, Sunil.


It's been a long time since I used Solaris, but I seem to remember
that the pthreads implementation on Solaris did not use multiple
CPU's, yet the Solaris native thread library (lwp ?) did use
multiple CPU's. You'd better ask on a Solaris newsgroup to
get definitive answers.

Pthreads on Linux (2.6+), I believe, will use multiple CPU's.

Regards,
Larry
Nov 24 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: andrewpalumbo | last post by:
I'm trying to write some code which will split up a vector into two halves and run a method on the objects in the vector using two seperate threads. I was hoping to see a near linear speedup on an...
6
by: m | last post by:
Hello, I have an application that processes thousands of files each day. The filenames and various related file information is retrieved, related filenames are associate and placed in a linked...
34
by: Kovan Akrei | last post by:
Hi, I would like to know how to reuse an object of a thread (if it is possible) in Csharp? I have the following program: using System; using System.Threading; using System.Collections; ...
2
by: R. Nachtsturm | last post by:
Hi, i have the problem that when i create a low priority background thread, start it, and wait for it to finish that it does not seem to terminate even after it is finished.. if i use...
35
by: Carl J. Van Arsdall | last post by:
Alright, based a on discussion on this mailing list, I've started to wonder, why use threads vs processes. So, If I have a system that has a large area of shared memory, which would be better? ...
35
by: keerthyragavendran | last post by:
hi i'm downloading a single file using multiple threads... how can i specify a particular range of bytes alone from a single large file... for example say if i need only bytes ranging from...
16
by: WATYF | last post by:
Hi there... I have a huge text file that needs to be processed. At the moment, I'm loading it into memory in small chunks (x amount of lines) and processing it that way. I'd like the process to be...
4
by: tdahsu | last post by:
All, I'd appreciate any help. I've got a list of files in a directory, and I'd like to iterate through that list and process each one. Rather than do that serially, I was thinking I should...
23
by: =?GB2312?B?0rvK18qr?= | last post by:
Hi all, Recently I had a new coworker. There is some dispute between us. The last company he worked for has a special networking programming model. They split the business logic into...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.