By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,369 Members | 1,553 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,369 IT Pros & Developers. It's quick & easy.

for loop speed problem

P: n/a
Hi

this code:

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
tmp = 0.;

}
cout << "J: " << i << endl;
}
is 100 times faster than this

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
data2d[i][j] = tmp; // the problem
tmp = 0.;

}
cout << "J: " << i << endl;
}


can anybody help me?

greetings

Mario
Oct 28 '05 #1
Share this Question
Share on Google+
9 Replies


P: n/a
L
Mario Lüttich wrote [10/28/2005 08:32 AM] :
Hi

this code:

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> tmp = 0.;
>
> }
> cout << "J: " << i << endl;
> }


is 100 times faster than this

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> data2d[i][j] = tmp; // the problem
> tmp = 0.;
>
> }
> cout << "J: " << i << endl;
> }


can anybody help me?

greetings

Mario


data2d type? 2d array?
Oct 28 '05 #2

P: n/a
On Fri, 28 Oct 2005 15:32:51 +0200, Mario Lüttich <ml*****@gwdg.de>
wrote:
Hi

this code:

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
tmp = 0.;

}
cout << "J: " << i << endl;
}


is 100 times faster than this

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
data2d[i][j] = tmp; // the problem
tmp = 0.;

}
cout << "J: " << i << endl;
}


can anybody help me?

greetings

Mario

What about the following?

for (int i = 0; i < 2000; i++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
where_to_put_i += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

.... and if we knew more deatils, we may even give more suggestions

Best regards,

-- Zara

Oct 28 '05 #3

P: n/a
Hi,
This doesn't look like the same code, I think you're dereferencing j
before decalring it, possibly you want to move the lines around like
this: -

for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

Nice idea using a reference though, although I suspect this will be the
same speed, how about this instead: -
for (int i = 0; i < 2000; i++)
{
float* where_is_i = data2d[i];
for (int j = i; j < 2000; j++)
{
float& where_to_put_it = where_is_i[j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++)
{
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

The idea here is to reduce the number of times that you de-reference
data2d with the same value of i.

Oct 28 '05 #4

P: n/a
BRG
Mario Lüttich wrote:
Hi

this code:

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
tmp = 0.;

}
cout << "J: " << i << endl;
}


is 100 times faster than this

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
data2d[i][j] = tmp; // the problem
tmp = 0.;

}
cout << "J: " << i << endl;
}


can anybody help me?


The speed difference is not surprising since the compiler has probably
noticed that the loop on k in the first example can be eliminated since
its output value (in tmp) is not used.
Oct 28 '05 #5

P: n/a
Mario Lüttich wrote:
Hi

this code:

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> tmp = 0.;
>
> }
> cout << "J: " << i << endl;
> }


is 100 times faster than this

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> data2d[i][j] = tmp; // the problem
> tmp = 0.;
>
> }
> cout << "J: " << i << endl;
> }


can anybody help me?

greetings

Mario

Mario,

Use pointers to loop through data2d incrementing them as required
like the code shown below. Incrementing pointers is quicker than the
calculations to index the array with i and j.

int data[10][10];
int (*pRow)[10] = data,*pCol;
int i,j;

for (i = 0;i < 10;++i,++pRow)
{
pCol = *pRow;
for (j = 0;j < 10;++j,++pCol)
{
*pCol = i * j;
}
}

JFJB
Oct 28 '05 #6

P: n/a
n2xssvv g02gfr12930 wrote:
Mario Lüttich wrote:
Hi

this code:

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> tmp = 0.;
> > }
> cout << "J: " << i << endl;
> }


is 100 times faster than this

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> data2d[i][j] = tmp; // the problem
> tmp = 0.;
> > }
> cout << "J: " << i << endl;
> }


can anybody help me?

greetings

Mario


Mario,

Use pointers to loop through data2d incrementing them as required
like the code shown below. Incrementing pointers is quicker than the
calculations to index the array with i and j.

int data[10][10];
int (*pRow)[10] = data,*pCol;
int i,j;

for (i = 0;i < 10;++i,++pRow)
{
pCol = *pRow;
for (j = 0;j < 10;++j,++pCol)
{
*pCol = i * j;
}
}

JFJB


Thanks anybody!

greetings

Mario
Oct 31 '05 #7

P: n/a
Mario Lüttich wrote:
Hi

this code:
[snip code that does not store the results anywhere]
is 100 times faster than this
[snip other code]
can anybody help me?


Look at the assembler file generated by the compiler. See what it is
doing. Step through the code instruction by instruction in a debugger.
If the optimiser is eliminating the inner loop then that code will not
be exercised. Try turning the optimiser off and seeing if the versions
are still so dis-similar. If one version is calling a library routine
instead of using built-in instructions, you will notice that too. If
the assembly looks fine, then it is possible that architecture issues
(cache lines, FPU pipelines) are dominating. Its interesting to see
what your compiler is actually doing.

Oct 31 '05 #8

P: n/a
On 28 Oct 2005 08:45:40 -0700, "chrisg67" <ch******@gmail.com> wrote:
Hi,
This doesn't look like the same code, I think you're dereferencing j
before decalring it, possibly you want to move the lines around like
this: -

for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}
Sure. Wrting and not testing => bugging ;-)

Nice idea using a reference though, although I suspect this will be the
same speed, how about this instead: -
for (int i = 0; i < 2000; i++)
{
float* where_is_i = data2d[i];
for (int j = i; j < 2000; j++)
{
float& where_to_put_it = where_is_i[j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++)
{
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

The idea here is to reduce the number of times that you de-reference
data2d with the same value of i.


OK. What about this one?

float **where_is_i=data2d;
for (int i = 0; i < 2000; ++i,++where_is_i) {
float *where_to_put_it=where_is_i+0;
for (int j = i; j < 2000; ++j,++where_is_i) {
*where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

It has more additions, and less multiplications/shiftings. BTW, not
tested it, just for fun.
Oct 31 '05 #9

P: n/a
Zara wrote:
On 28 Oct 2005 08:45:40 -0700, "chrisg67" <ch******@gmail.com> wrote:

Hi,
This doesn't look like the same code, I think you're dereferencing j
before decalring it, possibly you want to move the lines around like
this: -

for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

Sure. Wrting and not testing => bugging ;-)

Nice idea using a reference though, although I suspect this will be the
same speed, how about this instead: -
for (int i = 0; i < 2000; i++)
{
float* where_is_i = data2d[i];
for (int j = i; j < 2000; j++)
{
float& where_to_put_it = where_is_i[j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++)
{
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

The idea here is to reduce the number of times that you de-reference
data2d with the same value of i.

OK. What about this one?

float **where_is_i=data2d;
for (int i = 0; i < 2000; ++i,++where_is_i) {
float *where_to_put_it=where_is_i+0;
for (int j = i; j < 2000; ++j,++where_is_i) {
*where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

It has more additions, and less multiplications/shiftings. BTW, not
tested it, just for fun.

This one ist faster. And my first case was without the last loop.

Thanks at everyone that helped me.
Nov 1 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.