473,388 Members | 1,496 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,388 software developers and data experts.

for loop speed problem

Hi

this code:

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
tmp = 0.;

}
cout << "J: " << i << endl;
}
is 100 times faster than this

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
data2d[i][j] = tmp; // the problem
tmp = 0.;

}
cout << "J: " << i << endl;
}


can anybody help me?

greetings

Mario
Oct 28 '05 #1
9 1508
L
Mario Lüttich wrote [10/28/2005 08:32 AM] :
Hi

this code:

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> tmp = 0.;
>
> }
> cout << "J: " << i << endl;
> }


is 100 times faster than this

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> data2d[i][j] = tmp; // the problem
> tmp = 0.;
>
> }
> cout << "J: " << i << endl;
> }


can anybody help me?

greetings

Mario


data2d type? 2d array?
Oct 28 '05 #2
On Fri, 28 Oct 2005 15:32:51 +0200, Mario Lüttich <ml*****@gwdg.de>
wrote:
Hi

this code:

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
tmp = 0.;

}
cout << "J: " << i << endl;
}


is 100 times faster than this

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
data2d[i][j] = tmp; // the problem
tmp = 0.;

}
cout << "J: " << i << endl;
}


can anybody help me?

greetings

Mario

What about the following?

for (int i = 0; i < 2000; i++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
where_to_put_i += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

.... and if we knew more deatils, we may even give more suggestions

Best regards,

-- Zara

Oct 28 '05 #3
Hi,
This doesn't look like the same code, I think you're dereferencing j
before decalring it, possibly you want to move the lines around like
this: -

for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

Nice idea using a reference though, although I suspect this will be the
same speed, how about this instead: -
for (int i = 0; i < 2000; i++)
{
float* where_is_i = data2d[i];
for (int j = i; j < 2000; j++)
{
float& where_to_put_it = where_is_i[j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++)
{
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

The idea here is to reduce the number of times that you de-reference
data2d with the same value of i.

Oct 28 '05 #4
BRG
Mario Lüttich wrote:
Hi

this code:

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
tmp = 0.;

}
cout << "J: " << i << endl;
}


is 100 times faster than this

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
data2d[i][j] = tmp; // the problem
tmp = 0.;

}
cout << "J: " << i << endl;
}


can anybody help me?


The speed difference is not surprising since the compiler has probably
noticed that the loop on k in the first example can be eliminated since
its output value (in tmp) is not used.
Oct 28 '05 #5
Mario Lüttich wrote:
Hi

this code:

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> tmp = 0.;
>
> }
> cout << "J: " << i << endl;
> }


is 100 times faster than this

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> data2d[i][j] = tmp; // the problem
> tmp = 0.;
>
> }
> cout << "J: " << i << endl;
> }


can anybody help me?

greetings

Mario

Mario,

Use pointers to loop through data2d incrementing them as required
like the code shown below. Incrementing pointers is quicker than the
calculations to index the array with i and j.

int data[10][10];
int (*pRow)[10] = data,*pCol;
int i,j;

for (i = 0;i < 10;++i,++pRow)
{
pCol = *pRow;
for (j = 0;j < 10;++j,++pCol)
{
*pCol = i * j;
}
}

JFJB
Oct 28 '05 #6
n2xssvv g02gfr12930 wrote:
Mario Lüttich wrote:
Hi

this code:

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> tmp = 0.;
> > }
> cout << "J: " << i << endl;
> }


is 100 times faster than this

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> data2d[i][j] = tmp; // the problem
> tmp = 0.;
> > }
> cout << "J: " << i << endl;
> }


can anybody help me?

greetings

Mario


Mario,

Use pointers to loop through data2d incrementing them as required
like the code shown below. Incrementing pointers is quicker than the
calculations to index the array with i and j.

int data[10][10];
int (*pRow)[10] = data,*pCol;
int i,j;

for (i = 0;i < 10;++i,++pRow)
{
pCol = *pRow;
for (j = 0;j < 10;++j,++pCol)
{
*pCol = i * j;
}
}

JFJB


Thanks anybody!

greetings

Mario
Oct 31 '05 #7
Mario Lüttich wrote:
Hi

this code:
[snip code that does not store the results anywhere]
is 100 times faster than this
[snip other code]
can anybody help me?


Look at the assembler file generated by the compiler. See what it is
doing. Step through the code instruction by instruction in a debugger.
If the optimiser is eliminating the inner loop then that code will not
be exercised. Try turning the optimiser off and seeing if the versions
are still so dis-similar. If one version is calling a library routine
instead of using built-in instructions, you will notice that too. If
the assembly looks fine, then it is possible that architecture issues
(cache lines, FPU pipelines) are dominating. Its interesting to see
what your compiler is actually doing.

Oct 31 '05 #8
On 28 Oct 2005 08:45:40 -0700, "chrisg67" <ch******@gmail.com> wrote:
Hi,
This doesn't look like the same code, I think you're dereferencing j
before decalring it, possibly you want to move the lines around like
this: -

for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}
Sure. Wrting and not testing => bugging ;-)

Nice idea using a reference though, although I suspect this will be the
same speed, how about this instead: -
for (int i = 0; i < 2000; i++)
{
float* where_is_i = data2d[i];
for (int j = i; j < 2000; j++)
{
float& where_to_put_it = where_is_i[j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++)
{
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

The idea here is to reduce the number of times that you de-reference
data2d with the same value of i.


OK. What about this one?

float **where_is_i=data2d;
for (int i = 0; i < 2000; ++i,++where_is_i) {
float *where_to_put_it=where_is_i+0;
for (int j = i; j < 2000; ++j,++where_is_i) {
*where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

It has more additions, and less multiplications/shiftings. BTW, not
tested it, just for fun.
Oct 31 '05 #9
Zara wrote:
On 28 Oct 2005 08:45:40 -0700, "chrisg67" <ch******@gmail.com> wrote:

Hi,
This doesn't look like the same code, I think you're dereferencing j
before decalring it, possibly you want to move the lines around like
this: -

for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

Sure. Wrting and not testing => bugging ;-)

Nice idea using a reference though, although I suspect this will be the
same speed, how about this instead: -
for (int i = 0; i < 2000; i++)
{
float* where_is_i = data2d[i];
for (int j = i; j < 2000; j++)
{
float& where_to_put_it = where_is_i[j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++)
{
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

The idea here is to reduce the number of times that you de-reference
data2d with the same value of i.

OK. What about this one?

float **where_is_i=data2d;
for (int i = 0; i < 2000; ++i,++where_is_i) {
float *where_to_put_it=where_is_i+0;
for (int j = i; j < 2000; ++j,++where_is_i) {
*where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

It has more additions, and less multiplications/shiftings. BTW, not
tested it, just for fun.

This one ist faster. And my first case was without the last loop.

Thanks at everyone that helped me.
Nov 1 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

43
by: Gremlin | last post by:
If you are not familiar with the halting problem, I will not go into it in detail but it states that it is impossible to write a program that can tell if a loop is infinite or not. This is a...
22
by: Jan Richter | last post by:
Hi there, the Code below shows DJBs own implementation of strlen (str_len): unsigned int str_len(char *s) { register char *t; t = s; for (;;) { if (!*t) return t - s; ++t;
15
by: Mike Lansdaal | last post by:
I came across a reference on a web site (http://www.personalmicrocosms.com/html/dotnettips.html#richtextbox_lines ) that said to speed up access to a rich text box's lines that you needed to use a...
102
by: tom fredriksen | last post by:
Hi I was doing a simple test of the speed of a "maths" operation and when I tested it I found that removing the loop that initialises the data array for the operation caused the whole program to...
16
by: Claudio Grondi | last post by:
Sometimes it is known in advance, that the time spent in a loop will be in order of minutes or even hours, so it makes sense to optimize each element in the loop to make it run faster. One of...
19
by: vamshi | last post by:
Hi all, This is a question about the efficiency of the code. a :- int i; for( i = 0; i < 20; i++ ) printf("%d",i); b:- int i = 10;
33
by: dmoran21 | last post by:
Hi all, I am a mathematician and I'm trying to write a program to try out a formula that I've derived. However, it seems that I've got an infinite loop and I don't quite understand why. I was...
8
by: SaltyBoat | last post by:
Needing to import and parse data from a large PDF file into an Access 2002 table: I start by converted the PDF file to a html file. Then I read this html text file, line by line, into a table...
3
by: bhavyagupt | last post by:
i 'm not able to use the accordion menu code in for loop . can anyone solve ma problem. code....... slider.js------> java script var slider=function(){ var array=; var speed=10; var...
2
by: alireza6485 | last post by:
Hi, Could you please rewrite the program for me?I tried my best and the program still does not do what it has to do. I have to write a code that generates random speed and distance .it ask the...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.