for loop speed problem

Hi

this code:

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
tmp = 0.;

}
cout << "J: " << i << endl;
}
is 100 times faster than this

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
data2d[i][j] = tmp; // the problem
tmp = 0.;

}
cout << "J: " << i << endl;
}

can anybody help me?

greetings

Mario

Oct 28 '05 #1

Subscribe Post Reply

1508

Mario Lüttich wrote [10/28/2005 08:32 AM] :

Hi

this code:

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> tmp = 0.;
>
> }
> cout << "J: " < }

is 100 times faster than this

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> data2d[i][j] = tmp; // the problem
> tmp = 0.;
>
> }
> cout << "J: " < }

can anybody help me?

greetings

Mario

data2d type? 2d array?

Oct 28 '05 #2

Zara

On Fri, 28 Oct 2005 15:32:51 +0200, Mario Lüttich <ml*****@gwdg.de>
wrote:

Hi

this code:

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
tmp = 0.;

}
cout << "J: " << i << endl;
}

is 100 times faster than this

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
data2d[i][j] = tmp; // the problem
tmp = 0.;

}
cout << "J: " << i << endl;
}

can anybody help me?

greetings

Mario

What about the following?

for (int i = 0; i < 2000; i++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
where_to_put_i += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

.... and if we knew more deatils, we may even give more suggestions

Best regards,

-- Zara

Oct 28 '05 #3

chrisg67

Hi,
This doesn't look like the same code, I think you're dereferencing j
before decalring it, possibly you want to move the lines around like
this: -

for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

Nice idea using a reference though, although I suspect this will be the
same speed, how about this instead: -
for (int i = 0; i < 2000; i++)
{
float* where_is_i = data2d[i];
for (int j = i; j < 2000; j++)
{
float& where_to_put_it = where_is_i[j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++)
{
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

The idea here is to reduce the number of times that you de-reference
data2d with the same value of i.

Oct 28 '05 #4

BRG

Mario Lüttich wrote:

Hi

this code:

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
tmp = 0.;

}
cout << "J: " << i << endl;
}

is 100 times faster than this

float tmp = 0.;
for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
for (int k = 0; k < 9000; k++) {
tmp += tmp_mat(k,i) * tmp_mat(k,j);
}
data2d[i][j] = tmp; // the problem
tmp = 0.;

}
cout << "J: " << i << endl;
}

can anybody help me?

The speed difference is not surprising since the compiler has probably
noticed that the loop on k in the first example can be eliminated since
its output value (in tmp) is not used.

Oct 28 '05 #5

n2xssvv g02gfr12930

Mario Lüttich wrote:

Hi

this code:

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> tmp = 0.;
>
> }
> cout << "J: " < }

is 100 times faster than this

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> data2d[i][j] = tmp; // the problem
> tmp = 0.;
>
> }
> cout << "J: " < }

can anybody help me?

greetings

Mario

Mario,

Use pointers to loop through data2d incrementing them as required
like the code shown below. Incrementing pointers is quicker than the
calculations to index the array with i and j.

int data[10][10];
int (*pRow)[10] = data,*pCol;
int i,j;

for (i = 0;i < 10;++i,++pRow)
{
pCol = *pRow;
for (j = 0;j < 10;++j,++pCol)
{
*pCol = i * j;
}
}

JFJB

Oct 28 '05 #6

Mario Lüttich

n2xssvv g02gfr12930 wrote:

Mario Lüttich wrote:
Hi

this code:

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> tmp = 0.;
> > }
> cout << "J: " < }

is 100 times faster than this

> float tmp = 0.;
> for (int i = 0; i < 2000; i++) {
> for (int j = i; j < 2000; j++) {
> for (int k = 0; k < 9000; k++) {
> tmp += tmp_mat(k,i) * tmp_mat(k,j);
> }
> data2d[i][j] = tmp; // the problem
> tmp = 0.;
> > }
> cout << "J: " < }

can anybody help me?

greetings

Mario

Mario,

Use pointers to loop through data2d incrementing them as required
like the code shown below. Incrementing pointers is quicker than the
calculations to index the array with i and j.

int data[10][10];
int (*pRow)[10] = data,*pCol;
int i,j;

for (i = 0;i < 10;++i,++pRow)
{
pCol = *pRow;
for (j = 0;j < 10;++j,++pCol)
{
*pCol = i * j;
}
}

JFJB

Thanks anybody!

greetings

Mario

Oct 31 '05 #7

carlmuller

Mario Lüttich wrote:

Hi

this code:
[snip code that does not store the results anywhere]
is 100 times faster than this
[snip other code]
can anybody help me?

Look at the assembler file generated by the compiler. See what it is
doing. Step through the code instruction by instruction in a debugger.
If the optimiser is eliminating the inner loop then that code will not
be exercised. Try turning the optimiser off and seeing if the versions
are still so dis-similar. If one version is calling a library routine
instead of using built-in instructions, you will notice that too. If
the assembly looks fine, then it is possible that architecture issues
(cache lines, FPU pipelines) are dominating. Its interesting to see
what your compiler is actually doing.

Oct 31 '05 #8

Zara

On 28 Oct 2005 08:45:40 -0700, "chrisg67" <ch******@gmail.com> wrote:

Hi,
This doesn't look like the same code, I think you're dereferencing j
before decalring it, possibly you want to move the lines around like
this: -

for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " < bugging ;-)

Nice idea using a reference though, although I suspect this will be the
same speed, how about this instead: -
for (int i = 0; i < 2000; i++)
{
float* where_is_i = data2d[i];
for (int j = i; j < 2000; j++)
{
float& where_to_put_it = where_is_i[j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++)
{
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

The idea here is to reduce the number of times that you de-reference
data2d with the same value of i.

OK. What about this one?

float **where_is_i=data2d;
for (int i = 0; i < 2000; ++i,++where_is_i) {
float *where_to_put_it=where_is_i+0;
for (int j = i; j < 2000; ++j,++where_is_i) {
*where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

It has more additions, and less multiplications/shiftings. BTW, not
tested it, just for fun.

Oct 31 '05 #9

Mario Lüttich

Zara wrote:

On 28 Oct 2005 08:45:40 -0700, "chrisg67" <ch******@gmail.com> wrote:

Hi,
This doesn't look like the same code, I think you're dereferencing j
before decalring it, possibly you want to move the lines around like
this: -

for (int i = 0; i < 2000; i++) {
for (int j = i; j < 2000; j++) {
float& where_to_put_it=data2d[i][j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " < bugging ;-)

Nice idea using a reference though, although I suspect this will be the
same speed, how about this instead: -
for (int i = 0; i < 2000; i++)
{
float* where_is_i = data2d[i];
for (int j = i; j < 2000; j++)
{
float& where_to_put_it = where_is_i[j];
where_to_put_it=0;
for (int k = 0; k < 9000; k++)
{
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

The idea here is to reduce the number of times that you de-reference
data2d with the same value of i.

OK. What about this one?

float **where_is_i=data2d;
for (int i = 0; i < 2000; ++i,++where_is_i) {
float *where_to_put_it=where_is_i+0;
for (int j = i; j < 2000; ++j,++where_is_i) {
*where_to_put_it=0;
for (int k = 0; k < 9000; k++) {
where_to_put_it += tmp_mat(k,i) * tmp_mat(k,j);
}
}
cout << "J: " << i << endl;
}

It has more additions, and less multiplications/shiftings. BTW, not
tested it, just for fun.

This one ist faster. And my first case was without the last loop.

Thanks at everyone that helped me.

Nov 1 '05 #10

Similar topics

Infinite Loop Detector

by: Gremlin | last post by:

If you are not familiar with the halting problem, I will not go into it in detail but it states that it is impossible to write a program that can tell if a loop is infinite or not. This is a...

C / C++

duff's device / loop unriolling

by: Jan Richter | last post by:

Hi there, the Code below shows DJBs own implementation of strlen (str_len): unsigned int str_len(char *s) { register char *t; t = s; for (;;) { if (!*t) return t - s; ++t;

C / C++

C# compiler fails to optimize for loop same as foreach

by: Mike Lansdaal | last post by:

I came across a reference on a web site (http://www.personalmicrocosms.com/html/dotnettips.html#richtextbox_lines ) that said to speed up access to a rich text box's lines that you needed to use a...

C# / C Sharp

102

removing a loop cause it to go at half the speed?

by: tom fredriksen | last post by:

Hi I was doing a simple test of the speed of a "maths" operation and when I tested it I found that removing the loop that initialises the data array for the operation caused the whole program to...

C / C++

How to let a loop run for a while before checking for break condition?

by: Claudio Grondi | last post by:

Sometimes it is known in advance, that the time spent in a loop will be in order of minutes or even hours, so it makes sense to optimize each element in the loop to make it run faster. One of...

Python

Efficiency of the for loop

by: vamshi | last post by:

Hi all, This is a question about the efficiency of the code. a :- int i; for( i = 0; i < 20; i++ ) printf("%d",i); b:- int i = 10;

C / C++

Infinite loop

by: dmoran21 | last post by:

Hi all, I am a mathematician and I'm trying to write a program to try out a formula that I've derived. However, it seems that I've got an infinite loop and I don't quite understand why. I was...

C / C++

How to speed up a code loop with INSERT INTO query?

by: SaltyBoat | last post by:

Needing to import and parse data from a large PDF file into an Access 2002 table: I start by converted the PDF file to a html file. Then I read this html text file, line by line, into a table...

Microsoft Access / VBA

can Accordion menu can be used in a " for loop" or not?

by: bhavyagupt | last post by:

i 'm not able to use the accordion menu code in for loop . can anyone solve ma problem. code....... slider.js------> java script var slider=function(){ var array=; var speed=10; var...

Javascript

Do While loop...Please take a look at my code...

by: alireza6485 | last post by:

Hi, Could you please rewrite the program for me?I tried my best and the program still does not do what it has to do. I have to write a code that generates random speed and distance .it ask the...

Java

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server