Which if faster?

Ganesh

Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?

Oct 24 '06 #1

Subscribe Reply

1690

Michael

Which of the following is faster and why?

>
for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?

Two answers:
1) For any reasonably modern optimizing compiler, they should be the
same speed.
2) This kind of micro-optimization will almost never matter; better to
spend time on fixing slow algorithms, handling slow IO, etc. Whatever
a profiler tells you is the part of the code that is *actually* taking
time.

Michael

Oct 24 '06 #2

Salt_Peter

Ganesh wrote:

Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?

Probably because the first implies a cast.

Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}

Oct 24 '06 #3

Frederick Gotham

Ganesh posted:

Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....

On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.

--

Frederick Gotham

Oct 24 '06 #4

Howard

"Salt_Peter" <pj*****@yahoo.comwrote in message
news:11*********************@m73g2000cwd.googlegro ups.com...

>
Ganesh wrote:
>Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?

Probably because the first implies a cast.

Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}

Huh? The "int" type is the "native" integral type on any machine. Why
would there be a cast using "int i" to index a dynamically allocated array
(or any array for that matter)? The question wasn't about vectors, but
about raw pointers and arrays. I'd think that a "conversion" from size_t to
int would be needed if using your code and indexing the array with it. But
I see no cast implied or required anywhere in the OP's code.
The "usual" assumption is that the latter method (using pointers and
incrementing) is faster, on the assumption that indexing requires
multiplying the index by the size of an element, and adding that as an
offset to the array start. The assumption is false. That's one way to
implement indexing, but compilers are not required to implement it that way.

Indeed, a decent optimizing compiler is likely to take advantage of whatever
native machine code would do the job best, whether that's by using code more
like the pointer version, or by pre-loading some registers and calling a
built-in machine instruction which handles loops with one simple command.
Or any other solution they see fit to use.

Compiler makers undoubtedly profile their solutions. You should do the same
if you want to know whether you need to change your code, and whether a
proposed solution is a good one for your particular platform.

-Howard

Oct 24 '06 #5

Rolf Magnus

Ganesh wrote:

Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why?

I have seen that happen on some compiler. Depending on the optimization
capabilities of the compiler and on the type that a points to, it might
happen that i gets multiplied with that type's size on each iteration for
the first solution, while the latter does an addition instead. Since on
many architectures, an addition is quicker than a multiplication, it can
happen that the second one is faster.

Or is that dependent on architecture or compiler?

It is.

Oct 24 '06 #6

Howard

"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:j3*******************@news.indigo.ie...

Ganesh posted:

>Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....

On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

Can you back those statements up with documentation?

There are many forms of optimizations a compiler might use to implement a
loop, regardless of what the original code looks like. The best way to know
what's fastest is to test it on your platform. And that's _still_ no
guarantee that you've actually hit upon the absolute fastest method for your
compiler and platform.

-Howard

Oct 24 '06 #7

Frederick Gotham

Howard posted:

Can you back those statements up with documentation?

There are many forms of optimizations a compiler might use to implement
a loop, regardless of what the original code looks like. The best way
to know what's fastest is to test it on your platform. And that's
_still_ no guarantee that you've actually hit upon the absolute fastest
method for your compiler and platform.

A few of us tested them over on comp.lang.c. Here's the relevant thread:

http://groups.google.ie/group/comp.l...c11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4c a430be

--

Frederick Gotham

Oct 24 '06 #8

Victor Bazarov

Frederick Gotham wrote:

Ganesh posted:

>Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....

On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer
and an offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.

They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 24 '06 #9

Frederick Gotham

Victor Bazarov posted:

>On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer
and an offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.

They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.

Sorry, I don't follow. . .

(Also, the code assumes that len is non-zero)

--

Frederick Gotham

Oct 24 '06 #10

Victor Bazarov

Frederick Gotham wrote:

Victor Bazarov posted:

>>On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer
and an offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.

They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.

Sorry, I don't follow. . .

(Also, the code assumes that len is non-zero)

Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

is not semantically equivalent to

int i = 0;
do
blah;
while (++i < len);

, that's all.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 24 '06 #11

Howard

"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:7f*******************@news.indigo.ie...

Howard posted:

>Can you back those statements up with documentation?

Why did you clip what I was responding to? Now nobody reading this will
know what the heck I was talking about. Here are your statements that I was
challenging:

On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

>
A few of us tested them over on comp.lang.c. Here's the relevant thread:

http://groups.google.ie/group/comp.l...c11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4c a430be

I see a couple tests, but nothing which makes your statements always true,
even for the specific code you gave (let alone for the more general case the
OP asked about). How do you know that my compiler won't be able to unroll a
loop? Or that it won't load up a couple registers and call a single machine
instruction that performs the loop? (I've used those many times when
writing assembly code to handle simple operations on arrays.)

I think that you can really only say that, on your machine, using your
version of your compiler, your specific tests showed some specific results.
It's not valid to extend that to: "on any machine which has such-and-such a
feature, the following code is always fastest". There's simply no way to
know that for sure.

And even if you somehow demonstrated that it's true on all existing
compilers on all existing machines, that would be no guarantee that someone
couldn't write a compiler to implement it faster on one or more of those
machines. Or make a machine which will do it faster using some other
implementation.

-Howard

Oct 24 '06 #12

Howard

"Victor Bazarov" <v.********@comAcast.netwrote in message
news:eh**********@news.datemas.de...

Frederick Gotham wrote:
>Victor Bazarov posted:

>
Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

Typing mistakes are certainly a close second. :-)

That should obviously be "++i", not "+i".

-Howard

Oct 24 '06 #13

Clark S. Cox III

Salt_Peter wrote:

Ganesh wrote:
>Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?

Probably because the first implies a cast.

It implies no such thing.

Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}

I'd consider any compiler that produces significantly different code broken.

--
Clark S. Cox III
cl*******@gmail.com

Oct 24 '06 #14

Victor Bazarov

Howard wrote:

"Victor Bazarov" <v.********@comAcast.netwrote in message
news:eh**********@news.datemas.de...
>Frederick Gotham wrote:
>>Victor Bazarov posted:

>>
Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

Typing mistakes are certainly a close second. :-)

That should obviously be "++i", not "+i".

That's all my keyboard's fault! And since I'm using an MS Natural
thing, it's all the fault of MS!

Oct 24 '06 #15

Victor Bazarov

Clark S. Cox III wrote:

Salt_Peter wrote:
>Ganesh wrote:
>>Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and
of size N. I have read somewhere that the latter is faster. But
could not understand why? Or is that dependent on architecture or
compiler?

Probably because the first implies a cast.

It implies no such thing.

>Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}

I'd consider any compiler that produces significantly different code
broken.

Just to throw some fire into the oil, the former case is much easier
parallelizable than then latter.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 24 '06 #16

Clark S. Cox III

Frederick Gotham wrote:

Howard posted:

>Can you back those statements up with documentation?

There are many forms of optimizations a compiler might use to implement
a loop, regardless of what the original code looks like. The best way
to know what's fastest is to test it on your platform. And that's
_still_ no guarantee that you've actually hit upon the absolute fastest
method for your compiler and platform.

A few of us tested them over on comp.lang.c. Here's the relevant thread:

http://groups.google.ie/group/comp.l...c11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4c a430be

For every assertion that you made in that thread about which one was
generally faster, someone presented a counter-example.
--
Clark S. Cox III
cl*******@gmail.com

Oct 24 '06 #17

Ganesh

Ganesh wrote:

Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?

In my architecture (Intel Pentium IV 3 GHz, 1MG Cache ) I got the
following times for 10 runs on a reasonably huge array size for huge
number of iterations.

Array Access (seconds): 5 5 5 4 5 5 5 5 5 5
Pointer Access: 4 4 3 4 4 4 4 4 4
3

On the average, pointer access is faster, but array access also equals
the pointer access sometimes. Not sure why this happens. The assembly
code is different for each case. For array access, it shows
..
addl (%esi, %eax,4), %ebx
incl %eax
....
,
while for pointer access it generates
....
addl %eax, %ebx,
addl $4, %eax
....

Oct 24 '06 #18

BobR

Victor Bazarov wrote in message ...

>Howard wrote:
>"Victor Bazarov" wrote in message
>>Frederick Gotham wrote:
Victor Bazarov posted:

Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

Typing mistakes are certainly a close second. :-)

That should obviously be "++i", not "+i".

That's all my keyboard's fault! And since I'm using an MS Natural
thing, it's all the fault of MS!

Nice save, Victor!!
Does *that* keyboard have a BSOD feature? Crash daily?

--
Bob <GR
POVrookie

Oct 24 '06 #19

Frederick Gotham

Howard posted:

Why did you clip what I was responding to? Now nobody reading this will
know what the heck I was talking about.

Of course they will, by the miracle of threads! You do have threads... don't
you? Or do you still have candles for lightbulbs?

And even if you somehow demonstrated that it's true on all existing
compilers on all existing machines, that would be no guarantee that
someone couldn't write a compiler to implement it faster on one or more
of those machines. Or make a machine which will do it faster using some
other implementation.

Indeed you're right. When writing portable code, I make an educated guess as
to what strategy will produce high efficiency across the board. Looking at
the tests done all today's computers, I opt for the *p++ solution.

--

Frederick Gotham

Oct 24 '06 #20

Frederick Gotham

Victor Bazarov posted:

Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

is not semantically equivalent to

int i = 0;
do
blah;
while (++i < len);

, that's all.

I make assumptions all the time -- I belive we call them "pre-conditions" in
the programming world. For instance, I can write an algorithm which deals
with an array, and say:

"The array must have at least two elements."

Then, if somebody gets bogus results when they supply it with either:

(1) A null pointer.
(2) An empty array (if there's such a thing).
(3) A sole-element array.

, then I'll just say tough luck. The C++ Standard seems to agree with me,
guess what happens when you supply "strlen" with a null pointer.

That said though, I do make liberal use of "assert".

--

Frederick Gotham

Oct 24 '06 #21

Gernot Frisch

>for (int i = 0; i < N; i++)
= ... a[i]...

On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

Even for N==0 ? :/

Oct 25 '06 #22

Victor Bazarov

Frederick Gotham wrote:

Victor Bazarov posted:

>Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

is not semantically equivalent to

int i = 0;
do
blah;
while (++i < len);

, that's all.

I make assumptions all the time -- I belive we call them
"pre-conditions" in the programming world. For instance, I can write
an algorithm which deals with an array, and say:

"The array must have at least two elements."
[..]
That said though, I do make liberal use of "assert".

Good. The code you posted didn't have any.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 25 '06 #23

Frederick Gotham

Victor Bazarov posted:

>That said though, I do make liberal use of "assert".

Good. The code you posted didn't have any.

But "assert" isn't fool-proof (unless you decide to not define NDEBUG in
Release Mode).

--

Frederick Gotham

Oct 25 '06 #24

Victor Bazarov

Frederick Gotham wrote:

Victor Bazarov posted:

>>That said though, I do make liberal use of "assert".

Good. The code you posted didn't have any.

But "assert" isn't fool-proof (unless you decide to not define NDEBUG
in Release Mode).

Nothing is fool-proof. Your code, however, didn't even have any
comments explaining what assumptions had been made.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 25 '06 #25

persenaama

Howard wrote:

On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

Can you back those statements up with documentation?

As "int" is used there as the type the point is less relevant than when
something else is being used. Example:

mytype* p = ...;
p[i] ... ++i
and..
*p++;

Have one major difference from the point of view of large number of
platforms. It's common to have somesort of
register+register*scale(+constant) addressing mode. This works nicely
when the scale can match the size of the type in question.

If however, say, the size isn't a power of two or for some other
practical reason not matching the scale supported by a particular
platform, the p[i] implies multiplication (or efficient implementation
of multiplication by a constant), where as the p++ will imply ADDITION.

This is not really a C++ consideration and off-topic anyways, but a
practical consideration. Even so, I myself prefer the array indexing
over pointer dereference-and-increase. Why? Because it's more readable.

I could loop towards zero. I could use do-while or while loop if I know
some interesting detail about architechture and compiler I am using.
But I still wouldn't take advantage of this so-called knowledge. The
reason is simple: readability and clarity. I'm well aware that most
code that I write isn't performance critical, so I really don't want to
pay the cost of intentionally making code more complicated that it
needs to be.

On a register starved system it might be advantageous to use the
pointer -and- terminating end-of-iteration pointer to avoid spilling if
nothing else. But for the same reason as before, optimizing at cost of
clarity should be avoided unless necessary.. which for most code simply
isn't.

There's more to be said but most points converge to unnecessary
optimization and clarity issues..

Oct 25 '06 #26

Which if faster?

Similar topics