473,486 Members | 2,136 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Which if faster?

Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?

Oct 24 '06 #1
25 1690
Which of the following is faster and why?
>
for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?
Two answers:
1) For any reasonably modern optimizing compiler, they should be the
same speed.
2) This kind of micro-optimization will almost never matter; better to
spend time on fixing slow algorithms, handling slow IO, etc. Whatever
a profiler tells you is the part of the code that is *actually* taking
time.

Michael

Oct 24 '06 #2

Ganesh wrote:
Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?
Probably because the first implies a cast.

Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}

Oct 24 '06 #3
Ganesh posted:
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....

On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.

--

Frederick Gotham
Oct 24 '06 #4

"Salt_Peter" <pj*****@yahoo.comwrote in message
news:11*********************@m73g2000cwd.googlegro ups.com...
>
Ganesh wrote:
>Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?

Probably because the first implies a cast.

Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}
Huh? The "int" type is the "native" integral type on any machine. Why
would there be a cast using "int i" to index a dynamically allocated array
(or any array for that matter)? The question wasn't about vectors, but
about raw pointers and arrays. I'd think that a "conversion" from size_t to
int would be needed if using your code and indexing the array with it. But
I see no cast implied or required anywhere in the OP's code.
The "usual" assumption is that the latter method (using pointers and
incrementing) is faster, on the assumption that indexing requires
multiplying the index by the size of an element, and adding that as an
offset to the array start. The assumption is false. That's one way to
implement indexing, but compilers are not required to implement it that way.

Indeed, a decent optimizing compiler is likely to take advantage of whatever
native machine code would do the job best, whether that's by using code more
like the pointer version, or by pre-loading some registers and calling a
built-in machine instruction which handles loops with one simple command.
Or any other solution they see fit to use.

Compiler makers undoubtedly profile their solutions. You should do the same
if you want to know whether you need to change your code, and whether a
proposed solution is a good one for your particular platform.

-Howard

Oct 24 '06 #5
Ganesh wrote:
Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why?
I have seen that happen on some compiler. Depending on the optimization
capabilities of the compiler and on the type that a points to, it might
happen that i gets multiplied with that type's size on each iteration for
the first solution, while the latter does an addition instead. Since on
many architectures, an addition is quicker than a multiplication, it can
happen that the second one is faster.
Or is that dependent on architecture or compiler?
It is.

Oct 24 '06 #6

"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:j3*******************@news.indigo.ie...
Ganesh posted:
>Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....


On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);
Can you back those statements up with documentation?

There are many forms of optimizations a compiler might use to implement a
loop, regardless of what the original code looks like. The best way to know
what's fastest is to test it on your platform. And that's _still_ no
guarantee that you've actually hit upon the absolute fastest method for your
compiler and platform.

-Howard

Oct 24 '06 #7
Howard posted:
Can you back those statements up with documentation?

There are many forms of optimizations a compiler might use to implement
a loop, regardless of what the original code looks like. The best way
to know what's fastest is to test it on your platform. And that's
_still_ no guarantee that you've actually hit upon the absolute fastest
method for your compiler and platform.

A few of us tested them over on comp.lang.c. Here's the relevant thread:

http://groups.google.ie/group/comp.l...c11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4c a430be

--

Frederick Gotham
Oct 24 '06 #8
Frederick Gotham wrote:
Ganesh posted:
>Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....


On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer
and an offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.
They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Oct 24 '06 #9
Victor Bazarov posted:
>On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer
and an offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.

They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.

Sorry, I don't follow. . .

(Also, the code assumes that len is non-zero)

--

Frederick Gotham
Oct 24 '06 #10
Frederick Gotham wrote:
Victor Bazarov posted:
>>On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer
and an offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.

They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.


Sorry, I don't follow. . .

(Also, the code assumes that len is non-zero)
Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

is not semantically equivalent to

int i = 0;
do
blah;
while (++i < len);

, that's all.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Oct 24 '06 #11

"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:7f*******************@news.indigo.ie...
Howard posted:
>Can you back those statements up with documentation?
Why did you clip what I was responding to? Now nobody reading this will
know what the heck I was talking about. Here are your statements that I was
challenging:
On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);
>
A few of us tested them over on comp.lang.c. Here's the relevant thread:

http://groups.google.ie/group/comp.l...c11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4c a430be
I see a couple tests, but nothing which makes your statements always true,
even for the specific code you gave (let alone for the more general case the
OP asked about). How do you know that my compiler won't be able to unroll a
loop? Or that it won't load up a couple registers and call a single machine
instruction that performs the loop? (I've used those many times when
writing assembly code to handle simple operations on arrays.)

I think that you can really only say that, on your machine, using your
version of your compiler, your specific tests showed some specific results.
It's not valid to extend that to: "on any machine which has such-and-such a
feature, the following code is always fastest". There's simply no way to
know that for sure.

And even if you somehow demonstrated that it's true on all existing
compilers on all existing machines, that would be no guarantee that someone
couldn't write a compiler to implement it faster on one or more of those
machines. Or make a machine which will do it faster using some other
implementation.

-Howard
Oct 24 '06 #12

"Victor Bazarov" <v.********@comAcast.netwrote in message
news:eh**********@news.datemas.de...
Frederick Gotham wrote:
>Victor Bazarov posted:
>
Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;
Typing mistakes are certainly a close second. :-)

That should obviously be "++i", not "+i".

-Howard

Oct 24 '06 #13
Salt_Peter wrote:
Ganesh wrote:
>Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?

Probably because the first implies a cast.
It implies no such thing.
Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}
I'd consider any compiler that produces significantly different code broken.

--
Clark S. Cox III
cl*******@gmail.com
Oct 24 '06 #14
Howard wrote:
"Victor Bazarov" <v.********@comAcast.netwrote in message
news:eh**********@news.datemas.de...
>Frederick Gotham wrote:
>>Victor Bazarov posted:
>>
Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

Typing mistakes are certainly a close second. :-)

That should obviously be "++i", not "+i".
That's all my keyboard's fault! And since I'm using an MS Natural
thing, it's all the fault of MS!
Oct 24 '06 #15
Clark S. Cox III wrote:
Salt_Peter wrote:
>Ganesh wrote:
>>Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and
of size N. I have read somewhere that the latter is faster. But
could not understand why? Or is that dependent on architecture or
compiler?

Probably because the first implies a cast.

It implies no such thing.
>Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}

I'd consider any compiler that produces significantly different code
broken.
Just to throw some fire into the oil, the former case is much easier
parallelizable than then latter.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Oct 24 '06 #16
Frederick Gotham wrote:
Howard posted:
>Can you back those statements up with documentation?

There are many forms of optimizations a compiler might use to implement
a loop, regardless of what the original code looks like. The best way
to know what's fastest is to test it on your platform. And that's
_still_ no guarantee that you've actually hit upon the absolute fastest
method for your compiler and platform.


A few of us tested them over on comp.lang.c. Here's the relevant thread:

http://groups.google.ie/group/comp.l...c11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4c a430be
For every assertion that you made in that thread about which one was
generally faster, someone presented a counter-example.
--
Clark S. Cox III
cl*******@gmail.com
Oct 24 '06 #17

Ganesh wrote:
Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a[i]...

(or)

for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?

In my architecture (Intel Pentium IV 3 GHz, 1MG Cache ) I got the
following times for 10 runs on a reasonably huge array size for huge
number of iterations.

Array Access (seconds): 5 5 5 4 5 5 5 5 5 5
Pointer Access: 4 4 3 4 4 4 4 4 4
3

On the average, pointer access is faster, but array access also equals
the pointer access sometimes. Not sure why this happens. The assembly
code is different for each case. For array access, it shows
..
addl (%esi, %eax,4), %ebx
incl %eax
....
,
while for pointer access it generates
....
addl %eax, %ebx,
addl $4, %eax
....

Oct 24 '06 #18

Victor Bazarov wrote in message ...
>Howard wrote:
>"Victor Bazarov" wrote in message
>>Frederick Gotham wrote:
Victor Bazarov posted:

Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;
Typing mistakes are certainly a close second. :-)

That should obviously be "++i", not "+i".

That's all my keyboard's fault! And since I'm using an MS Natural
thing, it's all the fault of MS!
Nice save, Victor!!
Does *that* keyboard have a BSOD feature? Crash daily?

--
Bob <GR
POVrookie
Oct 24 '06 #19
Howard posted:
Why did you clip what I was responding to? Now nobody reading this will
know what the heck I was talking about.

Of course they will, by the miracle of threads! You do have threads... don't
you? Or do you still have candles for lightbulbs?

And even if you somehow demonstrated that it's true on all existing
compilers on all existing machines, that would be no guarantee that
someone couldn't write a compiler to implement it faster on one or more
of those machines. Or make a machine which will do it faster using some
other implementation.

Indeed you're right. When writing portable code, I make an educated guess as
to what strategy will produce high efficiency across the board. Looking at
the tests done all today's computers, I opt for the *p++ solution.

--

Frederick Gotham
Oct 24 '06 #20
Victor Bazarov posted:
Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

is not semantically equivalent to

int i = 0;
do
blah;
while (++i < len);

, that's all.

I make assumptions all the time -- I belive we call them "pre-conditions" in
the programming world. For instance, I can write an algorithm which deals
with an array, and say:

"The array must have at least two elements."

Then, if somebody gets bogus results when they supply it with either:

(1) A null pointer.
(2) An empty array (if there's such a thing).
(3) A sole-element array.

, then I'll just say tough luck. The C++ Standard seems to agree with me,
guess what happens when you supply "strlen" with a null pointer.

That said though, I do make liberal use of "assert".

--

Frederick Gotham
Oct 24 '06 #21
>for (int i = 0; i < N; i++)
= ... a[i]...
On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);
Even for N==0 ? :/
Oct 25 '06 #22
Frederick Gotham wrote:
Victor Bazarov posted:
>Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

is not semantically equivalent to

int i = 0;
do
blah;
while (++i < len);

, that's all.


I make assumptions all the time -- I belive we call them
"pre-conditions" in the programming world. For instance, I can write
an algorithm which deals with an array, and say:

"The array must have at least two elements."
[..]
That said though, I do make liberal use of "assert".
Good. The code you posted didn't have any.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Oct 25 '06 #23
Victor Bazarov posted:
>That said though, I do make liberal use of "assert".

Good. The code you posted didn't have any.

But "assert" isn't fool-proof (unless you decide to not define NDEBUG in
Release Mode).

--

Frederick Gotham
Oct 25 '06 #24
Frederick Gotham wrote:
Victor Bazarov posted:
>>That said though, I do make liberal use of "assert".

Good. The code you posted didn't have any.


But "assert" isn't fool-proof (unless you decide to not define NDEBUG
in Release Mode).
Nothing is fool-proof. Your code, however, didn't even have any
comments explaining what assumptions had been made.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Oct 25 '06 #25

Howard wrote:
On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

Can you back those statements up with documentation?
As "int" is used there as the type the point is less relevant than when
something else is being used. Example:

mytype* p = ...;
p[i] ... ++i
and..
*p++;

Have one major difference from the point of view of large number of
platforms. It's common to have somesort of
register+register*scale(+constant) addressing mode. This works nicely
when the scale can match the size of the type in question.

If however, say, the size isn't a power of two or for some other
practical reason not matching the scale supported by a particular
platform, the p[i] implies multiplication (or efficient implementation
of multiplication by a constant), where as the p++ will imply ADDITION.

This is not really a C++ consideration and off-topic anyways, but a
practical consideration. Even so, I myself prefer the array indexing
over pointer dereference-and-increase. Why? Because it's more readable.

I could loop towards zero. I could use do-while or while loop if I know
some interesting detail about architechture and compiler I am using.
But I still wouldn't take advantage of this so-called knowledge. The
reason is simple: readability and clarity. I'm well aware that most
code that I write isn't performance critical, so I really don't want to
pay the cost of intentionally making code more complicated that it
needs to be.

On a register starved system it might be advantageous to use the
pointer -and- terminating end-of-iteration pointer to avoid spilling if
nothing else. But for the same reason as before, optimizing at cost of
clarity should be avoided unless necessary.. which for most code simply
isn't.

There's more to be said but most points converge to unnecessary
optimization and clarity issues..

Oct 25 '06 #26

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

17
6099
by: John Bentley | last post by:
John Bentley: INTRO The phrase "decimal number" within a programming context is ambiguous. It could refer to the decimal datatype or the related but separate concept of a generic decimal number....
5
5832
by: MLH | last post by:
I have a table I can open as table type recordset or a dynaset. Searching for a particular value in the table's main keyfield, which would be faster and less strain on the application......
18
1638
by: junky_fellow | last post by:
which of the following is faster and why ? if ( x == 1 ) { } or if ( x != 0 ) {
65
12501
by: Skybuck Flying | last post by:
Hi, I needed a method to determine if a point was on a line segment in 2D. So I googled for some help and so far I have evaluated two methods. The first method was only a formula, the second...
14
14999
by: Bob | last post by:
I have a function that takes in a list of IDs (hundreds) as input parameter and needs to pass the data to another step as a comma delimited string. The source can easily create this list of IDs in...
7
1515
by: Sunil Varma | last post by:
Is accessing function by it's name faster or accessing it by its address faster?
4
2637
by: Sonnich | last post by:
Hi I have a costum function for a special search, which sort strings. This is currently the place where I can save a lot of time (~70%) if possible. So, which is faster: for($j =...
8
3368
by: Sing | last post by:
Dear C Gurus, I would like to optimise a max() algo that my program uses many times. Which one is faster or are they the same? 1. #define max(a,b) (((a) (b)) ? (a) : (b)) 2. if (a>b) return a...
36
336
by: lak | last post by:
Which is faster? Post Increment or assignment? Why? I was not able to get any things.
0
7100
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
6964
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7175
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
7330
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5434
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4559
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3070
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3070
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1378
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.