Hi,
This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?
for (int i = 0; i < N; i++)
= ... a[i]...
(or)
for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler? 25 1690
Which of the following is faster and why?
>
for (int i = 0; i < N; i++)
= ... a[i]...
(or)
for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?
Two answers:
1) For any reasonably modern optimizing compiler, they should be the
same speed.
2) This kind of micro-optimization will almost never matter; better to
spend time on fixing slow algorithms, handling slow IO, etc. Whatever
a profiler tells you is the part of the code that is *actually* taking
time.
Michael
Ganesh wrote:
Hi,
This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?
for (int i = 0; i < N; i++)
= ... a[i]...
(or)
for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?
Probably because the first implies a cast.
Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}
Ganesh posted:
Which of the following is faster and why?
for (int i = 0; i < N; i++)
= ... a[i]...
(or)
for (int i = 0; i < N; i++)
= ...*a++....
On most systems, the following is the fastest:
int *p = arr;
int const *const pover = arr+len;
do *p++ = ...
while (pover != p);
On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:
size_t i = 0;
do arr[i++] = ...
while (len != i);
When writing portable code, I opt for the former.
--
Frederick Gotham
"Salt_Peter" <pj*****@yahoo.comwrote in message
news:11*********************@m73g2000cwd.googlegro ups.com...
>
Ganesh wrote:
>Hi,
This is a question that pertains to pointers in general (C or C++). Which of the following is faster and why?
for (int i = 0; i < N; i++) = ... a[i]...
(or)
for (int i = 0; i < N; i++) = ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of size N. I have read somewhere that the latter is faster. But could not understand why? Or is that dependent on architecture or compiler?
Probably because the first implies a cast.
Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}
Huh? The "int" type is the "native" integral type on any machine. Why
would there be a cast using "int i" to index a dynamically allocated array
(or any array for that matter)? The question wasn't about vectors, but
about raw pointers and arrays. I'd think that a "conversion" from size_t to
int would be needed if using your code and indexing the array with it. But
I see no cast implied or required anywhere in the OP's code.
The "usual" assumption is that the latter method (using pointers and
incrementing) is faster, on the assumption that indexing requires
multiplying the index by the size of an element, and adding that as an
offset to the array start. The assumption is false. That's one way to
implement indexing, but compilers are not required to implement it that way.
Indeed, a decent optimizing compiler is likely to take advantage of whatever
native machine code would do the job best, whether that's by using code more
like the pointer version, or by pre-loading some registers and calling a
built-in machine instruction which handles loops with one simple command.
Or any other solution they see fit to use.
Compiler makers undoubtedly profile their solutions. You should do the same
if you want to know whether you need to change your code, and whether a
proposed solution is a good one for your particular platform.
-Howard
Ganesh wrote:
Hi,
This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?
for (int i = 0; i < N; i++)
= ... a[i]...
(or)
for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why?
I have seen that happen on some compiler. Depending on the optimization
capabilities of the compiler and on the type that a points to, it might
happen that i gets multiplied with that type's size on each iteration for
the first solution, while the latter does an addition instead. Since on
many architectures, an addition is quicker than a multiplication, it can
happen that the second one is faster.
Or is that dependent on architecture or compiler?
It is.
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:j3*******************@news.indigo.ie...
Ganesh posted:
>Which of the following is faster and why?
for (int i = 0; i < N; i++) = ... a[i]...
(or)
for (int i = 0; i < N; i++) = ...*a++....
On most systems, the following is the fastest:
int *p = arr;
int const *const pover = arr+len;
do *p++ = ...
while (pover != p);
On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:
size_t i = 0;
do arr[i++] = ...
while (len != i);
Can you back those statements up with documentation?
There are many forms of optimizations a compiler might use to implement a
loop, regardless of what the original code looks like. The best way to know
what's fastest is to test it on your platform. And that's _still_ no
guarantee that you've actually hit upon the absolute fastest method for your
compiler and platform.
-Howard
Howard posted:
Can you back those statements up with documentation?
There are many forms of optimizations a compiler might use to implement
a loop, regardless of what the original code looks like. The best way
to know what's fastest is to test it on your platform. And that's
_still_ no guarantee that you've actually hit upon the absolute fastest
method for your compiler and platform.
A few of us tested them over on comp.lang.c. Here's the relevant thread: http://groups.google.ie/group/comp.l...c11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4c a430be
--
Frederick Gotham
Frederick Gotham wrote:
Ganesh posted:
>Which of the following is faster and why?
for (int i = 0; i < N; i++) = ... a[i]...
(or)
for (int i = 0; i < N; i++) = ...*a++....
On most systems, the following is the fastest:
int *p = arr;
int const *const pover = arr+len;
do *p++ = ...
while (pover != p);
On systems which have a CPU instruction which takes both a pointer
and an offset, the following is fastest:
size_t i = 0;
do arr[i++] = ...
while (len != i);
When writing portable code, I opt for the former.
They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.
V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Victor Bazarov posted:
>On most systems, the following is the fastest:
int *p = arr; int const *const pover = arr+len;
do *p++ = ... while (pover != p);
On systems which have a CPU instruction which takes both a pointer and an offset, the following is fastest:
size_t i = 0;
do arr[i++] = ... while (len != i);
When writing portable code, I opt for the former.
They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.
Sorry, I don't follow. . .
(Also, the code assumes that len is non-zero)
--
Frederick Gotham
Frederick Gotham wrote:
Victor Bazarov posted:
>>On most systems, the following is the fastest:
int *p = arr; int const *const pover = arr+len;
do *p++ = ... while (pover != p);
On systems which have a CPU instruction which takes both a pointer and an offset, the following is fastest:
size_t i = 0;
do arr[i++] = ... while (len != i);
When writing portable code, I opt for the former.
They are not semantially equivalent. You need an 'if' in front of the 'do' to get them there.
Sorry, I don't follow. . .
(Also, the code assumes that len is non-zero)
Assumption is the mother of all f***-ups.
for (int i = 0; i < len; +i)
blah;
is not semantically equivalent to
int i = 0;
do
blah;
while (++i < len);
, that's all.
V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:7f*******************@news.indigo.ie...
Howard posted:
>Can you back those statements up with documentation?
Why did you clip what I was responding to? Now nobody reading this will
know what the heck I was talking about. Here are your statements that I was
challenging:
On most systems, the following is the fastest:
int *p = arr;
int const *const pover = arr+len;
do *p++ = ...
while (pover != p);
On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:
size_t i = 0;
do arr[i++] = ...
while (len != i);
>
A few of us tested them over on comp.lang.c. Here's the relevant thread:
http://groups.google.ie/group/comp.l...c11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4c a430be
I see a couple tests, but nothing which makes your statements always true,
even for the specific code you gave (let alone for the more general case the
OP asked about). How do you know that my compiler won't be able to unroll a
loop? Or that it won't load up a couple registers and call a single machine
instruction that performs the loop? (I've used those many times when
writing assembly code to handle simple operations on arrays.)
I think that you can really only say that, on your machine, using your
version of your compiler, your specific tests showed some specific results.
It's not valid to extend that to: "on any machine which has such-and-such a
feature, the following code is always fastest". There's simply no way to
know that for sure.
And even if you somehow demonstrated that it's true on all existing
compilers on all existing machines, that would be no guarantee that someone
couldn't write a compiler to implement it faster on one or more of those
machines. Or make a machine which will do it faster using some other
implementation.
-Howard
"Victor Bazarov" <v.********@comAcast.netwrote in message
news:eh**********@news.datemas.de...
Frederick Gotham wrote:
>Victor Bazarov posted:
>
Assumption is the mother of all f***-ups.
for (int i = 0; i < len; +i)
blah;
Typing mistakes are certainly a close second. :-)
That should obviously be "++i", not "+i".
-Howard
Salt_Peter wrote:
Ganesh wrote:
>Hi,
This is a question that pertains to pointers in general (C or C++). Which of the following is faster and why?
for (int i = 0; i < N; i++) = ... a[i]...
(or)
for (int i = 0; i < N; i++) = ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of size N. I have read somewhere that the latter is faster. But could not understand why? Or is that dependent on architecture or compiler?
Probably because the first implies a cast.
It implies no such thing.
Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}
I'd consider any compiler that produces significantly different code broken.
--
Clark S. Cox III cl*******@gmail.com
Howard wrote:
"Victor Bazarov" <v.********@comAcast.netwrote in message
news:eh**********@news.datemas.de...
>Frederick Gotham wrote:
>>Victor Bazarov posted:
>> Assumption is the mother of all f***-ups.
for (int i = 0; i < len; +i) blah;
Typing mistakes are certainly a close second. :-)
That should obviously be "++i", not "+i".
That's all my keyboard's fault! And since I'm using an MS Natural
thing, it's all the fault of MS!
Clark S. Cox III wrote:
Salt_Peter wrote:
>Ganesh wrote:
>>Hi,
This is a question that pertains to pointers in general (C or C++). Which of the following is faster and why?
for (int i = 0; i < N; i++) = ... a[i]...
(or)
for (int i = 0; i < N; i++) = ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of size N. I have read somewhere that the latter is faster. But could not understand why? Or is that dependent on architecture or compiler?
Probably because the first implies a cast.
It implies no such thing.
>Ironically, you'ld probably get "faster" with: for( size_t i = 0; i < N; ++i) { // your choice }
I'd consider any compiler that produces significantly different code
broken.
Just to throw some fire into the oil, the former case is much easier
parallelizable than then latter.
V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Frederick Gotham wrote:
Howard posted:
>Can you back those statements up with documentation?
There are many forms of optimizations a compiler might use to implement a loop, regardless of what the original code looks like. The best way to know what's fastest is to test it on your platform. And that's _still_ no guarantee that you've actually hit upon the absolute fastest method for your compiler and platform.
A few of us tested them over on comp.lang.c. Here's the relevant thread:
http://groups.google.ie/group/comp.l...c11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4c a430be
For every assertion that you made in that thread about which one was
generally faster, someone presented a counter-example.
--
Clark S. Cox III cl*******@gmail.com
Ganesh wrote:
Hi,
This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?
for (int i = 0; i < N; i++)
= ... a[i]...
(or)
for (int i = 0; i < N; i++)
= ...*a++....
'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?
In my architecture (Intel Pentium IV 3 GHz, 1MG Cache ) I got the
following times for 10 runs on a reasonably huge array size for huge
number of iterations.
Array Access (seconds): 5 5 5 4 5 5 5 5 5 5
Pointer Access: 4 4 3 4 4 4 4 4 4
3
On the average, pointer access is faster, but array access also equals
the pointer access sometimes. Not sure why this happens. The assembly
code is different for each case. For array access, it shows
..
addl (%esi, %eax,4), %ebx
incl %eax
....
,
while for pointer access it generates
....
addl %eax, %ebx,
addl $4, %eax
....
Victor Bazarov wrote in message ...
>Howard wrote:
>"Victor Bazarov" wrote in message
>>Frederick Gotham wrote: Victor Bazarov posted:
Assumption is the mother of all f***-ups.
for (int i = 0; i < len; +i) blah;
Typing mistakes are certainly a close second. :-)
That should obviously be "++i", not "+i".
That's all my keyboard's fault! And since I'm using an MS Natural thing, it's all the fault of MS!
Nice save, Victor!!
Does *that* keyboard have a BSOD feature? Crash daily?
--
Bob <GR
POVrookie
Howard posted:
Why did you clip what I was responding to? Now nobody reading this will
know what the heck I was talking about.
Of course they will, by the miracle of threads! You do have threads... don't
you? Or do you still have candles for lightbulbs?
And even if you somehow demonstrated that it's true on all existing
compilers on all existing machines, that would be no guarantee that
someone couldn't write a compiler to implement it faster on one or more
of those machines. Or make a machine which will do it faster using some
other implementation.
Indeed you're right. When writing portable code, I make an educated guess as
to what strategy will produce high efficiency across the board. Looking at
the tests done all today's computers, I opt for the *p++ solution.
--
Frederick Gotham
Victor Bazarov posted:
Assumption is the mother of all f***-ups.
for (int i = 0; i < len; +i)
blah;
is not semantically equivalent to
int i = 0;
do
blah;
while (++i < len);
, that's all.
I make assumptions all the time -- I belive we call them "pre-conditions" in
the programming world. For instance, I can write an algorithm which deals
with an array, and say:
"The array must have at least two elements."
Then, if somebody gets bogus results when they supply it with either:
(1) A null pointer.
(2) An empty array (if there's such a thing).
(3) A sole-element array.
, then I'll just say tough luck. The C++ Standard seems to agree with me,
guess what happens when you supply "strlen" with a null pointer.
That said though, I do make liberal use of "assert".
--
Frederick Gotham
>for (int i = 0; i < N; i++) = ... a[i]...
On most systems, the following is the fastest:
int *p = arr;
int const *const pover = arr+len;
do *p++ = ...
while (pover != p);
Even for N==0 ? :/
Frederick Gotham wrote:
Victor Bazarov posted:
>Assumption is the mother of all f***-ups.
for (int i = 0; i < len; +i) blah;
is not semantically equivalent to
int i = 0; do blah; while (++i < len);
, that's all.
I make assumptions all the time -- I belive we call them
"pre-conditions" in the programming world. For instance, I can write
an algorithm which deals with an array, and say:
"The array must have at least two elements."
[..]
That said though, I do make liberal use of "assert".
Good. The code you posted didn't have any.
V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Victor Bazarov posted:
>That said though, I do make liberal use of "assert".
Good. The code you posted didn't have any.
But "assert" isn't fool-proof (unless you decide to not define NDEBUG in
Release Mode).
--
Frederick Gotham
Frederick Gotham wrote:
Victor Bazarov posted:
>>That said though, I do make liberal use of "assert".
Good. The code you posted didn't have any.
But "assert" isn't fool-proof (unless you decide to not define NDEBUG
in Release Mode).
Nothing is fool-proof. Your code, however, didn't even have any
comments explaining what assumptions had been made.
V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Howard wrote:
On most systems, the following is the fastest:
int *p = arr;
int const *const pover = arr+len;
do *p++ = ...
while (pover != p);
On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:
size_t i = 0;
do arr[i++] = ...
while (len != i);
Can you back those statements up with documentation?
As "int" is used there as the type the point is less relevant than when
something else is being used. Example:
mytype* p = ...;
p[i] ... ++i
and..
*p++;
Have one major difference from the point of view of large number of
platforms. It's common to have somesort of
register+register*scale(+constant) addressing mode. This works nicely
when the scale can match the size of the type in question.
If however, say, the size isn't a power of two or for some other
practical reason not matching the scale supported by a particular
platform, the p[i] implies multiplication (or efficient implementation
of multiplication by a constant), where as the p++ will imply ADDITION.
This is not really a C++ consideration and off-topic anyways, but a
practical consideration. Even so, I myself prefer the array indexing
over pointer dereference-and-increase. Why? Because it's more readable.
I could loop towards zero. I could use do-while or while loop if I know
some interesting detail about architechture and compiler I am using.
But I still wouldn't take advantage of this so-called knowledge. The
reason is simple: readability and clarity. I'm well aware that most
code that I write isn't performance critical, so I really don't want to
pay the cost of intentionally making code more complicated that it
needs to be.
On a register starved system it might be advantageous to use the
pointer -and- terminating end-of-iteration pointer to avoid spilling if
nothing else. But for the same reason as before, optimizing at cost of
clarity should be avoided unless necessary.. which for most code simply
isn't.
There's more to be said but most points converge to unnecessary
optimization and clarity issues.. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: John Bentley |
last post by:
John Bentley:
INTRO
The phrase "decimal number" within a programming context is ambiguous. It could
refer to the decimal datatype or the related but separate concept of a generic
decimal number....
|
by: MLH |
last post by:
I have a table I can open as table type
recordset or a dynaset. Searching for
a particular value in the table's main
keyfield, which would be faster and
less strain on the application......
|
by: junky_fellow |
last post by:
which of the following is faster and why ?
if ( x == 1 ) {
}
or
if ( x != 0 ) {
|
by: Skybuck Flying |
last post by:
Hi,
I needed a method to determine if a point was on a line segment in 2D. So I
googled for some help and so far I have evaluated two methods.
The first method was only a formula, the second...
|
by: Bob |
last post by:
I have a function that takes in a list of IDs (hundreds) as input parameter
and needs to pass the data to another step as a comma delimited string. The
source can easily create this list of IDs in...
| |
by: Sunil Varma |
last post by:
Is accessing function by it's name faster or accessing it by its
address faster?
|
by: Sonnich |
last post by:
Hi
I have a costum function for a special search, which sort strings.
This is currently the place where I can save a lot of time (~70%) if
possible.
So, which is faster:
for($j =...
|
by: Sing |
last post by:
Dear C Gurus,
I would like to optimise a max() algo that my program uses many times.
Which one is faster or are they the same?
1. #define max(a,b) (((a) (b)) ? (a) : (b))
2. if (a>b) return a...
|
by: lak |
last post by:
Which is faster? Post Increment or assignment? Why?
I was not able to get any things.
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
| |