By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,440 Members | 1,420 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,440 IT Pros & Developers. It's quick & easy.

Costs of function calls

P: n/a
Hello everybody,
currently I am tweaking a bit on one of my projects and wondered how
high are the costs for calling a (member) function. I tried some
possibilites and come to a suprising result.
Maybe someone can explain to me what I might improve on my test or how
come this results occured?
For testing I commented different lines out. Code see below.

Regards,
Thomas Kowalski

// MS C++ 8.0
// 1 Billion additions:
// simple methode call: 2134 ms, 2123 ms,
// inline call: 2037 ms, 2053 ms, 2073 ms
// simple methode ref param: 2153 ms, 2034 ms, 2093 ms
// addition no methode 2034 ms, 2037 ms, 2032 ms
// virtual funtion call ~2100 ms
// function pointer 7060 ms, 7260 ms, 7260 ms

// Results with gcc 3.3.6 are similar but function pointer are as fast
as normal calls

#include "stdafx.h"
#include <iostream>
#include <cstdlib>
#include <vector>
#include <time.h>

using namespace std;

class Abstract {
public:
virtual size_t add(size_t& a, size_t& b) = 0;
};

class B :public Abstract {
public:
virtual size_t add(size_t& a, size_t& b);
};
class C :public Abstract {
public:
virtual size_t add(size_t& a, size_t& b);
};

class D :public Abstract {
public:
virtual size_t add(size_t& a, size_t& b);
};
/**/

class A {
public:
static size_t add(size_t& a, size_t& b);
};
//inline
inline size_t A::add(size_t& a, size_t& b) {
return a+b;
};

size_t B::add(size_t& a, size_t& b) {
return a+b;
};
size_t C::add(size_t& a, size_t& b) {
return a+b;
};

size_t D::add(size_t& a, size_t& b) {
return a+b;
};
int main(int argc, char *argv[])
{

// Abstract* a = new D();
A a;
size_t (*addition)(size_t&, size_t&);
addition = &A::add;
const size_t REPEAT = 1000000000;
size_t i,j,k;
j=5;

clock_t start = clock();
for (i=0; i<REPEAT; ++i) {
addition(i, k);
// k = a.add(i,j);
// k = i+j;
}
clock_t ende = clock();
cout << "K ist: " << k << endl;

cout << "test took " << ((ende - start)) << " ms" << endl;
cin >i;
return 0;
}

Dec 19 '06 #1
Share this Question
Share on Google+
31 Replies


P: n/a
Thomas Kowalski wrote:
Hello everybody,
currently I am tweaking a bit on one of my projects and wondered how
high are the costs for calling a (member) function. I tried some
possibilites and come to a suprising result.
Maybe someone can explain to me what I might improve on my test or how
come this results occured?
For testing I commented different lines out. Code see below.

Regards,
Thomas Kowalski

// MS C++ 8.0
// 1 Billion additions:
// simple methode call: 2134 ms, 2123 ms,
// inline call: 2037 ms, 2053 ms, 2073 ms
// simple methode ref param: 2153 ms, 2034 ms, 2093 ms
// addition no methode 2034 ms, 2037 ms, 2032 ms
// virtual funtion call ~2100 ms
// function pointer 7060 ms, 7260 ms, 7260 ms

// Results with gcc 3.3.6 are similar but function pointer are as fast
as normal calls

[..]
To make a real case of inline versus non-inline, you _must_ place your
non-inlined member functions in a separate translation unit. Otherwise
any decent compiler should be able to simply inline all your calls if
it can see the actual body.

I _know_ that if a function is not inlined, it's gonna costcha. Build
your project with optimization for size, use 'std::vector' and its
indexing argument, and you'll notice that if you switch to indexing
from a pointer to the first element instead, you'll get plenty of
performance improvement. Up to ten times, as a matter of fact, in
some cases.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 19 '06 #2

P: n/a
To make a real case of inline versus non-inline, you _must_ place your
non-inlined member functions in a separate translation unit. Otherwise
any decent compiler should be able to simply inline all your calls if
it can see the actual body.
Ok, that makes sense. But how can it inline the virtual funtion call?
Abstract *a = new B();
a->add(i,j);

Can this still be inlined?
I _know_ that if a function is not inlined, it's gonna costcha. Build
your project with optimization for size, use 'std::vector' and its
indexing argument, and you'll notice that if you switch to indexing
from a pointer to the first element instead, you'll get plenty of
performance improvement. Up to ten times, as a matter of fact, in
some cases.
You mean that if you have an std::vector and use a pointer to its
values its several times slower to access a certain element then using
an index of type int or unsigned int?

Thanks a lot,
Thomas Kowalski

Dec 20 '06 #3

P: n/a
On Dec 20, 7:00 am, "Thomas Kowalski" <t...@gmx.dewrote:
I _know_ that if a function is not inlined, it's gonna costcha. Build
your project with optimization for size, use 'std::vector' and its
indexing argument, and you'll notice that if you switch to indexing
from a pointer to the first element instead, you'll get plenty of
performance improvement. Up to ten times, as a matter of fact, in
some cases.

You mean that if you have an std::vector and use a pointer to its
values its several times slower to access a certain element then using
an index of type int or unsigned int?
No, the other way around:

std::vector<intvec;
// Push some elements

int& a = vec[5]; // Using index

int* ptr = &(vec[0]);
int& b = *(ptr + 5) // Using pointer

The second (using pointer) will be faster than the first (using index)
provided that you don't count the cost of getting the pointer in the
first place. This can be useful when optimising loops iterating over
all the elements in a vector like:

int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec[i] += 2;

--
Erik Wikström

Dec 20 '06 #4

P: n/a
Erik Wikström <er****@student.chalmers.sewrote:
int& a = vec[5]; // Using index

int* ptr = &(vec[0]);
int& b = *(ptr + 5) // Using pointer

The second (using pointer) will be faster than the first (using index)
provided that you don't count the cost of getting the pointer in the
first place. This can be useful when optimising loops iterating over
all the elements in a vector like:

int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec[i] += 2;
Or shorter:

for (std::vector<int>::iterator i = vec.begin(); i < vec.end(); i++)
*i += 2;

which does the same and produces similar code on modern
compilers tuned to deal with STL (tested on vc8).
Dec 20 '06 #5

P: n/a
Thomas Kowalski wrote:
>To make a real case of inline versus non-inline, you _must_ place your
non-inlined member functions in a separate translation unit. Otherwise
any decent compiler should be able to simply inline all your calls if
it can see the actual body.

Ok, that makes sense. But how can it inline the virtual funtion call?
Abstract *a = new B();
a->add(i,j);

Can this still be inlined?
Basically, it can, because the compiler could know that in that particular
place the object can only be a B, but I have no idea if compilers do such
optimizations.
>I _know_ that if a function is not inlined, it's gonna costcha. Build
your project with optimization for size, use 'std::vector' and its
indexing argument, and you'll notice that if you switch to indexing
from a pointer to the first element instead, you'll get plenty of
performance improvement. Up to ten times, as a matter of fact, in
some cases.

You mean that if you have an std::vector and use a pointer to its
values its several times slower to access a certain element then using
an index of type int or unsigned int?
I think that he means just the opposite. Last time I measured iterating a
vector, indexing was generally slower than using a pointer or an iterator
and increment it in the loop. No difference between indexing using a
pointer and using vector's operator[].

Dec 20 '06 #6

P: n/a
Ole Nielsby wrote:
>int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec[i] += 2;

Or shorter:

for (std::vector<int>::iterator i = vec.begin(); i < vec.end(); i++)
*i += 2;
And here you can see what it looks like if you go one step further in using
algorithms:

std::transform(vec.begin(), vec.end(), vec.begin(),
std::bind2nd(std::plus<int>(), 2));

This is harder to write, harder to read and not shorter than the for loop.

Dec 20 '06 #7

P: n/a
Ole Nielsby wrote:
Erik Wikström <er****@student.chalmers.sewrote:
>int& a = vec[5]; // Using index

int* ptr = &(vec[0]);
int& b = *(ptr + 5) // Using pointer

The second (using pointer) will be faster than the first (using
index) provided that you don't count the cost of getting the pointer
in the first place. This can be useful when optimising loops
iterating over all the elements in a vector like:

int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec[i] += 2;

Or shorter:

for (std::vector<int>::iterator i = vec.begin(); i < vec.end(); i++)
*i += 2;

which does the same and produces similar code on modern
compilers tuned to deal with STL (tested on vc8).
Tested with what optimization settings? This code has to instantiate
the vector<int>::iterator vec.size() times (since you used i++ and not
++i). Besides, it has to compare the iterators, it has to dereference
the 'i'. Don't just jump to conclusions. Didn't I say I profiled
that sort of code enough to claim pointers are the fastest?

Bottomline: avoid function calls like the plague!

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 20 '06 #8

P: n/a
Rolf Magnus wrote:
[..] Last time I measured
iterating a vector, indexing was generally slower than using a
pointer or an iterator and increment it in the loop. No difference
between indexing using a pointer and using vector's operator[].
Not true. Indexing using a pointer is still faster because it does
not involve a function call. That's the whole point!

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 20 '06 #9

P: n/a
On Wed, 20 Dec 2006 08:48:27 -0500, Victor Bazarov wrote:
Ole Nielsby wrote:
>Erik Wikström <er****@student.chalmers.sewrote:
>>int& a = vec[5]; // Using index

int* ptr = &(vec[0]);
int& b = *(ptr + 5) // Using pointer

The second (using pointer) will be faster than the first (using
index) provided that you don't count the cost of getting the pointer
in the first place. This can be useful when optimising loops
iterating over all the elements in a vector like:

int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec[i] += 2;

Or shorter:

for (std::vector<int>::iterator i = vec.begin(); i < vec.end(); i++)
*i += 2;

which does the same and produces similar code on modern
compilers tuned to deal with STL (tested on vc8).

Tested with what optimization settings? This code has to instantiate
the vector<int>::iterator vec.size() times (since you used i++ and not
++i). Besides, it has to compare the iterators, it has to dereference
the 'i'.
Surely you have to compare pointers (or ints, or whatever) to detect the
loop termination condition, and you have to dereference a pointer too...?
Ok, the iterator version may involve function calls but these are quite
likely trivial enough to be optimised away by any compiler worth its salt.
Don't just jump to conclusions. Didn't I say I profiled
that sort of code enough to claim pointers are the fastest?

Bottomline: avoid function calls like the plague!
....except if you are certain they will be magicked away by your
compiler.

--
Lionel B
Dec 20 '06 #10

P: n/a
Lionel B wrote:
On Wed, 20 Dec 2006 08:48:27 -0500, Victor Bazarov wrote:
>Ole Nielsby wrote:
>>Erik Wikström <er****@student.chalmers.sewrote:
int& a = vec[5]; // Using index

int* ptr = &(vec[0]);
int& b = *(ptr + 5) // Using pointer

The second (using pointer) will be faster than the first (using
index) provided that you don't count the cost of getting the
pointer in the first place. This can be useful when optimising
loops iterating over all the elements in a vector like:

int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec[i] += 2;

Or shorter:

for (std::vector<int>::iterator i = vec.begin(); i < vec.end(); i++)
*i += 2;

which does the same and produces similar code on modern
compilers tuned to deal with STL (tested on vc8).

Tested with what optimization settings? This code has to instantiate
the vector<int>::iterator vec.size() times (since you used i++ and
not ++i). Besides, it has to compare the iterators, it has to
dereference the 'i'.

Surely you have to compare pointers (or ints, or whatever) to detect
the loop termination condition, and you have to dereference a pointer
too...? Ok, the iterator version may involve function calls but these
are quite likely trivial enough to be optimised away by any compiler
worth its salt.
Generally speaking, no. Never assume. Especially about your compiler
(which you may be required to use, instead of picking one "worth its
salt").
> Don't just jump to conclusions. Didn't I say I profiled
that sort of code enough to claim pointers are the fastest?

Bottomline: avoid function calls like the plague!

...except if you are certain they will be magicked away by your
compiler.
You can never be certain unless you actually see the assembly code.
You can never have enough time to look through all assembly code if
your project is beyond the simplest few KLOC. Conclusion: never
trust your compiler to do "the right thing" when it comes to your
program's performance. Measure and make appropriate changes.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 20 '06 #11

P: n/a
On Wed, 20 Dec 2006 14:32:48 +0000, Lionel B wrote:
On Wed, 20 Dec 2006 08:48:27 -0500, Victor Bazarov wrote:

/.../
>Bottomline: avoid function calls like the plague!

...except if you are certain they will be magicked away by your
compiler.
Ok, here's a small test program:

--- begin code ---

#include <iostream>
#include <vector>
#include <time.h>

int main()
{
const int s = 100000000;
std::vector<intvec(s,1);

clock_t start,endt;

start = clock();
for (int i=0; i<s; ++i) vec[i] += 2;
endt = clock();
std::cout << "index : time = " << endt-start << '\n';

start = clock();
for (std::vector<int>::iterator i=vec.begin(); i!=vec.end(); ++i) *i += 2;
endt = clock();
std::cout << "iterator : time = " << endt-start << '\n';

int* p = &(vec[0]);
start = clock();
for (int i=0; i<s; ++i) *(p+i) += 2;
endt = clock();
std::cout << "pointer : time = " << endt-start << '\n';

return 0;
}

--- end code ---

Compiled with default optimisation (gcc 4.1.1 on linux x86-64):

index : time = 3240000
iterator : time = 3140000
pointer : time = 640000

Compiled with -O1:

index : time = 270000
iterator : time = 230000
pointer : time = 230000

Compiled with -O3:

index : time = 230000
iterator : time = 220000
pointer : time = 230000

--
Lionel B
Dec 20 '06 #12

P: n/a
Lionel B wrote:
On Wed, 20 Dec 2006 14:32:48 +0000, Lionel B wrote:
>On Wed, 20 Dec 2006 08:48:27 -0500, Victor Bazarov wrote:

/.../
>>Bottomline: avoid function calls like the plague!

...except if you are certain they will be magicked away by your
compiler.

Ok, here's a small test program:

--- begin code ---
[..]
--- end code ---

Compiled with default optimisation (gcc 4.1.1 on linux x86-64):
[..]
I think you're missing the point. If you want to discuss the
capabilities of any particular compiler to perform some function
call optimisations, the newsgroup dedicated to that compiler is
probably a better place. Here we talk C++ *in general*, and *in
general* nothing is "magicked away". That's all.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 20 '06 #13

P: n/a
On Wed, 20 Dec 2006 10:28:25 -0500, Victor Bazarov wrote:
Lionel B wrote:
>On Wed, 20 Dec 2006 14:32:48 +0000, Lionel B wrote:
>>On Wed, 20 Dec 2006 08:48:27 -0500, Victor Bazarov wrote:

/.../

Bottomline: avoid function calls like the plague!

...except if you are certain they will be magicked away by your
compiler.

Ok, here's a small test program:

--- begin code ---
[..]
--- end code ---

Compiled with default optimisation (gcc 4.1.1 on linux x86-64):
[..]

I think you're missing the point. If you want to discuss the
capabilities of any particular compiler to perform some function
call optimisations,
I don't. I only included that for information. The point I would like
to make is that I would not wilfully obfuscate my code with pointers where
indexing or iterators are more appropriate, for the sole purpose of
second-guessing compiler optimisation capabilities - and this seems to be
exactly what you are suggesting when you say "avoid function calls like the
plague". To me that is premature optimisation pure and simple.

--
Lionel B
Dec 20 '06 #14

P: n/a
Victor Bazarov wrote:
Tested with what optimization settings? This code has to instantiate
the vector<int>::iterator vec.size() times (since you used i++ and not
++i). Besides, it has to compare the iterators, it has to dereference
the 'i'. Don't just jump to conclusions. Didn't I say I profiled
that sort of code enough to claim pointers are the fastest?
Modern compilers generate exactly the same assembler when using vector
iterators or pointers.
Dec 20 '06 #15

P: n/a
Victor Bazarov wrote:
Rolf Magnus wrote:
>[..] Last time I measured
iterating a vector, indexing was generally slower than using a
pointer or an iterator and increment it in the loop. No difference
between indexing using a pointer and using vector's operator[].

Not true.
Well, it's truely what I measured.
Indexing using a pointer is still faster because it does
not involve a function call. That's the whole point!
This happens only if I switch optimizations off completely. Size
optimitation - as you suggested - isn't sufficient. I wouldn't expect
anything different, since the operator usually just contains nothing more
than the pointer indexing, and the code for that is smaller than a function
call, so I would expect a size optimizing compiler to inline the operator,
producing the exact same code as the direct pointer indexing.
Dec 20 '06 #16

P: n/a

Victor Bazarov skrev:
[snip]
To make a real case of inline versus non-inline, you _must_ place your
non-inlined member functions in a separate translation unit. Otherwise
any decent compiler should be able to simply inline all your calls if
it can see the actual body.

I _know_ that if a function is not inlined, it's gonna costcha. Build
your project with optimization for size, use 'std::vector' and its
indexing argument, and you'll notice that if you switch to indexing
from a pointer to the first element instead, you'll get plenty of
performance improvement. Up to ten times, as a matter of fact, in
some cases.
Your argument is valid for a specific compiler and a specific
optimisation setting and not very generic. On most compilers and with
most optimisation-settings, there should be no difference at all.
Actually, some advocate using indexing rather than pointer-iterations
for raw vectors, claiming better optimisation capabilities due to fewer
aliasing issues for the compiler.

/Peter

Dec 20 '06 #17

P: n/a
Victor Bazarov wrote:
>Compiled with default optimisation (gcc 4.1.1 on linux x86-64):
[..]

I think you're missing the point. If you want to discuss the
capabilities of any particular compiler to perform some function
call optimisations, the newsgroup dedicated to that compiler is
probably a better place. Here we talk C++ *in general*, and *in
general* nothing is "magicked away". That's all.
Well, *in general*, you can't do any optimizations, because C++ doesn't
specify how much time the generated code must take for a specific piece of
source code. So if you suggest any optimization techniques, you have to
look at the real world, not just at the standard.

Dec 20 '06 #18

P: n/a
Victor Bazarov wrote:
Thomas Kowalski wrote:
>Hello everybody,
currently I am tweaking a bit on one of my projects and wondered
how high are the costs for calling a (member) function. I tried
some possibilites and come to a suprising result.
Maybe someone can explain to me what I might improve on my test or
how come this results occured?
For testing I commented different lines out. Code see below.

Regards,
Thomas Kowalski

// MS C++ 8.0
// 1 Billion additions:
// simple methode call: 2134 ms, 2123 ms,
// inline call: 2037 ms, 2053 ms, 2073 ms
// simple methode ref param: 2153 ms, 2034 ms, 2093 ms
// addition no methode 2034 ms, 2037 ms, 2032 ms
// virtual funtion call ~2100 ms
// function pointer 7060 ms, 7260 ms, 7260 ms

// Results with gcc 3.3.6 are similar but function pointer are as
fast as normal calls

[..]

To make a real case of inline versus non-inline, you _must_ place
your non-inlined member functions in a separate translation unit.
Otherwise any decent compiler should be able to simply inline all
your calls if it can see the actual body.
With the compiler used here that won't help, as it is perfectly capable of
inlining over module boundaries.

Writing a synthetic benchmark for a global optimizer is extremely difficult.
:-)
Bo Persson
Dec 20 '06 #19

P: n/a
Rolf Magnus wrote:
Victor Bazarov wrote:
>>Compiled with default optimisation (gcc 4.1.1 on linux x86-64):
[..]

I think you're missing the point. If you want to discuss the
capabilities of any particular compiler to perform some function
call optimisations, the newsgroup dedicated to that compiler is
probably a better place. Here we talk C++ *in general*, and *in
general* nothing is "magicked away". That's all.

Well, *in general*, you can't do any optimizations, because C++
doesn't specify how much time the generated code must take for a
specific piece of source code. So if you suggest any optimization
techniques, you have to look at the real world, not just at the
standard.
*In general* if one's concerned with overhead of calling a function,
one should inline the function body. Whether your compiler is up
to snuff for that or you do it manually does not matter. In general,
however, you can't rely on your compiler, so doing it manually is
your only *generally* good approach. Are you suggesting that, given

int foo(int a, int b)
{
return a+b;
}
int bar(int a, int b)
{
return foo(a+b);
}

, calling 'bar' *can* be faster than calling 'foo'?

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 20 '06 #20

P: n/a
Mathias Gaunard wrote:
Victor Bazarov wrote:
>Tested with what optimization settings? This code has to instantiate
the vector<int>::iterator vec.size() times (since you used i++ and
not ++i). Besides, it has to compare the iterators, it has to
dereference the 'i'. Don't just jump to conclusions. Didn't I say
I profiled that sort of code enough to claim pointers are the
fastest?

Modern compilers generate exactly the same assembler when using vector
iterators or pointers.
No, they don't. It depends on what you ask them, and it depends on the
implementation of the Standard library. I am not going to go into the
details, try different optimization levels with your favourite compiler
and you will see. And keep in mind that maximum optimization is not
always the right choice.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 20 '06 #21

P: n/a
Rolf Magnus wrote:
Victor Bazarov wrote:
>Rolf Magnus wrote:
>>[..] Last time I measured
iterating a vector, indexing was generally slower than using a
pointer or an iterator and increment it in the loop. No difference
between indexing using a pointer and using vector's operator[].

Not true.

Well, it's truely what I measured.
I am not disputing what you measured. I am disagreeing with the
last statement of that paragraph. It is unclear whether it's what
you measured or what you deduced to be the reality.
>Indexing using a pointer is still faster because it does
not involve a function call. That's the whole point!

This happens only if I switch optimizations off completely. Size
optimitation - as you suggested - isn't sufficient.
Isn't sufficient for what? For getting the "correct" results in
indexing a pointer versus calling the vector::operator[]?
I wouldn't expect
anything different, since the operator usually just contains nothing
more than the pointer indexing, and the code for that is smaller than
a function call, so I would expect a size optimizing compiler to
inline the operator, producing the exact same code as the direct
pointer indexing.
Hey, maybe you and many others would _expect_ the world peace next
year. It's not gonna happen, though. Just like _expecting_ the
compiler to do something (or forgo doing something) does not mean
it is so.

In certain situations indexing operator implemenation _can_ contain
other stuff (whatever the library implementors decided to put in
there), which the compiler has to convert into code; it doesn't
have to do that if you just index a pointer. IOW, it's not just
the overhead of preparing the stack frame for the call, it's the
contents of the function that when they are not inlined, which we
have to compare with a plain pointer dereferencing/indexing.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 20 '06 #22

P: n/a
peter koch wrote:
Victor Bazarov skrev:
[snip]
>To make a real case of inline versus non-inline, you _must_ place
your non-inlined member functions in a separate translation unit.
Otherwise any decent compiler should be able to simply inline all
your calls if it can see the actual body.

I _know_ that if a function is not inlined, it's gonna costcha.
Build your project with optimization for size, use 'std::vector' and
its indexing argument, and you'll notice that if you switch to
indexing from a pointer to the first element instead, you'll get
plenty of performance improvement. Up to ten times, as a matter of
fact, in some cases.
Your argument is valid for a specific compiler and a specific
optimisation setting and not very generic. On most compilers and with
most optimisation-settings, there should be no difference at all.
If among one million outcomes there are ten that are not like the other
nine hundred ninety nine thousand nine hundred and ninety, you cannot
call those others "the general case". If there is at least one compiler
that doesn't perform like the others, we cannot say the general case is
that optimization takes care of that. You can call that "mainly" or
"chiefly" or "usually". Not "generally". *Generally* you cannot rely
on the compiler to do that.
Actually, some advocate using indexing rather than pointer-iterations
for raw vectors,
What are those? Do you mean "arrays"?
claiming better optimisation capabilities due to
fewer aliasing issues for the compiler.
Actually indexing is a better technique for parallel computations, by
far, than dereferencing the pointer (although I've seen indexing code
to be slower about 10% than dereferencing an incrementing pointer).
Indexing the pointer, that is, not calling 'vector::operator[]' of
course.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 20 '06 #23

P: n/a

Victor Bazarov skrev:
Rolf Magnus wrote:
Victor Bazarov wrote:
Rolf Magnus wrote:
[..] Last time I measured
iterating a vector, indexing was generally slower than using a
pointer or an iterator and increment it in the loop. No difference
between indexing using a pointer and using vector's operator[].

Not true.
Well, it's truely what I measured.

I am not disputing what you measured. I am disagreeing with the
last statement of that paragraph. It is unclear whether it's what
you measured or what you deduced to be the reality.
Indexing using a pointer is still faster because it does
not involve a function call. That's the whole point!
This happens only if I switch optimizations off completely. Size
optimitation - as you suggested - isn't sufficient.

Isn't sufficient for what? For getting the "correct" results in
indexing a pointer versus calling the vector::operator[]?
I wouldn't expect
anything different, since the operator usually just contains nothing
more than the pointer indexing, and the code for that is smaller than
a function call, so I would expect a size optimizing compiler to
inline the operator, producing the exact same code as the direct
pointer indexing.

Hey, maybe you and many others would _expect_ the world peace next
year. It's not gonna happen, though. Just like _expecting_ the
compiler to do something (or forgo doing something) does not mean
it is so.
I do not expect world peace even if that would be a nice feature, but I
do expect you to know your compiler and e.g. set the appropriate
switches in order to get optimal performance.
>
In certain situations indexing operator implemenation _can_ contain
other stuff (whatever the library implementors decided to put in
there), which the compiler has to convert into code; it doesn't
have to do that if you just index a pointer. IOW, it's not just
the overhead of preparing the stack frame for the call, it's the
contents of the function that when they are not inlined, which we
have to compare with a plain pointer dereferencing/indexing.
And in that case you'd compare apples to oranges, If the vendor should
decide to e.g put boundary checking into the indexing operator but not
put checking code into a "pointer" version, the two implementations are
not the same. You should - if you could - disable the extra code put in
by the vendor to get a fair comparison.

/Peter

Dec 20 '06 #24

P: n/a

Victor Bazarov skrev:
peter koch wrote:
Victor Bazarov skrev:
[snip]
To make a real case of inline versus non-inline, you _must_ place
your non-inlined member functions in a separate translation unit.
Otherwise any decent compiler should be able to simply inline all
your calls if it can see the actual body.

I _know_ that if a function is not inlined, it's gonna costcha.
Build your project with optimization for size, use 'std::vector' and
its indexing argument, and you'll notice that if you switch to
indexing from a pointer to the first element instead, you'll get
plenty of performance improvement. Up to ten times, as a matter of
fact, in some cases.
Your argument is valid for a specific compiler and a specific
optimisation setting and not very generic. On most compilers and with
most optimisation-settings, there should be no difference at all.

If among one million outcomes there are ten that are not like the other
nine hundred ninety nine thousand nine hundred and ninety, you cannot
call those others "the general case". If there is at least one compiler
that doesn't perform like the others, we cannot say the general case is
that optimization takes care of that. You can call that "mainly" or
"chiefly" or "usually". Not "generally". *Generally* you cannot rely
on the compiler to do that.
Actually, some advocate using indexing rather than pointer-iterations
for raw vectors,

What are those? Do you mean "arrays"?
Right - to sloppy and to fast to respond here. I'm slightly surprised
you did not understand what I meant.
>
claiming better optimisation capabilities due to
fewer aliasing issues for the compiler.

Actually indexing is a better technique for parallel computations, by
far, than dereferencing the pointer (although I've seen indexing code
to be slower about 10% than dereferencing an incrementing pointer).
Indexing the pointer, that is, not calling 'vector::operator[]' of
course.
I believe it is time you come out of the bush. What compilers apart
from MSVC 8.0 does give you an overhead using operator[]? And what
compiler out there gives you a performance where indexing is
significantly (say 10%) worse than pointer/iterator usage that can't be
cured by changing the compiler settings: without changing the
sourcecode?

/Peter

Dec 20 '06 #25

P: n/a
Victor Bazarov wrote:
Rolf Magnus wrote:
>Victor Bazarov wrote:
>>Rolf Magnus wrote:
[..] Last time I measured
iterating a vector, indexing was generally slower than using a
pointer or an iterator and increment it in the loop. No difference
between indexing using a pointer and using vector's operator[].

Not true.

Well, it's truely what I measured.

I am not disputing what you measured. I am disagreeing with the
last statement of that paragraph. It is unclear whether it's what
you measured or what you deduced to be the reality.
It's what I measured.
>>Indexing using a pointer is still faster because it does
not involve a function call. That's the whole point!

This happens only if I switch optimizations off completely. Size
optimitation - as you suggested - isn't sufficient.

Isn't sufficient for what? For getting the "correct" results in
indexing a pointer versus calling the vector::operator[]?
It isn't sufficient to make my compiler not inline the function. As I
understand, you were suggesting that size optimization would disable
inlining of that operator.
>I wouldn't expect
anything different, since the operator usually just contains nothing
more than the pointer indexing, and the code for that is smaller than
a function call, so I would expect a size optimizing compiler to
inline the operator, producing the exact same code as the direct
pointer indexing.

Hey, maybe you and many others would _expect_ the world peace next
year. It's not gonna happen, though. Just like _expecting_ the
compiler to do something (or forgo doing something) does not mean
it is so.
It's actually the result of an observation. I saw the compiler do that, and
then I thought "hey, that's logical. It's what one would expect from a good
compiler".
In certain situations indexing operator implemenation _can_ contain
other stuff (whatever the library implementors decided to put in
there), which the compiler has to convert into code; it doesn't
have to do that if you just index a pointer. IOW, it's not just
the overhead of preparing the stack frame for the call, it's the
contents of the function that when they are not inlined, which we
have to compare with a plain pointer dereferencing/indexing.
Right. But again, this is all implementation specific, and discussing about
whether one way of doing something is faster than another, you have to pick
an implementation and look at what it does. Otherwise, you are just
discussing about what could, might or should happen.

Dec 20 '06 #26

P: n/a
Rolf Magnus wrote:
[..] this is all implementation specific, and discussing
about whether one way of doing something is faster than another, you
have to pick an implementation and look at what it does. Otherwise,
you are just discussing about what could, might or should happen.
<shrug The discussion started with the question about the cost of
calling a member function. The results were specific (to VC++ v8),
but the question is generic. The answer is usually "you have to
measure to know". However, I found it that with all other things
different, if you can avoid calling a function, do that. You all
seem to disagree with that, and I for the hell of it cannot figure
out what (if anything) you have to say to prove me wrong, except
"duh, compilers should take care of it".

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 20 '06 #27

P: n/a
peter koch wrote:
Victor Bazarov skrev:
>peter koch wrote:
[..]
>>Actually, some advocate using indexing rather than
pointer-iterations for raw vectors,

What are those? Do you mean "arrays"?
Right - to sloppy and to fast to respond here. I'm slightly surprised
you did not understand what I meant.
I understood (otherwise how would I be able to suggest "arrays"?)
But for the slim chance that there is somebody who is even less
attentive than I am, it was necessary to confirm.
>>claiming better optimisation capabilities due to
fewer aliasing issues for the compiler.

Actually indexing is a better technique for parallel computations, by
far, than dereferencing the pointer (although I've seen indexing code
to be slower about 10% than dereferencing an incrementing pointer).
Indexing the pointer, that is, not calling 'vector::operator[]' of
course.

I believe it is time you come out of the bush. What compilers apart
from MSVC 8.0 does give you an overhead using operator[]?
I don't know. I (not surprisingly) don't actually have all compilers
in the world available to me to check all their optimization settings.
And what
compiler out there gives you a performance where indexing is
significantly (say 10%) worse than pointer/iterator usage that can't
be cured by changing the compiler settings: without changing the
sourcecode?
I am sure that behaviour of the code can be usually affected/fixed
by tweaking compiler settings (instead of changing the code). But
it is usually is the freedom that only programmers of small projects
enjoy. As soon as you have a larger organization consuming your
product (code), there are always things in the way, like coding
standards, like the need to justify or prove the benefit of changing
compiler settings, and so forth. It is much easier to fix the code
than to fix the project settings, when the large organisations like
that are concerned.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 20 '06 #28

P: n/a
On Wed, 20 Dec 2006 14:19:21 +0100 in comp.lang.c++, Rolf Magnus
<ra******@t-online.dewrote,
>std::transform(vec.begin(), vec.end(), vec.begin(),
std::bind2nd(std::plus<int>(), 2));

This is harder to write, harder to read and not shorter than the for loop.
#include <boost/lambda/lambda.hpp>

std::for_each(vec.begin(), vec.end(), _1 += 2);

Dec 21 '06 #29

P: n/a
Victor Bazarov wrote:
Rolf Magnus wrote:
>>[..] this is all implementation specific, and discussing
about whether one way of doing something is faster than another, you
have to pick an implementation and look at what it does. Otherwise,
you are just discussing about what could, might or should happen.


<shrug The discussion started with the question about the cost of
calling a member function. The results were specific (to VC++ v8),
but the question is generic. The answer is usually "you have to
measure to know". However, I found it that with all other things
different, if you can avoid calling a function, do that. You all
seem to disagree with that, and I for the hell of it cannot figure
out what (if anything) you have to say to prove me wrong, except
"duh, compilers should take care of it".
Well I for one would agree with you. Avoiding the function call will
speed things up. Mostly you won't notice it won't have any significant
affect, but when it does it does. So as you say you need to measure first.

Being overly concerned about what one compiler does can be
counterproductive, as can the statement the "compile should take care of
it". The GNU compiler rarely generates the same code for a significant
function from one release to the next, if you were to rely on the output
of that you may find yourself stuck with one particular release, bugs
and all.

Dec 21 '06 #30

P: n/a
Victor Bazarov wrote:
Rolf Magnus wrote:
>[..] this is all implementation specific, and discussing
about whether one way of doing something is faster than another, you
have to pick an implementation and look at what it does. Otherwise,
you are just discussing about what could, might or should happen.

<shrug The discussion started with the question about the cost of
calling a member function. The results were specific (to VC++ v8),
but the question is generic. The answer is usually "you have to
measure to know".
Agreed.
However, I found it that with all other things different, if you can avoid
calling a function, do that.
If by "function call" you mean machine level (as opposed to a function call
in C++ that gets inlined).
You all seem to disagree with that, and I for the hell of it cannot figure
out what (if anything) you have to say to prove me wrong, except "duh,
compilers should take care of it".
I wasn't really disagreeing on that, but rather on your suggestions on how
to measure the difference.
Anyway, I actually have heard (but not verified myself) that it can happen
that inlining a function increases execution time because of increased code
size and thus more cache load. If the inlined function contains branches
and is called from many places, branch prediction can also become less
efficient.

Dec 21 '06 #31

P: n/a
Rolf Magnus wrote:
[..]
Anyway, I actually have heard (but not verified myself) that it can
happen that inlining a function increases execution time because of
increased code size and thus more cache load. If the inlined function
contains branches and is called from many places, branch prediction
can also become less efficient.
I've heard of it too. I also heard that any attempt to improve code
performance by caching more information in objects can have the adverse
effect of ballooning memory to the point where cache misses and page
faults accessing objects in memory starts hindering performance... I
think we already agreed that it can only be discovered by measuring
the code execution, and very difficult to guess. I have measured the
effects of reducing function calls. I haven't seen any direct evidence
of code size affecting performance, as of yet.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Dec 21 '06 #32

This discussion thread is closed

Replies have been disabled for this discussion.