By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,686 Members | 2,756 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,686 IT Pros & Developers. It's quick & easy.

simple code performance question

P: n/a
Hi,

Given this code:

A** ppA = new A*[10];
A *pA = NULL;
for(int i = 0; i < 10; ++i)
{
pA = ppA[i];
//do something with pA
}

- is there some performance penalty if pA declaration and assignment
will be inside the for-block as:

A *pA = ppA[i];

Regards,
ren

Oct 30 '07 #1
Share this Question
Share on Google+
30 Replies


P: n/a
ga********@gmail.com wrote:
Given this code:

A** ppA = new A*[10];
A *pA = NULL;
for(int i = 0; i < 10; ++i)
{
pA = ppA[i];
//do something with pA
}

- is there some performance penalty if pA declaration and assignment
will be inside the for-block as:

A *pA = ppA[i];
Of cource not. However, the declaration/definition/initialisation
inside "the for-block" is much better from the code maintenance POV.
Presence of 'pA' variable outside has *no merit*. It only pollutes
the scope. Unless there is a need to use 'pA' in the same scope as
'ppA', 'pA' should be defined inside.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Oct 30 '07 #2

P: n/a
Thanks for the answer, but can you supply some more details?
Why there is no performance penalty here, is it because it is a
pointer type?
TIA
Oren

Oct 31 '07 #3

P: n/a
gl****@smile.net.il wrote:
Thanks for the answer, but can you supply some more details?
Why there is no performance penalty here, is it because it is a
pointer type?
Yes. What details are you looking for?

Let's say this: with a very bad compiler (which doesn't emit the
same code in both cases) there might be some difference, but so
miniscule that it would not matter when the big[ger] picture is
concerned.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Oct 31 '07 #4

P: n/a
What is the difference between these cases in a bad compiler?
Is it that the second case allocates the pointer's memory on each loop
and this causes the (negligible) performance degradation?

Oct 31 '07 #5

P: n/a
gl****@smile.net.il wrote:
What is the difference between these cases in a bad compiler?
Is it that the second case allocates the pointer's memory on each loop
and this causes the (negligible) performance degradation?
Something like that. You should not expect such allocation to take too
much time, of course, even with a bad compiler. Automatic storage is
usually not overly slow. That's why the degradation (if any) is truly
negligible.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Oct 31 '07 #6

P: n/a
On Oct 31, 9:36 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
glo...@smile.net.il wrote:
Thanks for the answer, but can you supply some more details?
Why there is no performance penalty here, is it because it is a
pointer type?
Yes. What details are you looking for?
Let's say this: with a very bad compiler (which doesn't emit the
same code in both cases) there might be some difference, but so
miniscule that it would not matter when the big[ger] picture is
concerned.
More importantly, the difference can go both ways.

A while back, someone suggested declaring an std::string outside
the loop, on the grounds that that would improve performance,
since the constructor and the destructor would only have to be
invoked once, and not each time through the loop. I wrote up a
small benchmark to see just how much difference it would make,
and with g++ (2.95.2 at the time, I think), it turned out that
declaring the variable in the loop was actually significantly
faster. For my benchmark---it all depended on what you were
doing.

The correct answer is (and I know you know it): don't worry
about it until the profiler says you have to, and then try both,
measuring, to see what actually happens on your implementation.
Anything else is just pure stupidity.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 1 '07 #7

P: n/a
On 1 nov, 06:36, James Kanze <james.ka...@gmail.comwrote:
On Oct 31, 9:36 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
glo...@smile.net.il wrote:
Thanks for the answer, but can you supply some more details?
Why there is no performance penalty here, is it because it is a
pointer type?
Yes. What details are you looking for?
Let's say this: with a very bad compiler (which doesn't emit the
same code in both cases) there might be some difference, but so
miniscule that it would not matter when the big[ger] picture is
concerned.

More importantly, the difference can go both ways.

A while back, someone suggested declaring an std::string outside
the loop, on the grounds that that would improve performance,
since the constructor and the destructor would only have to be
invoked once, and not each time through the loop. I wrote up a
small benchmark to see just how much difference it would make,
and with g++ (2.95.2 at the time, I think), it turned out that
declaring the variable in the loop was actually significantly
faster. For my benchmark---it all depended on what you were
doing.
How could that be possible? With a minimum of smartness in memory
management, both ways should be equally efficient in this regard. The
execution of any kind of code in constructors/destructors implicit
calls could also be easily avoided for std::strings when gcc detects
such constructs (though such optimization may not always be possible
for user-defined classes). But this only means that gcc should have
the same performance in both cases, otherwise it must not be doing a
good job on optimizing the constructor-outside-the-loop case.

In my humble opinion, it is safier to believe that code like this will
run faster:

#include <string>

void get_string_somehow( std::string& str ); //Elsewhere implemented.
void do_something_with_string( std::string& str ); //Elsewhere
implemented.

int main();
{
{
std::string str;
for ( unsigned i( 0 ); i < 100; ++i )
{
get_string_somehow( str );
do_something_with_string( str );
}
}

return( 0 );
}

Notice that concerns about name leaking are adressed by simply placing
braces {} around the relevant code. When it comes to pointers and
other fundamental types, however, I pretty much agree that declaring
inside loops is the way to go.
>
The correct answer is (and I know you know it): don't worry
about it until the profiler says you have to, and then try both,
measuring, to see what actually happens on your implementation.
Anything else is just pure stupidity.
I would say that some precaution when coding, i.e. prior to proper
profiling be possible, cannot be considered stupidity, but I do agree
that no a-priori assertion can be done in this repect.

Elias Salomão Helou Neto

Nov 2 '07 #8

P: n/a
On Nov 2, 1:45 am, Elias Salomão Helou Neto <eshn...@gmail.comwrote:
On 1 nov, 06:36, James Kanze <james.ka...@gmail.comwrote:
On Oct 31, 9:36 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
glo...@smile.net.il wrote:
Thanks for the answer, but can you supply some more details?
Why there is no performance penalty here, is it because it is a
pointer type?
Yes. What details are you looking for?
Let's say this: with a very bad compiler (which doesn't emit the
same code in both cases) there might be some difference, but so
miniscule that it would not matter when the big[ger] picture is
concerned.
More importantly, the difference can go both ways.
A while back, someone suggested declaring an std::string outside
the loop, on the grounds that that would improve performance,
since the constructor and the destructor would only have to be
invoked once, and not each time through the loop. I wrote up a
small benchmark to see just how much difference it would make,
and with g++ (2.95.2 at the time, I think), it turned out that
declaring the variable in the loop was actually significantly
faster. For my benchmark---it all depended on what you were
doing.
How could that be possible?
Who knows? Who cares? I didn't do an extensive analysis of the
implementation of g++. The point remains that you cannot say
which is faster (if either) until you've measured the case which
interests you.
With a minimum of smartness in memory management, both ways
should be equally efficient in this regard. The execution of
any kind of code in constructors/destructors implicit calls
could also be easily avoided for std::strings when gcc detects
such constructs (though such optimization may not always be
possible for user-defined classes). But this only means that
gcc should have the same performance in both cases, otherwise
it must not be doing a good job on optimizing the
constructor-outside-the-loop case.
Or the implementation of std::string does something funny. Or
whatever. All it means is that assignment is more expensive
than construction/destruction, which shouldn't really supprise
anyone.
In my humble opinion, it is safier to believe that code like
this will run faster:
#include <string>
void get_string_somehow( std::string& str ); //Elsewhere implemented.
void do_something_with_string( std::string& str ); //Elsewhere
implemented.
int main();
{
{
std::string str;
for ( unsigned i( 0 ); i < 100; ++i )
{
get_string_somehow( str );
do_something_with_string( str );
}
}
return( 0 );
}
I don't know what you mean by "safer", but there's absolutely no
reason to believe anything of the kind. Depending on the
implementation of std::string, what get_string_somehow does, and
possibly any number of other factors, it's impossible to say
without measuring whether the above will be faster or slower
than any other particular solution.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 2 '07 #9

P: n/a
Thanks for all posters.
Oren

Nov 2 '07 #10

P: n/a
More importantly, the difference can go both ways.
A while back, someone suggested declaring an std::string outside
the loop, on the grounds that that would improve performance,
since the constructor and the destructor would only have to be
invoked once, and not each time through the loop. I wrote up a
small benchmark to see just how much difference it would make,
and with g++ (2.95.2 at the time, I think), it turned out that
declaring the variable in the loop was actually significantly
faster. For my benchmark---it all depended on what you were
doing.
How could that be possible?

Who knows? Who cares? I didn't do an extensive analysis of the
implementation of g++. The point remains that you cannot say
which is faster (if either) until you've measured the case which
interests you.
I understand this.
With a minimum of smartness in memory management, both ways
should be equally efficient in this regard. The execution of
any kind of code in constructors/destructors implicit calls
could also be easily avoided for std::strings when gcc detects
such constructs (though such optimization may not always be
possible for user-defined classes). But this only means that
gcc should have the same performance in both cases, otherwise
it must not be doing a good job on optimizing the
constructor-outside-the-loop case.

Or the implementation of std::string does something funny. Or
whatever. All it means is that assignment is more expensive
than construction/destruction, which shouldn't really supprise
anyone.
I think it actually should surprise. Since memory is already allocated
(things are not that simple, I know, but it does not invalidate the
conclusion) assignment should be faster, or, at least as fast as
construction. Not being is, in my point of view, a flaw in the
compiler. I would like to see the code you benchmarked.
In my humble opinion, it is safier to believe that code like
this will run faster:
#include <string>
void get_string_somehow( std::string& str ); //Elsewhere implemented.
void do_something_with_string( std::string& str ); //Elsewhere
implemented.
int main();
{
{
std::string str;
for ( unsigned i( 0 ); i < 100; ++i )
{
get_string_somehow( str );
do_something_with_string( str );
}
}
return( 0 );
}

I don't know what you mean by "safer", but there's absolutely no
reason to believe anything of the kind. Depending on the
implementation of std::string, what get_string_somehow does, and
possibly any number of other factors, it's impossible to say
without measuring whether the above will be faster or slower
than any other particular solution.
Sorry, I was not completely clear, but I meant that it is more likely
that the mentioned code would run faster than one with the constructor
inside the loop, i.e., just above the get_string_somehow().

What I do not understand are the reasons which could make assignment
slower than construction. Again, I think it cannot be justified unless
as a compiler issue.

Nov 2 '07 #11

P: n/a
On Nov 2, 3:52 pm, Elias Salomão Helou Neto <eshn...@gmail.comwrote:

[...]
Or the implementation of std::string does something funny. Or
whatever. All it means is that assignment is more expensive
than construction/destruction, which shouldn't really supprise
anyone.
I think it actually should surprise. Since memory is already
allocated (things are not that simple, I know, but it does not
invalidate the conclusion) assignment should be faster, or, at
least as fast as construction.
That's really a very na ve point of view.

As I said, I measured; the actual results don't support your
conclusion, at least in the specific case I measured. (I might
also add that I sort of suspected this, since I am familiar with
the g++ implementation of std::string.)
Not being is, in my point of view, a flaw in the compiler. I
would like to see the code you benchmarked.
It was quite some time ago, but if I recall correctly, it was
something like:

std::string data ;
for ( ... ) {
data = someFunction() ;
}

vs.

for ( ... ) {
std::string data( someFunction() ) ;
}

[...]
What I do not understand are the reasons which could make
assignment slower than construction.
std::string is a complicated class, with not a few constraints.
Implementations try to optimize frequent operations, like the
above (with the definition in the loop). In the case of g++,
for example, construction of a copy of a string does NOT
generally allocate memory, nor copy any text---assignment to the
string will almost always copy text. Other implementations
don't allocate memory for smaller strings, but just copy the
data. And so on.
Again, I think it cannot be justified unless as a compiler
issue.
What can I say? You're wrong. (And there's nothing to
"justify". I would consider it a sign of a good implementation
that the cleaner, more frequent use runs faster.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 3 '07 #12

P: n/a
On 3 nov, 06:15, James Kanze <james.ka...@gmail.comwrote:
>
[...]

It was quite some time ago, but if I recall correctly, it was
something like:

std::string data ;
for ( ... ) {
data = someFunction() ;
}

vs.

for ( ... ) {
std::string data( someFunction() ) ;
}

[...]
What I do not understand are the reasons which could make
assignment slower than construction.

std::string is a complicated class, with not a few constraints.
Implementations try to optimize frequent operations, like the
above (with the definition in the loop). In the case of g++,
for example, construction of a copy of a string does NOT
generally allocate memory, nor copy any text---assignment to the
string will almost always copy text. Other implementations
don't allocate memory for smaller strings, but just copy the
data. And so on.
Well, IMHO whenever copy construction doesn't need copying, neither
should assignement.

But the real point here is that you were using something like str =
someFunction() instead of someFunction( str ). Do you see? In such a
case, it is much easier to optimize away the creation of the temporary
in std::string str( someFunction ) than in str = someFunction(). This
is not the compiler fault, nor it falls under my example. In the
second case, a temporary needed to be created for the assignement to
be possible, while in the former no.

The bottleneck here most certainly was the creation of the temporary,
not assignement operation versus construction/destruction cycle. It
was not the compiler, after all!

As I see, even gurus like you (and I always appreciate our
enlightening discussions) eventually get lost with c++ subtleties. I
am perhaps naive, but not as much as you may be thinking.

Also, in normal cases, optimizations may not be as simple as they are
with std::string, which is under the compiler control (in fact, the
implementation of std::string should be near optimal without any
compiler optimization anyway). Within user-defined classes, copy
construction may execute startup code that will be cleaned up upon
destruction, which would not happen with assignement operations, so
the construction/destruction cycle within a loop should best be
avoided in most cases when it is not needed, unless assignement is
specially poorly implemented. In such cases, however, it would be
better to reimplement the assignement properly. That is the reason my
advice is to place constructors outside the loop, though exceptions to
the rule may exist (yours doesn't seem to be one) AND to refrain from
returning large objects by value, wich would avoid the creation of
temporaries.
Elias Salomão Helou neto.

Nov 4 '07 #13

P: n/a
Elias Salomão Helou Neto wrote:
: On 3 nov, 06:15, James Kanze <james.ka...@gmail.comwrote:
::
:: [...]
::
:: It was quite some time ago, but if I recall correctly, it was
:: something like:
::
:: std::string data ;
:: for ( ... ) {
:: data = someFunction() ;
:: }
::
:: vs.
::
:: for ( ... ) {
:: std::string data( someFunction() ) ;
:: }
::
:: [...]
:
::: What I do not understand are the reasons which could make
::: assignment slower than construction.
::
:: std::string is a complicated class, with not a few constraints.
:: Implementations try to optimize frequent operations, like the
:: above (with the definition in the loop). In the case of g++,
:: for example, construction of a copy of a string does NOT
:: generally allocate memory, nor copy any text---assignment to the
:: string will almost always copy text. Other implementations
:: don't allocate memory for smaller strings, but just copy the
:: data. And so on.
:
: Well, IMHO whenever copy construction doesn't need copying, neither
: should assignement.

But assignment has the additional problem of dealing with the old
value.

:
: But the real point here is that you were using something like str =
: someFunction() instead of someFunction( str ). Do you see?

No, I don't! :-)

We have a perfectly good and idiomatic piece of code in section 2. It
is simple, easy to read, and actualy runs faster. What more could we
ask??

In section 1, the programmer tries some kind of premature optimzation
which just complicates the code. That he also gets worse run-time
performance, is well deserved!

: In such a
: case, it is much easier to optimize away the creation of the
: temporary in std::string str( someFunction ) than in str =
: someFunction().

Now you are just making the code even more complicated, attempting to
match the performance of the smaller and the simpler code. In
addition, you also force the user of the function to create the target
value before calling the function. This forces me to write

std::string data;
someFunction(data);

whether I have a loop to "optimize" or not!

: Also, in normal cases, optimizations may not be as simple as they
: are with std::string, which is under the compiler control (in fact,
: the implementation of std::string should be near optimal without any
: compiler optimization anyway). Within user-defined classes, copy
: construction may execute startup code that will be cleaned up upon
: destruction, which would not happen with assignement operations, so
: the construction/destruction cycle within a loop should best be
: avoided in most cases when it is not needed, unless assignement is
: specially poorly implemented. In such cases, however, it would be
: better to reimplement the assignement properly. That is the reason
: my advice is to place constructors outside the loop, though
: exceptions to the rule may exist (yours doesn't seem to be one) AND
: to refrain from returning large objects by value, wich would avoid
: the creation of temporaries.

The fact is that std::string has no overhead in its copy constructor,
all it does is store a copy of the other string. On many compilers, it
also has a definite advantage in combination with RVO/NRVO
optimization for value returning functions.

The assignment operator is much more complicated, as it also has to
decide what to do with the existing value. It doesn't help if you move
the assignment to inside the function, further complicating it by
passing a parameter.
Bo Persson
Nov 4 '07 #14

P: n/a
On 4 Nov., 02:14, Elias Salomão Helou Neto <eshn...@gmail.comwrote:
On 3 nov, 06:15, James Kanze <james.ka...@gmail.comwrote:


[...]
It was quite some time ago, but if I recall correctly, it was
something like:
std::string data ;
for ( ... ) {
data = someFunction() ;
}
vs.
for ( ... ) {
std::string data( someFunction() ) ;
}
[...]
What I do not understand are the reasons which could make
assignment slower than construction.
std::string is a complicated class, with not a few constraints.
Implementations try to optimize frequent operations, like the
above (with the definition in the loop). In the case of g++,
for example, construction of a copy of a string does NOT
generally allocate memory, nor copy any text---assignment to the
string will almost always copy text. Other implementations
don't allocate memory for smaller strings, but just copy the
data. And so on.

Well, IMHO whenever copy construction doesn't need copying, neither
should assignement.

But the real point here is that you were using something like str =
someFunction() instead of someFunction( str ). Do you see? In such a
case, it is much easier to optimize away the creation of the temporary
in std::string str( someFunction ) than in str = someFunction(). This
is not the compiler fault, nor it falls under my example. In the
second case, a temporary needed to be created for the assignement to
be possible, while in the former no.
I believe James gave a realistic example, where the object gets
initialised once for every entrance in the loop. After all, since the
object logically belongs inside the loop (or we would not have this
discussion), it should be initialised for every entry.
>
The bottleneck here most certainly was the creation of the temporary,
not assignement operation versus construction/destruction cycle. It
was not the compiler, after all!
Surely, RVO helps James here - no doubt about that. But I know of no
modern compiler that does not implement RVO and you would be a fool
not to exploit it.
>
As I see, even gurus like you (and I always appreciate our
enlightening discussions) eventually get lost with c++ subtleties. I
am perhaps naive, but not as much as you may be thinking.

Also, in normal cases, optimizations may not be as simple as they are
with std::string, which is under the compiler control (in fact, the
implementation of std::string should be near optimal without any
compiler optimization anyway). Within user-defined classes, copy
construction may execute startup code that will be cleaned up upon
destruction, which would not happen with assignement operations, so
the construction/destruction cycle within a loop should best be
avoided in most cases when it is not needed, unless assignement is
specially poorly implemented.
There is no reason copy construction should add any overhead compared
to assignment. Actually it is the other way around.

If you have the choice between
Class c; somefunction(s) and
Class c(somefunction()),

The second choice will be faster than the first one. More important,
the second choice is shorter and clearer in intent (esp. when the
definition and the initialisation is separated in space). The second
choice is also much easier to write in case you consider any exception
guarantees. As a matter of fact, you probably end up writing
somefunction as

somefunction(Class &c)
{
Class tmp;
// initialise tmp
std::swap(c,tmp);
}
In such cases, however, it would be
better to reimplement the assignement properly. That is the reason my
advice is to place constructors outside the loop, though exceptions to
the rule may exist (yours doesn't seem to be one) AND to refrain from
returning large objects by value, wich would avoid the creation of
temporaries.
The best practice is the opposite of yours: Prefer to return classes
by value, and declare your classes locally. This gives clearer code,
more stable code and - as a side effect - optimises the more common
case, where you actually want to construct and initialise the object
in one go.
There might be exceptions to this rule, of course, but do not exchange
clarity for perceived efficiency unless your profiler shows so.
>
Elias Salomão Helou neto.
/Peter

Nov 4 '07 #15

P: n/a
On 4 nov, 08:42, "Bo Persson" <b...@gmb.dkwrote:
But assignment has the additional problem of dealing with the old
value.
As I see, copy on write with reference counting for the data is the
only reason for "the additional problem of dealing with the old value"
you mention (even though your wording is awful).

If so, why would not it hold as well for copying? Think about it!
Copying may be done on write, allright, but why could not assignement
do exactly the same, i.e. copy on write with reference counting for
the data? If it is done so, when a temporary argument is passed for
either the copy constructor or the assignement operator, actual
copying of the data would never take place.
: But the real point here is that you were using something like str =
: someFunction() instead of someFunction( str ). Do you see?

No, I don't! :-)

We have a perfectly good and idiomatic piece of code in section 2. It
is simple, easy to read, and actualy runs faster. What more could we
ask??
I ask to compare it to what it should be compared to, not to an
oversimplified version. I mean that str( someFunction() ) is not to be
benchmarked against str = someFunction() when optimization is turned
on because, in the former, temporaries are rather easy to be
eliminated, while in the second no!
In section 1, the programmer tries some kind of premature optimzation
which just complicates the code. That he also gets worse run-time
performance, is well deserved!
Is it well deserved to receive a performance penalty solely for
deviating from an idiom? Now, an affirmative answer to this would be
dumb!

You should realize, and nobody will be able to seriously disagree with
me on this, that there is no reason for (now, this is the correct
comparison) my RVO version to run slower than construction.

Remember that if we want to return an object, it must be created
within the function (and copied to be returned), but both extra
constructions can be optimized away. Actually, your glorified idiom is
nothing but a way to support such optimization to be done by the
compiler!
: In such a
: case, it is much easier to optimize away the creation of the
: temporary in std::string str( someFunction ) than in str =
: someFunction().

Now you are just making the code even more complicated, attempting to
match the performance of the smaller and the simpler code. In
addition, you also force the user of the function to create the target
value before calling the function. This forces me to write

std::string data;
someFunction(data);

whether I have a loop to "optimize" or not!
Overloading the someFunction would allow both options to be used.
: Also, in normal cases, optimizations may not be as simple as they
: are with std::string, which is under the compiler control (in fact,
: the implementation of std::string should be near optimal without any
: compiler optimization anyway). Within user-defined classes, copy
: construction may execute startup code that will be cleaned up upon
: destruction, which would not happen with assignement operations, so
: the construction/destruction cycle within a loop should best be
: avoided in most cases when it is not needed, unless assignement is
: specially poorly implemented. In such cases, however, it would be
: better to reimplement the assignement properly. That is the reason
: my advice is to place constructors outside the loop, though
: exceptions to the rule may exist (yours doesn't seem to be one) AND
: to refrain from returning large objects by value, wich would avoid
: the creation of temporaries.

The fact is that std::string has no overhead in its copy constructor,
all it does is store a copy of the other string.
I cannot figure out what you are meaning. Should not every copy-
constructed object store a copy of the copied object? I guess you
wanted to mean that they share data (copy on write).
On many compilers, it
also has a definite advantage in combination with RVO/NRVO
optimization for value returning functions.
Really? Why?
The assignment operator is much more complicated, as it also has to
decide what to do with the existing value.
Again! How is that possible that operator= (or any other function)
would need to know what to do with its argument after it has
completed execution (if this is what you mean)? The reason I see for
something vaguely like this is, again, copy on write, but I will not
discuss this again.
It doesn't help if you move
the assignment to inside the function, further complicating it by
passing a parameter.
That would be really smart, huh? Did you came up with that by
yourself? Why would anybody do it?

Elias Salomão Helou Neto

Nov 4 '07 #16

P: n/a
On Nov 4, 2:14 am, Elias Salomão Helou Neto <eshn...@gmail.comwrote:
On 3 nov, 06:15, James Kanze <james.ka...@gmail.comwrote:
[...]
It was quite some time ago, but if I recall correctly, it
was something like:
std::string data ;
for ( ... ) {
data = someFunction() ;
}
vs.
for ( ... ) {
std::string data( someFunction() ) ;
}
[...]
What I do not understand are the reasons which could make
assignment slower than construction.
std::string is a complicated class, with not a few constraints.
Implementations try to optimize frequent operations, like the
above (with the definition in the loop). In the case of g++,
for example, construction of a copy of a string does NOT
generally allocate memory, nor copy any text---assignment to the
string will almost always copy text. Other implementations
don't allocate memory for smaller strings, but just copy the
data. And so on.
Well, IMHO whenever copy construction doesn't need copying,
neither should assignement.
Well, as you say, that's your opinion (humble or not). You're
free to believe that the earth is flat as well. An objective
analysis of the facts doesn't give you any reason to believe it,
but opinions are opinions. The issues are simply complex enough
that you can't make any assumptions.
But the real point here is that you were using something like
str = someFunction() instead of someFunction( str ). Do you
see?
I don't see where it makes any real difference in my argument.
My argument is simple: for any given case, you can't know until
you've measured it. Guessing is totally unreliable.

As it happens, I know exactly how g++ implements basic_string,
and I know that it wouldn't be too difficult to create a
benchmark in which using someFunction( data ) also runs slower.
Similarly, I could easily create cases where the reverse was
true. But that's neither here nor there. My point remains:
what you (or someone else) naïvely expects to be faster may not
be. Until you've measured the specific case which interests
you, you don't know which solution will be faster.
In such a case, it is much easier to optimize away the
creation of the temporary in std::string str( someFunction )
than in str = someFunction(). This is not the compiler fault,
nor it falls under my example. In the second case, a temporary
needed to be created for the assignement to be possible, while
in the former no.
The bottleneck here most certainly was the creation of the
temporary, not assignement operation versus
construction/destruction cycle. It was not the compiler, after
all!
That's not what the profiler said. The compiler implements
NRVO, and the code in the function was designed to take
advantage of it.
As I see, even gurus like you (and I always appreciate our
enlightening discussions) eventually get lost with c++
subtleties. I am perhaps naive, but not as much as you may be
thinking.
I'm afraid in this case you're completely wrong.
Also, in normal cases, optimizations may not be as simple as they are
with std::string, which is under the compiler control
It's true in theory. In practice, all of the compilers I know
treat std::basic_string exactly like they do a user defined
class.
(in fact, the implementation of std::string should be near
optimal without any compiler optimization anyway). Within
user-defined classes, copy construction may execute startup
code that will be cleaned up upon destruction, which would not
happen with assignement operations,
In the most frequent idiom for complex classes, the user defined
assignment operator starts by constructing a temporary copy
using the copy constructor. So assignment is very, very likely
to be slower than copy construction. In general, in fact,
assignment is likely to be slower than copy construction.
so the construction/destruction cycle within a loop should
best be avoided in most cases when it is not needed, unless
assignement is specially poorly implemented.
And that is simply wrong. It's a classical example of naïve
premature optimization: replacing clean code with something less
clean on the grounds that it is faster, when you've not
measured, and when in fact it isn't necessarily faster.
In such cases, however, it would be better to reimplement the
assignement properly. That is the reason my advice is to place
constructors outside the loop, though exceptions to the rule
may exist (yours doesn't seem to be one) AND to refrain from
returning large objects by value, wich would avoid the
creation of temporaries.
In sum, make your code as unreadable as possible, so that it
can't be optimized later, if the need realy does exist, for what
are probably non-issues, and for what (in this case at least)
may actually be a pessimization.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 4 '07 #17

P: n/a
Elias Salomão Helou Neto wrote:
: On 4 nov, 08:42, "Bo Persson" <b...@gmb.dkwrote:
:: But assignment has the additional problem of dealing with the old
:: value.
:
: As I see, copy on write with reference counting for the data is the
: only reason for "the additional problem of dealing with the old
: value" you mention (even though your wording is awful).

This has nothing to with reference counting. The problem with
assignment is that the string assigned to already has a value. What
are we going to do with that? How long does that take?

(and I'm not trying to win a litterature prize)

: If so, why would not it hold as well for copying? Think about it!
: Copying may be done on write, allright, but why could not
: assignement do exactly the same, i.e. copy on write with reference
: counting for the data? If it is done so, when a temporary argument
: is passed for either the copy constructor or the assignement
: operator, actual copying of the data would never take place.
:
::: But the real point here is that you were using something like str
::: = someFunction() instead of someFunction( str ). Do you see?
::
:: No, I don't! :-)
::
:: We have a perfectly good and idiomatic piece of code in section 2.
:: It is simple, easy to read, and actualy runs faster. What more
:: could we ask??
:
: I ask to compare it to what it should be compared to, not to an
: oversimplified version. I mean that str( someFunction() ) is not to
: be benchmarked against str = someFunction() when optimization is
: turned on because, in the former, temporaries are rather easy to be
: eliminated, while in the second no!

I don't get this one.

:
:: In section 1, the programmer tries some kind of premature
:: optimzation which just complicates the code. That he also gets
:: worse run-time performance, is well deserved!
:
: Is it well deserved to receive a performance penalty solely for
: deviating from an idiom? Now, an affirmative answer to this would be
: dumb!

Ok. :-)

Trying to outsmart the compiler, and getting slower code is well
deserved IMO.

:
::: Also, in normal cases, optimizations may not be as simple as they
::: are with std::string, which is under the compiler control (in
::: fact, the implementation of std::string should be near optimal
::: without any compiler optimization anyway). Within user-defined
::: classes, copy construction may execute startup code that will be
::: cleaned up upon destruction, which would not happen with
::: assignement operations, so the construction/destruction cycle
::: within a loop should best be avoided in most cases when it is not
::: needed, unless assignement is specially poorly implemented. In
::: such cases, however, it would be better to reimplement the
::: assignement properly. That is the reason my advice is to place
::: constructors outside the loop, though exceptions to the rule may
::: exist (yours doesn't seem to be one) AND to refrain from
::: returning large objects by value, wich would avoid the creation
::: of temporaries.
::
:: The fact is that std::string has no overhead in its copy
:: constructor, all it does is store a copy of the other string.
:
: I cannot figure out what you are meaning. Should not every copy-
: constructed object store a copy of the copied object? I guess you
: wanted to mean that they share data (copy on write).

No, I mean that constructing a string object from scratch can be
faster than destroying the old value, and then copying the new one.
Bo Persson
Nov 4 '07 #18

P: n/a
This has nothing to with reference counting. The problem with
assignment is that the string assigned to already has a value. What
are we going to do with that? How long does that take?
Now I see what you meant!

Release memory, not longer than destroying the object every iteration
of the loop, right?
(and I'm not trying to win a litterature prize)
But you are trying to be understood.
: If so, why would not it hold as well for copying? Think about it!
: Copying may be done on write, allright, but why could not
: assignement do exactly the same, i.e. copy on write with reference
: counting for the data? If it is done so, when a temporary argument
: is passed for either the copy constructor or the assignement
: operator, actual copying of the data would never take place.
:
::: But the real point here is that you were using something like str
::: = someFunction() instead of someFunction( str ). Do you see?
::
:: No, I don't! :-)
::
:: We have a perfectly good and idiomatic piece of code in section 2.
:: It is simple, easy to read, and actualy runs faster. What more
:: could we ask??
:
: I ask to compare it to what it should be compared to, not to an
: oversimplified version. I mean that str( someFunction() ) is not to
: be benchmarked against str = someFunction() when optimization is
: turned on because, in the former, temporaries are rather easy to be
: eliminated, while in the second no!

I don't get this one.
What can I say, then? Try to squeeze your brain.
:
:: In section 1, the programmer tries some kind of premature
:: optimzation which just complicates the code. That he also gets
:: worse run-time performance, is well deserved!
:
: Is it well deserved to receive a performance penalty solely for
: deviating from an idiom? Now, an affirmative answer to this would be
: dumb!

Ok. :-)

Trying to outsmart the compiler, and getting slower code is well
deserved IMO.
When did I try to outsmart the compiler?
::: Also, in normal cases, optimizations may not be as simple as they
::: are with std::string, which is under the compiler control (in
::: fact, the implementation of std::string should be near optimal
::: without any compiler optimization anyway). Within user-defined
::: classes, copy construction may execute startup code that will be
::: cleaned up upon destruction, which would not happen with
::: assignement operations, so the construction/destruction cycle
::: within a loop should best be avoided in most cases when it is not
::: needed, unless assignement is specially poorly implemented. In
::: such cases, however, it would be better to reimplement the
::: assignement properly. That is the reason my advice is to place
::: constructors outside the loop, though exceptions to the rule may
::: exist (yours doesn't seem to be one) AND to refrain from
::: returning large objects by value, wich would avoid the creation
::: of temporaries.
::
:: The fact is that std::string has no overhead in its copy
:: constructor, all it does is store a copy of the other string.
:
: I cannot figure out what you are meaning. Should not every copy-
: constructed object store a copy of the copied object? I guess you
: wanted to mean that they share data (copy on write).

No, I mean that constructing a string object from scratch can be
faster than destroying the old value, and then copying the new one.
Well, again you forget that your idiom has an implied destruction of
the object at every loop iteration, resulting in the need to deal with
exactly the same problem! How could that be different?

I will give you an example. Take the following two simple programs:

//Program 1:
#include <string>

std::string myFunction()
{
std::string str;
for ( unsigned i( 0 ); i < 1000; ++i )
str.append( "supercalifragilisomethingidonotremebmberandd"
"donotwantotsearchintheinternet" );

return( str );
}

int main()
{
for( unsigned i( 0 ); i < 100000; ++i )
std::string str( myFunction() );

return( 0 );
}

//Program 2:
#include <string>

void myFunction( std::string& str )
{
str.clear();
for ( unsigned i( 0 ); i < 1000; ++i )
str.append( "supercalifragilisomethingidonotremebmberandd"
"donotwantotsearchintheinternet" );
}

int main()
{
std::string str;
for( unsigned i( 0 ); i < 100000; ++i )
myFunction( str );

return( 0 );
}

According to you, Program 1 should run faster, right? But it is just
the opposite. Compiling both with no optimization (the default) using
gcc 4.1.2 20070502 Program 1 takes around 21 seconds to run against
around 15 seconds for Program 2. Now, let us turn optimization to its
higher level and see what happens. With the -O3 flag used when
compiling, Program 1's execution time falls to around 19 seconds,
while Program 2 goes down to amazing 12 seconds! Can you explain me
that?

It's time for another listing:

//Program 3:
#include <string>

std::string myFunction()
{
std::string str;
for ( unsigned i( 0 ); i < 1000; ++i )
str.append( "supercalifragilisomethingidonotremebmberandd"
"donotwantotsearchintheinternet" );

return( str );
}

int main()
{
std::string str;
for( unsigned i( 0 ); i < 100000; ++i )
str = myFunction();

return( 0 );
}

Program 3 takes little more than 17 seconds to run without
optimization turned on, explain it to me, please. When optimized, it
will take around 15 seconds to run.

Even though it is a contrived example, it shows who knows what is
talking about here. And I, for sure, did not have to look for some odd
example, it was the first one I tried.
>From now on, refrain from making statements without the required
knowledge, all right? And always remember to try things out before
blindly believing in what people say to you (start by trying the
listings here).

Elias Salomão Helou Neto

Nov 4 '07 #19

P: n/a
On Nov 5, 2:32 am, Elias Salomão Helou Neto <eshn...@gmail.comwrote:

[...]
In section 1, the programmer tries some kind of premature
optimzation which just complicates the code. That he also
gets worse run-time performance, is well deserved!
Is it well deserved to receive a performance penalty solely
for deviating from an idiom? Now, an affirmative answer to
this would be dumb!
It's expecting automatically that one idiom will be faster than
the other, without actually having measured the specific case in
question, which is dumb. It's choosing a particular idiom on
the grounds that it will be faster without having measured.
Of course it is, but people here seems to religiously prefer T
t( someFunction() ) over someFunction( t ) even within tight
loops and for absolutely every class. So, are they dumb?
It's dumb to choose an idiom in function of supposed performance
issues, without having first determined that there is a need,
and then measured to ensure that changing the idiom improves
things. In the past, I've actually changes one or two functions
to use your idiom. Because the profiler said I had to.
Depending on the class, the function, etc., it can make a
difference.
And anyone with any real experience will automatically
disagree when you claim differences without actually having
measured, and will disagree that the measurements you made
for one case apply to the next.
But you measured, right?
I measured one particular case. I measured intentionally in
answer to a posting which was making a similar claim to yours;
that declaring the variable outside the loop was faster.

In the past, I've profiled once or twice when the code was too
slow. In at least one case, using your technique made a
significant improvement. In others, it made no change, or even
made things worse. Choosing the solution "up front" because it
"will be faster" is counter productive; most of the time, it
doesn't matter, and when it does, you don't know up front which
will be faster. (Obviously, there are exceptions. If you're
constantly writing similar applications, using the same classes,
and you've already had to fix two or three in the same way,
well, experience is there for us to learn from. But you still
have to be aware that any changes in the implementation of
anything could invalidate your experience, and be prepared to
remeasure---even in the simple case of a compiler upgrade.)
I was trying to understand the reasons for something that
seemed unlikely to happen, at least from my point of view.
Off hand, I don't know the reasons. I knew them at the time,
but I've forgotten them. I do know that the C++ object model is
not always trivial, and that even in the case of simple code in
a simple language, it's not usually possible to predict where
the bottlenecks will be in advance. (Again, with some
exceptions: if you know in advance that you'll have to process
several hundred thousand elements, or more, it seems a safe
guess that an O(n!) won't cut it, and that even an O(n^2) will
probably be a bottleneck. But I've very sceptical of people who
claim different k's for the same big O, without measuring.)
In fact,I am still not convinced, as I have never actually
seen an example in which "repeated construction/destruction
cycles" could not be replaced with performance gain by
"creating outside the loop and passing the object as
reference". There may be, I look forward to see, but have
never seen. Perhaps, with your knowledge of std::string
internals, you could craft one for us. I would appreciate
that.
It would depend on the exact implementation of basic_string, and
it's been some time since I last looked into it.
You claim to have an ancient piece of code where your idiom
used to perform better than "creating once and repeatedly
assigning", but this is not what I want.
Let us make things clear. I am not here claiming that this is
always the way to go. One must, for sure, try several
solutions. However trying every possibility is not usually
feasible, and one should try those which are more likely to
work well. This is why it is worth to know why your code
behaved like that, if it could have been improved, and such.
One must only start trying when there is a problem. In most
domains, that's very rarely---most applications don't deal with
extremely large sets of data in memory. For those that do, you
can't really make any assumptions before profiling. Except
those concerning big-O---if the data set is large enough, an
O(n^2) implementation will be measurably slower than an O(n)
one, regardless of any other implementation details. But even
then... it takes a fair amount of data for the difference
between O(n) and O(n lg n) to become significant. And you'd be
surprised at how fast just copying can be on some machines. Off
hand, I'd expect returning something like an
std::vector<double>( 1000000 ) to be expensive, but in many
cases, the difference is significantly less than the time it
will take to generate the data anyhow.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 5 '07 #20

P: n/a
On 5 nov, 07:53, James Kanze <james.ka...@gmail.comwrote:
On Nov 5, 2:32 am, Elias Salomão Helou Neto <eshn...@gmail.comwrote:

[...]
In section 1, the programmer tries some kind of premature
optimzation which just complicates the code. That he also
gets worse run-time performance, is well deserved!
Is it well deserved to receive a performance penalty solely
for deviating from an idiom? Now, an affirmative answer to
this would be dumb!
It's expecting automatically that one idiom will be faster than
the other, without actually having measured the specific case in
question, which is dumb. It's choosing a particular idiom on
the grounds that it will be faster without having measured.
Of course it is, but people here seems to religiously prefer T
t( someFunction() ) over someFunction( t ) even within tight
loops and for absolutely every class. So, are they dumb?

It's dumb to choose an idiom in function of supposed performance
issues, without having first determined that there is a need,
and then measured to ensure that changing the idiom improves
things. In the past, I've actually changes one or two functions
to use your idiom. Because the profiler said I had to.
Depending on the class, the function, etc., it can make a
difference.
Alright, you have made your point. However, as you said, unless the
profiler says the opposite, an idiom is mostly a matter of choice
(code readability is important, but whether some piece of code is more
readable than other is a matter of how used you are with each idiom).
It may be untruth in a few cases, but my own experience says that it
is worth sticking to the choice I have made.
And anyone with any real experience will automatically
disagree when you claim differences without actually having
measured, and will disagree that the measurements you made
for one case apply to the next.
But you measured, right?

I measured one particular case. I measured intentionally in
answer to a posting which was making a similar claim to yours;
that declaring the variable outside the loop was faster.
While the mentioned posting may have had a similar wording in its
claim, it was far from being the same. I am not only avoiding the
construction/destruction cycle, but also avoiding to return by value.
It is not only declaring outside the loop, but I see you've already
got it.

Still, my examples have shown that declaring outside and repeatedly
assigning provided a performance boost over declaring within the loop,
at least in that extremely simple case using std::string. If I had the
time I would investigate that further, for the difference is much
larger than it should be (likely because destruction releases memory
that will have to be reallocated, while assignment, in this case where
the string never grows, not).

It is worth repeating, I am not saying that declaring outside should
be preferred over copy-construction/destruction. In fact, my feeling
is that they should be nearly the same!
In the past, I've profiled once or twice when the code was too
slow. In at least one case, using your technique made a
significant improvement. In others, it made no change, or even
made things worse. Choosing the solution "up front" because it
"will be faster" is counter productive; most of the time, it
doesn't matter, and when it does, you don't know up front which
will be faster. (Obviously, there are exceptions. If you're
constantly writing similar applications, using the same classes,
and you've already had to fix two or three in the same way,
well, experience is there for us to learn from. But you still
have to be aware that any changes in the implementation of
anything could invalidate your experience, and be prepared to
remeasure---even in the simple case of a compiler upgrade.)
I guess we should say that there is no "simple code performance
question" after all. But it is also counter productive to choose a
style based solely on code readability, being this a secondary issue
in some domains, AND to believe that this idiom cannot degrade
performance (this is obviously not your case, but there are many who
do think this way).
I was trying to understand the reasons for something that
seemed unlikely to happen, at least from my point of view.

Off hand, I don't know the reasons. I knew them at the time,
but I've forgotten them. I do know that the C++ object model is
not always trivial, and that even in the case of simple code in
a simple language, it's not usually possible to predict where
the bottlenecks will be in advance. (Again, with some
exceptions: if you know in advance that you'll have to process
several hundred thousand elements, or more, it seems a safe
guess that an O(n!) won't cut it, and that even an O(n^2) will
probably be a bottleneck. But I've very sceptical of people who
claim different k's for the same big O, without measuring.)
Our point here does not even regard operation count, it is more like
implementation details for some already chosen algorithm.
In fact,I am still not convinced, as I have never actually
seen an example in which "repeated construction/destruction
cycles" could not be replaced with performance gain by
"creating outside the loop and passing the object as
reference". There may be, I look forward to see, but have
never seen. Perhaps, with your knowledge of std::string
internals, you could craft one for us. I would appreciate
that.

It would depend on the exact implementation of basic_string, and
it's been some time since I last looked into it.
Hum...
You claim to have an ancient piece of code where your idiom
used to perform better than "creating once and repeatedly
assigning", but this is not what I want.
Let us make things clear. I am not here claiming that this is
always the way to go. One must, for sure, try several
solutions. However trying every possibility is not usually
feasible, and one should try those which are more likely to
work well. This is why it is worth to know why your code
behaved like that, if it could have been improved, and such.

One must only start trying when there is a problem. In most
domains, that's very rarely---most applications don't deal with
extremely large sets of data in memory. For those that do, you
can't really make any assumptions before profiling. Except
those concerning big-O---if the data set is large enough, an
O(n^2) implementation will be measurably slower than an O(n)
one, regardless of any other implementation details. But even
then... it takes a fair amount of data for the difference
between O(n) and O(n lg n) to become significant. And you'd be
surprised at how fast just copying can be on some machines. Off
hand, I'd expect returning something like an
std::vector<double>( 1000000 ) to be expensive, but in many
cases, the difference is significantly less than the time it
will take to generate the data anyhow.
Well, at least in my problem domain, implementation details do matter.
The algorithms are out there (even those you invented, it is likely
that you will want to make public through a paper); once you have
chosen one, either based on its convergence rate, reconstruction
quality, numerical stability or any other reasons, you have to
actually provide a sound implementation. This, for sure, means heavy
experimentation, but it also precludes any unnecessary copying of the
data, whether it is 100 or 1000000 elements; even before profiling I
am quite sure I should avoid that.

In many cases, however, there is no need for extreme performance and
such attention to details would make the code unnecessarily hard to
read and lead to an extra time budget, which is not always available.
This is, by the way, the reason for many to advocate Java, Python,
etc. over C++ for such applications.

I have chosen C++ because, even if sometimes sacrificing code
readability and having to pay lots of attention to the details, code
written in C++ can be made as fast as "pure" C code, but with much
more expressive power.

All said, I guess we agree in most points and the main one is:

"There is no simple code performance question, so always ask to the
profiler."

Though experience will be handy when looking for solutions.

Elias Salomão Helou Neto

Nov 5 '07 #21

P: n/a
According to you, Program 1 should run faster, right? But it is just
the opposite. Compiling both with no optimization (the default) using
gcc 4.1.2 20070502 Program 1 takes around 21 seconds to run against
around 15 seconds for Program 2. Now, let us turn optimization to its
higher level and see what happens. With the -O3 flag used when
compiling, Program 1's execution time falls to around 19 seconds,
while Program 2 goes down to amazing 12 seconds! Can you explain me
that?
I ran the same test as you did and I can confirm your result. I didn't
bother without optimization in any case.

However, since std::string::append did strike me as a bit unorthodox
(appending is the first you thought of?), i took the liberty to run
the same tests with:
str = "supercalifragili...";
instead of appending. With optimization -O3 on gcc 4.2.1, program 1
ran at 26.5 sec. Program 2 ran at 28.0 sec. If I remove the
"str.clear();" line in Program 2, I get 26.4sec.

(Yeah slow PC)

Can you explain me that? :)

Hint hint: Program 2 returns a reference (to an increasingly large
std::string), so the two programs are not fair imo.

On another similar note. Which one would you _prefer_?

for (std::list<...>::iterator i = myList.begin();
i != myList.end(); ++i)
{}

versus

std::list<...>::iterator i = myList.begin();
std::list<...>::iterator j = myList.end();
for ( ; i != j ; ++i)
{}

Nov 5 '07 #22

P: n/a
Ioannis Gyftos wrote:
[..]
On another similar note. Which one would you _prefer_?

for (std::list<...>::iterator i = myList.begin();
i != myList.end(); ++i)
{}

versus

std::list<...>::iterator i = myList.begin();
std::list<...>::iterator j = myList.end();
for ( ; i != j ; ++i)
{}
Actually, I probably would prefer the former (for a list, of
course), since there is no difference, AFAIK (unless profiling
tells me otherwise).

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Nov 5 '07 #23

P: n/a
Ioannis Gyftos <io************@gmail.comwrote in news:1194279646.643915.166410
@o38g2000hse.googlegroups.com:
>
On another similar note. Which one would you _prefer_?

for (std::list<...>::iterator i = myList.begin();
i != myList.end(); ++i)
{}

versus

std::list<...>::iterator i = myList.begin();
std::list<...>::iterator j = myList.end();
for ( ; i != j ; ++i)
{}

I've used:

for (std::list<...>::iterater i = myList.begin(), iEnd = myList.end();
i != iEnd; ++i)
{}

in the past. Gives the best of both worlds. However, one should recognize that
this is an optimisation that will make very little difference in most cases, but
somehow calling a function the minimum number of times necessary satifsfies an inner
urge of mine. :)

joe
Nov 5 '07 #24

P: n/a
Joe Greer wrote:
[..] somehow calling a function the minimum
number of times necessary satifsfies an inner urge of mine. :)
:-)

It's an inherent mistrust between a human and a machine. In most
cases the compiler should replace the function call with the value
it returns (hopefully just a null pointer). If you trust your
compiler, "e = lst.end(); i < e;" and "; i < lst.end();" should be
the same to you.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Nov 5 '07 #25

P: n/a
On Nov 5, 6:44 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
Joe Greer wrote:
[..] somehow calling a function the minimum
number of times necessary satifsfies an inner urge of mine. :)
:-)
It's an inherent mistrust between a human and a machine. In most
cases the compiler should replace the function call with the value
it returns (hopefully just a null pointer). If you trust your
compiler, "e = lst.end(); i < e;" and "; i < lst.end();" should be
the same to you.
Interestingly enough, the benchmarks I've run suggest that they
don't. Moving the call to end() out of the loop does speed it
up, at least with g++ on Sun Sparc.

Of course, the difference is very, very small, so unless you're
doing almost nothing in the loop, there's no point in
programming to it. I normally use the comparison it != c.end()
everywhere, and it's yet to cause a real performance problem.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 5 '07 #26

P: n/a
"Victor Bazarov" <v.********@comAcast.netwrote in news:fgnkqe$8og$1
@news.datemas.de:
Joe Greer wrote:
>[..] somehow calling a function the minimum
number of times necessary satifsfies an inner urge of mine. :)

:-)

It's an inherent mistrust between a human and a machine. In most
cases the compiler should replace the function call with the value
it returns (hopefully just a null pointer). If you trust your
compiler, "e = lst.end(); i < e;" and "; i < lst.end();" should be
the same to you.
Sadly, it's experience that causes this mistrust. :) Since the form I use
is pretty much idiomatic to me and is no more expensive, I tend to use it.

joe
Nov 5 '07 #27

P: n/a
Elias Salomão Helou Neto wrote:
:
: Well, again you forget that your idiom has an implied destruction of
: the object at every loop iteration, resulting in the need to deal
: with exactly the same problem! How could that be different?
:
: I will give you an example. Take the following two simple programs:
:
: //Program 1:
: #include <string>
:
: std::string myFunction()
: {
: std::string str;
: for ( unsigned i( 0 ); i < 1000; ++i )
: str.append( "supercalifragilisomethingidonotremebmberandd"
: "donotwantotsearchintheinternet" );
:
: return( str );
: }
:
: int main()
: {
: for( unsigned i( 0 ); i < 100000; ++i )
: std::string str( myFunction() );
:
: return( 0 );
: }
:
: //Program 2:
: #include <string>
:
: void myFunction( std::string& str )
: {
: str.clear();
: for ( unsigned i( 0 ); i < 1000; ++i )
: str.append( "supercalifragilisomethingidonotremebmberandd"
: "donotwantotsearchintheinternet" );
: }
:
: int main()
: {
: std::string str;
: for( unsigned i( 0 ); i < 100000; ++i )
: myFunction( str );
:
: return( 0 );
: }
:
: According to you, Program 1 should run faster, right? But it is just
: the opposite. Compiling both with no optimization (the default)
: using gcc 4.1.2 20070502 Program 1 takes around 21 seconds to run
: against around 15 seconds for Program 2. Now, let us turn
: optimization to its higher level and see what happens. With the -O3
: flag used when compiling, Program 1's execution time falls to
: around 19 seconds, while Program 2 goes down to amazing 12 seconds!
: Can you explain me that?

Yes, you are benchmarking the memory allocation for std::string.

On my machine, using another compiler, I get:

Program 1: 22.5 s
Program 2: 3.3 s

Then I notice that Program 2 reuses the same internal string buffer
for all calls, saving calls to the string growth code for the last
99,999 calls. To even the score a bit, I add a "str.reserve(100000)"
to myFunction.

Program 1B: 3.5 s
Program 2B: 3.4 s
: It's time for another listing:
:
: //Program 3:
: #include <string>
:
: std::string myFunction()
: {
: std::string str;
: for ( unsigned i( 0 ); i < 1000; ++i )
: str.append( "supercalifragilisomethingidonotremebmberandd"
: "donotwantotsearchintheinternet" );
:
: return( str );
: }
:
: int main()
: {
: std::string str;
: for( unsigned i( 0 ); i < 100000; ++i )
: str = myFunction();
:
: return( 0 );
: }
:
: Program 3 takes little more than 17 seconds to run without
: optimization turned on, explain it to me, please. When optimized, it
: will take around 15 seconds to run.

On my machine it takes 24 s unmodified.

Adding the same "str.reserve(100000)" to myFunction.

Program 3B: 5.6 s

Rewriting main, making it equivalent to Program 1:

int main()
{

for( unsigned i( 0 ); i < 100000; ++i )
std::string str = myFunction();

return( 0 );
}

Program 3C: 3.5 s
The last case shows that, in this test, constructing a new string on
each iteration is faster than assigning a new value to an existing
string.
Bo Persson
Nov 5 '07 #28

P: n/a
On 5 nov, 14:20, Ioannis Gyftos <ioannis.gyf...@gmail.comwrote:
According to you, Program 1 should run faster, right? But it is just
the opposite. Compiling both with no optimization (the default) using
gcc 4.1.2 20070502 Program 1 takes around 21 seconds to run against
around 15 seconds for Program 2. Now, let us turn optimization to its
higher level and see what happens. With the -O3 flag used when
compiling, Program 1's execution time falls to around 19 seconds,
while Program 2 goes down to amazing 12 seconds! Can you explain me
that?

I ran the same test as you did and I can confirm your result. I didn't
bother without optimization in any case.

However, since std::string::append did strike me as a bit unorthodox
(appending is the first you thought of?), i took the liberty to run
Yes, because I wanted a huge string, not only to mimic assignment.
Without delving in obscure std::string methods, how do you create a
huge std::string without appending content to it?
the same tests with:
str = "supercalifragili...";
instead of appending. With optimization -O3 on gcc 4.2.1, program 1
ran at 26.5 sec. Program 2 ran at 28.0 sec. If I remove the
"str.clear();" line in Program 2, I get 26.4sec.
(Yeah slow PC)

Can you explain me that? :)
Well, of course clearing the string was not necessary in your case.
The explanation in my example was memory management, as you can easily
deduce from the 3.7 seconds of system time required for Program 1 to
run to completion, compared to 0.3 for Program 2. Under Linux, the
time command gives you this info. Now, in your case both versions had
the same execution time, so there should be no preferred way, and I
will stick to mine. Also, being everything the same, what do you want
me to explain?
Hint hint: Program 2 returns a reference (to an increasingly large
std::string), so the two programs are not fair imo.
Neither the program returns any reference nor the void myFunction
does, I don't see what you mean.
On another similar note. Which one would you _prefer_?

for (std::list<...>::iterator i = myList.begin();
i != myList.end(); ++i)
{}

versus

std::list<...>::iterator i = myList.begin();
std::list<...>::iterator j = myList.end();
for ( ; i != j ; ++i)
{}
The former. As Bazarov has already noticed, both should be equivalent
unless the compiler does any calculation to arrive at myList.end(),
which is unlikely.

On the other hand, it could be interesting to avoid things like:

for ( std::vector<...>::size_type i( 0 ); i < myVector.size(); ++i );

because the size() member is likely to end up being inline expanded to
myVector.end() - myVector.begin(), so saving it in a variable could be
a good idea in performance sensitive applications. In such cases I do
write:

std::vector<...>:size_type size( myVector.size() );
for ( std::vector<...>::size_type i( 0 ); i < size; ++i );

Notice that the second _cannot_ be much slower than the former (it
will be at most one integer type creation slower), but can potentially
avoid thousands of recalculations. When you come into a loop that only
executes one subtraction within each iteration (and it does happen a
lot to me), the former will be doubling the execution time, unless if
optimized by the compiler (which would result in code similar to
mine), but such optimization may not be an easy task for it.

Of course using iterators here would be much more elegant, but some
algorithms are best implemented with indexes. It is much easier to
provide indexes when working with matrices representing images than to
work with iterators. By the way, why there is no std::matrix? It makes
me unhappy.

Perhaps I am neurotic about performance, but I do need to be. That's
why I chose C++ over Python. If one is to be writing sloppy C++ code,
why not move to Python?

Elias Salomão Helou Neto

Nov 5 '07 #29

P: n/a
On 5 nov, 17:09, "Bo Persson" <b...@gmb.dkwrote:
Elias Salomão Helou Neto wrote:
:
: Well, again you forget that your idiom has an implied destruction of
: the object at every loop iteration, resulting in the need to deal
: with exactly the same problem! How could that be different?
:
: I will give you an example. Take the following two simple programs:
:
: //Program 1:
: #include <string>
:
: std::string myFunction()
: {
: std::string str;
: for ( unsigned i( 0 ); i < 1000; ++i )
: str.append( "supercalifragilisomethingidonotremebmberandd"
: "donotwantotsearchintheinternet" );
:
: return( str );
: }
:
: int main()
: {
: for( unsigned i( 0 ); i < 100000; ++i )
: std::string str( myFunction() );
:
: return( 0 );
: }
:
: //Program 2:
: #include <string>
:
: void myFunction( std::string& str )
: {
: str.clear();
: for ( unsigned i( 0 ); i < 1000; ++i )
: str.append( "supercalifragilisomethingidonotremebmberandd"
: "donotwantotsearchintheinternet" );
: }
:
: int main()
: {
: std::string str;
: for( unsigned i( 0 ); i < 100000; ++i )
: myFunction( str );
:
: return( 0 );
: }
:
: According to you, Program 1 should run faster, right? But it is just
: the opposite. Compiling both with no optimization (the default)
: using gcc 4.1.2 20070502 Program 1 takes around 21 seconds to run
: against around 15 seconds for Program 2. Now, let us turn
: optimization to its higher level and see what happens. With the -O3
: flag used when compiling, Program 1's execution time falls to
: around 19 seconds, while Program 2 goes down to amazing 12 seconds!
: Can you explain me that?

Yes, you are benchmarking the memory allocation for std::string.
Well, it is in fact easier to deal with memory allocation once than
doing it in every loop iteration. But, as I said, my example is
contrived.
On my machine, using another compiler, I get:

Program 1: 22.5 s
Program 2: 3.3 s

Then I notice that Program 2 reuses the same internal string buffer
for all calls, saving calls to the string growth code for the last
99,999 calls.
It happens all the time with this idiom.
To even the score a bit, I add a "str.reserve(100000)"
to myFunction.

Program 1B: 3.5 s
Program 2B: 3.4 s
Assuming also that reserving much more memory than needed is not a
problem, yes, it should work, but 2 is still (marginally) faster, it
would be fairer to say as fast as. It is yet to appear someone to show
an opposite example, i.e., where passing an object as reference will
degrade performance (although some claim that it is possible, and I do
believe).

I can imagine extremely contrived examples involving somewhat absurd
classes, but never when the class to which the object belongs allows
efficient manipulation of the data. If std::string did not allow such
manipulations it would be useless, since char[] already existed in C.
In fact, if any class does not provide other means to manipulate its
data than through constructors, why to exist at all if we could have
done well with a C struct? This seems to apply even more to classes
whose instances are supposed to hold large amounts of data.
: It's time for another listing:
:
: //Program 3:
: #include <string>
:
: std::string myFunction()
: {
: std::string str;
: for ( unsigned i( 0 ); i < 1000; ++i )
: str.append( "supercalifragilisomethingidonotremebmberandd"
: "donotwantotsearchintheinternet" );
:
: return( str );
: }
:
: int main()
: {
: std::string str;
: for( unsigned i( 0 ); i < 100000; ++i )
: str = myFunction();
:
: return( 0 );
: }
:
: Program 3 takes little more than 17 seconds to run without
: optimization turned on, explain it to me, please. When optimized, it
: will take around 15 seconds to run.

On my machine it takes 24 s unmodified.
Adding the same "str.reserve(100000)" to myFunction.
Program 3B: 5.6 s
I guess there is no copy on write on your compiler's std::string
implementation, so that assignment to a temporary will actually move
data around (whether this is a good design decision or not, I do not
know), but this would not be needed with your idiom because the
standard allows to optimize away the copy constructor (I am willing to
bet that if you forbid optimization both will be equivalent). Compiled
with gcc, all of your versions run equally fast on my machine
(actually equally slow when compared to your machine) whether
optimized or not. Now I really want to know which compiler you are
using.
Rewriting main, making it equivalent to Program 1:

int main()
{

for( unsigned i( 0 ); i < 100000; ++i )
std::string str = myFunction();

return( 0 );
}

Program 3C: 3.5 s

The last case shows that, in this test, constructing a new string on
each iteration is faster than assigning a new value to an existing
string.
This is just the same than std::string str( myFunction() ). We did not
even needed this case to reach the conclusion, but the dramatic effect
is interesting. Are you a lawyer? Just kidding...

Well it is for your compiler, but what I would really love to know is
why is your idiom so overhauled that no one can realize that passing
the string as a reference (within tight loops, of course) is much less
likely to suffer from performance penalties?

Also, try comparing 1B against 3B forbidding optimization to see what
an non-optimizing compiler may be doing with your idiom. Please, do it
or say which compiler you are using. I am curious.

I argue that, when not optimized, 1B should be equivalent to 3B in
every realistic implementation of std::string. With optimization, 1B
should perform better on some implementations. But for really good
implementations (recent versions of gcc), both should be nearly the
same even with optimization turned on. The conclusion is that 1B has
more chances of being successful, so should be preferred over 3B. But
we can go further and say that 2B is much more likely to beat both in
most cases.

Elias Salomão Helou Neto

Nov 5 '07 #30

P: n/a
Elias Salomão Helou Neto wrote:
: On 5 nov, 17:09, "Bo Persson" <b...@gmb.dkwrote:
:: Elias Salomão Helou Neto wrote:
:::
::: According to you, Program 1 should run faster, right? But it is
::: just the opposite. Compiling both with no optimization (the
::: default) using gcc 4.1.2 20070502 Program 1 takes around 21
::: seconds to run against around 15 seconds for Program 2. Now, let
::: us turn optimization to its higher level and see what happens.
::: With the -O3 flag used when compiling, Program 1's execution time
::: falls to around 19 seconds, while Program 2 goes down to amazing
::: 12 seconds! Can you explain me that?
::
:: Yes, you are benchmarking the memory allocation for std::string.
:
: Well, it is in fact easier to deal with memory allocation once than
: doing it in every loop iteration. But, as I said, my example is
: contrived.
:
:: On my machine, using another compiler, I get:
::
:: Program 1: 22.5 s
:: Program 2: 3.3 s
::
:: Then I notice that Program 2 reuses the same internal string buffer
:: for all calls, saving calls to the string growth code for the last
:: 99,999 calls.
:
: It happens all the time with this idiom.

The benefit is exaggerated by teh fact that the string is the same
size for every call. Otherwise there would be reallocations here too.

:
:: To even the score a bit, I add a "str.reserve(100000)"
:: to myFunction.
::
:: Program 1B: 3.5 s
:: Program 2B: 3.4 s
:
: Assuming also that reserving much more memory than needed is not a
: problem, yes, it should work,

It's not *much* more memory that needed, I just allocate enough to
hold 1000 appends of about a 100 characters each. (74 is it, if
counting?)

: but 2 is still (marginally) faster, it
: would be fairer to say as fast as. It is yet to appear someone to
: show an opposite example, i.e., where passing an object as
: reference will degrade performance (although some claim that it is
: possible, and I do believe).

Being 0.1 s faster per 100,000 iterations is very marginally faster in
my book. :-)

:
::: It's time for another listing:
:::
::: //Program 3:
::: #include <string>
:::
::: std::string myFunction()
::: {
::: std::string str;
::: for ( unsigned i( 0 ); i < 1000; ++i )
::: str.append( "supercalifragilisomethingidonotremebmberandd"
::: "donotwantotsearchintheinternet" );
:::
::: return( str );
::: }
:::
::: int main()
::: {
::: std::string str;
::: for( unsigned i( 0 ); i < 100000; ++i )
::: str = myFunction();
:::
::: return( 0 );
::: }
:::
::: Program 3 takes little more than 17 seconds to run without
::: optimization turned on, explain it to me, please. When optimized,
::: it will take around 15 seconds to run.
::
:: On my machine it takes 24 s unmodified.
:: Adding the same "str.reserve(100000)" to myFunction.
:: Program 3B: 5.6 s
:
: I guess there is no copy on write on your compiler's std::string

Right.

: implementation, so that assignment to a temporary will actually move
: data around (whether this is a good design decision or not, I do not
: know), but this would not be needed with your idiom because the
: standard allows to optimize away the copy constructor (I am willing
: to bet that if you forbid optimization both will be equivalent).

I don't find it very interesting to compare the speed of unoptimized
compiles. If I want the code to be fast, I use a good compiler with
appropriate settings. If I don't care (or need) the speed, it doesn't
really matter.

: Compiled with gcc, all of your versions run equally fast on my
: machine (actually equally slow when compared to your machine)
: whether optimized or not. Now I really want to know which compiler
: you are using.

It's the other free compiler, Visual C++ 2005 Express (using an
alternate version of the standard library).

:
: Well it is for your compiler, but what I would really love to know
: is why is your idiom so overhauled that no one can realize that
: passing the string as a reference (within tight loops, of course)
: is much less likely to suffer from performance penalties?

The argument was the other way around, that constructing a string
inside the loop was not killing performance.

:
: Also, try comparing 1B against 3B forbidding optimization to see
: what an non-optimizing compiler may be doing with your idiom.
: Please, do it or say which compiler you are using. I am curious.

Ok, without optimization (debug build) we get

Program 1B: 95 s
Program 2B: 87 s
Prorgam 3B: 96 s

From earlier experiments I believe that the main effect here is from
disabled inlining. Actually having to call a lot of accessor functions
out-of-line, seems to cost between 10 and 100 times as much in my
code.
Bo Persson

Nov 5 '07 #31

This discussion thread is closed

Replies have been disabled for this discussion.