469,613 Members | 1,342 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,613 developers. It's quick & easy.

Q: Return value optimization thru several function calls?

Hi!

I am using VC++ 7.1 and have a question about return value optimization.
Consider the following code:

#include <list>
#include <string>

struct test
{
std::list <std::string> m_strings;
};

test func1 (int r)
{
if (r == 0)
return test ();

test t = func1 (r - 1);
t.m_strings.push_back ("lala");
return t;
}

int main ()
{
test t = func1 (3);
}

Now, stepping thru the code, one would assume that 'test' objects are
copy constructed and destroyed often. My question is, would a decent
compiler, like VC++ 7.1, be able to optimize this in such a way that only a
single 'test' object is used throughout the code? To make this post more
on-topic, would the Standard even allow such an optimization, as it is a
pretty great change to the original code ..

Thanks in advance!
--
jb

(replace y with x if you want to reply by e-mail)
Jul 22 '05 #1
20 2080
Jakob Bieling wrote:
Hi!

I am using VC++ 7.1 and have a question about return value optimization.
Consider the following code:

#include <list>
#include <string>

struct test
{
std::list <std::string> m_strings;
};

test func1 (int r)
{
if (r == 0)
return test ();

test t = func1 (r - 1);
t.m_strings.push_back ("lala");
return t;
}

int main ()
{
test t = func1 (3);
}

Now, stepping thru the code, one would assume that 'test' objects are
copy constructed and destroyed often. My question is, would a decent
compiler, like VC++ 7.1, be able to optimize this in such a way that only a
single 'test' object is used throughout the code? To make this post more
on-topic, would the Standard even allow such an optimization, as it is a
pretty great change to the original code ..


The standard allows the optimization (6.3.3).
Jul 22 '05 #2
"Jeff Schwab" <je******@comcast.net> wrote in message
news:bt********************@comcast.com...
The standard allows the optimization (6.3.3).


In my copy, 6.3 is "Compound statement or block" and only has a single
paragraph. IOW, there is no 6.3.3?

regards
--
jb

(replace y with x if you want to reply by e-mail)
Jul 22 '05 #3
Jakob Bieling wrote:

Hi!

I am using VC++ 7.1 and have a question about return value optimization.
Consider the following code:

#include <list>
#include <string>

struct test
{
std::list <std::string> m_strings;
};

test func1 (int r)
{
if (r == 0)
return test ();

test t = func1 (r - 1);
t.m_strings.push_back ("lala");
return t;
}

int main ()
{
test t = func1 (3);
}


As Jeff said, in the above example the compiler is allowed to optimise
away unnecessary locals (in func1()), temporaries and the corresponding
copy constructor calls. If func1() were defined like this:

test func1 (int r)
{
test t;
t.m_strings.push_back ("lala");
return t;
}

with the same main(), I would definitely expect just one construction and
destruction from a decent compiler. The recursion, however, may complicate
things a bit. You can add a couple of ctors to test (default and copy) and
a dtor that cout a message and see.

Consider this too:
int main()
{
test t;
//do something to t ...

t = func1 (3);
}

Not much luck in this case: t will be constructed first, then at least one
object of class test will be created as a result of func1(), then operator =
will be called.

When I care about performance, I usually define func1() as
void func1 (int r, test& t) {...}
If the functional notation were important, I would consider using proxies
(and accomodate the "actual" objects acordingly), but I would not return
really big objects from a function.

Denis
Jul 22 '05 #4
"Denis Remezov" <de*********************@yahoo.ca.removethis> wrote in
message news:40***************@yahoo.ca.removethis...
When I care about performance, I usually define func1() as
void func1 (int r, test& t) {...}
If the functional notation were important,
Well, it is not important, but I do care about design and the above
seems like a rather ugly solution; tho it would serve the purpose pretty
well.
I would consider using proxies
(and accomodate the "actual" objects acordingly), but I would not return
really big objects from a function.


It is not that the object is big, but it contains a linked list of
non-POD objects, so creating copies of it where it is not necessary, seems
like a bad idea.

I assume the proxy would take care of doing a fast swap rather than a
deep-copy of the objects, correct?

Thanks!
--
jb

(replace y with x if you want to reply by e-mail)
Jul 22 '05 #5
"Jakob Bieling" <ne*****@gmy.net> wrote...
"Jeff Schwab" <je******@comcast.net> wrote in message
news:bt********************@comcast.com...
The standard allows the optimization (6.3.3).


In my copy, 6.3 is "Compound statement or block" and only has a single
paragraph. IOW, there is no 6.3.3?


Jeff probably meant 6.6.3 "The return statement".

V
Jul 22 '05 #6
Jakob Bieling wrote:
"Jeff Schwab" <je******@comcast.net> wrote in message
news:bt********************@comcast.com...
The standard allows the optimization (6.3.3).


In my copy, 6.3 is "Compound statement or block" and only has a
single paragraph. IOW, there is no 6.3.3?

regards


It's not in mine (2003 revision) either...

- Pete
Jul 22 '05 #7
"Victor Bazarov" <v.********@comAcast.net> wrote in message
news:7nygc.20953$ru4.18150@attbi_s52...
"Jakob Bieling" <ne*****@gmy.net> wrote...
"Jeff Schwab" <je******@comcast.net> wrote in message
news:bt********************@comcast.com...
The standard allows the optimization (6.3.3).


In my copy, 6.3 is "Compound statement or block" and only has a single paragraph. IOW, there is no 6.3.3?


Jeff probably meant 6.6.3 "The return statement".

Thanks .. there is a reference to 12.2 which explicitly allows the
optimization of constructing the return value directly in place of the
variable that is used to hold the return value, thus eliminating the
temporary copy.

Guess I will go for Dennis' proxy-class approach, since it does not look
like a decent compiler will do the optimization I am after.

Thanks for the comments!
--
jb

(replace y with x if you want to reply by e-mail)
Jul 22 '05 #8
On Sun, 18 Apr 2004 19:38:26 +0200 in comp.lang.c++, "Jakob Bieling"
<ne*****@gmy.net> wrote,
Guess I will go for Dennis' proxy-class approach, since it does not look
like a decent compiler will do the optimization I am after.


A decent compiler will certainly do that optimization!
Now the question remains, whether there is one yet that does? Perhaps
not.

Jul 22 '05 #9
"David Harmon" <so****@netcom.com> wrote in message
news:40****************@news.west.earthlink.net...
On Sun, 18 Apr 2004 19:38:26 +0200 in comp.lang.c++, "Jakob Bieling"
<ne*****@gmy.net> wrote,
Guess I will go for Dennis' proxy-class approach, since it does not looklike a decent compiler will do the optimization I am after.


A decent compiler will certainly do that optimization!
Now the question remains, whether there is one yet that does? Perhaps
not.

;)


--
jb

(replace y with x if you want to reply by e-mail)
Jul 22 '05 #10
Jakob Bieling wrote:

"Denis Remezov" <de*********************@yahoo.ca.removethis> wrote in
message news:40***************@yahoo.ca.removethis...
When I care about performance, I usually define func1() as
void func1 (int r, test& t) {...}
If the functional notation were important,
Well, it is not important, but I do care about design and the above
seems like a rather ugly solution;

[snip]

It depends. In some respect, the syntax is inferior, I agree; func1() now
also needs to be careful to handle or dispose of the existing contents
of t properly (that could be a bad thing). Potential advantages are
(performance issues aside):
0) Class test does not need to be constructible in the given scope.
1) It is further guaranteed and made explicit that no construction
or copying of objects of class test will take place.

I would consider using proxies
(and accomodate the "actual" objects acordingly), but I would not return
really big objects from a function.


It is not that the object is big, but it contains a linked list of
non-POD objects,

[snip]

Sorry I didn't make it clear, but I meant to include that into the
definition of a "big object".

I assume the proxy would take care of doing a fast swap rather than a
deep-copy of the objects, correct?


Something of the kind. I was thinking about proxy objects sharing pointers
to the "real" objects and reference-counting them, deferring the
actual creation and copying until such point in time that a write-operation
is requested.

Denis
Jul 22 '05 #11
On Sun, 18 Apr 2004 18:30:47 +0200 in comp.lang.c++, "Jakob Bieling"
<ne*****@gmy.net> wrote,
When I care about performance, I usually define func1() as
void func1 (int r, test& t) {...}
If the functional notation were important,


Well, it is not important, but I do care about design and the above
seems like a rather ugly solution; tho it would serve the purpose pretty
well.


I am sure your example is simplified and that there would be other
concerns in a complex application. Nevertheless, for the purpose of the
simple example I would consider constructing a single object and using a
member function with the implied 'this' to make it perhaps a bit less
ugly than the explicit reference argument. In my view, this gets closer
to the implementation originally desired, albeit at the cost of a member
function that you might prefer as a non-menber.

#include <iostream>
using std::cerr;
#include <list>
#include <string>

struct test {
std::list<std::string> m_strings;
test & func1(int r);

// instrumentation:
test() { cerr << "Construct!\n"; }
test(const test & x)
: m_strings(x.m_strings)
{ cerr << "Copy construct " << m_strings.size() << "!\n"; }
~test() { cerr << "Destruct " << m_strings.size() << "!\n"; }
//
};

test & test::func1(int r)
{
if (r == 0)
return *this;

func1(r - 1);
m_strings.push_back("lala");
return *this;
}

int main()
{
test t;
t.func1(3);
}

Jul 22 '05 #12
Victor Bazarov wrote:
"Jakob Bieling" <ne*****@gmy.net> wrote...
"Jeff Schwab" <je******@comcast.net> wrote
The standard allows the optimization (6.3.3).


In my copy, 6.3 is "Compound statement or block" and only has a single
paragraph. IOW, there is no 6.3.3?


Jeff probably meant 6.6.3 "The return statement".


Thanks, Victor. You're right, I did mean 6.6.3.
Jul 22 '05 #13
On Sun, 18 Apr 2004 14:57:20 +0200, "Jakob Bieling" <ne*****@gmy.net>
wrote:
Hi!

I am using VC++ 7.1 and have a question about return value optimization.
Consider the following code:

#include <list>
#include <string>

struct test
{
std::list <std::string> m_strings;
};

test func1 (int r)
{
if (r == 0)
return test ();
The line above disables the optimization for most compilers.
Generally, for NRVO to work, all return statements must return the
same object (although this isn't a requirement of 12.8/15, just common
practice).

test t = func1 (r - 1);
t.m_strings.push_back ("lala");
return t;
So the above probably won't be optimized.
}

int main ()
{
test t = func1 (3);
}

Now, stepping thru the code, one would assume that 'test' objects are
copy constructed and destroyed often. My question is, would a decent
compiler, like VC++ 7.1, be able to optimize this in such a way that only a
single 'test' object is used throughout the code? To make this post more
on-topic, would the Standard even allow such an optimization, as it is a
pretty great change to the original code ..


Both GCC and VC7.1 failed to optimize the original code (I added
instrumentation). Here is a version that on GCC 3.3 prints:

Default
Destroy

which is what you were after. VC++ 7.1 doesn't implement NRVO, so that
prints:

Default
Copy
Destroy
Copy
Destroy
Copy
Destroy
Copy
Destroy
Copy
Destroy
Copy
Destroy
Copy
Destroy
Copy
Destroy
Destroy

Oh dear! Here's the code:

#include <list>
#include <string>
#include <iostream>
using namespace std;

struct test
{
test()
{
cout << "Default\n";
}

test(test const& t)
:m_strings(t.m_strings)
{
cout << "Copy\n";
}

~test()
{
cout << "Destroy\n";
}

test& operator=(test const& t)
{
m_strings = t.m_strings;
cout << "Assign\n";
return *this;
}

std::list <std::string> m_strings;
};

test func1 (int r)
{
test t(r == 0? test() : func1 (r - 1));
if (r != 0)
t.m_strings.push_back ("lala");
return t;
}

int main ()
{
test t = func1 (3);
}

And what's the moral of this? Well, to enable NRVO, make sure you:

1. Ensure all return statements return the same local object.
2. Avoid assignments.
3. Use GCC.

I assume MS will implement NRVO eventually...

Tom
--
C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
Jul 22 '05 #14
tom_usenet wrote:
Oh dear! Here's the code: cat main.cc #include <list>
#include <string>
#include <iostream>

struct test {
test(void) {
std::cout << "Default" << std::endl;
}

test(test const& t):
m_strings(t.m_strings) {
std::cout << "Copy" << std::endl;
}

~test(void) {
std::cout << "Destroy" << std::endl;
}

test& operator=(test const& t) {
m_strings = t.m_strings;
std::cout << "Assign" << std::endl;
return *this;
}

std::list <std::string> m_strings;
};

test func1(int r) {
test t(r == 0? test(): func1(r - 1));
if (r != 0)
t.m_strings.push_back("lala");
return t;
}

int main (int argc, char* argv[]) {
test t = func1(3);
return 0;
}
g++ -Wall -ansi -pedantic -O2 -o main main.cc
./main

Default
Destroy

Jul 22 '05 #15
David Harmon wrote:
I am sure your example is simplified and that
there would be other concerns in a complex application.
Nevertheless, for the purpose of the simple example
I would consider constructing a single object
and using a member function with the implied 'this'
This is *very* bad advice.
to make it perhaps a bit less ugly
than the explicit reference argument. In my view, this gets closer
to the implementation originally desired, albeit at the cost of a member
function that you might prefer as a non-menber.

#include <iostream>
// using std::cerr;
#include <list>
#include <string>

struct test {
using std::cerr;
std::list<std::string> m_strings;
test & func1(int r);

// instrumentation:
test() { cerr << "Construct!\n"; }
test(const test & x):
m_strings(x.m_strings) {
cerr << "Copy construct " << m_strings.size() << "!\n"; }
~test() { cerr << "Destruct " << m_strings.size() << "!\n"; }
//
};

test& test::func1(int r) {

if (0 == r)
return *this;

func1(r - 1);
m_strings.push_back("lala");
return *this;
}

int main() {
test t;
t.func1(3);
return 0; }

Jul 22 '05 #16
David Harmon wrote:
A decent compiler will certainly do that optimization!
Now the question remains, whether there is one yet that does?
Perhaps not.
My GNU C++ compiler:
g++ --version

g++ (GCC) 3.2 20020903 (Red Hat Linux 8.0 3.2-7)

implements the NRVO. So do several other quality C++ compilers.
If your compiler does not implement the NVRO,
you should be shopping for a better compiler.

Jul 22 '05 #17
"tom_usenet" <to********@hotmail.com> wrote in message
news:uf********************************@4ax.com...
On Sun, 18 Apr 2004 14:57:20 +0200, "Jakob Bieling" <ne*****@gmy.net>

test func1 (int r)
{
if (r == 0)
return test ();


The line above disables the optimization for most compilers.
Generally, for NRVO to work, all return statements must return the
same object (although this isn't a requirement of 12.8/15, just common
practice).


Compilers should allow NVRO if the different return statements return
objects declared after the previous return statement.
Maybe explicit braces will help, but I'm not sure:

test func1 (int r)
{
if (r == 0)
return test ();

/*else*/ {
test t = func1 (r - 1);
t.m_strings.push_back ("lala");
return t;
}
}
Or even mutual recursion

test func1 (int r)
{
return r == 0 ? test () : private_func1_assumes_r_greater_than_zero(r);
}

test private_func1_assumes_r_greater_than_zero (int r)
{
test t = func1 (r - 1);
t.m_strings.push_back ("lala");
return t;
}

Jul 22 '05 #18
On Mon, 19 Apr 2004 10:17:55 -0700 in comp.lang.c++, "E. Robert Tisdale"
<E.**************@jpl.nasa.gov> wrote,
David Harmon wrote:
I am sure your example is simplified and that
there would be other concerns in a complex application.
Nevertheless, for the purpose of the simple example
I would consider constructing a single object
and using a member function with the implied 'this'


This is *very* bad advice.


Then what is so bad about it? It works as intended. I see you made
some useless and irrelevant changes to my suggested code, but nothing of
substance. So how would you code it to squeeze the desired result out
of a compiler that does not include NRVO?
Jul 22 '05 #19
On Mon, 19 Apr 2004 21:12:58 GMT, "Siemel Naran"
<Si*********@REMOVE.att.net> wrote:
"tom_usenet" <to********@hotmail.com> wrote in message
news:uf********************************@4ax.com.. .
On Sun, 18 Apr 2004 14:57:20 +0200, "Jakob Bieling" <ne*****@gmy.net>
>test func1 (int r)
>{
> if (r == 0)
> return test ();


The line above disables the optimization for most compilers.
Generally, for NRVO to work, all return statements must return the
same object (although this isn't a requirement of 12.8/15, just common
practice).


Compilers should allow NVRO if the different return statements return
objects declared after the previous return statement.


They don't in general though. NRVO is generally implemented by passing
a hidden parameter for a pointer to some storage in which the return
value should be constructed, but compilers generally only bother if
there is only one object to return, since in the general case it is
very hard to do otherwise. e.g.

test f()
{
test t(5);
if (today == TUESDAY)
return test(10);
else
return t;
}

The compiler doesn't know whether to construct t in the return storage
space until runtime, so it doesn't bother, and just copies u or v into
the return value without optimizing it.
Maybe explicit braces will help, but I'm not sure:

test func1 (int r)
{
if (r == 0)
return test ();

/*else*/ {
test t = func1 (r - 1);
t.m_strings.push_back ("lala");
return t;
}
}
That doesn't do it - the braces make no difference.

Or even mutual recursion

test func1 (int r)
{
return r == 0 ? test () : private_func1_assumes_r_greater_than_zero(r);
}

test private_func1_assumes_r_greater_than_zero (int r)
{
test t = func1 (r - 1);
t.m_strings.push_back ("lala");
return t;
}


That does do it, since it follows the rules I posted in my previous
post (if a named local is returned, all returns return the same one).
Still, simpler is the version I posted:

test func1 (int r)
{
test t(r == 0? test() : func1 (r - 1));
if (r != 0)
t.m_strings.push_back ("lala");
return t;
}

also does the job.

Tom
--
C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
Jul 22 '05 #20
"tom_usenet" <to********@hotmail.com> wrote in message
news:9a********************************@4ax.com...
Still, simpler is the version I posted:

test func1 (int r)
{
test t(r == 0? test() : func1 (r - 1));
if (r != 0)
t.m_strings.push_back ("lala");
return t;
}

also does the job.

Thanks Tom, but as I am bound to using VC++ 7.1, I guess I will try to
use the proxy-method mentioned by Denis.

Thanks!
--
jb

(replace y with x if you want to reply by e-mail)
Jul 22 '05 #21

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

29 posts views Thread by pmatos | last post: by
32 posts views Thread by zl2k | last post: by
2 posts views Thread by Eric Lilja | last post: by
32 posts views Thread by Axel Bock | last post: by
3 posts views Thread by Igor.Smirnov | last post: by
18 posts views Thread by terminator(jam) | last post: by
reply views Thread by devrayhaan | last post: by
reply views Thread by gheharukoh7 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.