By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,949 Members | 1,345 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,949 IT Pros & Developers. It's quick & easy.

RVO alternative

P: 17
Hi,

I'm having problems with dynamic memory, and returning values from functions. Consider the following function signature:
Expand|Select|Wrap|Line Numbers
  1. Thing buildThing() { return ThingImpl; }
  2.  
And the calling:
Expand|Select|Wrap|Line Numbers
  1. Thing& t = buildThing();
The calling can be optimized through return value optimization (RVO), such that no copying needs to take place. The caller will get the value on its stack directly, and no heap memory will be used.

Now, my question is: can the same thing be accomplished without relying on RVO (i.e accomplish the same thing explicitly)? If not, is there any specific reasons why this is? I think I am confused about when dynamic memory should be used and not. I find myself wanting to acheive this very often.

Thanks
Nov 16 '08 #1
Share this Question
Share on Google+
12 Replies


Expert 100+
P: 671
Your question doesn't really make sense. On one hand you ask about dynamic memory allocation, which is part of C++. Then you talk about RVO, which is a compiler optimization that has nothing to do with the code you write.

You use allocate dynamically when you can't directly on the stack. Examples are too much to allocate for the stack, or you don't know until runtime.
Nov 17 '08 #2

P: 17
Thank you for your answer. What I tried to show with my example is that I often want to put things on the caller stack. This is true on many occasions, where methods build, initializes and returns an instance. The example I showed was an example where the implementation class is unknown by the caller, which I felt was an important example, since it prohibits the caller to instantiate the variable on his stack before calling the builder method (this independence is what I want to achieve quite often).

In order to make an object survive onto the parent scope (put it on the callers stack), it seems as if you need to return variables by value. This generally means extra copy overhead, when copying back to the callers stack. I would like to know what the rationale behind the copy semantics is. It seems inefficient, and it would not be too hard to have a "construct onto callers stack" keyword or idiom (much like RVO works). Maybe there are some hardware considerations, or other issues which makes copy semantics necessary. I don't understand why there is no way of explicitly ask for RVO behavior (or I simply don't understand how to write it).
Nov 17 '08 #3

weaknessforcats
Expert Mod 5K+
P: 9,197
As a general rule in C++ avoid stack memory like the Black Death.

Passing pointers to stack items is the single most dangerous thing you can do in a C++ program. Right when you need them, the stack item has gone out of scope.

Instead, use the heap. The new operator returns a pointer to the allocation. I recommend you learn how to use a handle with pointers. You might read this:
http://bytes.com/forum/thread651599.html.

Beyind that, the return emantics are that returning a reference does not make a copy. Otherwise a copy is made and this is a copy constructor call which is usually a deep-copy. So don't do this. On the other hand, returning a reference is useless if it refers to a local stack variable.So don't do that either. Otherwise the reference is to a function argument that is also a reference. So there's no point in returning anything. That leaves returning a reference based on dereferencing a heap pointer. So don't to that either. Just return the pointer, er, handle.
Nov 17 '08 #4

P: 17
Weaknessforcats, thanks for taking your time answering.

As a general rule in C++ avoid stack memory like the Black Death.
What do you mean by this? Generally avoid or only passing pointers to stack memory which goes out of scope?

Passing pointers to stack items is the single most dangerous thing you can do in a C++ program. Right when you need them, the stack item has gone out of scope.
This I understand. I understand that variables do go out of scope, but what I meant was that it sometimes might be useful if they don't (if they survive to the parent scope).

Instead, use the heap. The new operator returns a pointer to the allocation. I recommend you learn how to use a handle with pointers. You might read this:
http://bytes.com/forum/thread651599.html.
What I don't understand is why I should use the heap for situations where I only want the caller to get the instance back. As oler1s said, it makes sense to use the heap when you want to share an instance, or you don't have enough space on the stack. But using heap space just because you want to avoid copying an instance doesn't make sense. But in c++ you cannot make a variable survive to the parent stack, so you are forced to use the heap or copy the variable back (return by value). The only way to avoid this is to trust RVO. This is what I don't think makes much sense.

The handle class achieves that level of indirection, and could work without the heap, thank you for that tip.

Beyind that, the return emantics are that returning a reference does not make a copy. Otherwise a copy is made and this is a copy constructor call which is usually a deep-copy. So don't do this. On the other hand, returning a reference is useless if it refers to a local stack variable.So don't do that either. Otherwise the reference is to a function argument that is also a reference. So there's no point in returning anything. That leaves returning a reference based on dereferencing a heap pointer. So don't to that either. Just return the pointer, er, handle.
This is excactly what I mean, there is something missing. You have to choose between return by value or pointer to heap-allocated memory.
Nov 18 '08 #5

weaknessforcats
Expert Mod 5K+
P: 9,197
What do you mean by this? Generally avoid or only passing pointers to stack memory which goes out of scope?
I mean avoid using stack memory for anything leaving the function. You don't pass back references to local variables and so you don't pass back addresses to local variables. The problem is when you have a pointer, you can't tell if a) it's any good, b) it's on the heap. Now you can't clean up.

What I don't understand is why I should use the heap for situations where I only want the caller to get the instance back. As oler1s said, it makes sense to use the heap when you want to share an instance, or you don't have enough space on the stack. But using heap space just because you want to avoid copying an instance doesn't make sense. But in c++ you cannot make a variable survive to the parent stack, so you are forced to use the heap or copy the variable back (return by value). The only way to avoid this is to trust RVO. This is what I don't think makes much sense.
I repeat. Create your object on the heap and return a handle (not a pointer) to it. Did you read the handle article? Why spend time making a copy to return? Rely on RVO? Never. Code that makes assumptions about what the compiler will or will not do is weak code. You avoid the stack becuse you don't want the compiler contolling the life of your data. You want to be in charge of that.

Also, it is not true that you use the heap only when you want to share and instance. You share an instance by having the address (or reference) to that instance inside other objects. This has nothing to do with having a handle (or pointer) member variable. If your object has the only address to that heap object, then it's not shared.

This is excactly what I mean, there is something missing. You have to choose between return by value or pointer to heap-allocated memory.
Correct. So use a handle instead of returning a pointer so that proper cleanup occurs. Note that returning a value and returning a pointer are each return by value. In the case of a handle, you make a copy of the handle and you do not want RVO stepping in because the copy needs to adjust reference counts and RVO will screw that up. You have to be sure the handle copy constructor is called.

Lastly, if you start using handles, be sure to never use a reference to a handle. Be certain copies are made that are managed by the handle member functions.
Nov 18 '08 #6

P: 17
I mean avoid using stack memory for anything leaving the function. You don't pass back references to local variables and so you don't pass back addresses to local variables. The problem is when you have a pointer, you can't tell if a) it's any good, b) it's on the heap. Now you can't clean up.
agreed.

I repeat. Create your object on the heap and return a handle (not a pointer) to it. Did you read the handle article? Why spend time making a copy to return? Rely on RVO? Never. Code that makes assumptions about what the compiler will or will not do is weak code.
I did read the article. A smart pointer (handle object) is sort of a flyweight pattern which does not directly address the real problem, but rather avoids it. You will still need to copy your flyweight smart pointer around, plus you get all the other problems related to reference counting.

My point is that what you a) do not want to spend time making a copy to return, b) do not want to be _forced_ to do reference counting and c) do not want to have ambiguities about who is responsible for the allocation. And how is that solved? Easy, the variable needs to enter the parent scope. That's it. Simple. How do we do that? By copying? Well, that probably worked well for C, but for large objects it is a mess. So we can employ auto pointers or handle objects. That is a workaround. What is the real solution? According to me, the real solution is to support RVO-like behavior intrinsically in the language. You want to be able to have a subroutine create a variable on the parent stack.

If you look at this article, first answer, you see what I mean. C++0x addresses the problem by assuming flyweight wrappers, and allowing you to write special assignment operators which swaps the impl class between flyweights. Again, I say, avoiding the real problem.

You avoid the stack becuse you don't want the compiler contolling the life of your data. You want to be in charge of that.
Yes, and I say let's include it in the syntax. Let the programmer explicitly ask for this behavior.

Also, it is not true that you use the heap only when you want to share and instance. You share an instance by having the address (or reference) to that instance inside other objects. This has nothing to do with having a handle (or pointer) member variable. If your object has the only address to that heap object, then it's not shared.
Ok fine. But my point was that heap memory should be used when it is really needed, not when it's just for workarounds.

Correct. So use a handle instead of returning a pointer so that proper cleanup occurs. Note that returning a value and returning a pointer are each return by value. In the case of a handle, you make a copy of the handle and you do not want RVO stepping in because the copy needs to adjust reference counts and RVO will screw that up. You have to be sure the handle copy constructor is called.

Lastly, if you start using handles, be sure to never use a reference to a handle. Be certain copies are made that are managed by the handle member functions.
I thank you for the tips, and I understand that you know what you are talking about. But if you try to see it from my point, perhaps you could just explain to me why it is a bad idea (unfeasible or what), to just expand the scope of variables returned. Too me, it just seems to be the simplest and most intuitive solution. Perhaps this is related to keeping C compatibility, or the way that call stacks are typically implemented, I don't know.

My suggestion would be to let the child stack allocate return variables on top of its stack. Upon return, the parent stack would be expanded to include the return variables from the subroutine, where after the remaining parts of the child stack would be destroyed. Thus a part of the child stack would be passed to the parent stack without copying. That is my concrete example. Now maybe you can tell me why this is not feasible or why it doesn't make sense? Thank you.
Nov 19 '08 #7

weaknessforcats
Expert Mod 5K+
P: 9,197
You will still need to copy your flyweight smart pointer around,
plus you get all the other problems related to reference counting.
Correct. You will need to copy two pointers. But that is buried in the
handle constructors and assignment operators. There is no problem with
reference counting. The handle member functions do it.The article on
handle classes on the HowTos forum has a template for a referecnce counted
handle. Or you can use the new C++ 2003 STL shared_ptr.

What is the real solution? According to me, the real solution is to support
RVO-like behavior intrinsically in the language. You want to be able to have
a subroutine create a variable on the parent stack.
That assumes the parent scope is always higher in the stack than the current
stack frame. That's a bad assumtion. Your variable may be on any stack frame
in existense not just the caller's stack frame. So what you suggest here
simply won't work.

Besides, C++ doesn't work that way and never will so this is all academic.

Ok fine. But my point was that heap memory should be used when it is really needed,
not when it's just for workarounds.
What's this supposed to mean? I see nothing holy about the stack other than
it allows programmers to avoid cleaning up after themselves.

Besides, I suspect you are taalking about the CRT heap. I assume you are aware
that many programs use a private process heap and don't use the CRT heap at all.

My suggestion would be to let the child stack allocate return variables on top
of its stack. Upon return, the parent stack would be expanded to include the
return variables from the subroutine, where after the remaining parts of the
child stack would be destroyed. Thus a part of the child stack would be passed
to the parent stack without copying. That is my concrete example. Now maybe
you can tell me why this is not feasible or why it doesn't make sense? Thank you.
As I said C++ doesn't work this way.
Nov 20 '08 #8

Banfa
Expert Mod 5K+
P: 8,916
My suggestion would be to let the child stack allocate return variables on top of its stack. Upon return, the parent stack would be expanded to include the return variables from the subroutine, where after the remaining parts of the child stack would be destroyed. Thus a part of the child stack would be passed to the parent stack without copying. That is my concrete example. Now maybe you can tell me why this is not feasible or why it doesn't make sense? Thank you.
Because it wouldn't work, all that would happen is your stacks would expand until there was no more space for them (i.e. you blow your stack).

Think of it like this, assuming that it was possible when returning to discard all the stack of the called function apart from the bit of data you wanted, which I doubt is possible without copying the data to another location on the stack and copying is what you are trying to avoid, but lets assume that.

Many (most?) programs of any consequence are designed to run for a long time, if not indefinitely. However every time you call a function with these return semantics the stack of the calling function is increased in size. Indefinite running generally means that somewhere there is a loop that is preventing at least 1 function exiting for the life time of the program, in fact there must 1 an instance of some function for every thread that the program is running in this state.

Currently when a function exits all it's data is returned to the stack, the size of the stack frame for a function is based solely on its own data use plus any processor required register copies. If all functions had these calling semantics then it is clear that any function returning a value would increase the size of the calling functions stack, for an indefinitely running thread whose top level function must be making calls to other functions that means a permanently growing stack.

I suppose you could add keywords that would let you specify whether to use these calling semantics and then also add the stipulation that functions with these semantics should not be called from functions that are expected to be in the running scope for an indefinite period of time. But all you are really doing is designing new and interesting ways of creating maintenance issues that may well be very easily hidden.

Why do this when there are already plenty of good ways to return data from a function without lots of copying that don't introduce any chance of destroying the stack.

IMO the suggested solution is very poor in that it creates the opportunity for many problems while attempting to solve something that is not really a problem at all.


BTW in C the answer was very simple. Don't return structures from a function! That is because to many C compilers did it in a manor that was not thread safe so in C best practice is if you want to return something that is not a basic type (and a pointer is a basic type) then the calling function should actually pass a pointer into the called function with the location of the place to store the data. Allocating the data then becomes the responsibility of the calling function and it may use the stack or the heap or any other data segment as appropriate.
Nov 20 '08 #9

P: 17
Hi again. Sorry for the delay, I wanted to do some research before I felt like I could provide any useful feedback. By talking to colleagues I've found out that the stack is usually located in the L1 cache in the CPU, and that this abstractions functionality thus is hardware bound. I've never really considered that, I thought it was merely a way of handling automatic memory.

@weaknessforcats
This I simply don't understand. I always assumed that a stack is linear (I'm thinking stack traces here). How could the subroutines stack frame be anywhere else but on top of the callers stack frame? Could this confusion be because I talked about two stacks before, when I really meant one stack and two frames (I've read up on it a bit now)?

Besides, C++ doesn't work that way and never will so this is all academic.
Very true. But I feel like I need to understand the logic behind why things work like they do before I can understand how to become a good programmer. And this question has puzzled me for a very long time.

What's this supposed to mean? I see nothing holy about the stack other than
it allows programmers to avoid cleaning up after themselves.
I always thought that the stack was meant to be used for situations when you want to bypass automatic memory, which I thought was supposed to be very unusual. What I mean is that if you use stack memory, you will never get any memory leaks, so you should stick with this as much as possible, until you need to make an exception, and then spend effort at those places to make sure that you don't make mistakes.

Besides, I suspect you are taalking about the CRT heap. I assume you are aware that many programs use a private process heap and don't use the CRT heap at all.
I did not imply any specific heap.

As I said C++ doesn't work this way.
I believe you, I would just like to understand why it was designed like it is. The stack abstraction is even built into the x86 architecture, so there have to be some good reasons behind it :) I just don't understand it, and I believe that I cannot be a good C/C++ programmer before I understand this.
Nov 25 '08 #10

P: 17
Banfa, thank you for your elaborate post.

@Banfa
I agree with you that this is the case. But with a small modification, it would work. One could check if the return value is replacing any old objects, and throw these orphans away. This might require deleting parts of the stack (random access), and might not be implemented for hardware reasons, I don't know. But as far as I see it, it could work. Or am I missing something?

Why do this when there are already plenty of good ways to return data from a function without lots of copying that don't introduce any chance of destroying the stack.
Mostly because I feel that the alternatives feel like workarounds and that I don't understand the logic behind when and why I need to use which method. I guess I would like to first get a good answer to what we are talking about now, and when I understand that, I would like some guide to memory management with logical reasoning behind it. Perhaps I need some book, take some courses in computer hardware or just learn assembly, but I feel like I want to understand why C++ is like it is, it just doesn't make sense to me right now.

BTW in C the answer was very simple. Don't return structures from a function! That is because to many C compilers did it in a manor that was not thread safe so in C best practice is if you want to return something that is not a basic type (and a pointer is a basic type) then the calling function should actually pass a pointer into the called function with the location of the place to store the data. Allocating the data then becomes the responsibility of the calling function and it may use the stack or the heap or any other data segment as appropriate.
This approach is sometimes acceptable, but it does require the caller to know exactly what comes back, and doesn't allow you to make a "template function" abstraction. Not that this is the most important thing in the world, but I think that that consequence is merely a symptom indicating that something is wrong with that idiom. It would also be interesting to know why C compilers decided to return structs in a non-typesafe way, if that gives any hints to why the calling convention and call stacks are implemented the way they are.
Nov 25 '08 #11

Banfa
Expert Mod 5K+
P: 8,916
@disown
Yes the way stacks generally work. You see you don't get random access to a stack, you can't deallocate bits of it. When a function is called a stack frame is created on the stack. The stack frame consists of everything required to exit from the function plus all the data for any variables in the called function with automatic scope. "everything required to exit from the function" is normally a copy of some or all the processor registers (or at least the ones that are general purpose) plus the instruction point to the address to return to, plus the stack of the previous stack frame plus the size of this stack frame. When the function exits the entire stack frame is removed.

The stack is not random access it is allocated contiguously. That is there are no gaps in the stack, all the data below the current stack frame contains stack frames of earlier function calls and all the data above the current stack frame is free.

So you can not just replace 1 object with another, you would have to copy the object to the new location, otherwise you are going to have to subsume a large chunk of the stack frame of the called function into the calling function. As already stated anything that grows the stack frame is a recipe for disaster.

You can not avoid a copy when returning by value.

@disown
I did mention that that was C best practice not C++.
Nov 25 '08 #12

P: 17
I get your point with fragmentation and stacks being continuous, and also that you need to have the exact same size of the new variable if you want to do an overwrite.

_But_ if you would extend the stack with a random access range delete (think of pulling out a plate from the middle of the stack), it would work. If simple variable 'a' refers to item 5 in the stack, and 'a' is assigned the value of a recently pushed return value (a=f()), item number 5 can be zapped (no longer referenced), and 'a' can be "re-pointed" to item number 1 in the stack.

I realize that compilers would need to be aware of that stack items can disappear, but I believe that it wouldn't be that complex, since you always delete things that are currently in scope (since you can only assign things currently in scope).

Now the real question is if the stack needs to be physically contiguously stored in memory (if the stack is implemented as a linked list-like structure, there would be almost no overhead in random access delete). If it is important for the stack to be physically contiguous, it would probably be possible to implement the stack collapse operation in hardware fairly quickly (i can think of 1 clock cycle shift register implementations).
Nov 27 '08 #13

Post your reply

Sign in to post your reply or Sign up for a free account.