473,327 Members | 2,118 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,327 software developers and data experts.

Getter returning strings by value -- good or bad?

Hi,

I was wondering why library implementors often
make getter functions return strings by value
(copies). For example, in boost::filesystem the
leaf() function returns an std::string by value.
So does Gnome::Vfs::FileInfo::get_name().

Isn't that unnecessary overhead? I could as well
return by reference to const-string and avoid
copying the string, right? That's what I always
do, and it works quite well.
Okay, for strings it's maybe not that of a
performance hit, but with more complex objects I'd
reckon it's a bad idea to return copies.

I'm asking because I have this scenario:
I have a class called File, which defines an
interface for accessing information about a file.
Its state is mostly defined by a
Gnome::Vfs::FileInfo object.
Many functions in the File interface are just
wrappers; they delegate calls to the encapsulated
FileInfo object.

One such example is a function get_name(), which
returns the name of a file. It's implemented like
this:

Glib::ustring File::get_name() const {
return m_finfo_ptr->get_name();
}

I previously wanted to implement it like this:

const Glib::ustring& File::get_name() const {
return m_finfo_ptr->get_name();
}

That however issues a warning, because I'm
referencing a temporary (FileInfo::get_name()
returns a copy of the name-string, that's actually
why this question came up).

But back to the first version:

Glib::ustring File::get_name() const {
return m_finfo_ptr->get_name();
}

How many copies are created here?
FileInfo::get_name() creates a copy, and I'm then
returning a copy of that copy, right? That's terrible!

Please, someone shed some light on this.

Regards,
Matthias
Jul 23 '05 #1
13 4469
Matthias Kaeppler wrote:
I was wondering why library implementors often make getter functions
return strings by value (copies). For example, in boost::filesystem the
leaf() function returns an std::string by value. So does
Gnome::Vfs::FileInfo::get_name().

Isn't that unnecessary overhead? I could as well return by reference to
const-string and avoid copying the string, right? [...]


What would that be a reference to, and how is it going to fare in a multi-
threaded program? Returning a string by value is perfectly fine from
a member function called for a temporary, but returning a reference is
most likely isn't...

Compilers can optimize but they can't fix broken code for you. Keep that
in mind :-)

V
Jul 23 '05 #2
Victor Bazarov wrote:
What would that be a reference to, and how is it going to fare in a multi-
threaded program? Returning a string by value is perfectly fine from
a member function called for a temporary, but returning a reference is
most likely isn't...
Well, it would be a reference to an object which
already exists. Like this:

class FileInfo {
private:
Glib::ustring name;
public:
const Glib::ustring& get_name() const {
return name;
}
};

I don't see why this is a problem in multithreaded
environments. The reference returned is a
reference to const, so you can't write it anyway;
you don't need a lock on this object and several
threads may read it at will without breaking
integrity.
Compilers can optimize but they can't fix broken code for you. Keep that
in mind :-)
I don't see where the code above is broken.

V

Jul 23 '05 #3
Matthias Kaeppler wrote:
Victor Bazarov wrote:
What would that be a reference to, and how is it going to fare in a
multi-
threaded program? Returning a string by value is perfectly fine from
a member function called for a temporary, but returning a reference is
most likely isn't...

Well, it would be a reference to an object which already exists. Like this:

class FileInfo {
private:
Glib::ustring name;
public:
const Glib::ustring& get_name() const {
return name;
}
};

I don't see why this is a problem in multithreaded environments. The
reference returned is a reference to const, so you can't write it
anyway; you don't need a lock on this object and several threads may
read it at will without breaking integrity.


Yes, but if one of the threads wants to write to it, the readers would
have to wait, no? But if they decided to use the reference without
locking it, what's going to happen if somebody changes the contents of
that value in some writer thread?
Compilers can optimize but they can't fix broken code for you. Keep that
in mind :-)

I don't see where the code above is broken.


The code above doesn't do anything hence it can't be broken. Now, imagine
somebody writes

std::string const& sr = someclass().somememberreturningaconstref();

a temporary is gone and 'sr' has become invalid as soon as it was defined.
It's more difficult if you return a string by value.

V
Jul 23 '05 #4
Matthias Kaeppler wrote:
I was wondering why library implementors often
make getter functions return strings by value
(copies).
Sometimes because they assume a reference-counted string class.
For example, in boost::filesystem the
leaf() function returns an std::string by value.
Boost is 'off topic'. They just imitate the inefficient STL-style where
everything is 'by value'.
So does Gnome::Vfs::FileInfo::get_name().
Isn't that unnecessary overhead?
Yes, if the string is duplicated. No, if the string is ref-counted.
I could as well
return by reference to const-string and avoid
copying the string, right?
Yes, if it's possible (i.e. if you have an object for which you can
return a refernce).
That's what I always
do, and it works quite well.
Okay, for strings it's maybe not that of a
performance hit, but with more complex objects I'd
reckon it's a bad idea to return copies.

I'm asking because I have this scenario:
I have a class called File, which defines an
interface for accessing information about a file.
Its state is mostly defined by a
Gnome::Vfs::FileInfo object.
Many functions in the File interface are just
wrappers; they delegate calls to the encapsulated
FileInfo object.

One such example is a function get_name(), which
returns the name of a file. It's implemented like
this:

Glib::ustring File::get_name() const {
return m_finfo_ptr->get_name();
}

I previously wanted to implement it like this:

const Glib::ustring& File::get_name() const {
return m_finfo_ptr->get_name();
}

That however issues a warning, because I'm
referencing a temporary (FileInfo::get_name()
returns a copy of the name-string, that's actually
why this question came up).

But back to the first version:

Glib::ustring File::get_name() const {
return m_finfo_ptr->get_name();
}

How many copies are created here?
FileInfo::get_name() creates a copy, and I'm then
returning a copy of that copy, right? That's terrible!


You are right. But there is a huge difference between what is described
in tutorials/books and what is used in the real world. Do you know if
Glib::ustring is ref-counted? If not then the implementation is
probably 'suboptimal'.

Jul 23 '05 #5
Ian
Matthias Kaeppler wrote:

Well, it would be a reference to an object which already exists. Like this:

class FileInfo {
private:
Glib::ustring name;
public:
const Glib::ustring& get_name() const {
return name;
}
};
Why would you want to return a reference in a more general case? The
example you quote is fine, but on some systems it may be more efficient
to return by value.
I don't see why this is a problem in multithreaded environments. The
reference returned is a reference to const, so you can't write it
anyway; you don't need a lock on this object and several threads may
read it at will without breaking integrity.

In this trivial case, no. But in situations where the string is
constructed in the function, maybe using a member variable, you hit
problems.
Compilers can optimize but they can't fix broken code for you. Keep that
in mind :-)

I don't see where the code above is broken.

As a trivial case that can get inlined away, it isn't!

Ian
Jul 23 '05 #6
Panjandrum wrote:
Matthias Kaeppler wrote:
I was wondering why library implementors often
make getter functions return strings by value
(copies).

Sometimes because they assume a reference-counted string class.


That's a good point, I didn't think about that. Actually I don't know if
it's ref-counted. I'll look into that.
How many copies are created here?
FileInfo::get_name() creates a copy, and I'm then
returning a copy of that copy, right? That's terrible!

You are right. But there is a huge difference between what is described
in tutorials/books and what is used in the real world. Do you know if
Glib::ustring is ref-counted? If not then the implementation is
probably 'suboptimal'.


But the compiler will optimize away the second copy, right? (assuming
the worst case, that the string is NOT ref-counted)?
I mean, it doesn't lokk like if I had a choice here.

--
Matthias Kaeppler
Jul 23 '05 #7
Victor Bazarov wrote:

Yes, but if one of the threads wants to write to it, the readers would
have to wait, no? But if they decided to use the reference without
locking it, what's going to happen if somebody changes the contents of
that value in some writer thread?

Okay, you have a point here.

But: If the string should be indeed reference counted, doesn't that mean
all those problems are still there? The "copy" returned by the getter
would probably have a pointer to the string data which it shares with
another string object and this data could at the same time be accessed
by a second thread--because locking the object itself doesn't do
anything. Both objects are internally pointing to the same string data,
but they are still different objects!
That seems to be even worse than returning by reference.
The code above doesn't do anything hence it can't be broken. Now, imagine
somebody writes

std::string const& sr = someclass().somememberreturningaconstref();

a temporary is gone and 'sr' has become invalid as soon as it was defined.
It's more difficult if you return a string by value.


Not sure what you want to tell me here. Why would sr be invalid? sr is
initialized with the string returned by the member function which
returns a reference to it. After that line, working with sr is just like
working with the "real" string, just that you have read-only access.
Nothing is invalid here.

--
Matthias Kaeppler
Jul 23 '05 #8
Ian wrote:
Why would you want to return a reference in a more general case? The
example you quote is fine, but on some systems it may be more efficient
to return by value.
Under what circumstances would it be more efficient to copy-construct an
object than to return a reference to it? ^^
As a trivial case that can get inlined away, it isn't!


Hm. What would happen if the code can't get inlined?

--
Matthias Kaeppler
Jul 23 '05 #9
Ian
Matthias Kaeppler wrote:
Ian wrote:
Why would you want to return a reference in a more general case? The
example you quote is fine, but on some systems it may be more
efficient to return by value.

Under what circumstances would it be more efficient to copy-construct an
object than to return a reference to it? ^^

Consider a reference counted string, this could have a size of 8 bytes
(two pointers). Eight bytes is 64bits which is a single register on a
number of machines. Compilers often reserve one or more registers for
function returns. Some architectures use register wheels, so the sting
could be in a single out register.
As a trivial case that can get inlined away, it isn't!

Hm. What would happen if the code can't get inlined?

OK, inlined away and returns a constant - the last bit being the
important bit.

Ian
Jul 23 '05 #10
Ian
Matthias Kaeppler wrote:
Victor Bazarov wrote:

Yes, but if one of the threads wants to write to it, the readers would
have to wait, no? But if they decided to use the reference without
locking it, what's going to happen if somebody changes the contents of
that value in some writer thread?


Okay, you have a point here.

But: If the string should be indeed reference counted, doesn't that mean
all those problems are still there? The "copy" returned by the getter
would probably have a pointer to the string data which it shares with
another string object and this data could at the same time be accessed
by a second thread--because locking the object itself doesn't do
anything. Both objects are internally pointing to the same string data,
but they are still different objects!
That seems to be even worse than returning by reference.

That's where copy-on-write comes in. The same representation will be
passed around until something changes it. At this time, the
representation will be cloned and then modified.
The code above doesn't do anything hence it can't be broken. Now,
imagine
somebody writes

std::string const& sr = someclass().somememberreturningaconstref();

a temporary is gone and 'sr' has become invalid as soon as it was
defined.
It's more difficult if you return a string by value.

Not sure what you want to tell me here. Why would sr be invalid? sr is
initialized with the string returned by the member function which
returns a reference to it. After that line, working with sr is just like
working with the "real" string, just that you have read-only access.
Nothing is invalid here.

I think Victor was thinking of a member returning a local string variable.

Ian
Jul 23 '05 #11
Matthias Kaeppler wrote:
But: If the string should be indeed reference counted, doesn't that mean
all those problems are still there? The "copy" returned by the getter
would probably have a pointer to the string data which it shares with
another string object and this data could at the same time be accessed
by a second thread--because locking the object itself doesn't do
anything. Both objects are internally pointing to the same string data,
but they are still different objects!
That seems to be even worse than returning by reference.


Whether the string class is ref counted or not is irrelevant to the
client, and if it is ref counted then it's the classes responsibility
to ensure that the references are not messed up. As the client you
should read the documentation to see what guarantees the class makes
about thread safety and so forth.

Jul 23 '05 #12
Matthias Kaeppler wrote:
Panjandrum wrote:
Do you know if
Glib::ustring is ref-counted? If not then the implementation is
probably 'suboptimal'.
But the compiler will optimize away the second copy, right?


Maybe, maybe not. Still one needless heap allocation.
(assuming
the worst case, that the string is NOT ref-counted)?
I mean, it doesn't lokk like if I had a choice here.


It also depends on how often you copy the string. I wouldn't care for a
few dozens superfluous copies (that's the abstraction penalty of C++).
OTOH, things are probably different if you want to create and sort a
large vector<Glib::ustring>.

Jul 23 '05 #13

Matthias Kaeppler wrote:
Ian wrote:
Why would you want to return a reference in a more general case? The
example you quote is fine, but on some systems it may be more efficient
to return by value.


Under what circumstances would it be more efficient to copy-construct an
object than to return a reference to it? ^^
As a trivial case that can get inlined away, it isn't!


Hm. What would happen if the code can't get inlined?

--
Matthias Kaeppler


Inlined or not, the getter function can be optimized with the Return
Value Optimization or the Named Return Value Optimization (RVO and
NRVO, repectively), to eliminate the temporary and the copy, and
construct the object returned, in place at the calling site.

In fact, returning by value can often be more efficient than the usual
alternative of having the client pass a reference to an object to use
to hold the result. The drawback with this approach:
void getString( std::string& outResult) const
is the inconvenient syntax that forces the client to declare a
variable, even in the case that the result is to be passed onto another
function. One workaround is for the function to return a reference to
the supplied object:
std::string& getString( std::string& outResult) const
and while an improvement in some ways, it makes the overall interface
more complicated than needed.

But once the RVO optimization is applied, neither of those two routines
is as efficient as the most natural declaration:
std::string getString() const;
Returning by value need not be inefficient, as many still think.

An class object should certainly never return a non-const reference to
one of its own data members. Doing so essentially allows clients to
change its state behind its back, and pretty much renders useless any
sort of encapsulation the class was meant to provide. One may as well
make the data member a global variable at that point.

Greg

Jul 23 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Kiwi | last post by:
Hello. I know a getter can return other thing than a field. I know a setter can do more things than setting a field. I know there are "setter only" cases and "getter only" cases. I do use...
19
by: Rick | last post by:
Hi, I was wondering, can it be safely assumed that any function that returns a *char will return a NULL terminated string? Or does the function explicitly mention this always (and the ones that...
7
by: wonderboy | last post by:
Hey guys, I have a simple question. Suppose we have the following functions:- //-----My code starts here char* f1(char* s) { char* temp="Hi"; return temp;
7
by: sienaman | last post by:
I have a C# dll with a COM interface that is successfully call by a C++ client. One of the parameters is a string, the method looks like void Goofy(string strUserInput, out string strOutput); ...
4
by: Jimbo | last post by:
I am sort of new to C#. Currently have a private property called "_name" in a class. I have written a public getter and setter routine for it called "Name". Currently, the getter for the...
9
by: Paul | last post by:
Hi, I feel I'm going around circles on this one and would appreciate some other points of view. From a design / encapsulation point of view, what's the best practise for returning a private...
17
by: kleary00 | last post by:
Hi, I am writing a function that needs to return an array of strings and I am having some trouble getting it right. I need some help. Here is what I consider an array of 100 strings: char...
4
by: Shawn McGrath | last post by:
Hi, I'm trying to expose a C++ class' internals to python via boost::python. I can do integer/boolean functions fine, but as soon as I do a string get/set it craps out. ...
26
by: Turin | last post by:
Dear all; As far as I understand the idea behind getter methods, it is used to make sure that private memers of a class is returned appropriately to the calling object. However, if all I am...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.