473,785 Members | 2,851 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How is string::c_str() usually implemented?

I'm curious about the performance of string::c_str,
so I'm wondering how it's commonly implemented. Do
most std::string implementations just keep an extra
char allocated for the NULL termination so they can
return a pointer to their internal buffer, or are
they equally likely to create a new buffer on demand?
I know the standard doesn't require any particular
implementation, which is why I'm curious if there is
a consensus among implementations .
Jul 22 '05 #1
15 11610
On Tue, 25 May 2004 10:18:43 -0400, Derek <us**@nospam.or g> wrote:
I'm curious about the performance of string::c_str,
so I'm wondering how it's commonly implemented. Do
most std::string implementations just keep an extra
char allocated for the NULL termination so they can
return a pointer to their internal buffer, or are
they equally likely to create a new buffer on demand?
I know the standard doesn't require any particular
implementation , which is why I'm curious if there is
a consensus among implementations .


I looked at about 5, and they all either return the pointer directly, or
first do a conditional test and return one of several pointers (I suspect
this has to do with the small string optimization). But none of the ones
I've looked at go creating a new buffer. I don't see how they /could/,
because it would change the dynamics of the pointer returned. The caller
would become responsible for deleting the memory, rendering code using that
library implementation incompatible with code from "normal"
implementations .

Looking up the Standard's spec for c_str (21.3.6), I didn't see anything
that explicitly disallows allocating a copy (I perhaps just don't know
where to look), but the way it says the pointer will become invalidated
after any call to a non-const string member function (on the associated
string) pretty much tells the tale.
-leor
--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html
Jul 22 '05 #2
On Tue, 25 May 2004 10:18:43 -0400, Derek wrote:
I'm curious about the performance of string::c_str, so I'm wondering how
it's commonly implemented. Do most std::string implementations just
keep an extra char allocated for the NULL termination so they can return
a pointer to their internal buffer, or are they equally likely to create
a new buffer on demand? I know the standard doesn't require any
particular implementation, which is why I'm curious if there is a
consensus among implementations .


Here's from the GCC-3.4 C++ standard library basic_string.h:
// _Rep: string representation
// Invariants:
// 1. String really contains _M_length + 1 characters; last is set
// to 0 only on call to c_str(). We avoid instantiating //
_CharT() where the interface does not require it.

I believe other implementations also keep an extra byte around but I'm not
sure if they keep it set to 0 or not.

-Ryan Mack
email: [first letter of first name][last name]@[last name]man.net
Jul 22 '05 #3
Leor Zolman wrote:

I looked at about 5, and they all either return the pointer directly, or
first do a conditional test and return one of several pointers (I suspect
this has to do with the small string optimization). But none of the ones
I've looked at go creating a new buffer. I don't see how they /could/,
because it would change the dynamics of the pointer returned. The caller
would become responsible for deleting the memory, rendering code using that
library implementation incompatible with code from "normal"
implementations .

Looking up the Standard's spec for c_str (21.3.6), I didn't see anything
that explicitly disallows allocating a copy (I perhaps just don't know
where to look), but the way it says the pointer will become invalidated
after any call to a non-const string member function (on the associated
string) pretty much tells the tale.


The basic_string object can delete the buffer at those points if it
allocated one, or it can wait until the next call to c_str or until
destruction. Not that anybody does it that way, of course. But the idea
was that basic_string could be implemented to hold its text in
non-contiguous chunks, and only be required to gather their contents
together on a call to c_str.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Jul 22 '05 #4
Leor Zolman wrote:
I'm curious about the performance of string::c_str, so
I'm wondering how it's commonly implemented. Do most
std::string implementations just keep an extra char
allocated for the NULL termination so they can return a
pointer to their internal buffer, or are they equally
likely to create a new buffer on demand? I know the
standard doesn't require any particular implementation,
which is why I'm curious if there is a consensus among
implementations .
I looked at about 5, and they all either return the
pointer directly, or first do a conditional test and
return one of several pointers (I suspect this has to do
with the small string optimization).


Thanks for taking the time. I appreciate it.
But none of the ones I've looked at go creating a
new buffer. I don't see how they /could/, because it
would change the dynamics of the pointer returned.
The caller would become responsible for deleting the
memory, rendering code using that library implementation
incompatible with code from "normal" implementations .
Looking up the Standard's spec for c_str (21.3.6), I
didn't see anything that explicitly disallows allocating
a copy (I perhaps just don't know where to look), but the
way it says the pointer will become invalidated after
any call to a non-const string member function (on the
associated string) pretty much tells the tale.


I think a new buffer can be returned without transferring
ownership to the caller. (Though I can't really imagine
why an implementation would want to do this.) As long as
the string keeps an internal pointer to the buffer, the
string can be responsible for ownership and the caller
doesn't have to worry about deleting it. The string
could delete the c_str() buffer according to some policy
of its choosing, like when a non-const member is called.

Of course this scheme would require an extra pointer in the
string, memory allocation on c_str(), copying, and other
inefficiencies, but I think it's allowed. (Though I'm glad
most implementations seem to take a more efficient approach.)
Jul 22 '05 #5
"Leor Zolman" <le**@bdsoft.co m> wrote in message
I looked at about 5, and they all either return the pointer directly, or


My version returns the pointer directly. But I think the implementation
could set the null char and then return the pointer directly. This is so
that calls to string+=c just append 1 char, like vector.push_bac k(c), and
don't set the null char over and over again.
Jul 22 '05 #6
On Tue, 25 May 2004 11:13:12 -0400, Pete Becker <pe********@acm .org> wrote:
Leor Zolman wrote:

I looked at about 5, and they all either return the pointer directly, or
first do a conditional test and return one of several pointers (I suspect
this has to do with the small string optimization). But none of the ones
I've looked at go creating a new buffer. I don't see how they /could/,
because it would change the dynamics of the pointer returned. The caller
would become responsible for deleting the memory, rendering code using that
library implementation incompatible with code from "normal"
implementations .

Looking up the Standard's spec for c_str (21.3.6), I didn't see anything
that explicitly disallows allocating a copy (I perhaps just don't know
where to look), but the way it says the pointer will become invalidated
after any call to a non-const string member function (on the associated
string) pretty much tells the tale.


The basic_string object can delete the buffer at those points if it
allocated one, or it can wait until the next call to c_str or until
destruction. Not that anybody does it that way, of course. But the idea
was that basic_string could be implemented to hold its text in
non-contiguous chunks, and only be required to gather their contents
together on a call to c_str.


Ah yes, of course it /could/ be set up that way, and it does help to
understand that the Standard allows for this. But it also does seem that
the practical concern most folks run up against when first exposed to
c_str() runs along the lines of wanting to know its potential cost under
the real platform in use; empirically, the call is an effective freebie.
-leor

--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html
Jul 22 '05 #7
On Tue, 25 May 2004 11:27:45 -0400, Derek <us**@nospam.or g> wrote:
I think a new buffer can be returned without transferring
ownership to the caller. (Though I can't really imagine
why an implementation would want to do this.) As long as
the string keeps an internal pointer to the buffer, the
string can be responsible for ownership and the caller
doesn't have to worry about deleting it. The string
could delete the c_str() buffer according to some policy
of its choosing, like when a non-const member is called.

Of course this scheme would require an extra pointer in the
string, memory allocation on c_str(), copying, and other
inefficiencies , but I think it's allowed. (Though I'm glad
most implementations seem to take a more efficient approach.)


It indeed does not sound useful, or at all in the spirit of C/C++ to be
doing all that work when it usually would not be necessary. That's probably
why doing it that way didn't even occur to me. If the caller wants a
mutable version of the text out of a c_str() call, they just make the copy
themselves. No muss, no fuss, no unsolicited overhead.

It seems to me that none of the approaches that make an "unnecessar y" copy
of the buffer (the only "necessary" case being the one Pete outlined) would
really be consistent with the "const char *" return type from c_str()
anyway. I mean, if you wanted to advertise that you were providing a copy
that the caller could futz with, you wouldn't have that "const" there...and
what practical reason would there be for making a copy, other than to
provide a futz-able (tm) copy?
-leor
--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html
Jul 22 '05 #8
Leor Zolman <le**@bdsoft.co m> spoke thus:
Ah yes, of course it /could/ be set up that way, and it does help to
understand that the Standard allows for this. But it also does seem that
the practical concern most folks run up against when first exposed to
c_str() runs along the lines of wanting to know its potential cost under
the real platform in use; empirically, the call is an effective freebie.


Does "empiricall y" include implementations as old as 1999? And don't
think I'm talking about my implementation again, I never talk about my
implementation ;)

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cybers pace.org | don't, I need to know. Flames welcome.
Jul 22 '05 #9
Leor Zolman wrote:
...
The basic_string object can delete the buffer at those points if it
allocated one, or it can wait until the next call to c_str or until
destruction . Not that anybody does it that way, of course. But the idea
was that basic_string could be implemented to hold its text in
non-contiguous chunks, and only be required to gather their contents
together on a call to c_str.


Ah yes, of course it /could/ be set up that way, and it does help to
understand that the Standard allows for this. But it also does seem that
the practical concern most folks run up against when first exposed to
c_str() runs along the lines of wanting to know its potential cost under
the real platform in use; empirically, the call is an effective freebie.


Unfortunately, in many cases (and I've seen it more than once) once some
folks find out that in practice the pointer returned by 'c_str()' really
points to the beginning of the actual controlled sequence, they proceed
right on to casting away the constness and using the resultant pointer
to modify the stored string. In my opinion, in order to keep beginner
C++ programmers from getting this nasty habit it might be quite useful
to continue to perpetuate the idea that 'c_str()' might return a pointer
to a independently allocated buffer (which is, BTW, at least partially
true in implementations that return a pointer to a static empty-string
literal "" for 'std::string's of zero length).

--
Best regards,
Andrey Tarasevich

Jul 22 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
4670
by: Choubey | last post by:
hi everybody, Iam a java programmer actually i want to know how the Connection object is created through DriverManager.getConnection when DriverManager class is not implements Connection interface and no class in java implementas interface connection so how can we get a object reference for connection object and call a getconnection method through the connection object. i seen that there are lots of interface in java which are not...
19
3224
by: Leif K-Brooks | last post by:
Has anyone ever tried implementing a simple unstructured BASIC dialect in Python? I'm getting interested in language implementation, and looking at a reasonably simple example like that could be pretty interesting.
33
2742
by: Quest Master | last post by:
I am interested in developing an application where the user has an ample amount of power to customize the application to their needs, and I feel this would best be accomplished if a scripting language was available. However, I want to code this application in Python, and I have not yet heard of an implementation of another scripting language into Python. An example of what I mean is this (an implementation of Lua into Ruby -- which I'd...
66
7789
by: roy | last post by:
Hi, I was wondering how strlen is implemented. What if the input string doesn't have a null terminator, namely the '\0'? Thanks a lot Roy
3
4022
by: maria.s | last post by:
Hi, I try to create a calender-item in the personal calendar folder from an ASP.NET application using XML-HTTP Request (WebDAV). System: Windows 2003 SP1, Exchange 2003 SP1 Configuration IIS: Default Web Site stopped, OWA running on a second virtual site, my application is running on a third virtual site.
22
2460
by: sujilc | last post by:
This question seems to be silly. Can ony one figure out the default functions implemented by compiler when we decalre a class like class A { } According to me this declaration will define default functions like
7
1670
by: forgroupsonly | last post by:
Hello All. I wonder if browsers developers scoff at CSS developers... I do simple tests while reading CSS2.1 specification, just few boxes. And from time to time I see that recent browsers render these *simple* things *incorrectly* (each one in it's own way). If I had a big page with dozen of blocks and they rendered incorrectly - I'll call this a *bug* (and I can understand this). But when simple things from specification not...
3
1777
by: Satish Itty | last post by:
What is the advantage of the new auto implemented properties introduced in ocras? For example is there any difference between public string Message; and public string Message {get;set;}
12
1561
by: Author | last post by:
I know the basic differences between a struct and a class, but apparently not good enough to know why some types/concepts are implemented as struct whereas most types are implemented as class. For example, why is DateTime implemented as a struct? Why is BitArray implemented as a class, but BitVector32 a struct? In general, in OO design, how do we determine if we should use struct or class?
0
10324
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10147
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9949
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7499
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6739
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5380
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5511
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3645
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2879
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.