473,406 Members | 2,769 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

sizeof(std::string) seems to small

Dear C++ community,

I have a question regarding the size of C++ std::strings.
Basically, I compiled the following code under two different compilers:

std::string someString = "Hello, world!";
int size1 = sizeof(std::string);
int size2 = sizeof(someString);

and printed out the values of size1 and size2. size1 and size2 always
matched in value (in other words, size1 == size2). That makes sense to
me.

Under the Visual C++ 6.0, size1 and size2 both equalled 16, but
under a GNU C++ compiler (under Linux), size1 and size2 were both 4. I
understand that different compilers are allowed to implement
std::string differently which allows for the differences between the
results of sizeof(std::string) by the different compilers.

What I don't understand is why sizeof(std::string) returns 4 with
any compiler. I mean, a value of 4 just seems too small for me. I
figure that any std::string implementation should have at least a
pointer (which points to the main string), an integer storing the
already allocated space for the main string (whose value gets returned
in the call to std::string::capacity()), and possibly even an integer
storing the length of the string.

Just the pointer alone would take up 4 bytes (I tested it and
sizeof(char*) does indeed equal 4), so I can't see how there could
possibly be any more room for anything else, like the integer that
holds the already allocated space (the one used in
std::string::capacity()). The fact that Visual C++ has a
sizeof(std::string) of 16 makes a lot more sense to me, as it clearly
has enough space to hold these integers.

So my main question is: Assuming that sizeof(char*) equals 4, how
is it possible that sizeof(std::string) can be 4 on any compiler?

Also, shouldn't sizeof(std::string) be AT LEAST sizeof(char*) +
sizeof(unsigned int) ? I'm curious why it isn't on the GNU C++
compiler that I'm using.

Thank-you in advance for any responses.

-- Jean-Luc

Sep 28 '05 #1
12 13542

jl*****@hotmail.com wrote:
Dear C++ community,

I have a question regarding the size of C++ std::strings.
Basically, I compiled the following code under two different compilers:

std::string someString = "Hello, world!";
int size1 = sizeof(std::string);
int size2 = sizeof(someString);

and printed out the values of size1 and size2. size1 and size2 always
matched in value (in other words, size1 == size2). That makes sense to
me.

Under the Visual C++ 6.0, size1 and size2 both equalled 16, but
under a GNU C++ compiler (under Linux), size1 and size2 were both 4. I
understand that different compilers are allowed to implement
std::string differently which allows for the differences between the
results of sizeof(std::string) by the different compilers.

What I don't understand is why sizeof(std::string) returns 4 with
any compiler. I mean, a value of 4 just seems too small for me. I
figure that any std::string implementation should have at least a
pointer (which points to the main string), an integer storing the
already allocated space for the main string (whose value gets returned
in the call to std::string::capacity()), and possibly even an integer
storing the length of the string.

Just the pointer alone would take up 4 bytes (I tested it and
sizeof(char*) does indeed equal 4), so I can't see how there could
possibly be any more room for anything else, like the integer that
holds the already allocated space (the one used in
std::string::capacity()). The fact that Visual C++ has a
sizeof(std::string) of 16 makes a lot more sense to me, as it clearly
has enough space to hold these integers.

So my main question is: Assuming that sizeof(char*) equals 4, how
is it possible that sizeof(std::string) can be 4 on any compiler?

Also, shouldn't sizeof(std::string) be AT LEAST sizeof(char*) +
sizeof(unsigned int) ? I'm curious why it isn't on the GNU C++
compiler that I'm using.

Thank-you in advance for any responses.

-- Jean-Luc


The library could implement it using the Pimpl idiom (cf.
http://www.gotw.ca/gotw/024.htm):

class stringImpl;

class string
{
public:
// Forwarding functions
private:
stringImpl* pImpl;
};

Thus you have only a pointer as a member.

Cheers! --M

Sep 28 '05 #2
jl*****@hotmail.com wrote:
Dear C++ community,

I have a question regarding the size of C++ std::strings.
Basically, I compiled the following code under two different compilers:

std::string someString = "Hello, world!";
int size1 = sizeof(std::string);
int size2 = sizeof(someString);

and printed out the values of size1 and size2. size1 and size2 always
matched in value (in other words, size1 == size2). That makes sense to
me.

Under the Visual C++ 6.0, size1 and size2 both equalled 16, but
under a GNU C++ compiler (under Linux), size1 and size2 were both 4. I
understand that different compilers are allowed to implement
std::string differently which allows for the differences between the
results of sizeof(std::string) by the different compilers.

What I don't understand is why sizeof(std::string) returns 4 with
any compiler.
It does? Really? Wait, didn't you just say that "Under the Visual C++
6.0, size1 .. equalled 16"? And 'size1' _is_ 'sizeof(std::string)', no?
So, why do you say "sizeof(std::string) returns 4 with any compiler"? It
apparently does NOT in VC++ 6.0...
[...]

So my main question is: Assuming that sizeof(char*) equals 4, how
is it possible that sizeof(std::string) can be 4 on any compiler?
It isn't.
Also, shouldn't sizeof(std::string) be AT LEAST sizeof(char*) +
sizeof(unsigned int) ? I'm curious why it isn't on the GNU C++
compiler that I'm using.


"Use the Source, Luke!" Just look at their implementation. They
may have a simple thing like

class blah {
blah_internal *pimpl;
public:
/// all members simply forwarding the requests to 'pimpl'
};

V
Sep 28 '05 #3
Any type in C++ has a fixed value. This is because the compiler needs to
know the exact size of each type to allocate stack frame.

An std::string object does not contain the string data in the object itself
typically. Rather, it dynamically manages the string content somewhere else
in the memory. Usually, and by default, it allocates/ manages/ and
eventually deallocates the string content on the free store.

An std::string only needs a pointer to the content and an integer to cache
the size of the string. Of course, more complex of representation is
possible. Many versions of std::string manages its dynamic string contents
as a number of memory "chunks".

Ben
Sep 28 '05 #4

jl*****@hotmail.com wrote:
So my main question is: Assuming that sizeof(char*) equals 4, how
is it possible that sizeof(std::string) can be 4 on any compiler?
Then, if std::string only contains a char* then it's size is 4.
so I can't see how there could possibly be any more room for anything else
Also, shouldn't sizeof(std::string) be AT LEAST sizeof(char*) +
sizeof(unsigned int) ?


It could be implemented so that they're using some of the first few
bytes in the memory pointed to by this char * for something other than
the characters of the string. Then you only need one char *, you don't
need to store anything else as members of the object. Then,
std::string::c_str() could return this char * + (some number to skip
the non character data)*sizeof(char), std::string::capacity could
return *((int*)( pointer to char* + capacity location )),... I'm just
speculating, but this is at least one way that it could be implemented
with a single char *.

-Brian

Sep 28 '05 #5
jl*****@hotmail.com wrote:

So my main question is: Assuming that sizeof(char*) equals 4, how
is it possible that sizeof(std::string) can be 4 on any compiler?
I really don't know how the GNU people implemented std::string.
But who says that the pointer in std::string has to point to the
characters?
What about an intermeidate structure which holds, amongst the other
things you mentioned, a reference counter? Then it would be possible
that in

std::string st1 = "hello world";
std::string st2 = "hello world";

both strings internally point to the very same memory area

st1 st2
+-------+ +--------+
| o-----------+ +-------------o |
+-------+ | | +--------+
| |
| |
v v
+----------+
| cap: 12 |
| len: 11 |
| ref: 2 |
| data: o--------+
+----------+ |
|
+-------------------------+
|
v
+---+---+---+---+---+---+---+---+---+---+---+---+
| h | e | l | l | o | | w | o | r | l | d | |
+---+---+---+---+---+---+---+---+---+---+---+---+
Also, shouldn't sizeof(std::string) be AT LEAST sizeof(char*) +
sizeof(unsigned int) ? I'm curious why it isn't on the GNU C++
compiler that I'm using.


As said: I don't know if the GNU people did it that way. But it would
be possible.

--
Karl Heinz Buchegger
kb******@gascad.at
Sep 28 '05 #6
Victor Bazarov <v.********@comAcast.net> wrote in news:jsy_e.36813$Tf5.5443
@newsread1.mlpsca01.us.to.verio.net:
What I don't understand is why sizeof(std::string) returns 4 with
any compiler.


It does? Really? Wait, didn't you just say that "Under the Visual C++
6.0, size1 .. equalled 16"? And 'size1' _is_ 'sizeof(std::string)', no?
So, why do you say "sizeof(std::string) returns 4 with any compiler"? It
apparently does NOT in VC++ 6.0...


Uh, Victor... the OP is expressing his surprise that there exists at least
one compiler for which sizeof(std::string) is 4, not that every compiler
returns 4... (existential vs. universal quantifier....)
Sep 28 '05 #7
> jl*****@hotmail.com wrote:

What I don't understand is why sizeof(std::string)
returns 4 with any compiler.

Victor Bazarov replied:
It does? Really? Wait, didn't you just say that
"Under the Visual C++ 6.0, size1 .. equalled 16"?
And 'size1' _is_ 'sizeof(std::string)', no?
So, why do you say "sizeof(std::string) returns 4
with any compiler"? It apparently does NOT in VC++
6.0...

I apologize, Victor. When I said "it returns 4 with any compiler" I
did not mean "it returns 4 with EVERY compiler." By using the word
"any" I meant to say that "if there exists any compiler with which
sizeof(std::string) returns 4, then I have trouble understanding why 4
is returned."

And you are right in saying that it apparently does not return 4 in
VC++ 6.0. My point was that it made sense to me that VC++ returned a
value greater than 4, but I was confused that some compilers (by which
I mean GNU C++ and not VC++ 6.0) returned 4.

By using the word "any," I didn't mean "every."

Sorry for the misunderstanding, Victor.

-- Jean-Luc

Sep 28 '05 #8
Andre Kostur wrote:
Victor Bazarov <v.********@comAcast.net> wrote in news:jsy_e.36813$Tf5.5443
@newsread1.mlpsca01.us.to.verio.net:

What I don't understand is why sizeof(std::string) returns 4 with
any compiler.


It does? Really? Wait, didn't you just say that "Under the Visual C++
6.0, size1 .. equalled 16"? And 'size1' _is_ 'sizeof(std::string)', no?
So, why do you say "sizeof(std::string) returns 4 with any compiler"? It
apparently does NOT in VC++ 6.0...

Uh, Victor... the OP is expressing his surprise that there exists at least
one compiler for which sizeof(std::string) is 4, not that every compiler
returns 4... (existential vs. universal quantifier....)


My apologies. English is not my native tongue, I sometimes have trouble
with it.

V
Sep 28 '05 #9
In article <43***************@gascad.at>,
Karl Heinz Buchegger <kb******@gascad.at> wrote:
std::string st1 = "hello world";
std::string st2 = "hello world";

both strings internally point to the very same memory area

st1 st2
+-------+ +--------+
| o-----------+ +-------------o |
+-------+ | | +--------+
| |
| |
v v
+----------+
| cap: 12 |
| len: 11 |
| ref: 2 |
| data: o--------+
+----------+ |
|
+-------------------------+
|
v
+---+---+---+---+---+---+---+---+---+---+---+---+
| h | e | l | l | o | | w | o | r | l | d | |
+---+---+---+---+---+---+---+---+---+---+---+---+


Nice ASCII art! :-)

Just fyi, "Effect STL" by Scott Meyers does a nice survey std::string
layouts circa 2000 (Item 15). Things have changed since then in at
least one implementation I'm aware of (CodeWarrior) but it is still a
nice survey.

The above diagram is very close to "Implementation B" from this survey.
"Implementation C" from the survey gives an example where sizeof(string)
would be equal to sizeof(char*).

-Howard
Sep 28 '05 #10
Wow! Thank you all for the extremely fast responses!

I never really thought the pointer pointing to another structure
altogether (as opposed to pointing to an array of chars), but it makes
a lot more sense now, especially having peeked at the source (at
Victor's suggestion).

(To find the file it was using for the #include, I ran:

c++ -E source.cpp

and used the output to find the full pathname of the "string" header
file.)

Let me say that the header file is much, much larger than I thought
it would be! Yet, strangely enough, sizeof(std::string) is still just
a tiny value of 4. (I guess that's not so strange when you think about
it: the methods don't really contribute to the literal size of the
object -- they're only there to be used when they're needed.)

Anyway, thanks for answering my questions. I'm a big proponent of
using std::string (which sometimes results in lots of friction with
fellow programmers who oppose them "because of all their extra baggage
and overhead"), so it's nice to see that they are quite small and not
really any larger than they need to be.

Thanks again.

-- Jean-Luc

Sep 28 '05 #11

"Victor Bazarov" <v.********@comAcast.net> wrote in message
news:Q%******************@newsread1.mlpsca01.us.to .verio.net...
Andre Kostur wrote:
Victor Bazarov <v.********@comAcast.net> wrote in
news:jsy_e.36813$Tf5.5443
@newsread1.mlpsca01.us.to.verio.net:

What I don't understand is why sizeof(std::string) returns 4 with
any compiler.

It does? Really? Wait, didn't you just say that "Under the Visual C++
6.0, size1 .. equalled 16"? And 'size1' _is_ 'sizeof(std::string)', no?
So, why do you say "sizeof(std::string) returns 4 with any compiler"? It
apparently does NOT in VC++ 6.0...


Uh, Victor... the OP is expressing his surprise that there exists at
least one compiler for which sizeof(std::string) is 4, not that every
compiler returns 4... (existential vs. universal quantifier....)


My apologies. English is not my native tongue, I sometimes have trouble
with it.


Don't feel too bad, but I interpreted it the same way you originally did. :)

Jeff
Sep 28 '05 #12
Victor Bazarov <v.********@comAcast.net> wrote in
news:Q%******************@newsread1.mlpsca01.us.to .verio.net:
Andre Kostur wrote:
Victor Bazarov <v.********@comAcast.net> wrote in
news:jsy_e.36813$Tf5.5443 @newsread1.mlpsca01.us.to.verio.net:

What I don't understand is why sizeof(std::string) returns 4 with
any compiler.

It does? Really? Wait, didn't you just say that "Under the Visual
C++ 6.0, size1 .. equalled 16"? And 'size1' _is_
'sizeof(std::string)', no? So, why do you say "sizeof(std::string)
returns 4 with any compiler"? It apparently does NOT in VC++ 6.0...

Uh, Victor... the OP is expressing his surprise that there exists at
least one compiler for which sizeof(std::string) is 4, not that every
compiler returns 4... (existential vs. universal quantifier....)


My apologies. English is not my native tongue, I sometimes have
trouble with it.


This would be the first indication to me that english isn't your native
tongue, your english is quite good! I've read many, many of your posts
over the years, and I didn't see any significant errors.... :)
Sep 28 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: karthik.naig | last post by:
Hi, This was the routine I wrote earlier to convert a C++ string to a char array. But I found that the char* array consisted only of junk after returning from the below function.
4
by: vivekian | last post by:
Hi, This is the part of the code am trying to compile to : void Server::respondToClient ( std::string response ) { .... .... if ((numbytes = sendto ( sockFd_ , response , sizeof(response)...
10
by: farseer | last post by:
How can i do this? i'd like to call the following code: .... string url = <my urld>; TCHAR* urlParams = GetParams( ); url.append( (char * ) urlParams ); GotoURL( ( LPCTSTR ) url ); ...
84
by: Peter Olcott | last post by:
Is there anyway of doing this besides making my own string from scratch? union AnyType { std::string String; double Number; };
4
by: hugob0ss | last post by:
Hi, i'm with a problem here that i can't understand what it is. Hi have this code struct SF { std::string mnemonic;//mnemonic that represents it std::string name;//a descriptive name ...
25
by: Bala2508 | last post by:
Hi, I have a C++ application that extensively uses std::string and std::ostringstream in somewhat similar manner as below std::string msgHeader; msgHeader = "<"; msgHeader += a; msgHeader...
11
by: Christopher Pisz | last post by:
Is std::string::npos always going to be less than any std::string 's size()? I am trying to handle a replacement of all occurances of a substr, in which the replacement also contains the substr....
5
by: scudemax | last post by:
I am having trouble porting some code from VC6 to VC8 (2005). I need to deserialize files that were written in VC6 and heavily use the std::string template. sizeof(std::string) will be 16 from...
5
by: Ramesh | last post by:
Hi. Assuming I have a code snippet like below: #include <iostream> using namespace std; char Mac = { 0x0, 0x1, 0x2, 0x3, 0x4, 0x5 }; std::string csMac;
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.