473,769 Members | 8,134 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

std::string as data array

If s is a std::string, does &s[0] refer to the contiguous block of
characters representing s?
Oct 25 '05 #1
8 4815
>If s is a std::string, does &s[0] refer to the contiguous block of
characters representing s?

Return type of operator [] is a reference to char.
And the reference may be invalidated by string reallocations or
modifications for the non-const strings.

Oct 25 '05 #2

"Jason Heyes" <ja********@opt usnet.com.au> wrote in message
news:43******** *************** @news.optusnet. com.au...
| If s is a std::string, does &s[0] refer to the contiguous block of
| characters representing s?
|

Of course not, s[0] is a single char. If what you seek is a constant
pointer to a legacy char array, try s.c_str(). C++ is not in the habit
of decaying containers into pointers (although it can due to backward
compatibility with certain ancient aspects of C).

In fact, C++ replaces pointer-manipulation with iterators where the end
iterator(s) having the value(s) of null are deemed to be part of the
container, not part of the dataset. And its customary to refer to an
instance of a std:string by reference, not by pointer.

Hence, a std::string initialized like so...

std::string s("abcde");

....has 5 values in its dataset. While an old type char array initialized
like so...

char str[] = {'a', 'b', 'c', 'd', 'e', '0x0'};

....has 6 values in it. Many of the ancient C functions rely on the
presence of that 0x0 terminator. Standard C++ does not. A pointer to a
char is a pointer to a char, nothing more, nothing less.

Instead of passing a pointer to the first element, you pass the entire
string by reference. Since you can't break the relationship between a
reference and its object, the transfer is nuke proof, and no more
debugging for hours due to invalid pointers either.

void foo(std::string & r_s) { ... } // is evident in its purpose

Consider an empty array of chars...

char str[]; // this decays to a pointer with an undetermined value at
pointee
// and arrays have a fixed compile-time size

....not so with a std::string...

std::string s;

.... where the iterators at...

std::string::it erator s_iter = s.begin();
s_iter = s.end();

....have a guarenteed value of null. And unlike an array, you can modify
its size at runtime on the fly...

s += "a short phrase"; // no need to preset the std::strings size

And thats but a glimpse to a std::string's deep capabilities. Combine
all of the above and add its own algorithms + member functions +
overloaded operators and the std::string repays the effort of learning
it by the first day. You'll freak out over the simplicity of the concept
too.
Oct 25 '05 #3
Hmm, I don't think Peter's reply solves Jason's puzzle, though he gives
a lot of comparison between C and C++'s string.
Jason wants to know whether &s[0] is the start address of the
contiguous block of characters that s holds.
I think the answer of this question depends on the implementation
details of std::string.
If std::string is implemented via an character array, the answer is
ture; however, if std::string is implemented via some other linked list
method, the answer is definitely false.
You can test the following program on vc6.

#include <string>
#include <iostream>

using namespace std;

int main()
{
string s("abcde");

char *pstr = &s[0];

cout<<pstr<<end l;
cout<<hex<<int( pstr)<<endl;
cout<<hex<<int( s.c_str())<<end l;

return 0;
}

Output:
abcde
481cf1
481cf1

The output shows that std::string in vc6 is implemented with the first
method.

Certainly, just as Peter and Alan say, we should seldom manipulate s
via the casted pstr char pointer. Because, std::string is a wrapper of
conservative C string, and users can shrink and stretch std::string
freely without the annoying memory manipulation problem, which you
always suffer in the old C string's time.

Oct 25 '05 #4
In message <43************ ***********@new s.optusnet.com. au>, Jason Heyes
<ja********@opt usnet.com.au> writes
If s is a std::string, does &s[0] refer to the contiguous block of
characters representing s?

What makes you think it's contiguous?

--
Richard Herring
Oct 25 '05 #5
Richard Herring wrote:
In message <43************ ***********@new s.optusnet.com. au>, Jason Heyes
<ja********@opt usnet.com.au> writes
If s is a std::string, does &s[0] refer to the contiguous block of
characters representing s?

What makes you think it's contiguous?


Since the Standard specifies that std::string's operator[] provides
indexed access to std::string::da ta() - and since the Standard
specifies that std::string::da ta() point to a character array (that is,
a block of contiguously allocated memory), it must be the case by
transitive logic that &s[0] refers to the first character in a block of
contiguously-allocated character data.

Greg

Oct 25 '05 #6
yepp wrote:
Hmm, I don't think Peter's reply solves Jason's puzzle, though he gives
a lot of comparison between C and C++'s string.
Jason wants to know whether &s[0] is the start address of the
contiguous block of characters that s holds.
It's the starting address of a block of characters identical to those
that comprise s, but it is not (necessarily) a pointer to the character
data of s itself.
I think the answer of this question depends on the implementation
details of std::string.
No, the implementation details of std::string make no difference.
If std::string is implemented via an character array, the answer is
ture; however, if std::string is implemented via some other linked list
method, the answer is definitely false.
No, the answer is always true.

The client has no access to std::string's internal data representation
(that is why it is called "internal") . So however std::string stores
its character data is of interest only to itself.
You can test the following program on vc6.

#include <string>
#include <iostream>

using namespace std;

int main()
{
string s("abcde");

char *pstr = &s[0];

cout<<pstr<<end l;
cout<<hex<<int( pstr)<<endl;
cout<<hex<<int( s.c_str())<<end l;

return 0;
}

Output:
abcde
481cf1
481cf1

The output shows that std::string in vc6 is implemented with the first
method.


No, it does not show how vc6's std::string is implemented internally.
After all, internal implementations are not observable from the outside
by definition. It does show &s[0] == data() but that relationship is a
requirement.

Greg

Oct 25 '05 #7
In message <11************ **********@g14g 2000cwa.googleg roups.com>, Greg
<gr****@pacbell .net> writes
Richard Herring wrote:
In message <43************ ***********@new s.optusnet.com. au>, Jason Heyes
<ja********@opt usnet.com.au> writes
>If s is a std::string, does &s[0] refer to the contiguous block of
>characters representing s?
>
>

What makes you think it's contiguous?


Since the Standard specifies that std::string's operator[] provides
indexed access to std::string::da ta() - and since the Standard
specifies that std::string::da ta() point to a character array (that is,
a block of contiguously allocated memory), it must be the case by
transitive logic that &s[0] refers to the first character in a block of
contiguously-allocated character data.


Is that something that's been corrected in the latest version of the
Standard, then? The 1998 version appears to be inconsistent:

21.3.4:
const_reference operator[] (size_type pos) const;
reference operator[](size_type pos);

Returns: If pos < size(), returns data()[pos]. Otherwise [...]

But:

21.3.6

const charT* data() const;

Returns: If size() is nonzero, the member returns a pointer to the
initial element of an array whose first size() elements equal the
corresponding elements of the string controlled by *this [...]
Requires: The program shall not alter any of the values stored in the
character array [...]

So data() returns a const pointer to something which the standard takes
pains not to say is "the" string, but may be a copy of it, yet
operator[] magically converts it to a non-const reference which can
modify the string.

Hmmm.

--
Richard Herring
Oct 25 '05 #8
Peter_Julian wrote:
"Jason Heyes" wrote:
| If s is a std::string, does &s[0] refer to the contiguous block of
| characters representing s?
|

Of course not, s[0] is a single char.
Actually s[0] is a reference to a single char.
It may or may not have other the other chars following it.
If what you seek is a constant pointer to a legacy char array,
try s.c_str().
Better would be s.data() , which does not bother to null-terminate
the array.
In fact, C++ replaces pointer-manipulation with iterators where the
end iterator(s) having the value(s) of null are deemed to be part
of the container, not part of the dataset.
Most iterators cannot have a value of NULL.
Try writing:
std::string::it erator it = NULL;
and see how far you get.
(Whether or not this works is implementation-specific.)
Hence, a std::string initialized like so...

std::string s("abcde");

...has 5 values in its dataset. While an old type char array
initialized like so...

char str[] = {'a', 'b', 'c', 'd', 'e', '0x0'};

...has 6 values in it. Many of the ancient C functions rely on the
presence of that 0x0 terminator.
Note that '0x0' is not a end-of-string marker (it's a multi-
byte character constant). You might be thinking of 0, or '\0'.
Standard C++ does not. A pointer to a char is a pointer to
a char, nothing more, nothing less.
Standard C++ has the same rules and expectations about char
arrays as Standard C does. (Except that string literals
are const in C++).
Consider an empty array of chars...

char str[]; // this decays to a pointer with an undetermined
// value at pointee and arrays have a fixed compile-time size
Actually this is a syntax error; arrays must have a specified size.
...not so with a std::string...
std::string s;
... where the iterators at...

std::string::it erator s_iter = s.begin();
s_iter = s.end();

...have a guarenteed value of null.


No, they don't. In fact you are not guaranteed to be able to
compare these iterators to anything except for other iterators
of the same type. Dereferencing any of these iterators
causes undefined behaviour.

Oct 26 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
8770
by: Niko Korhonen | last post by:
I'm currently in the process of programming a multimedia tagging library in standard C++. However, I've stumbled across one or two unclear issues while working with the library. First of all, is it safe to store binary data in std::string? This question rose from my implementation with APEv2 tags. An APEv2 tag's field value can contain either UTF encoded text or binary data. I've decided to use std::string to represent the field value....
24
11034
by: Julie | last post by:
I'm re-evaluating the way that I convert from a std::string to char *. (Requirement: the source is a std::string, the usable contents are char *) Here is what I've come up with: #include <string> #include <vector> #include <cstring> // presume s from somewhere, such as:
12
28199
by: Flzw | last post by:
How to convert a std::string to a WCHAR* ? is there any methods or something ? I can't find. Thanks
19
6164
by: Erik Wikström | last post by:
First of all, forgive me if this is the wrong place to ask this question, if it's a stupid question (it's my second week with C++), or if this is answered some place else (I've searched but not found anything). Here's the problem, I have two sets of files, the name of a file contains a number which is unique for each set but it's possible (even probable) that two files in different sets have the same numbers. I want to store these...
16
16428
by: Khuong Dinh Pham | last post by:
I have the contents of an image of type std::string. How can I make a CxImage object with this type. The parameters to CxImage is: CxImage(byte* data, DWORD size) Thx in advance
8
5156
by: puzzlecracker | last post by:
Does string class take into consideration a poterntial buffer overflow issue? or does std:string::c_str() member functions does? what are the preventives?
7
11616
by: JustSomeGuy | last post by:
I need to make a class called uid. A UID is a unique identifier. It looks like... 1.2.3.345.1.2.4.566 This uid get transmitted over a network as 8 bit binary data. If the length of the UID is odd, an extra padding null \0 is added to the end. This is what I've written but I'm not sure if I've garanteed to have the c_str() method return a buffer that is null padded.
4
11231
by: daroman | last post by:
Hi Guys, i've problem with my small C++ programm. I've just small template class which represetns a array, everything works fine up to combination with std::string. I did tried it with M$ VC++ and with GCC (Cygwin and Linux) and my problem is when i try do this int main(int argc, char **argv) { array<std::stringa(10); a = "Huhuhu"; <--- with gcc i got a crash !
5
1983
by: Olaf | last post by:
Hi, I wrap a legacy C library, e.g. the signature is void set_error_buffer(char* buf); where the buf length should be of length of 512 (it's defined). Now I want to wrap it with std::string. What is the prefered way?
0
9423
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10219
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9865
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8876
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7413
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6675
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5310
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5448
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
2815
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.