By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,836 Members | 1,725 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,836 IT Pros & Developers. It's quick & easy.

<string> class with support of Null-Bytes?

P: n/a
Hi!

I asked a similar question before but then changed everything to using
char-Arrays instead of the string class, but I would rather not do this
again.

So, does anyone know of a string-Class similar to the STL-<string> that
supports null-bytes?

I tried with standard <string> but this definitely does not support
them... :(

Tnx
Karl
Jul 22 '05 #1
Share this Question
Share on Google+
17 Replies


P: n/a
* Karl Ebener:

I asked a similar question before but then changed everything to using
char-Arrays instead of the string class, but I would rather not do this
again.

So, does anyone know of a string-Class similar to the STL-<string> that
supports null-bytes?

I tried with standard <string> but this definitely does not support
them... :(


Depends what you mean by "support", but with usual definitions that's
not correct.

Perhaps post a simple program that shows what you mean by "not support"?

Then we can see whether the problem is in the code or with std::string,
and give better suggestions on how to proceeed.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 22 '05 #2

P: n/a
Little change:
I tried with standard <string> but this definitely does not support
them... :(


-> I tried using length()-method which stops at null-bytes and c_str()
of course extracts only part till null-byte.
Have I only not seen any possibility to extract the content as char* ?

Tnx
Karl
Jul 22 '05 #3

P: n/a
* Karl Ebener:
Little change:
I tried with standard <string> but this definitely does not support
them... :(
-> I tried using length()-method which stops at null-bytes


It doesn't.

and c_str() of course extracts only part till null-byte.
It doesn't, see 21.3.6/1.

Have I only not seen any possibility to extract the content as char* ?


Post some code.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 22 '05 #4

P: n/a
Alf P. Steinbach schrieb:
Depends what you mean by "support", but with usual definitions that's
not correct.

Perhaps post a simple program that shows what you mean by "not support"?

Then we can see whether the problem is in the code or with std::string,
and give better suggestions on how to proceeed.

Okay, this is my test program.
What I want to do finally, is read a complete (binary) file into a
string and then send this via using socket to/from server.
I am using socket-routines that use strings because it is much easier
this way and I would love to leave it at that and not recode everything...

Tnx
Karl

#include <string>
#include <iostream>

using namespace std;

int main()
{
string abc = "abc\0abc\0"; // string contains Null-bytes
cout << abc << ":" << abc.length() << endl; // output is: 3
FILE* fp;

fp = fopen("ABC", "w");
fwrite(abc.c_str(), 8, 1, fp); // file will contain: "abc" and Garbage
fclose(fp);
}
Jul 22 '05 #5

P: n/a
Karl Ebener wrote:
Alf P. Steinbach schrieb:
Depends what you mean by "support", but with usual definitions that's
not correct.

Perhaps post a simple program that shows what you mean by "not support"?

Then we can see whether the problem is in the code or with std::string,
and give better suggestions on how to proceeed.
Okay, this is my test program.
What I want to do finally, is read a complete (binary) file into a
string and then send this via using socket to/from server.
I am using socket-routines that use strings because it is much easier
this way and I would love to leave it at that and not recode everything...

Tnx
Karl

#include <string>
#include <iostream>

using namespace std;

int main()
{
string abc = "abc\0abc\0"; // string contains Null-bytes


No. Your literal contains 0-bytes. The conversion constructor from C style
strings to std::string of course has to stop at \0, since that's the value
that marks the end of a C style string. Try:

const char c[] = "abc\0abc\0";

string abc(c, sizeof(c));

This tells the constructor to not stop at \0, but read the specified number
of characters.
cout << abc << ":" << abc.length() << endl; // output is: 3
That's because only the first 3 characters were actually copied into the
string.
FILE* fp;

fp = fopen("ABC", "w");
fwrite(abc.c_str(), 8, 1, fp); // file will contain: "abc" and Garbage
Again, that's because the string only contains the first 3 characters.
fclose(fp);
}


Jul 22 '05 #6

P: n/a
* Karl Ebener:

#include <string>
#include <iostream>

using namespace std;

int main()
{
string abc = "abc\0abc\0"; // string contains Null-bytes
cout << abc << ":" << abc.length() << endl; // output is: 3
FILE* fp;

fp = fopen("ABC", "w");
fwrite(abc.c_str(), 8, 1, fp); // file will contain: "abc" and Garbage
fclose(fp);
}


The problem in the abc declaration is that you invoke the constructor
that takes a C string as argument, and by definition that C string ends
at the first nullbyte.

Try
#include <string>
#include <iostream>

#define ELEMCOUNT( array ) (sizeof(array)/sizeof(*array))

int main()
{
static char const abc_data[] = "abc\0abc\0";
std::string abc( abc_data, ELEMCOUNT( abc_data );

std::cout << abc << ":" << abc.length() << std::endl;
}

But you might instead (for efficiency) want to use std::vector<char>.

Also, the file should be opened in binary mode.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 22 '05 #7

P: n/a
Karl Ebener wrote:
Okay, this is my test program.


My guess is that std::string's functions (including constructors) that take
a C-Style string as an argument, *do* treat it as a C-style (i.e.
null-terminated) string.

Makes sense, doesn't it? You don't want

char s[15] = "sth";
string s1(s);

to allocate 11 extra null characters in s1 for no reason :-)

If, OTOH, you put a '\0' in an std::string, it will not be treated as a
terminating character.

Check out this example to see what I mean:

#include <iostream>
#include <string>

int main(){
std::string s("abc\0abc\0");
std::cout<<s.length()<<std::endl; //prints 3, not 9
std::string s2;
s2.push_back('a');
s2.push_back('\0');
s2.push_back('b');
std::cout<<s2.length()<<std::endl; //prints 3, not 1
}
Note: c_string() will return a const char *, which means that the string
returned will always stop at the first null byte, for any code that cares
about it (e.g. strlen or strcpy). Better use a vector<char> if you want
byte semantics.
Jul 22 '05 #8

P: n/a
Karl Ebener wrote:
fwrite(abc.c_str(), 8, 1, fp); // file will contain: "abc"
// and Garbage


As a separate issue, data() would be better than c_str() here. c_str()
may expand the string's internal buffer, to make room for an extra null
character past the end. You don't need a null-terminated C-string to
call fwrite, so you can just use data().

--
Dave O'Hearn

Jul 22 '05 #9

P: n/a
Dimitris Kamenopoulos wrote:
Karl Ebener wrote:
Okay, this is my test program.


My guess is that std::string's functions (including constructors) that
take a C-Style string as an argument, *do* treat it as a C-style (i.e.
null-terminated) string.

Makes sense, doesn't it? You don't want

char s[15] = "sth";
string s1(s);

to allocate 11 extra null characters in s1 for no reason :-)


That's not the main point. The constructor takes a pointer, which doesn't
contain any information about the size of the array pointed to. So the \0
is the _only_ way at all to know where a C style string ends.

Jul 22 '05 #10

P: n/a

"Karl Ebener" <my*****@vollbio.de> wrote in message
news:41***********************@newsread2.arcor-online.net...
Little change:
I tried with standard <string> but this definitely does not support
them... :(


-> I tried using length()-method which stops at null-bytes and c_str()
of course extracts only part till null-byte.


What you are saying is totally false. std::string fully supports strings
with embedded NULLs. You just need to know the functions to use.

First, use the right constructor. The std::string has a few constructors --
a good C++ book that goes into the standard library will show you the
various constructors. The proper constructor is the one that takes a const
char * and an integer denoting the number of characters.

#include <string>
std::string s("abc\0123", 7);

Second, use the std::string::data( ) member function instead of
std::string::c_str(). This respects the length of the string and does not
terminate on the first NULL.

Third, if you need to add binary data to a std::string, use the append( )
function. If you need to reassign binary data, use the
std::string::append() on an empty string, or the std::string::assign( )
member function.

Paul
Jul 22 '05 #11

P: n/a
Karl Ebener wrote:
Little change:
I tried with standard <string> but this definitely does not support
them... :(

-> I tried using length()-method which stops at null-bytes and c_str()
of course extracts only part till null-byte.
Have I only not seen any possibility to extract the content as char* ?


Multibyte does not contain nulls. I'm confused as what you are asking.
Neither c_str() nor length() cares anything about embedded nulls.

Now that being said, there is NO real multibyte handling in std::string
either.
Jul 22 '05 #12

P: n/a
Karl Ebener wrote:
So, does anyone know of a string-Class similar to the STL-<string> that
supports null-bytes?


std:string handles null bytes just fine. The only thing that you have to
be careful with is that if you use the conversions to/from char*, you need
to pass/retrieve the actual length because the default strlen() calculations
won't work.

std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('\b');

cout << s.size(); // prints 3
const char* cp = s.c_str();

cout << cp[0] << cp[2]; // prints ab
Jul 22 '05 #13

P: n/a
Paul wrote:
"Karl Ebener" <my*****@vollbio.de> wrote:

#include <string>
std::string s("abc\0123", 7);
Undefined behaviour. "abc\0123" is an array of 6 chars:
{'a', 'b', 'c', '\012', '3', '\0'}
Second, use the std::string::data( ) member function instead of
std::string::c_str(). This respects the length of the string
and does not terminate on the first NULL.


std::string::c_str() does not terminate on the first null
character. The only difference between c_str() and data()
is that c_str() appends a null character.

std::string s("abc\0def", 7);
std::cout << (s.c_str() + 4) << std::endl;

will output "def".
BTW, the macro NULL is not really relevant to null characters.

Jul 22 '05 #14

P: n/a
Paul wrote:
"Karl Ebener" <my*****@vollbio.de> wrote:

#include <string>
std::string s("abc\0123", 7);
Undefined behaviour. "abc\0123" is an array of 6 chars:
{'a', 'b', 'c', '\012', '3', '\0'}
Second, use the std::string::data( ) member function instead of
std::string::c_str(). This respects the length of the string
and does not terminate on the first NULL.


std::string::c_str() does not terminate on the first null
character. The only difference between c_str() and data()
is that c_str() appends a null character.

std::string s("abc\0def", 7);
std::cout << (s.c_str() + 4) << std::endl;

will output "def".
BTW, the macro NULL is not really relevant to null characters.

Jul 22 '05 #15

P: n/a
Karl Ebener wrote:
Alf P. Steinbach schrieb:
Perhaps post a simple program that shows what you mean by "not support"?

#include <string>
#include <iostream>

using namespace std;

int main()
{
string abc = "abc\0abc\0"; // string contains Null-bytes
cout << abc << ":" << abc.length() << endl; // output is: 3
...
}


Nothing wrong with string. You lost your trailing data because C-style
string literals end at the first '\0'. This one works:

#include <string>
#include <iostream>

using namespace std;

int main()
{
string abc = "abcdabcd";
abc[3] = abc[7] = '\0';
cout << abc << ":" << abc.length() << endl;
return 0;
}

Prints:

abcabc:8

--
Ron House ho***@usq.edu.au
http://www.sci.usq.edu.au/staff/house
Jul 22 '05 #16

P: n/a

"Old Wolf" <ol*****@inspire.net.nz> wrote in message
news:11**********************@z14g2000cwz.googlegr oups.com...
Paul wrote:
"Karl Ebener" <my*****@vollbio.de> wrote:

#include <string>
std::string s("abc\0123", 7);


Undefined behaviour. "abc\0123" is an array of 6 chars:
{'a', 'b', 'c', '\012', '3', '\0'}

Sorry, that was my attempt to put together a string in haste. The following
is what I meant:

#include <string>
int main( )
{
char s1[] = {'0','1','2',0,'4','5','6'};
std::string s(s1, 7);
}

Paul
Jul 22 '05 #17

P: n/a
Karl Ebener <my*****@vollbio.de> wrote in
news:41***********************@newsread2.arcor-online.net:
What I want to do finally, is read a complete (binary) file into a
string and then send this via using socket to/from server.
I am using socket-routines that use strings because it is much easier
this way and I would love to leave it at that and not recode
everything...


OK, in case of large and/or binary strings assign(), append() and swap()
member functions are your friends. E.g.

void read_from_file(std::string& content) {
char buffer[N];
std::string collector;
while(!eof(the_file)) {
// ... read chunk of file into the buffer, say of length n.
collector.append(buffer, n);
}
content.swap(collector);
}

void send_to_socket() {
std::string packet;
read_from_file(packet);
// assume there is a nice C++ object around called socket:
socket.write(packet.data(), packet.length());
}

Note that using c_str() instead of data() might imply a performance
penalty here as the c_str() function might have to add a NUL terminator
at the end of the buffer, which can cause a reallocation and extra
unneeded copy of the whole string. As you must be managing the lengths
anyway explicitly the terminating NUL is not needed.

OK, swap() is not really necessary in this example, but it might be
useful in other similar situations where you have a large string to be
passed around.

In case of binary data the first rule is to avoid all std::string member
functions which take a single char* pointer - there is no way to specify
the actual length of data for such parameter.

HTH
Paavo

Jul 22 '05 #18

This discussion thread is closed

Replies have been disabled for this discussion.