473,406 Members | 2,439 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

std::string - undefined behavior?

Hi all,
Two distinct compilers give different output for the following code:

#include <iostream>
#include <string>

int main(void)
{
std::string s("0124");
s.replace(0, 3, s).replace(s.size(), 6, s);

std::cout << s << " size=" << s.size() << std::endl;
return 0;
}
GCC(3.4.4):
0124401244 size=10

VC7.1:
012401244 size=9

Which one of the two is correct? Or, maybe this an undefined behavior
of std::string?

Oct 25 '05 #1
15 2541
shablool wrote:
Hi all,
Two distinct compilers give different output for the following code:

#include <iostream>
#include <string>

int main(void)
{
std::string s("0124");
s.replace(0, 3, s).replace(s.size(), 6, s);

std::cout << s << " size=" << s.size() << std::endl;
return 0;
}
GCC(3.4.4):
0124401244 size=10

VC7.1:
012401244 size=9

Which one of the two is correct? Or, maybe this an undefined behavior
of std::string?


I think you're running into undefined order of execution here. The
compiler is free, for example, to first execute that rightmost size()
call, and then start with the whole replace issue. Or it could first do
the replace and then call the rightmost size() etc.

--
Regards,

Ferdi Smit (M.Sc.)
Email: Fe********@cwi.nl
Room: C0.07 Phone: 4229
INS3 Visualization and 3D Interfaces
CWI Amsterdam, The Netherlands
Oct 25 '05 #2
Ferdi Smit wrote:
shablool wrote:
Hi all,
Two distinct compilers give different output for the following code:

#include <iostream>
#include <string>

int main(void)
{
std::string s("0124");
s.replace(0, 3, s).replace(s.size(), 6, s);

std::cout << s << " size=" << s.size() << std::endl;
return 0;
}
GCC(3.4.4):
0124401244 size=10

VC7.1:
012401244 size=9

Which one of the two is correct? Or, maybe this an undefined behavior
of std::string?


I think you're running into undefined order of execution here. The
compiler is free, for example, to first execute that rightmost size()
call, and then start with the whole replace issue. Or it could first do
the replace and then call the rightmost size() etc.


I believe the order of execution is merely unspecified, not undefined.
Undefined means the program may do anything - it has essentially ceased
performing any useful operation. Unspecified means that the program has
a choice of several possible execution paths to take, and it's entirely
up to the program to decide which one to take.

For example, in the above program, the compiler gets to choose the
order it will evaluate the expressions. Any order is as good as any
other; so even though two programs compiled from identical source files
produce different output, neither program is incorrect. Interestingly,
since a C++ program must always produce identical behavior given
identical inputs, even unspecified behavior is consistent behavior.

Greg

Oct 25 '05 #3
Greg wrote:

Undefined means the program may do anything - it has essentially ceased
performing any useful operation.


Well, that's a bit extreme. Undefined means that the C++ language
definition doesn't tell you what the program will do. In many cases your
compiler documentation will tell you what it does, or observation and a
little thought will tell you what various implementations actually do.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 25 '05 #4
Pete Becker wrote:
Greg wrote:

Undefined means the program may do anything - it has essentially ceased
performing any useful operation.


Well, that's a bit extreme. Undefined means that the C++ language
definition doesn't tell you what the program will do. In many cases your
compiler documentation will tell you what it does, or observation and a
little thought will tell you what various implementations actually do.


Then your compiler uses some non-standard extension(s), almost
guaranteed not to work on someone else's machine.

Kristo

Oct 25 '05 #5
Kristo wrote:
Pete Becker wrote:
Greg wrote:
Undefined means the program may do anything - it has essentially ceased
performing any useful operation.


Well, that's a bit extreme. Undefined means that the C++ language
definition doesn't tell you what the program will do. In many cases your
compiler documentation will tell you what it does, or observation and a
little thought will tell you what various implementations actually do.

Then your compiler uses some non-standard extension(s), almost
guaranteed not to work on someone else's machine.


Nope. Doesn't vary from machine to machine, but possibly from compiler
to compiler. Which is why that last sentence started with "In many
cases...". Beginners, of course, shouldn't try to take advantage of
undefined behavior.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 25 '05 #6
Greg wrote:
Ferdi Smit wrote:
I think you're running into undefined order of execution here. The
compiler is free, for example, to first execute that rightmost size()
call, and then start with the whole replace issue. Or it could first do
the replace and then call the rightmost size() etc.

I believe the order of execution is merely unspecified, not undefined.
Undefined means the program may do anything - it has essentially ceased
performing any useful operation. Unspecified means that the program has
a choice of several possible execution paths to take, and it's entirely
up to the program to decide which one to take.

For example, in the above program, the compiler gets to choose the
order it will evaluate the expressions. Any order is as good as any
other; so even though two programs compiled from identical source files
produce different output, neither program is incorrect. Interestingly,
since a C++ program must always produce identical behavior given
identical inputs, even unspecified behavior is consistent behavior.


I didn't mean undefined in the C++ sense of undefined behaviour, but to
indicate it's not 'fixed' by standard and different compilers use
different schemes (it's well known that gcc and vc do this in reverse of
eachother). It's more of an english vocabulary problem than a C++ one;
tho I do get the fine difference between undefined and unspecified now,
I think. Another day another word. So you'd say undefined means it can
do anything, and unspecified means it _is_ defined (somewhere), just not
specified how. Interesting indeed, never thought of it that way.

--
Regards,

Ferdi Smit (M.Sc.)
Email: Fe********@cwi.nl
Room: C0.07 Phone: 4229
INS3 Visualization and 3D Interfaces
CWI Amsterdam, The Netherlands
Oct 25 '05 #7
"Pete Becker" <pe********@acm.org> wrote in message
news:Dq******************************@rcn.net...
Kristo wrote:
Pete Becker wrote:
Greg wrote:

Undefined means the program may do anything - it has essentially ceased
performing any useful operation.

Well, that's a bit extreme. Undefined means that the C++ language
definition doesn't tell you what the program will do. In many cases your
compiler documentation will tell you what it does, or observation and a
little thought will tell you what various implementations actually do.

Then your compiler uses some non-standard extension(s), almost
guaranteed not to work on someone else's machine.


Nope. Doesn't vary from machine to machine, but possibly from compiler to
compiler. Which is why that last sentence started with "In many cases...".
Beginners, of course, shouldn't try to take advantage of undefined
behavior.


I'm not convinced it's a particularly great idea for anyone to take
advantage of undefined behaviour if it can be avoided to be honest :) This
isn't aimed at you(!), as I know you're more than competent, but as a
general thought, I think a lot of problems arise when otherwise good
programmers decide that they know what they're doing and that "just this
once" it's ok to write a bit of dodgy code because it works fine with their
compiler (this is somewhat along the lines of "a little knowledge can be a
dangerous thing"). It is admittedly a slightly different story if your
compiler documentation defines what happens in a particular situation and
it's guaranteed not to change from one version to the next. In that case, if
you're writing a program which will never be ported to another compiler, you
could argue that it's ok to take advantage of the compiler-specific feature.
In that sort of case, what happens is only undefined according to the
standard, in practice you know what will happen because it's defined by your
compiler. It's still better to make things as portable as possible, though,
you never know if you (or indeed someone else) might need to port your code
to a different compiler at some point.

Stu
--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

Oct 25 '05 #8
Stuart Golodetz wrote:
It's still better to make things as portable as possible, though,
you never know if you (or indeed someone else) might need to port your code
to a different compiler at some point.


Portability is not an absolute. For example, there are many things that
are required by the language definition but aren't portable, because
some compilers don't get them right. So you have to know compiler quirks
if you're going to write portable code. The way to handle non-portable
code is to identify it and test it.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 25 '05 #9
Ferdi Smit wrote:

So you'd say undefined means it can
do anything,
No, it means that the C++ standard doesn't say what it does. There can
be other sources for that information.
and unspecified means it _is_ defined (somewhere), just not
specified how.


Unspecified means that the C++ standard allows a range of (usually
reasonable) behaviors. Implementation defined in most cases means that
the C++ standard allows a range of (usually reasonable) behaviors, and
the compiler documentation must document the compiler's behavior.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 25 '05 #10
"Pete Becker" <pe********@acm.org> wrote in message
news:Va********************@rcn.net...
Stuart Golodetz wrote:
It's still better to make things as portable as possible, though, you
never know if you (or indeed someone else) might need to port your code
to a different compiler at some point.


Portability is not an absolute. For example, there are many things that
are required by the language definition but aren't portable, because some
compilers don't get them right. So you have to know compiler quirks if
you're going to write portable code. The way to handle non-portable code
is to identify it and test it.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)


Fair enough :) I was probably being a little dogmatic. It's always a good
idea to know the quirks of any compiler you're using. (Actually, it's always
a good idea to know as much as possible, in general, I reckon. Or perhaps
not? Something for the philosophers, maybe...but I digress :)) I do think
it's the case though that what I said about making it "as portable as
possible" was reasonably sound, in that if certain compilers aren't
standard-compliant (and as far as I know, most of them aren't in one way or
another), then of course you have to work around any defects, but to the
extent that you can write standard, portable code, you should. If you can't
make it portable, then yes, clearly marking it so that everyone knows where
it is and testing it are definitely good things to be doing.

Cheers,
Stu
Oct 25 '05 #11
Stuart Golodetz wrote:

Fair enough :) I was probably being a little dogmatic. It's always a good
idea to know the quirks of any compiler you're using. (Actually, it's always
a good idea to know as much as possible, in general, I reckon. Or perhaps
not? Something for the philosophers, maybe...but I digress :)) I do think
it's the case though that what I said about making it "as portable as
possible" was reasonably sound, in that if certain compilers aren't
standard-compliant (and as far as I know, most of them aren't in one way or
another), then of course you have to work around any defects, but to the
extent that you can write standard, portable code, you should. If you can't
make it portable, then yes, clearly marking it so that everyone knows where
it is and testing it are definitely good things to be doing.


In general, that's right. But if writing portable code involves
something that's far more complicated than a hypothetically non-portable
construct that in fact works, go for the non-portable version. For example,

void show_chars(const char *begin, const char *end)
{
while (begin != end)
cout << *begin++;
cout << '\n';
}

int main()
{
vector<char> vec;
// insert some chars into vec
show_chars(&*vec.begin(), &*vec.end()); // GASP!! UNDEFINED BEHAVIOR!!
return 0;
}

"Improved" version:

int main()
{
vector<char> vec;
// insert some chars into vec
show_chars(&*vec.begin(), &*(vec.end() - 1) + 1);
return 0;
}

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 25 '05 #12
Pete Becker wrote:

In general, that's right. But if writing portable code involves
something that's far more complicated than a hypothetically
non-portable construct that in fact works, go for the
non-portable version. For example,
I agree in principle, but not for this particular example:
int main()
{
vector<char> vec;
// insert some chars into vec
show_chars(&*vec.begin(), &*vec.end()); // GASP!! UB!!
return 0;
}

"Improved" version:
show_chars(&*vec.begin(), &*(vec.end() - 1) + 1);


The GNU C++ standard library doesn't use pointers as
iterators for std::vector. Without seeing their code,
it seems possible that *vec.end() might actually have
undesirable results. And I certainly do know of
an implementation where &*deq.end() blows up, where
deq is a deque.

My 'improved' version would be:
show_chars( &vec[0], &vec[0] + vec.size() );

Of course, vec[0] and *vec.begin() are equivalent, I just
find &* a bit unaesthetic.

Oct 26 '05 #13
Old Wolf wrote:

I agree in principle, but not for this particular example:

The point wasn't to design the code, but to look at what people actually
do. The code I posted was paraphrased from something someone sent me.
They'd have been better off spending less time avoiding undefined
behavior and just writing something that worked.

The GNU C++ standard library
The code wasn't written for the GNU C++ standard library.
doesn't use pointers as
iterators for std::vector. Without seeing their code,
it seems possible that *vec.end() might actually have
undesirable results.
Nevertheless, it worked. And that's the point: writing code that works
is sometimes more important than writing code that is theoretically better.
And I certainly do know of
an implementation where &*deq.end() blows up, where
deq is a deque.


The code didn't use a deque.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 26 '05 #14
please use namespace (std::)

Oct 27 '05 #15
ti****@kingsoft.net wrote:
please use namespace (std::)


You forgot to mention #include directives, and a host of other things
that are needed in compilable code. But none of that is needed in code
snippets when the meaning is clear.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 27 '05 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Angus Leeming | last post by:
Hello, Could someone explain to me why the Standard conveners chose to typedef std::string rather than derive it from std::basic_string<char, ...>? The result of course is that it is...
9
by: Divick | last post by:
Hi all, I have a problem related to std::string class. Is it ok to assign a global string variable to a local string object as shown below? I am trying to print the address of local string...
37
by: jortizclaver | last post by:
Hi, I'm about to develop a new framework for my corporative applications and my first decision point is what kind of strings to use: std::string or classical C char*. Performance in my system...
7
by: Marcus Kwok | last post by:
std::string::npos is described in _TC++PL:SE_ (Section 20.3.4) as the "all characters" marker. I tried to use it this way, but my program crashes: #include <iostream> #include <string> int...
10
by: mr_sorcerer | last post by:
Hi! I just found something interesting. I mean what do you think about this: char *p = 0; std::string str = p; Why std::string doesn't check null pointers?
2
by: HerbD | last post by:
I have a loooong debugging session behind me! I finally found the reason for the problem and now would like to know, if it is a bug in my code or not standardconformant behavour of the compiler(s) or...
13
by: arnuld | last post by:
/* C++ Primer 4/e * section 3.2 - String Standard Library * exercise 3.10 * STATEMENT * write a programme to strip the punctation from the string. */ #include <iostream> #include...
14
by: Mosfet | last post by:
Hi, what is the most efficient way of doing a case insensitive comparison ? I am trying to write a universal String class and I am stuck with the case insensitive part : TCHAR is a char in...
12
by: sas | last post by:
hi, i need that because the path functions for windows, like PathAppend and PathRemoveFileExt accept a writable zero terminated char*, but i didn't find that for std::string, with CString, i...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.