By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,513 Members | 1,815 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,513 IT Pros & Developers. It's quick & easy.

std::string - undefined behavior?

P: n/a
Hi all,
Two distinct compilers give different output for the following code:

#include <iostream>
#include <string>

int main(void)
{
std::string s("0124");
s.replace(0, 3, s).replace(s.size(), 6, s);

std::cout << s << " size=" << s.size() << std::endl;
return 0;
}
GCC(3.4.4):
0124401244 size=10

VC7.1:
012401244 size=9

Which one of the two is correct? Or, maybe this an undefined behavior
of std::string?

Oct 25 '05 #1
Share this Question
Share on Google+
15 Replies


P: n/a
shablool wrote:
Hi all,
Two distinct compilers give different output for the following code:

#include <iostream>
#include <string>

int main(void)
{
std::string s("0124");
s.replace(0, 3, s).replace(s.size(), 6, s);

std::cout << s << " size=" << s.size() << std::endl;
return 0;
}
GCC(3.4.4):
0124401244 size=10

VC7.1:
012401244 size=9

Which one of the two is correct? Or, maybe this an undefined behavior
of std::string?


I think you're running into undefined order of execution here. The
compiler is free, for example, to first execute that rightmost size()
call, and then start with the whole replace issue. Or it could first do
the replace and then call the rightmost size() etc.

--
Regards,

Ferdi Smit (M.Sc.)
Email: Fe********@cwi.nl
Room: C0.07 Phone: 4229
INS3 Visualization and 3D Interfaces
CWI Amsterdam, The Netherlands
Oct 25 '05 #2

P: n/a
Ferdi Smit wrote:
shablool wrote:
Hi all,
Two distinct compilers give different output for the following code:

#include <iostream>
#include <string>

int main(void)
{
std::string s("0124");
s.replace(0, 3, s).replace(s.size(), 6, s);

std::cout << s << " size=" << s.size() << std::endl;
return 0;
}
GCC(3.4.4):
0124401244 size=10

VC7.1:
012401244 size=9

Which one of the two is correct? Or, maybe this an undefined behavior
of std::string?


I think you're running into undefined order of execution here. The
compiler is free, for example, to first execute that rightmost size()
call, and then start with the whole replace issue. Or it could first do
the replace and then call the rightmost size() etc.


I believe the order of execution is merely unspecified, not undefined.
Undefined means the program may do anything - it has essentially ceased
performing any useful operation. Unspecified means that the program has
a choice of several possible execution paths to take, and it's entirely
up to the program to decide which one to take.

For example, in the above program, the compiler gets to choose the
order it will evaluate the expressions. Any order is as good as any
other; so even though two programs compiled from identical source files
produce different output, neither program is incorrect. Interestingly,
since a C++ program must always produce identical behavior given
identical inputs, even unspecified behavior is consistent behavior.

Greg

Oct 25 '05 #3

P: n/a
Greg wrote:

Undefined means the program may do anything - it has essentially ceased
performing any useful operation.


Well, that's a bit extreme. Undefined means that the C++ language
definition doesn't tell you what the program will do. In many cases your
compiler documentation will tell you what it does, or observation and a
little thought will tell you what various implementations actually do.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 25 '05 #4

P: n/a
Pete Becker wrote:
Greg wrote:

Undefined means the program may do anything - it has essentially ceased
performing any useful operation.


Well, that's a bit extreme. Undefined means that the C++ language
definition doesn't tell you what the program will do. In many cases your
compiler documentation will tell you what it does, or observation and a
little thought will tell you what various implementations actually do.


Then your compiler uses some non-standard extension(s), almost
guaranteed not to work on someone else's machine.

Kristo

Oct 25 '05 #5

P: n/a
Kristo wrote:
Pete Becker wrote:
Greg wrote:
Undefined means the program may do anything - it has essentially ceased
performing any useful operation.


Well, that's a bit extreme. Undefined means that the C++ language
definition doesn't tell you what the program will do. In many cases your
compiler documentation will tell you what it does, or observation and a
little thought will tell you what various implementations actually do.

Then your compiler uses some non-standard extension(s), almost
guaranteed not to work on someone else's machine.


Nope. Doesn't vary from machine to machine, but possibly from compiler
to compiler. Which is why that last sentence started with "In many
cases...". Beginners, of course, shouldn't try to take advantage of
undefined behavior.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 25 '05 #6

P: n/a
Greg wrote:
Ferdi Smit wrote:
I think you're running into undefined order of execution here. The
compiler is free, for example, to first execute that rightmost size()
call, and then start with the whole replace issue. Or it could first do
the replace and then call the rightmost size() etc.

I believe the order of execution is merely unspecified, not undefined.
Undefined means the program may do anything - it has essentially ceased
performing any useful operation. Unspecified means that the program has
a choice of several possible execution paths to take, and it's entirely
up to the program to decide which one to take.

For example, in the above program, the compiler gets to choose the
order it will evaluate the expressions. Any order is as good as any
other; so even though two programs compiled from identical source files
produce different output, neither program is incorrect. Interestingly,
since a C++ program must always produce identical behavior given
identical inputs, even unspecified behavior is consistent behavior.


I didn't mean undefined in the C++ sense of undefined behaviour, but to
indicate it's not 'fixed' by standard and different compilers use
different schemes (it's well known that gcc and vc do this in reverse of
eachother). It's more of an english vocabulary problem than a C++ one;
tho I do get the fine difference between undefined and unspecified now,
I think. Another day another word. So you'd say undefined means it can
do anything, and unspecified means it _is_ defined (somewhere), just not
specified how. Interesting indeed, never thought of it that way.

--
Regards,

Ferdi Smit (M.Sc.)
Email: Fe********@cwi.nl
Room: C0.07 Phone: 4229
INS3 Visualization and 3D Interfaces
CWI Amsterdam, The Netherlands
Oct 25 '05 #7

P: n/a
"Pete Becker" <pe********@acm.org> wrote in message
news:Dq******************************@rcn.net...
Kristo wrote:
Pete Becker wrote:
Greg wrote:

Undefined means the program may do anything - it has essentially ceased
performing any useful operation.

Well, that's a bit extreme. Undefined means that the C++ language
definition doesn't tell you what the program will do. In many cases your
compiler documentation will tell you what it does, or observation and a
little thought will tell you what various implementations actually do.

Then your compiler uses some non-standard extension(s), almost
guaranteed not to work on someone else's machine.


Nope. Doesn't vary from machine to machine, but possibly from compiler to
compiler. Which is why that last sentence started with "In many cases...".
Beginners, of course, shouldn't try to take advantage of undefined
behavior.


I'm not convinced it's a particularly great idea for anyone to take
advantage of undefined behaviour if it can be avoided to be honest :) This
isn't aimed at you(!), as I know you're more than competent, but as a
general thought, I think a lot of problems arise when otherwise good
programmers decide that they know what they're doing and that "just this
once" it's ok to write a bit of dodgy code because it works fine with their
compiler (this is somewhat along the lines of "a little knowledge can be a
dangerous thing"). It is admittedly a slightly different story if your
compiler documentation defines what happens in a particular situation and
it's guaranteed not to change from one version to the next. In that case, if
you're writing a program which will never be ported to another compiler, you
could argue that it's ok to take advantage of the compiler-specific feature.
In that sort of case, what happens is only undefined according to the
standard, in practice you know what will happen because it's defined by your
compiler. It's still better to make things as portable as possible, though,
you never know if you (or indeed someone else) might need to port your code
to a different compiler at some point.

Stu
--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

Oct 25 '05 #8

P: n/a
Stuart Golodetz wrote:
It's still better to make things as portable as possible, though,
you never know if you (or indeed someone else) might need to port your code
to a different compiler at some point.


Portability is not an absolute. For example, there are many things that
are required by the language definition but aren't portable, because
some compilers don't get them right. So you have to know compiler quirks
if you're going to write portable code. The way to handle non-portable
code is to identify it and test it.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 25 '05 #9

P: n/a
Ferdi Smit wrote:

So you'd say undefined means it can
do anything,
No, it means that the C++ standard doesn't say what it does. There can
be other sources for that information.
and unspecified means it _is_ defined (somewhere), just not
specified how.


Unspecified means that the C++ standard allows a range of (usually
reasonable) behaviors. Implementation defined in most cases means that
the C++ standard allows a range of (usually reasonable) behaviors, and
the compiler documentation must document the compiler's behavior.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 25 '05 #10

P: n/a
"Pete Becker" <pe********@acm.org> wrote in message
news:Va********************@rcn.net...
Stuart Golodetz wrote:
It's still better to make things as portable as possible, though, you
never know if you (or indeed someone else) might need to port your code
to a different compiler at some point.


Portability is not an absolute. For example, there are many things that
are required by the language definition but aren't portable, because some
compilers don't get them right. So you have to know compiler quirks if
you're going to write portable code. The way to handle non-portable code
is to identify it and test it.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)


Fair enough :) I was probably being a little dogmatic. It's always a good
idea to know the quirks of any compiler you're using. (Actually, it's always
a good idea to know as much as possible, in general, I reckon. Or perhaps
not? Something for the philosophers, maybe...but I digress :)) I do think
it's the case though that what I said about making it "as portable as
possible" was reasonably sound, in that if certain compilers aren't
standard-compliant (and as far as I know, most of them aren't in one way or
another), then of course you have to work around any defects, but to the
extent that you can write standard, portable code, you should. If you can't
make it portable, then yes, clearly marking it so that everyone knows where
it is and testing it are definitely good things to be doing.

Cheers,
Stu
Oct 25 '05 #11

P: n/a
Stuart Golodetz wrote:

Fair enough :) I was probably being a little dogmatic. It's always a good
idea to know the quirks of any compiler you're using. (Actually, it's always
a good idea to know as much as possible, in general, I reckon. Or perhaps
not? Something for the philosophers, maybe...but I digress :)) I do think
it's the case though that what I said about making it "as portable as
possible" was reasonably sound, in that if certain compilers aren't
standard-compliant (and as far as I know, most of them aren't in one way or
another), then of course you have to work around any defects, but to the
extent that you can write standard, portable code, you should. If you can't
make it portable, then yes, clearly marking it so that everyone knows where
it is and testing it are definitely good things to be doing.


In general, that's right. But if writing portable code involves
something that's far more complicated than a hypothetically non-portable
construct that in fact works, go for the non-portable version. For example,

void show_chars(const char *begin, const char *end)
{
while (begin != end)
cout << *begin++;
cout << '\n';
}

int main()
{
vector<char> vec;
// insert some chars into vec
show_chars(&*vec.begin(), &*vec.end()); // GASP!! UNDEFINED BEHAVIOR!!
return 0;
}

"Improved" version:

int main()
{
vector<char> vec;
// insert some chars into vec
show_chars(&*vec.begin(), &*(vec.end() - 1) + 1);
return 0;
}

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 25 '05 #12

P: n/a
Pete Becker wrote:

In general, that's right. But if writing portable code involves
something that's far more complicated than a hypothetically
non-portable construct that in fact works, go for the
non-portable version. For example,
I agree in principle, but not for this particular example:
int main()
{
vector<char> vec;
// insert some chars into vec
show_chars(&*vec.begin(), &*vec.end()); // GASP!! UB!!
return 0;
}

"Improved" version:
show_chars(&*vec.begin(), &*(vec.end() - 1) + 1);


The GNU C++ standard library doesn't use pointers as
iterators for std::vector. Without seeing their code,
it seems possible that *vec.end() might actually have
undesirable results. And I certainly do know of
an implementation where &*deq.end() blows up, where
deq is a deque.

My 'improved' version would be:
show_chars( &vec[0], &vec[0] + vec.size() );

Of course, vec[0] and *vec.begin() are equivalent, I just
find &* a bit unaesthetic.

Oct 26 '05 #13

P: n/a
Old Wolf wrote:

I agree in principle, but not for this particular example:

The point wasn't to design the code, but to look at what people actually
do. The code I posted was paraphrased from something someone sent me.
They'd have been better off spending less time avoiding undefined
behavior and just writing something that worked.

The GNU C++ standard library
The code wasn't written for the GNU C++ standard library.
doesn't use pointers as
iterators for std::vector. Without seeing their code,
it seems possible that *vec.end() might actually have
undesirable results.
Nevertheless, it worked. And that's the point: writing code that works
is sometimes more important than writing code that is theoretically better.
And I certainly do know of
an implementation where &*deq.end() blows up, where
deq is a deque.


The code didn't use a deque.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 26 '05 #14

P: n/a
please use namespace (std::)

Oct 27 '05 #15

P: n/a
ti****@kingsoft.net wrote:
please use namespace (std::)


You forgot to mention #include directives, and a host of other things
that are needed in compilable code. But none of that is needed in code
snippets when the meaning is clear.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Oct 27 '05 #16

This discussion thread is closed

Replies have been disabled for this discussion.