473,396 Members | 1,766 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

C++ teaser: Is this a compiler bug, or is this expected behavior?

Compile the following snippet of code and run it. If the program spits
out bat:bat instead of bat:zat, what would you say? Would you say that
the compiler has a problem, or would you lay the blame on "undefined
execution of function parameters" in the C/C++ standard and "sequence
points"?

/////// Code snippet begins ///////

#include <iostream>
char foo[10]="cat";
char* writestring()
{
foo[0]='b';
return foo;
}

char* write2()
{
foo[0]='z';
return foo;
}
int main(void)
{ std::cout << writestring() << ":" << write2() << std::endl; }

/////// Code snippet ends ///////

Thanks,
Bhat
[Purists who hold that this NG is meant to discuss compiler neutral,
standard C++ issues only may not proceed beyond this point;-)]




For those of you who are "trivially inclined", here's some
background......
I stumbled upon a "bug" in my C++ compiler (g++ 3.3.1), which I
promptly reported to Bugzilla. The code snippet above was actually
provided by someone from the GCC volunteer community. They attributed
the unexpected behavior to the undefined behavior of execution of
function parameters and sequence points. In my original code snippet,
I was maintaining an STL map between IP addresses e.g. 105.52.20.33,
5000 and 47.32.68.95, 6000.

When I displayed the entries in the map, the second IP address was
displayed incorrectly. So instead of the mapping:

105.52.20.33, 5000 >>-->> 47.32.68.95, 6000
I got

105.52.20.33, 5000 >>-->> 105.52.20.33, 6000

The bug does not manifest when the code is compiled using native
Solaris C++
compiler version "WorkShop Compilers 5.0 02/04/10 C++ 5.0 Patch
107311-17"

Here's my original code snippet

/////// Code snippet begins ///////
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

#include <string>
#include <map>
#include <iostream>

using namespace std;
struct addrLessThan:public binary_function<const struct sockaddr_in,
const
struct sockaddr_in, bool>
{
bool operator()(const struct sockaddr_in addr1, const struct
sockaddr_in
addr2) const
{
bool retVal = true;

string addrStr1 = inet_ntoa(addr1.sin_addr);
string addrStr2 = inet_ntoa(addr2.sin_addr);

if(addrStr1 > addrStr2)
retVal = false;
else if(addrStr1 == addrStr2)
retVal = (addr1.sin_port < addr2.sin_port);

return retVal;
}
};

typedef map<struct sockaddr_in, struct sockaddr_in, addrLessThan>
IpV4AddrMap;

main()
{
struct sockaddr_in actualAddress, mappedAddress;

actualAddress.sin_port=5000;
actualAddress.sin_addr.s_addr = inet_addr("105.52.20.33");

mappedAddress.sin_port=6000;
mappedAddress.sin_addr.s_addr = inet_addr("47.32.68.95");

IpV4AddrMap map;

map[actualAddress] = mappedAddress;

IpV4AddrMap::iterator itor = map.find(actualAddress);

if(itor != map.end())
{
cout << "Key: " << inet_ntoa(itor->first.sin_addr)
<< ", " << itor->first.sin_port << endl
<< "Value: " << inet_ntoa(itor->second.sin_addr)
<< ", " << itor->second.sin_port << endl
<< endl;
}
return 0;
}

/////// Code snippet ends ///////
For more details, you can go to
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22265

Jul 23 '05 #1
13 1633
Generic Usenet Account wrote:
Compile the following snippet of code and run it. If the program spits
out bat:bat instead of bat:zat, what would you say? Would you say that
the compiler has a problem, or would you lay the blame on "undefined
execution of function parameters" in the C/C++ standard and "sequence
points"?

/////// Code snippet begins ///////

#include <iostream>
char foo[10]="cat";
char* writestring()
{
foo[0]='b';
return foo;
}

char* write2()
{
foo[0]='z';
return foo;
}
int main(void)
{ std::cout << writestring() << ":" << write2() << std::endl; }
[..]


Yes, the latter, the correct term is "the order of evaluation of the
function arguments is unspecified". A simpler expression is

cout << writestring() << write2();

which is the same as

( cout.operator<<( writestring() ) ) . operator<< ( write2() );

in which 'write2()' is allowed to be evaluated before 'writestring()'
as I understand it. The part to the right of the second dot is the
function with its arguments. The left part is the object, which also
needs to be evaluated...

V
Jul 23 '05 #2
Generic Usenet Account wrote:
Compile the following snippet of code and run it. If the program
spits out bat:bat instead of bat:zat, what would you say? Would
you say that the compiler has a problem, or would you lay the blame
on "undefined execution of function parameters" in the C/C++
standard and "sequence points"?

#include <iostream>
char foo[10]="cat";
char* writestring()
{
foo[0]='b';
return foo;
}

char* write2()
{
foo[0]='z';
return foo;
}

int main(void)
{ std::cout << writestring() << ":" << write2() << std::endl; }
The behaviour is unspecified (NOT undefined), and
"bat:bat", "bat:zat", and "zat:zat" are all valid outputs.
(But "zat:bat" is not.)

You must remember that writestring() can be called at any point
between the start of this statement's execution, and the point
where its return value is needed. The same goes for write2().

Sequence points are not an issue here, because there are
no instances of multiple side-effects occuring without an
intervening sequence point (a function call has a sequence
point after its arguments have been evaluated, and another one
as it returns).

The example has some similarities to:

foo( a(), b() );

where there is no reason to suspect that a() will be called
before b().

BTW, Why are you posting to comp.sources.d ?
The code snippet above was actually provided by someone from
the GCC volunteer community. They attributed the unexpected
behavior to the undefined behavior of execution of function
parameters and sequence points.
If that was their exact wording, then they are wrong (or
expressed their intention incorrectly).

The behaviour is only unexpected if you were expecting
the wrong thing :)
I stumbled upon a "bug" in my C++ compiler (g++ 3.3.1), which I
promptly reported to Bugzilla. cout << "Key: " << inet_ntoa(itor->first.sin_addr)
<< ", " << itor->first.sin_port << endl
<< "Value: " << inet_ntoa(itor->second.sin_addr)
<< ", " << itor->second.sin_port << endl
<< endl;


Unfortunately you have wasted the time of the Bugzilla people.
You have correctly identified the essence of the "problem",
namely that inet_ntoa() returns a pointer into a static buffer.
In fact, on my system, the inet_ntoa manpage specifically
says:
The string is
returned in a statically allocated buffer, which
subsequent calls will overwrite.

If you still think this is a bug, then what do you think the
'fix' should be? The most common suggestion that people make
on comp.lang.c (or c++) is to force left-to-right evaluation
of function parameters.

This has been discussed to death before, but the main reason
for opposing it is that it would force compilers to produce
slower code in many cases. For example, some calling conventions
feature parameters being pushed onto a stack, with the right-most
parameters pushed first. A function with this calling convention
would need the compiler to jump through some hoops, instead of
a few simple function calls followed by a stack push of the
return value.

Jul 23 '05 #3
Geo
Old Wolf wrote:
The behaviour is unspecified (NOT undefined), and
"bat:bat", "bat:zat", and "zat:zat" are all valid outputs.
(But "zat:bat" is not.)


Attempting to modify a literal value is undefined behaviour, surely ?

Jul 23 '05 #4


Geo schreef:
Old Wolf wrote:
The behaviour is unspecified (NOT undefined), and
"bat:bat", "bat:zat", and "zat:zat" are all valid outputs.
(But "zat:bat" is not.)


Attempting to modify a literal value is undefined behaviour, surely ?


It would be. However, char[10] is not a literal. It can be modified.
It's equivalent to { int foo = 10; ++foo; } That doesn't modify 10.

HTH,
Michiel Salters

Jul 23 '05 #5
Geo


msalters wrote:
Geo schreef:
Old Wolf wrote:
The behaviour is unspecified (NOT undefined), and
"bat:bat", "bat:zat", and "zat:zat" are all valid outputs.
(But "zat:bat" is not.)


Attempting to modify a literal value is undefined behaviour, surely ?


It would be. However, char[10] is not a literal. It can be modified.
It's equivalent to { int foo = 10; ++foo; } That doesn't modify 10.

HTH,
Michiel Salters


No it's not equivalent at all,
char foo[10]="cat";

reserves 10 character slots and points char[0] at the address of "cat",
which is a literal. Later, foo[0] = 'z' is an attempt to modify the
first chatacter of "cat", i.e. modify the literal, which is undefined
behaviour.

Jul 23 '05 #6
Geo wrote:

msalters wrote:
Geo schreef:
Old Wolf wrote:

> The behaviour is unspecified (NOT undefined), and
> "bat:bat", "bat:zat", and "zat:zat" are all valid outputs.
> (But "zat:bat" is not.)

Attempting to modify a literal value is undefined behaviour, surely ?


It would be. However, char[10] is not a literal. It can be modified.
It's equivalent to { int foo = 10; ++foo; } That doesn't modify 10.

HTH,
Michiel Salters


No it's not equivalent at all,

char foo[10]="cat";

reserves 10 character slots and points char[0] at the address of "cat",
which is a literal. Later, foo[0] = 'z' is an attempt to modify the
first chatacter of "cat", i.e. modify the literal, which is undefined
behaviour.


You might want to reread your 'C++ begining programmers intorduction'
to figure out what
char foo[10] = "cat";
realy does.
Hint: It does not what you describe above.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 23 '05 #7
Geo wrote:

msalters wrote:
Geo schreef:

Attempting to modify a literal value is undefined behaviour, surely ?
It would be. However, char[10] is not a literal. It can be modified.
It's equivalent to { int foo = 10; ++foo; } That doesn't modify 10.



No it's not equivalent at all,
char foo[10]="cat";

reserves 10 character slots and points char[0] at the address of "cat",
which is a literal. Later, foo[0] = 'z' is an attempt to modify the
first chatacter of "cat", i.e. modify the literal, which is undefined
behaviour.


I may be wrong, but I was under the impression that:
char foo[10]="cat";
results in an array of size 10 in which members are initialised from
the string "cat" (including terminating \0). Whereas
char *bar="cat";
results in a pointer which points to the address of the literal "cat".

--
imalone
Jul 23 '05 #8


Geo schreef:
msalters wrote:
Geo schreef:
Old Wolf wrote:

> The behaviour is unspecified (NOT undefined), and
> "bat:bat", "bat:zat", and "zat:zat" are all valid outputs.
> (But "zat:bat" is not.)

Attempting to modify a literal value is undefined behaviour, surely ?


It would be. However, char[10] is not a literal. It can be modified.
It's equivalent to { int foo = 10; ++foo; } That doesn't modify 10.

HTH,
Michiel Salters


No it's not equivalent at all,
char foo[10]="cat";

reserves 10 character slots and points char[0] at the address of "cat",
which is a literal. Later, foo[0] = 'z' is an attempt to modify the
first chatacter of "cat", i.e. modify the literal, which is undefined
behaviour.


That's the description for { const char* foo = "cat"; }

You can't even point foo[0] to "cat". foo[0] is a char, check typeid()
or sizeof() if you don't believe me. A 'char' is not a 'char*', and
only the latter points.

Also, if you could point foo to "cat", you surely could later point
it to "dog". However, the compiler will tell you that

char foo[10]="cat";
foo = "dog"

is illegal. Of course,

const char* foo = "cat";
foo = "dog";

is legal.

HTH,
Michiel Salters

Jul 23 '05 #9
Geo
Sorry my mistake, you are all of course correct, I'll shut up now.

Jul 23 '05 #10


Old Wolf wrote:
The behaviour is unspecified (NOT undefined), and
"bat:bat", "bat:zat", and "zat:zat" are all valid outputs.
(But "zat:bat" is not.)

I find it somewhat hard to accept that "bat:bat" and "zat:zat" are
valid outputs. In fact I would be more willing to accept "zat:bat" as
valid output. The reason is that I can live with the fact that within
the same statement, the order of evaluating the functions is undefined.

Rick

Jul 23 '05 #11


ri******@yahoo.com schreef:
Old Wolf wrote:
The behaviour is unspecified (NOT undefined), and
"bat:bat", "bat:zat", and "zat:zat" are all valid outputs.
(But "zat:bat" is not.)

I find it somewhat hard to accept that "bat:bat" and "zat:zat" are
valid outputs. In fact I would be more willing to accept "zat:bat" as
valid output. The reason is that I can live with the fact that within
the same statement, the order of evaluating the functions is undefined.


True. So if write2 is called first, the string is changed to 'zat' and
the later to 'bat'. The char* returned is the same in both cases. If
cout only looks at that char* after both write*s have returned, it
will see "bat" twice, since the char* returned from write2 points
to memory later overwritten by a 'b'.

The point to remember is that write2 doesn't return a char* pointing
to a historical state of memory. It points to some memory, and the
user of write2 has to be aware that the contents of that memory can
change even after write2 returns.

Regards
Michiel Salters

Jul 23 '05 #12
ri******@yahoo.com wrote:

Old Wolf wrote:
The behaviour is unspecified (NOT undefined), and
"bat:bat", "bat:zat", and "zat:zat" are all valid outputs.
(But "zat:bat" is not.)


I find it somewhat hard to accept that "bat:bat" and "zat:zat" are
valid outputs. In fact I would be more willing to accept "zat:bat" as
valid output. The reason is that I can live with the fact that within
the same statement, the order of evaluating the functions is undefined.


But writestring() and write2() both return the same value, which is the
address of foo[0], *regardless* of what foo[] contains at any given
point in time. So why wouldn't they be the same?

--
Mike Smith
Jul 23 '05 #13
On 5 Jul 2005 09:20:55 -0700, "msalters"
<Mi*************@logicacmg.com> did courageously avow:


ri******@yahoo.com schreef:
Old Wolf wrote:
> The behaviour is unspecified (NOT undefined), and
> "bat:bat", "bat:zat", and "zat:zat" are all valid outputs.
> (But "zat:bat" is not.)
>

I find it somewhat hard to accept that "bat:bat" and "zat:zat" are
valid outputs. In fact I would be more willing to accept "zat:bat" as
valid output. The reason is that I can live with the fact that within
the same statement, the order of evaluating the functions is undefined.


True. So if write2 is called first, the string is changed to 'zat' and
the later to 'bat'. The char* returned is the same in both cases. If
cout only looks at that char* after both write*s have returned, it
will see "bat" twice, since the char* returned from write2 points
to memory later overwritten by a 'b'.

The point to remember is that write2 doesn't return a char* pointing
to a historical state of memory. It points to some memory, and the
user of write2 has to be aware that the contents of that memory can
change even after write2 returns.

Regards
Michiel Salters


Avoid the whole issue all together. Make two independent char *
variables to capture the output from the functions and then use the
variables in the cout statement. Now, everything is defined, no
surprises about the running order of things, and the debate ceases.
I find the best practice for me, when I am not sure of any potential
side effects or questions, is to use functions calls in-line in
statements only where I am absolutely sure of the consequences and
consign the return of all others to variables I can safely use
wherever and whenever I chose.

Ken Wilson

Amer. Dlx. Tele, Gary Moore LP, LP DC Classic w/P90s,
Jeff Beck Strat, Morgan OM Acoustic,
Rick 360/12, Std. Strat (MIM), Mesa 100 Nomad,
Mesa F-30

"Goodnight Austin, Texas, wherever you are."
Jul 23 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

37
by: Curt | last post by:
If this is the complete program (ie, the address of the const is never taken, only its value used) is it likely the compiler will allocate ram for constantA or constantB? Or simply substitute the...
5
by: Steven T. Hatton | last post by:
What should I expect the following code to print? Is it defined in the Standard? What does it produce for you? I was kind of surprised by what GCC 4.0.2 made of it. #include <string> #include...
24
by: s.subbarayan | last post by:
Dear all, According to standards is this valid: char TmpPtrWriteBuffer; void* PtrWriteBuffer =(void*) TmpPtrWriteBuffer; I had a debate with my colleagues that anything cant be typecasted to...
8
by: aditya | last post by:
Hi all, Can body please me that why the following code in not working as expected.Basically,my aim was to shift the control from one function to another as soon as I presses Control-c keys. In...
5
by: Stephen Mayes | last post by:
#include <stdio.h> #include <stdlib.h> #include <string.h> int main (void) { static char * contents = "Line1\nLine2\nLine3\nLine4"; FILE * tmp; char readbuf; size_t len, n = 0;
15
by: Chung Leong | last post by:
Here's a little brain teaser distilled from a bug that took me a rather long time to figure out. The two functions in the example below behave differently. The difference is easy to spot, of...
14
by: hsharsha | last post by:
Consider the below code snippet: #include <iostream> using namespace::std; class myclass { public: myclass() { cout << "constructor" << endl;
4
by: duffdevice | last post by:
Hi, I came across this unexpected behavior while working on something else. I am attempting to return a custom type by value from a global function. I have a trace in the custom class's copy...
32
by: r.z. | last post by:
class vector3 { public: union { float data; struct { float x, y, z; };
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.