By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,079 Members | 940 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,079 IT Pros & Developers. It's quick & easy.

Undefined Behavior. ..

P: n/a
Will the following statement invoke undefined behavior :

a^=b,b^=a,a^=b ;

given that a and b are of int-type ??

Be cautious, I have not written a^=b^=a^=b ; which, of course, is
undefined. I am having some confusion with the former statement!

Also, state the reason for the statement being undefined!
Nov 15 '08 #1
Share this Question
Share on Google+
33 Replies


P: n/a
co**************@gmail.com wrote:
Will the following statement invoke undefined behavior :

a^=b,b^=a,a^=b ;

given that a and b are of int-type ??
No, because the comma operator constitutes a sequence point
Be cautious, I have not written a^=b^=a^=b ; which, of course, is
undefined. I am having some confusion with the former statement!

Also, state the reason for the statement being undefined!
6.5.17.1
The left operand of a comma operator is evaluated as a void expression;
there is a sequence point after its evaluation.

Bye, Jojo
Nov 15 '08 #2

P: n/a
<<Be cautious, I have not written a^=b^=a^=b ; which, of course, is
undefined. I am having some confusion with the former statement!>>

I think this expression is valid as assignment operator is evaluated
from right to left so it is parsed as
(a^=(b^=(a^=b)))
above expression swaps two numbers
please correct me if i am wrong...
Nov 15 '08 #3

P: n/a
c.***********@gmail.com wrote:
<<Be cautious, I have not written a^=b^=a^=b ; which, of course, is
undefined. I am having some confusion with the former statement!>>

I think this expression is valid as assignment operator is evaluated
from right to left so it is parsed as
(a^=(b^=(a^=b)))
above expression swaps two numbers
please correct me if i am wrong...
This is modifying 'a' twice without an intermediate sequence point ->
undefined behavoir

Bye, Jojo
Nov 15 '08 #4

P: n/a
On Nov 15, 3:18*pm, "c.lang.mys...@gmail.com"
<c.lang.mys...@gmail.comwrote:
<<Be cautious, I have not written a^=b^=a^=b ; which, of course, is
undefined. I am having some confusion with the former statement!>>

I think this expression is valid as assignment operator is evaluated
from right to left so it is parsed as
(a^=(b^=(a^=b)))
6.5.17

1.
expression:
assignment-expression
expression , assignment-expression

2.
The left operand of a comma operator is evaluated as a void
expression; there is a sequence point after its evaluation. Then the
right operand is evaluated; the result has its type and value.

So, you have:

a^=b, b^=a, a^=b
[expression][assignment-expression]
|
a^=b, b^=a
[expression][assignment-expression]
|
a^=b
[assignment-expression]

OTOH, a^=b, b^=a, a^=b
[left operand] [right operand]

Left operand evaluted as void, and comma epression has a type and
value of right operand: a^=b.
But left operand if: a^=b, b^=a
And again: a^=b, b^=a
[left operand] [right operand]
Left operand evaluted as void, and comma epression has a type and
value of right operand: b^=a.

So, the sequnce should be like:
evalute a^=b, then b^=a, then a^=b.
You should get the same result as in this case:

a^=b;
b^=a;
a^=b;
Nov 15 '08 #5

P: n/a
maverik wrote:
On Nov 15, 3:18 pm, "c.lang.mys...@gmail.com"
<c.lang.mys...@gmail.comwrote:
><<Be cautious, I have not written a^=b^=a^=b ; which, of course, is
undefined. I am having some confusion with the former statement!>>

I think this expression is valid as assignment operator is evaluated
from right to left so it is parsed as
(a^=(b^=(a^=b)))

6.5.17

1.
expression:
assignment-expression
expression , assignment-expression

2.
The left operand of a comma operator is evaluated as a void
expression; there is a sequence point after its evaluation. Then the
right operand is evaluated; the result has its type and value.

So, you have:

a^=b, b^=a, a^=b
You don't have any comma operators in the code that you quoted.

--
pete
Nov 15 '08 #6

P: n/a
co**************@gmail.com wrote:
Will the following statement invoke undefined behavior :

a^=b,b^=a,a^=b ;

given that a and b are of int-type ??
That's perfectly safe. As a stand-alone statement, it is exactly
equivalent to

a^=b;
b^=a;
a^=b;

The use of the comma operator is only needed in cases like if(A; B; C),
where A, B, and C can only be a single statement. In any other context,
I'd recommend breaking it out into three separate statements, but only
for the sake of clarity - it's perfectly legal as a single statement.
Be cautious, I have not written a^=b^=a^=b ; which, of course, is
undefined. I am having some confusion with the former statement!

Also, state the reason for the statement being undefined!
6.5p2: "Between the previous and next sequence point an object shall
have its stored value modified at most once by the evaluation of an
expression".

Because the second version violates a "shall" occurring outside of a
"Constraints" section, the behavior is undefined.

The key difference between the two statements is the presence of the ','
operators in the first version. The ',' operator inserts a sequence
point separating it's two operands.
Nov 15 '08 #7

P: n/a
On Nov 15, 5:24*pm, "Joachim Schmitz" <nospam.j...@schmitz-digital.de>
wrote:
c.lang.mys...@gmail.com wrote:
<<Be cautious, I have not written a^=b^=a^=b ; which, of course, is
undefined. I am having some confusion with the former statement!>>
I think this expression is valid as assignment operator is evaluated
from right to left so it is parsed as
(a^=(b^=(a^=b)))
above expression swaps two numbers
please correct me if i am wrong...

This is modifying 'a' twice without an intermediate sequence point ->
undefined behavoir

Bye, Jojo
sorry,I am still not able to get it...

now again consider a^=b^=a^=b
when compiler see this expression it starts from RHS as assignment
operator is right associative
so first it has to calculate rightmost a^=b then value of a must have
to change....it porceedes then to calculate b^=a ..then again a^=b....

as we can also use a=b=3;
Do you mean that after right most a^=b,we can not be sure that value
of a is changed till we get sequence point...
but a=b=3 is valid

Please clear my confusion...
thanks,

Nov 15 '08 #8

P: n/a
On Nov 15, 3:34*pm, pete <pfil...@mindspring.comwrote:
You don't have any comma operators in the code that you quoted.
Ok, I mean this: a^=b,b^=a,a^=b; of course

Nov 15 '08 #9

P: n/a
On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com"
<c.lang.mys...@gmail.comwrote:
On Nov 15, 5:24 pm, "Joachim Schmitz" <nospam.j...@schmitz-digital.de>
wrote:
c.lang.mys...@gmail.com wrote:
<<Be cautious, I have not written a^=b^=a^=b ; which, of course, is
undefined. I am having some confusion with the former statement!>>
I think this expression is valid as assignment operator is evaluated
from right to left so it is parsed as
(a^=(b^=(a^=b)))
above expression swaps two numbers
please correct me if i am wrong...
This is modifying 'a' twice without an intermediate sequence point ->
undefined behavoir
Bye, Jojo

sorry,I am still not able to get it...
What don't you get?
now again consider a^=b^=a^=b
when compiler see this expression it starts from RHS as assignment
Wrong, when the compiler sees that expression, he can do whatever he
wants because that expression invokes undefined behavior.
operator is right associative
Uh, no, it's UB.
so first it has to calculate rightmost a^=b then value of a must have
Uh, no, it's UB.
to change....it porceedes then to calculate b^=a ..then again a^=b....
Uh, no, it's UB.
as we can also use a=b=3;
That's different than the former expression, because in that
expression a and b are modified only once between sequence points.
Do you mean that after right most a^=b,we can not be sure that value
No, a^=b^=a^=b invokes UB and if you have that in your code, you can't
be sure of anything at all.
but a=b=3 is valid
Please clear my confusion...
For the reasons I explained before. Now I have a question: How the
hell do you know that a = b = 3 is valid if you don't understand and
you're confused?
Nov 15 '08 #10

P: n/a
On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote:
On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com" <c.lang.mys...@gmail.com>
wrote:
>now again consider a^=b^=a^=b
when compiler see this expression it starts from RHS as assignment

Wrong, when the compiler sees that expression, he can do whatever he
wants because that expression invokes undefined behavior.
When the compiler sees that expression, it cannot be sure whether it will
be executed. It must compile the code, and if the program reaches the
point where the expression is evaluated, _then_ the behaviour is
undefined. The compiler cannot do whatever it wants. The compiler can make
the resulting code do whatever it wants.
>operator is right associative

Uh, no, it's UB.
Uh, yes, ^= is right associative. The behaviour is undefined but that does
not affect parsing.
Nov 15 '08 #11

P: n/a
On Nov 15, 3:11 pm, Harald van Dk <true...@gmail.comwrote:
On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote:
On Nov 15, 2:39 pm, "c.lang.mys[298]...@gmail.com" <c.lang.mys[299]...@gmail.com>
wrote:
now again consider a^=b^=a^=b
when compiler see this expression it starts from RHS as assignment
Wrong, when the compiler sees that expression, he can do whatever he
wants because that expression invokes undefined behavior.

When the compiler sees that expression, it cannot be sure whether it will
be executed. It must compile the code, and if the program reaches the
point where the expression is evaluated, _then_ the behaviour is
undefined. The compiler cannot do whatever it wants. The compiler can make
the resulting code do whatever it wants.
What ELSE could the compiler do other than having the output be
whatever he wants? I don't think you've really added to what I said.
operator is right associative
Uh, no, it's UB.

Uh, yes, ^= is right associative. The behaviour is undefined but that does
not affect parsing.
Well, UB is when the standard doesn't impose any requirements for the
behavior of the implementation. You'd argue that 'parsing' is not
included in 'behavior'? You might be right but I'd like quotes from
the standard to trust you. (you are certainly credible, but c&v from
the standard is always nice)
Nov 15 '08 #12

P: n/a
Harald van Dijk wrote:
On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote:
>On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com" <c.lang.mys...@gmail.com>
wrote:
>>now again consider a^=b^=a^=b
when compiler see this expression it starts from RHS as assignment
Wrong, when the compiler sees that expression, he can do whatever he
wants because that expression invokes undefined behavior.

When the compiler sees that expression, it cannot be sure whether it will
be executed. It must compile the code, and if the program reaches the
point where the expression is evaluated, _then_ the behaviour is
undefined. The compiler cannot do whatever it wants. The compiler can make
the resulting code do whatever it wants.
The standard quite explicitly says that the consequences of undefined
behavior can include failing to compile, which clearly indicates that it
can precede execution of the relevant code. The standard does not
explain this in any detail, but I believe that the relevant rule is that
the undefined behavior is allowed at any point after execution of the
relevant code becomes inevitable.

Example:

if(some condition)
a^=b^=a^=b;

For code like this, the behavior of the code becomes undefined as soon
as it becomes inevitable that the if() clause will be executed. This
means that at points in the code prior the if() statement, the compiler
is allowed to generate code using optimizations that only work if the
if-condition is not true. As a result, those optimizations may cause
your code to misbehave long before evaluation of the offending statement.
Nov 15 '08 #13

P: n/a
On Sat, 15 Nov 2008 13:38:21 +0000, James Kuyper wrote:
Harald van Dijk wrote:
>On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote:
>>On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com"
<c.lang.mys...@gmail.comwrote:
now again consider a^=b^=a^=b
when compiler see this expression it starts from RHS as assignment
Wrong, when the compiler sees that expression, he can do whatever he
wants because that expression invokes undefined behavior.

When the compiler sees that expression, it cannot be sure whether it
will be executed. It must compile the code, and if the program reaches
the point where the expression is evaluated, _then_ the behaviour is
undefined. The compiler cannot do whatever it wants. The compiler can
make the resulting code do whatever it wants.

The standard quite explicitly says that the consequences of undefined
behavior can include failing to compile,
It does quite explicitly say so, but there is a distinction between
compile-time undefined behaviour and run-time undefined behaviour (not
spelled out in the standard, but made explicit in DRs). This is run-time
undefined behaviour, where refusing to compile is permitted only if the
compiler can prove that the code would always be executed.
which clearly indicates that it
can precede execution of the relevant code. The standard does not
explain this in any detail, but I believe that the relevant rule is that
the undefined behavior is allowed at any point after execution of the
relevant code becomes inevitable.
I agree with this.
Nov 15 '08 #14

P: n/a
On Sat, 15 Nov 2008 05:37:35 -0800, vippstar wrote:
On Nov 15, 3:11 pm, Harald van Dijk <true...@gmail.comwrote:
>On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote:
On Nov 15, 2:39 pm, "c.lang.mys[298]...@gmail.com"
<c.lang.mys[299]...@gmail.comwrote:
now again consider a^=b^=a^=b
when compiler see this expression it starts from RHS as assignment
Wrong, when the compiler sees that expression, he can do whatever he
wants because that expression invokes undefined behavior.

When the compiler sees that expression, it cannot be sure whether it
will be executed. It must compile the code, and if the program reaches
the point where the expression is evaluated, _then_ the behaviour is
undefined. The compiler cannot do whatever it wants. The compiler can
make the resulting code do whatever it wants.

What ELSE could the compiler do other than having the output be whatever
he wants? I don't think you've really added to what I said.
A non-conforming compiler could give an error message and refuse to
compile any program containing a^=b^=a^=b, but a conforming compiler is
not allowed to do so (or at least not always).
>operator is right associative
Uh, no, it's UB.

Uh, yes, ^= is right associative. The behaviour is undefined but that
does not affect parsing.

Well, UB is when the standard doesn't impose any requirements for the
behavior of the implementation.
Yes, but to get to the part where the standard says the behaviour is
undefined you need to have already parsed the expression. I can't give an
exact chapter and verse, but can you explain how you can determine that
the behaviour is undefined without parsing?
Nov 15 '08 #15

P: n/a
On Nov 15, 4:04 pm, Harald van Dk <true...@gmail.comwrote:
On Sat, 15 Nov 2008 05:37:35 -0800, vippstar wrote:
On Nov 15, 3:11 pm, Harald van Dk <true[389]...@gmail.comwrote:
On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote:
On Nov 15, 2:39 pm, "c.lang.mys[298].[390]...@gmail.com"
<c.lang.mys[299].[391]...@gmail.comwrote:
now again consider a^=b^=a^=b
when compiler see this expression it starts from RHS as assignment
Wrong, when the compiler sees that expression, he can do whatever he
wants because that expression invokes undefined behavior.
When the compiler sees that expression, it cannot be sure whether it
will be executed. It must compile the code, and if the program reaches
the point where the expression is evaluated, _then_ the behaviour is
undefined. The compiler cannot do whatever it wants. The compiler can
make the resulting code do whatever it wants.
What ELSE could the compiler do other than having the output be whatever
he wants? I don't think you've really added to what I said.

A non-conforming compiler could give an error message and refuse to
compile any program containing a^=b^=a^=b, but a conforming compiler is
not allowed to do so (or at least not always).
Wrong, a conforming compiler can reject that code. The reason might be
quite different than the expression, but it serves more like an excuse
to reject this. For example, the compiler could claim (assuming a, b,
int) that the source exceeds the environmental limits.
operator is right associative
Uh, no, it's UB.
Uh, yes, ^= is right associative. The behaviour is undefined but that
does not affect parsing.
Well, UB is when the standard doesn't impose any requirements for the
behavior of the implementation.

Yes, but to get to the part where the standard says the behaviour is
undefined you need to have already parsed the expression. I can't give an
exact chapter and verse, but can you explain how you can determine that
the behaviour is undefined without parsing?
If I understand, what you're saying is that the semantics of something
are not affected by UB, well that's wrong I believe. ^= doesn't need
to be anything meaningful after the first UB in source code.
Nov 15 '08 #16

P: n/a
On Sat, 15 Nov 2008 06:19:14 -0800, vippstar wrote:
On Nov 15, 4:04 pm, Harald van Dijk <true...@gmail.comwrote:
>A non-conforming compiler could give an error message and refuse to
compile any program containing a^=b^=a^=b, but a conforming compiler is
not allowed to do so (or at least not always).

Wrong, a conforming compiler can reject that code. The reason might be
quite different than the expression, but it serves more like an excuse
to reject this. For example, the compiler could claim (assuming a, b,
int) that the source exceeds the environmental limits.
Heh, that's nasty. I suppose it has to also reject a^=b^=c^=d, then, to
cover up its lie? I can't think of a limit right now that a^=b^=a^=b might
exceed that a^=b^=c^=d doesn't, and if you can, then I can try to come up
with another valid example that also exceeds that limit. :)
>Uh, yes, ^= is right associative. The behaviour is undefined but
that does not affect parsing.
Well, UB is when the standard doesn't impose any requirements for the
behavior of the implementation.

Yes, but to get to the part where the standard says the behaviour is
undefined you need to have already parsed the expression. I can't give
an exact chapter and verse, but can you explain how you can determine
that the behaviour is undefined without parsing?

If I understand, what you're saying is that the semantics of something
are not affected by UB, well that's wrong I believe.
Yes, that is wrong, and no, that is not what I'm saying.
^= doesn't need to
be anything meaningful after the first UB in source code.
To determine that a^=b^=a^=b modifies a at all, you need to already have
determined that this is an assignment-expression with only a as its left
operand. In other words, that it is equivalent to a^=(b^=(a^=b)) instead
of ((a^=b)^=a)^=b. The latter is simply a constraint violation and
attempts to modify a only once: the other ^= operators attempts to modify
the result of an assignment.
Nov 15 '08 #17

P: n/a
On Nov 15, 5:01 pm, Harald van Dk <true...@gmail.comwrote:
On Sat, 15 Nov 2008 06:19:14 -0800, vippstar wrote:
On Nov 15, 4:04 pm, Harald van Dk <true...@gmail.comwrote:
A non-conforming compiler could give an error message and refuse to
compile any program containing a^=b^=a^=b, but a conforming compiler is
not allowed to do so (or at least not always).
Wrong, a conforming compiler can reject that code. The reason might be
quite different than the expression, but it serves more like an excuse
to reject this. For example, the compiler could claim (assuming a, b,
int) that the source exceeds the environmental limits.

Heh, that's nasty. I suppose it has to also reject a^=b^=c^=d, then, to
cover up its lie? I can't think of a limit right now that a^=b^=a^=b might
exceed that a^=b^=c^=d doesn't, and if you can, then I can try to come up
with another valid example that also exceeds that limit. :)
There's no need for that, a conforming compiler can also reject
a^=b^=c^=d for the same reason.
Also, a compiler doesn't have to cover up its lies as there's no such
requirement. My point is that a conforming compiler can reject
anything, and a compiler can output anything when UB is encountered. I
believe that to be true, and I think the requirements the standard
sets on an implemenations agree with those words.
Uh, yes, ^= is right associative. The behaviour is undefined but
that does not affect parsing.
Well, UB is when the standard doesn't impose any requirements for the
behavior of the implementation.
Yes, but to get to the part where the standard says the behaviour is
undefined you need to have already parsed the expression. I can't give
an exact chapter and verse, but can you explain how you can determine
that the behaviour is undefined without parsing?
If I understand, what you're saying is that the semantics of something
are not affected by UB, well that's wrong I believe.

Yes, that is wrong, and no, that is not what I'm saying.
Alright then...
^= doesn't need to
be anything meaningful after the first UB in source code.

To determine that a^=b^=a^=b modifies a at all, you need to alreadyhave
determined that this is an assignment-expression with only a as its left
operand. In other words, that it is equivalent to a^=(b^=(a^=b)) instead
of ((a^=b)^=a)^=b. The latter is simply a constraint violation and
attempts to modify a only once: the other ^= operators attempts to modify
the result of an assignment.
Hmm... then yes, I was wrong on the original matter.
Nov 15 '08 #18

P: n/a
On Nov 15, 5:56*pm, vipps...@gmail.com wrote:
On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com"

<c.lang.mys...@gmail.comwrote:
On Nov 15, 5:24 pm, "Joachim Schmitz" <nospam.j...@schmitz-digital.de>
wrote:
c.lang.mys...@gmail.com wrote:
<<Be cautious, I have not written a^=b^=a^=b ; which, of course, is
undefined. I am having some confusion with the former statement!>>
I think this expression is valid as assignment operator is evaluated
from right to left so it is parsed as
(a^=(b^=(a^=b)))
above expression swaps two numbers
please correct me if i am wrong...
This is modifying 'a' twice without an intermediate sequence point ->
undefined behavoir
Bye, Jojo
sorry,I am still not able to get it...
but a=b=3 is valid
Please clear my confusion...

For the reasons I explained before. Now I have a question: How the
hell do you know that a = b = 3 is valid if you don't understand and
you're confused
Now i am able to get it.
I know a=b=3 is valid as i read it in my C text book but i had not
read other undefined statement in this thread so i was confused about
it...
Nov 15 '08 #19

P: n/a
On Nov 15, 7:12 pm, "c.lang.mys...@gmail.com"
<c.lang.mys...@gmail.comwrote:
I know a=b=3 is valid as i read it in my C text book but i had not
read other undefined statement in this thread so i was confused about
it..
Which book and page?
Nov 15 '08 #20

P: n/a
On Nov 15, 10:35*pm, vipps...@gmail.com wrote:
On Nov 15, 7:12 pm, "c.lang.mys...@gmail.com"

<c.lang.mys...@gmail.comwrote:
I know a=b=3 is valid as i read it in my C text book but i had not
read other undefined statement in this thread so i was confused about
it..

Which book and page?
K n R page-48...in function strcat
2nd line is i=j=0;
Nov 15 '08 #21

P: n/a
On Nov 15, 7:53 pm, "c.lang.mys...@gmail.com"
<c.lang.mys...@gmail.comwrote:
On Nov 15, 10:35 pm, vipps...@gmail.com wrote:
On Nov 15, 7:12 pm, "c.lang.mys...@gmail.com"
<c.lang.mys...@gmail.comwrote:
I know a=b=3 is valid as i read it in my C text book but i had not
read other undefined statement in this thread so i was confused about
it..
Which book and page?

K n R page-48...in function strcat
2nd line is i=j=0;

Thanks.
Nov 15 '08 #22

P: n/a
James Kuyper wrote:
co**************@gmail.com wrote:
>Will the following statement invoke undefined behavior :

a^=b,b^=a,a^=b ;

given that a and b are of int-type ??

That's perfectly safe.
Not perfectly.
Given that a and b are of int-type,
(a^=b) is capable of generating a trap representation.

--
pete
Nov 15 '08 #23

P: n/a
On Nov 16, 2:33*am, pete <pfil...@mindspring.comwrote:
James Kuyper wrote:
coolguyaround...@gmail.com wrote:
Will the following statement invoke undefined behavior :
a^=b,b^=a,a^=b ;
given that a and b are of int-type ??
That's perfectly safe.

Not perfectly.
Given that a and b are of int-type,
(a^=b) is capable of generating a trap representation.

--
pete
Can you please elaborate about this trap representation.....
Nov 16 '08 #24

P: n/a
c.***********@gmail.com wrote:
On Nov 16, 2:33 am, pete <pfil...@mindspring.comwrote:
>James Kuyper wrote:
>>coolguyaround...@gmail.com wrote:
Will the following statement invoke undefined behavior :
a^=b,b^=a,a^=b ;
given that a and b are of int-type ??
That's perfectly safe.
Not perfectly.
Given that a and b are of int-type,
(a^=b) is capable of generating a trap representation.

--
pete

Can you please elaborate about this trap representation.....
Type int is allowed to have representations,
which don't represent any value;
those would be trap representations.

The bit pattern for negative zero
is allowed to be a trap representation.
In one's complement, it's all bits set to 1.
In signed magnitude, it's a 1 followed by all 0's.

--
pete
Nov 16 '08 #25

P: n/a
pete wrote:
c.***********@gmail.com wrote:
>On Nov 16, 2:33 am, pete <pfil...@mindspring.comwrote:
>>James Kuyper wrote:
coolguyaround...@gmail.com wrote:
Will the following statement invoke undefined behavior :
a^=b,b^=a,a^=b ;
given that a and b are of int-type ??
That's perfectly safe.
Not perfectly.
Given that a and b are of int-type,
(a^=b) is capable of generating a trap representation.

--
pete

Can you please elaborate about this trap representation.....

Type int is allowed to have representations,
which don't represent any value;
those would be trap representations.

The bit pattern for negative zero
is allowed to be a trap representation.
In one's complement, it's all bits set to 1.
In signed magnitude, it's a 1 followed by all 0's.
Also, in two's complement, a 1 followed by all 0's,
can be either INT_MIN or a trap.

--
pete
Nov 16 '08 #26

P: n/a
On 16 Nov 2008 at 4:50, c.***********@gmail.com wrote:
On Nov 16, 2:33*am, pete <pfil...@mindspring.comwrote:
>(a^=b) is capable of generating a trap representation.

Can you please elaborate about this trap representation.....
It's something you don't need to know about or worry about - it's just
something many of the clc regulars have a morbid obsession with.

If you're writing real-world C programs on a modern desktop, you will
never encounter trap representations, so they may as well not exist.

If you're programming mainframes from the 1970s, or if you have an
academic interest in the most obscure technicalities of the ISO
Standard, then you should live and breathe trap representations, and
expect to find them everywhere you look. Then you can join the clc
regulars' club.

Nov 16 '08 #27

P: n/a
c.***********@gmail.com wrote:
On Nov 16, 2:33 am, pete <pfil...@mindspring.comwrote:
>James Kuyper wrote:
>>coolguyaround...@gmail.com wrote:
Will the following statement invoke undefined behavior :
a^=b,b^=a,a^=b ;
given that a and b are of int-type ??
That's perfectly safe.
Not perfectly.
Given that a and b are of int-type,
(a^=b) is capable of generating a trap representation.

--
pete

Can you please elaborate about this trap representation.....
A trap representation for a given data type is a bit pattern that
doesn't actually represent a valid value in that type. Most data types
are allowed to have trap representations. Whether or not they actually
do is up to your implementation. The main exceptions are the character
types. Also, a struct or a union object is never a trap representation,
even though it may have a member which is a trap representation.

Uninitialized objects might contain a trap representation. You can
create a trap representation in an object by accessing the object as if
it were an array of bytes and changing the value of some of those bytes.
A pointer object whose value used to point into a memory block allocated
by calling a member of the malloc() family can become a trap
representation as a result of free()ing that memory block(). A pointer
that points at or into a non-static object that is local to a function
can become a trap representation when that function returns. A FILE *
value can become a trap representation as a result of fclose()ing the
file. There are also several other more obscure ways in which an object
may acquire a trap representation.

Any attempt to read the value of an object containing a trap
representation has undefined behavior. Any attempt to store a trap
representation in an object has undefined behavior.

The standard explicitly states (6.2.6.2p2) that any given signed integer
type has one bit pattern that might or might not be a trap
representation - it's up to the implementation to decide (and to
document their decision). For types that use a one's complement or
sign-magnitude representation, this is the bit pattern that would
otherwise represent negative 0. If the type uses a twos-complement
representation, this is the bit pattern that would otherwise represent
-2^N, where N is the number of value bits in the type.

Some people read that clause as allowing only that one trap
representation, and requiring that all other bit patterns must be valid.
I don't read it that way. It seems to me that what it says still allows
for the possibility of other trap representations as well. An
implementation that used 1 padding bit, 1 sign bit, and 30 value bits
for 'int' could set INT_MAX to 1000000000, and INT_MIN to -1000000000,
and declare that all bit patterns that would seem to represent values
outside that range are actually trap representations. It's been argued
that this violates the requirement that for any signed type "Each bit
that is a value bit shall have the same value as the same bit in the
object representation of the corresponding unsigned type." But every
does have that value, in every non-trap representation that has that bit
set.
Nov 16 '08 #28

P: n/a
James Kuyper <ja*********@verizon.netwrites:
Harald van Dijk wrote:
On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote:
On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com" <c.lang.mys...@gmail.com>
wrote:
now again consider a^=b^=a^=b
when compiler see this expression it starts from RHS as assignment
Wrong, when the compiler sees that expression, he can do whatever he
wants because that expression invokes undefined behavior.
When the compiler sees that expression, it cannot be sure whether it will
be executed. It must compile the code, and if the program reaches the
point where the expression is evaluated, _then_ the behaviour is
undefined. The compiler cannot do whatever it wants. The compiler can make
the resulting code do whatever it wants.

The standard quite explicitly says that the consequences of undefined
behavior can include failing to compile, which clearly indicates that it
can precede execution of the relevant code. The standard does not
explain this in any detail, but I believe that the relevant rule is that
the undefined behavior is allowed at any point after execution of the
relevant code becomes inevitable.
I had to read this a couple of times before I understood the main
point. I mostly agree with the main point (explained further below),
but the inference stated in the first sentence seems wrong to me.
It's true that undefined behavior during, e.g., macro processing might
cause compilation to fail, but execution-time undefined behavior can't
start until program execution is started. Whatever execution-time
undefined behavior there might be isn't allowed to creep across the
boundary into program translation.

Example:

if(some condition)
a^=b^=a^=b;

For code like this, the behavior of the code becomes undefined as soon
as it becomes inevitable that the if() clause will be executed. This
means that at points in the code prior the if() statement, the compiler
is allowed to generate code using optimizations that only work if the
if-condition is not true. As a result, those optimizations may cause
your code to misbehave long before evaluation of the offending statement.
That's true, but only up to a point, under the as-if rule. In
particular, the undefined behavior must not be allowed to go back
before any access of a volatile variable, or any operation on any
potentially interactive device. In practical terms, the undefined
behavior usually cannot be moved back before a call to an external
function, since any unknown function might call longjmp(), preventing
the code in question from being reached.

I don't think any of these refinements are at odds with the intended
point, but they seem important enough to be worth mentioning.
Nov 18 '08 #29

P: n/a
James Kuyper <ja*********@verizon.netwrites:
[...]

The standard explicitly states (6.2.6.2p2) that any given signed integer
type has one bit pattern that might or might not be a trap
representation - it's up to the implementation to decide (and to
document their decision). For types that use a one's complement or
sign-magnitude representation, this is the bit pattern that would
otherwise represent negative 0. If the type uses a twos-complement
representation, this is the bit pattern that would otherwise represent
-2^N, where N is the number of value bits in the type.

Some people read that clause as allowing only that one trap
representation, and requiring that all other bit patterns must be valid.
I don't read it that way. It seems to me that what it says still allows
for the possibility of other trap representations as well. An
implementation that used 1 padding bit, 1 sign bit, and 30 value bits
for 'int' could set INT_MAX to 1000000000, and INT_MIN to -1000000000,
and declare that all bit patterns that would seem to represent values
outside that range are actually trap representations. It's been argued
that this violates the requirement that for any signed type "Each bit
that is a value bit shall have the same value as the same bit in the
object representation of the corresponding unsigned type." But every
does have that value, in every non-trap representation that has that bit
set.
I must admit to having trouble with this one. What's the basis for
the position you state? In the example given the presence of a
padding bit seems completely irrelevant (except perhaps to make the
number of bits a multiple of 8 while preserving round limits?) -- is
there any difference between this example and one using 31 value bits
to represent values in [-2000000000 .. 2000000000]? It seems like
all you are saying is that you think some combinations of values
bits are allowed to be trap representations whereas other people
think they aren't (not counting the distinguished ones explicitly
identified in the Standard, of course). What's the argument to
support this position?
Nov 18 '08 #30

P: n/a


Tim Rentsch wrote:
James Kuyper <ja*********@verizon.netwrites:
[...]

The standard explicitly states (6.2.6.2p2) that any given signed integer
type has one bit pattern that might or might not be a trap
representation - it's up to the implementation to decide (and to
document their decision). For types that use a one's complement or
sign-magnitude representation, this is the bit pattern that would
otherwise represent negative 0. If the type uses a twos-complement
representation, this is the bit pattern that would otherwise represent
-2^N, where N is the number of value bits in the type.

Some people read that clause as allowing only that one trap
representation, and requiring that all other bit patterns must be valid.
I don't read it that way. It seems to me that what it says still allows
for the possibility of other trap representations as well. An
implementation that used 1 padding bit, 1 sign bit, and 30 value bits
for 'int' could set INT_MAX to 1000000000, and INT_MIN to -1000000000,
and declare that all bit patterns that would seem to represent values
outside that range are actually trap representations. It's been argued
that this violates the requirement that for any signed type "Each bit
that is a value bit shall have the same value as the same bit in the
object representation of the corresponding unsigned type." But every
does have that value, in every non-trap representation that has that bit
set.

I must admit to having trouble with this one. What's the basis for
the position you state? In the example given the presence of a
padding bit seems completely irrelevant (except perhaps to make the
number of bits a multiple of 8 while preserving round limits?)
Exactly - it serves to make an extremely implausible but legal
implementation very marginally more plausible.
-- is
there any difference between this example and one using 31 value bits
to represent values in [-2000000000 .. 2000000000]?
Yes - I can imagine reasons for wanting INT_MAX to be a power of 10; I
find it much harder to come up with reasons for wanting INT_MAX to be
2 times a power of 10. Again, it's just a matter of making an
intrinsically implausible implementation a little more plausible.
... It seems like
all you are saying is that you think some combinations of values
bits are allowed to be trap representations whereas other people
think they aren't (not counting the distinguished ones explicitly
identified in the Standard, of course). What's the argument to
support this position?
Which position - mine or theirs? My position is based upon the fact
that the standard explicitly allows for trap representations, and says
nothing to limit how many any given type may have. The opposing
position is based upon the claim that 6.2.6.2p2 defines the only trap
representation involving value bits that a signed integer type is
allowed to have. As I read it, 6.2.6.2p2 serves primarily to explain
the fact that the bit pattern that would otherwise represent negative
zero in 1's complement or sign-magnitude representations is allowed to
be a trap representation. This clears up any ambiguity that might
arise due to the fact that 0 has two distinct representations for such
types. It doesn't imply in any way that negative zero is the only
allowed trap representation. The fact that it also defines a bit
pattern for 2's complement representations that is allowed to be a
trap representation is a weak point in my argument. If my argument is
correct, that clause is redundant; but it doesn't directly contradict
my conclusion.
Nov 18 '08 #31

P: n/a
>I had to read this a couple of times before I understood the main
>point. I mostly agree with the main point (explained further below),
but the inference stated in the first sentence seems wrong to me.
It's true that undefined behavior during, e.g., macro processing might
cause compilation to fail, but execution-time undefined behavior can't
start until program execution is started. Whatever execution-time
undefined behavior there might be isn't allowed to creep across the
boundary into program translation.
Get a good textbook on Temporal Mechanics. Miles O'Brien of Deep
Space Nine has one. The subject isn't as simple as you might think.

One call to fflush(stdin) retroactively removed the 'ant' keyword
from C89 (const ant int xyz = 42; defined an actual integer constant
expression in a variable). Too bad; it was a useful feature :-) .
Nov 19 '08 #32

P: n/a
go****@hammy.burditt.org (Gordon Burditt) writes:
I had to read this a couple of times before I understood the main
point. I mostly agree with the main point (explained further below),
but the inference stated in the first sentence seems wrong to me.
It's true that undefined behavior during, e.g., macro processing might
cause compilation to fail, but execution-time undefined behavior can't
start until program execution is started. Whatever execution-time
undefined behavior there might be isn't allowed to creep across the
boundary into program translation.

Get a good textbook on Temporal Mechanics. Miles O'Brien of Deep
Space Nine has one. The subject isn't as simple as you might think.
I already have one. It was required reading in the Temporal Mechanics
course that I took ten years from now.
Nov 19 '08 #33

P: n/a
jameskuyper <ja*********@verizon.netwrites:
Tim Rentsch wrote:
James Kuyper <ja*********@verizon.netwrites:
[...]
>
The standard explicitly states (6.2.6.2p2) that any given signed integer
type has one bit pattern that might or might not be a trap
representation - it's up to the implementation to decide (and to
document their decision). For types that use a one's complement or
sign-magnitude representation, this is the bit pattern that would
otherwise represent negative 0. If the type uses a twos-complement
representation, this is the bit pattern that would otherwise represent
-2^N, where N is the number of value bits in the type.
>
Some people read that clause as allowing only that one trap
representation, and requiring that all other bit patterns must be valid.
I don't read it that way. It seems to me that what it says still allows
for the possibility of other trap representations as well. An
implementation that used 1 padding bit, 1 sign bit, and 30 value bits
for 'int' could set INT_MAX to 1000000000, and INT_MIN to -1000000000,
and declare that all bit patterns that would seem to represent values
outside that range are actually trap representations. It's been argued
that this violates the requirement that for any signed type "Each bit
that is a value bit shall have the same value as the same bit in the
object representation of the corresponding unsigned type." But every
does have that value, in every non-trap representation that has that bit
set.
I must admit to having trouble with this one. What's the basis for
the position you state? In the example given the presence of a
padding bit seems completely irrelevant (except perhaps to make the
number of bits a multiple of 8 while preserving round limits?)

[..minor detour on padding bits..]
... It seems like
all you are saying is that you think some combinations of values
bits are allowed to be trap representations whereas other people
think they aren't (not counting the distinguished ones explicitly
identified in the Standard, of course). What's the argument to
support this position?

Which position - mine or theirs? My position is based upon the fact
that the standard explicitly allows for trap representations, and says
nothing to limit how many any given type may have. The opposing
position is based upon the claim that 6.2.6.2p2 defines the only trap
representation involving value bits that a signed integer type is
allowed to have. As I read it, 6.2.6.2p2 serves primarily to explain
the fact that the bit pattern that would otherwise represent negative
zero in 1's complement or sign-magnitude representations is allowed to
be a trap representation. This clears up any ambiguity that might
arise due to the fact that 0 has two distinct representations for such
types. It doesn't imply in any way that negative zero is the only
allowed trap representation. The fact that it also defines a bit
pattern for 2's complement representations that is allowed to be a
trap representation is a weak point in my argument. If my argument is
correct, that clause is redundant; but it doesn't directly contradict
my conclusion.
So if I may paraphrase/summarize your argument, you're saying that
trap representations involving some combination of value bits
aren't explicitly forbidden, and therefore they are allowed.
Is that right?

If that's what you're saying, I agree that TR aren't explicitly
forbidden, in the sense that there is no statement in the standard
that says directly that other combinations of value bits may not be
TR. However, I think this requirement is given implicitly by how
integer values are represented, specifically in the definition of
value bits. For unsigned types, from 6.2.6.2p1:

If there are N value bits, each bit shall represent a
different power of 2 between 1 and 2**(N-1)

And for signed types, from 6.2.6.2 p 3:

Each bit that is a value bit shall have the same value as the same
bit in the object representation of the corresponding unsigned
type (if there are M value bits in the signed type and N in the
unsigned type, then M<=N).

The statements seem specific enough so that TR involving combinations
of value bits aren't allowed, except by explicit exception, which
explains why the three distinguished representations are identified
specifically as possibly being TR. I agree that those representations
being associated with negative zero (in two cases) muddies the point;
however, I think that's incidental, since some implementations support
negative zero, and there is some other text discussing negative zeros
that specifically are not TR. Of course I agree with your point that
the TR for 2's complement weakens the case that other TR are allowed.

If we look for corroborating evidence, think about this. If what
you're saying is right, then bitfields should also be allowed TR
(besides the distinguished three) as part of their value bit range.
In fact, since there aren't any limits set on what values signed
bitfields must represent (with 0 as a possible exception, for example
in (signed _Bool)), signed bitfields couldn't be used portably at
all (not counting the trivial case where they only ever have the
value 0). There is support for the proposition that bitfields
may not have TR in 6.7.7p6:

EXAMPLE 3 The following obscure constructions

typedef signed int t;
typedef int plain;
struct tag {
unsigned t:4;
const t:5;
plain r:5;
};

declare a typedef name t with type signed int, a typedef name
plain with type int, and a structure with three bit-field
members, one named t that contains values in the range [0, 15],
an unnamed const-qualified bit-field which (if it could be
accessed) would contain values in either the range [-15, +15] or
[-16, +15], and one named r that contains values in one of the
ranges [0, 31], [-15, +15], or [-16, +15]. (The choice of range
is implementation-defined.)

The numeric ranges given indicate pretty clearly that bitfields may
not have TR in their usual value range. Unless there is other text
that differentiates bitfields and non-bitfield integer types with
respect to TR (and I haven't been able to find any), this evidence
indicates pretty strongly that TR are not allowed in the usual
value range for integer types.

Nov 19 '08 #34

This discussion thread is closed

Replies have been disabled for this discussion.