449,079 Members | 940 Online
Need help? Post your question and get tips & solutions from a community of 449,079 IT Pros & Developers. It's quick & easy.

# Undefined Behavior. ..

 P: n/a Will the following statement invoke undefined behavior : a^=b,b^=a,a^=b ; given that a and b are of int-type ?? Be cautious, I have not written a^=b^=a^=b ; which, of course, is undefined. I am having some confusion with the former statement! Also, state the reason for the statement being undefined! Nov 15 '08 #1
33 Replies

 P: n/a co**************@gmail.com wrote: Will the following statement invoke undefined behavior : a^=b,b^=a,a^=b ; given that a and b are of int-type ?? No, because the comma operator constitutes a sequence point Be cautious, I have not written a^=b^=a^=b ; which, of course, is undefined. I am having some confusion with the former statement! Also, state the reason for the statement being undefined! 6.5.17.1 The left operand of a comma operator is evaluated as a void expression; there is a sequence point after its evaluation. Bye, Jojo Nov 15 '08 #2

 P: n/a <> I think this expression is valid as assignment operator is evaluated from right to left so it is parsed as (a^=(b^=(a^=b))) above expression swaps two numbers please correct me if i am wrong... Nov 15 '08 #3

 P: n/a c.***********@gmail.com wrote: <> I think this expression is valid as assignment operator is evaluated from right to left so it is parsed as (a^=(b^=(a^=b))) above expression swaps two numbers please correct me if i am wrong... This is modifying 'a' twice without an intermediate sequence point -> undefined behavoir Bye, Jojo Nov 15 '08 #4

 P: n/a On Nov 15, 3:18*pm, "c.lang.mys...@gmail.com" > I think this expression is valid as assignment operator is evaluated from right to left so it is parsed as (a^=(b^=(a^=b))) 6.5.17 1. expression: assignment-expression expression , assignment-expression 2. The left operand of a comma operator is evaluated as a void expression; there is a sequence point after its evaluation. Then the right operand is evaluated; the result has its type and value. So, you have: a^=b, b^=a, a^=b [expression][assignment-expression] | a^=b, b^=a [expression][assignment-expression] | a^=b [assignment-expression] OTOH, a^=b, b^=a, a^=b [left operand] [right operand] Left operand evaluted as void, and comma epression has a type and value of right operand: a^=b. But left operand if: a^=b, b^=a And again: a^=b, b^=a [left operand] [right operand] Left operand evaluted as void, and comma epression has a type and value of right operand: b^=a. So, the sequnce should be like: evalute a^=b, then b^=a, then a^=b. You should get the same result as in this case: a^=b; b^=a; a^=b; Nov 15 '08 #5

 P: n/a maverik wrote: On Nov 15, 3:18 pm, "c.lang.mys...@gmail.com" <>I think this expression is valid as assignment operator is evaluatedfrom right to left so it is parsed as(a^=(b^=(a^=b))) 6.5.17 1. expression: assignment-expression expression , assignment-expression 2. The left operand of a comma operator is evaluated as a void expression; there is a sequence point after its evaluation. Then the right operand is evaluated; the result has its type and value. So, you have: a^=b, b^=a, a^=b You don't have any comma operators in the code that you quoted. -- pete Nov 15 '08 #6

 P: n/a co**************@gmail.com wrote: Will the following statement invoke undefined behavior : a^=b,b^=a,a^=b ; given that a and b are of int-type ?? That's perfectly safe. As a stand-alone statement, it is exactly equivalent to a^=b; b^=a; a^=b; The use of the comma operator is only needed in cases like if(A; B; C), where A, B, and C can only be a single statement. In any other context, I'd recommend breaking it out into three separate statements, but only for the sake of clarity - it's perfectly legal as a single statement. Be cautious, I have not written a^=b^=a^=b ; which, of course, is undefined. I am having some confusion with the former statement! Also, state the reason for the statement being undefined! 6.5p2: "Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression". Because the second version violates a "shall" occurring outside of a "Constraints" section, the behavior is undefined. The key difference between the two statements is the presence of the ',' operators in the first version. The ',' operator inserts a sequence point separating it's two operands. Nov 15 '08 #7

 P: n/a On Nov 15, 5:24*pm, "Joachim Schmitz" wrote: c.lang.mys...@gmail.com wrote: <> I think this expression is valid as assignment operator is evaluated from right to left so it is parsed as (a^=(b^=(a^=b))) above expression swaps two numbers please correct me if i am wrong... This is modifying 'a' twice without an intermediate sequence point -> undefined behavoir Bye, Jojo sorry,I am still not able to get it... now again consider a^=b^=a^=b when compiler see this expression it starts from RHS as assignment operator is right associative so first it has to calculate rightmost a^=b then value of a must have to change....it porceedes then to calculate b^=a ..then again a^=b.... as we can also use a=b=3; Do you mean that after right most a^=b,we can not be sure that value of a is changed till we get sequence point... but a=b=3 is valid Please clear my confusion... thanks, Nov 15 '08 #8

 P: n/a On Nov 15, 3:34*pm, pete

 P: n/a On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com" wrote: c.lang.mys...@gmail.com wrote: <> I think this expression is valid as assignment operator is evaluated from right to left so it is parsed as (a^=(b^=(a^=b))) above expression swaps two numbers please correct me if i am wrong... This is modifying 'a' twice without an intermediate sequence point -> undefined behavoir Bye, Jojo sorry,I am still not able to get it... What don't you get? now again consider a^=b^=a^=b when compiler see this expression it starts from RHS as assignment Wrong, when the compiler sees that expression, he can do whatever he wants because that expression invokes undefined behavior. operator is right associative Uh, no, it's UB. so first it has to calculate rightmost a^=b then value of a must have Uh, no, it's UB. to change....it porceedes then to calculate b^=a ..then again a^=b.... Uh, no, it's UB. as we can also use a=b=3; That's different than the former expression, because in that expression a and b are modified only once between sequence points. Do you mean that after right most a^=b,we can not be sure that value No, a^=b^=a^=b invokes UB and if you have that in your code, you can't be sure of anything at all. but a=b=3 is valid Please clear my confusion... For the reasons I explained before. Now I have a question: How the hell do you know that a = b = 3 is valid if you don't understand and you're confused? Nov 15 '08 #10

 P: n/a On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote: On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com" wrote: >now again consider a^=b^=a^=bwhen compiler see this expression it starts from RHS as assignment Wrong, when the compiler sees that expression, he can do whatever he wants because that expression invokes undefined behavior. When the compiler sees that expression, it cannot be sure whether it will be executed. It must compile the code, and if the program reaches the point where the expression is evaluated, _then_ the behaviour is undefined. The compiler cannot do whatever it wants. The compiler can make the resulting code do whatever it wants. >operator is right associative Uh, no, it's UB. Uh, yes, ^= is right associative. The behaviour is undefined but that does not affect parsing. Nov 15 '08 #11

 P: n/a On Nov 15, 3:11 pm, Harald van D©¦k wrote: now again consider a^=b^=a^=b when compiler see this expression it starts from RHS as assignment Wrong, when the compiler sees that expression, he can do whatever he wants because that expression invokes undefined behavior. When the compiler sees that expression, it cannot be sure whether it will be executed. It must compile the code, and if the program reaches the point where the expression is evaluated, _then_ the behaviour is undefined. The compiler cannot do whatever it wants. The compiler can make the resulting code do whatever it wants. What ELSE could the compiler do other than having the output be whatever he wants? I don't think you've really added to what I said. operator is right associative Uh, no, it's UB. Uh, yes, ^= is right associative. The behaviour is undefined but that does not affect parsing. Well, UB is when the standard doesn't impose any requirements for the behavior of the implementation. You'd argue that 'parsing' is not included in 'behavior'? You might be right but I'd like quotes from the standard to trust you. (you are certainly credible, but c&v from the standard is always nice) Nov 15 '08 #12

 P: n/a Harald van DÄ³k wrote: On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote: >On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com" wrote: >>now again consider a^=b^=a^=bwhen compiler see this expression it starts from RHS as assignment Wrong, when the compiler sees that expression, he can do whatever hewants because that expression invokes undefined behavior. When the compiler sees that expression, it cannot be sure whether it will be executed. It must compile the code, and if the program reaches the point where the expression is evaluated, _then_ the behaviour is undefined. The compiler cannot do whatever it wants. The compiler can make the resulting code do whatever it wants. The standard quite explicitly says that the consequences of undefined behavior can include failing to compile, which clearly indicates that it can precede execution of the relevant code. The standard does not explain this in any detail, but I believe that the relevant rule is that the undefined behavior is allowed at any point after execution of the relevant code becomes inevitable. Example: if(some condition) a^=b^=a^=b; For code like this, the behavior of the code becomes undefined as soon as it becomes inevitable that the if() clause will be executed. This means that at points in the code prior the if() statement, the compiler is allowed to generate code using optimizations that only work if the if-condition is not true. As a result, those optimizations may cause your code to misbehave long before evaluation of the offending statement. Nov 15 '08 #13

 P: n/a On Sat, 15 Nov 2008 13:38:21 +0000, James Kuyper wrote: Harald van DÄ³k wrote: >On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote: >>On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com"

 P: n/a On Sat, 15 Nov 2008 05:37:35 -0800, vippstar wrote: On Nov 15, 3:11 pm, Harald van DÄ³k On Sat, 15 Nov 2008 04:56:25 -0800, vippstar wrote: On Nov 15, 2:39 pm, "c.lang.mys[298]...@gmail.com" operator is right associative Uh, no, it's UB. Uh, yes, ^= is right associative. The behaviour is undefined but thatdoes not affect parsing. Well, UB is when the standard doesn't impose any requirements for the behavior of the implementation. Yes, but to get to the part where the standard says the behaviour is undefined you need to have already parsed the expression. I can't give an exact chapter and verse, but can you explain how you can determine that the behaviour is undefined without parsing? Nov 15 '08 #15

 P: n/a On Nov 15, 4:04 pm, Harald van D©¦k

 P: n/a On Sat, 15 Nov 2008 06:19:14 -0800, vippstar wrote: On Nov 15, 4:04 pm, Harald van DÄ³k A non-conforming compiler could give an error message and refuse tocompile any program containing a^=b^=a^=b, but a conforming compiler isnot allowed to do so (or at least not always). Wrong, a conforming compiler can reject that code. The reason might be quite different than the expression, but it serves more like an excuse to reject this. For example, the compiler could claim (assuming a, b, int) that the source exceeds the environmental limits. Heh, that's nasty. I suppose it has to also reject a^=b^=c^=d, then, to cover up its lie? I can't think of a limit right now that a^=b^=a^=b might exceed that a^=b^=c^=d doesn't, and if you can, then I can try to come up with another valid example that also exceeds that limit. :) >Uh, yes, ^= is right associative. The behaviour is undefined butthat does not affect parsing. Well, UB is when the standard doesn't impose any requirements for the behavior of the implementation. Yes, but to get to the part where the standard says the behaviour isundefined you need to have already parsed the expression. I can't givean exact chapter and verse, but can you explain how you can determinethat the behaviour is undefined without parsing? If I understand, what you're saying is that the semantics of something are not affected by UB, well that's wrong I believe. Yes, that is wrong, and no, that is not what I'm saying. ^= doesn't need to be anything meaningful after the first UB in source code. To determine that a^=b^=a^=b modifies a at all, you need to already have determined that this is an assignment-expression with only a as its left operand. In other words, that it is equivalent to a^=(b^=(a^=b)) instead of ((a^=b)^=a)^=b. The latter is simply a constraint violation and attempts to modify a only once: the other ^= operators attempts to modify the result of an assignment. Nov 15 '08 #17

 P: n/a On Nov 15, 5:01 pm, Harald van D©¦k

 P: n/a On Nov 15, 5:56*pm, vipps...@gmail.com wrote: On Nov 15, 2:39 pm, "c.lang.mys...@gmail.com" wrote: c.lang.mys...@gmail.com wrote: <> I think this expression is valid as assignment operator is evaluated from right to left so it is parsed as (a^=(b^=(a^=b))) above expression swaps two numbers please correct me if i am wrong... This is modifying 'a' twice without an intermediate sequence point -> undefined behavoir Bye, Jojo sorry,I am still not able to get it... but a=b=3 is valid Please clear my confusion... For the reasons I explained before. Now I have a question: How the hell do you know that a = b = 3 is valid if you don't understand and you're confused Now i am able to get it. I know a=b=3 is valid as i read it in my C text book but i had not read other undefined statement in this thread so i was confused about it... Nov 15 '08 #19

 P: n/a On Nov 15, 7:12 pm, "c.lang.mys...@gmail.com"

 P: n/a On Nov 15, 10:35*pm, vipps...@gmail.com wrote: On Nov 15, 7:12 pm, "c.lang.mys...@gmail.com"

 P: n/a On Nov 15, 7:53 pm, "c.lang.mys...@gmail.com"

 P: n/a James Kuyper wrote: co**************@gmail.com wrote: >Will the following statement invoke undefined behavior :a^=b,b^=a,a^=b ;given that a and b are of int-type ?? That's perfectly safe. Not perfectly. Given that a and b are of int-type, (a^=b) is capable of generating a trap representation. -- pete Nov 15 '08 #23

 P: n/a On Nov 16, 2:33*am, pete

 P: n/a c.***********@gmail.com wrote: On Nov 16, 2:33 am, pete James Kuyper wrote: >>coolguyaround...@gmail.com wrote:Will the following statement invoke undefined behavior :a^=b,b^=a,a^=b ;given that a and b are of int-type ??That's perfectly safe. Not perfectly.Given that a and b are of int-type,(a^=b) is capable of generating a trap representation.--pete Can you please elaborate about this trap representation..... Type int is allowed to have representations, which don't represent any value; those would be trap representations. The bit pattern for negative zero is allowed to be a trap representation. In one's complement, it's all bits set to 1. In signed magnitude, it's a 1 followed by all 0's. -- pete Nov 16 '08 #25

 P: n/a pete wrote: c.***********@gmail.com wrote: >On Nov 16, 2:33 am, pete >James Kuyper wrote:coolguyaround...@gmail.com wrote:Will the following statement invoke undefined behavior :a^=b,b^=a,a^=b ;given that a and b are of int-type ??That's perfectly safe.Not perfectly.Given that a and b are of int-type,(a^=b) is capable of generating a trap representation.--pete Can you please elaborate about this trap representation..... Type int is allowed to have representations, which don't represent any value; those would be trap representations. The bit pattern for negative zero is allowed to be a trap representation. In one's complement, it's all bits set to 1. In signed magnitude, it's a 1 followed by all 0's. Also, in two's complement, a 1 followed by all 0's, can be either INT_MIN or a trap. -- pete Nov 16 '08 #26

 P: n/a On 16 Nov 2008 at 4:50, c.***********@gmail.com wrote: On Nov 16, 2:33Â*am, pete (a^=b) is capable of generating a trap representation. Can you please elaborate about this trap representation..... It's something you don't need to know about or worry about - it's just something many of the clc regulars have a morbid obsession with. If you're writing real-world C programs on a modern desktop, you will never encounter trap representations, so they may as well not exist. If you're programming mainframes from the 1970s, or if you have an academic interest in the most obscure technicalities of the ISO Standard, then you should live and breathe trap representations, and expect to find them everywhere you look. Then you can join the clc regulars' club. Nov 16 '08 #27

 P: n/a c.***********@gmail.com wrote: On Nov 16, 2:33 am, pete James Kuyper wrote: >>coolguyaround...@gmail.com wrote:Will the following statement invoke undefined behavior :a^=b,b^=a,a^=b ;given that a and b are of int-type ??That's perfectly safe. Not perfectly.Given that a and b are of int-type,(a^=b) is capable of generating a trap representation.--pete Can you please elaborate about this trap representation..... A trap representation for a given data type is a bit pattern that doesn't actually represent a valid value in that type. Most data types are allowed to have trap representations. Whether or not they actually do is up to your implementation. The main exceptions are the character types. Also, a struct or a union object is never a trap representation, even though it may have a member which is a trap representation. Uninitialized objects might contain a trap representation. You can create a trap representation in an object by accessing the object as if it were an array of bytes and changing the value of some of those bytes. A pointer object whose value used to point into a memory block allocated by calling a member of the malloc() family can become a trap representation as a result of free()ing that memory block(). A pointer that points at or into a non-static object that is local to a function can become a trap representation when that function returns. A FILE * value can become a trap representation as a result of fclose()ing the file. There are also several other more obscure ways in which an object may acquire a trap representation. Any attempt to read the value of an object containing a trap representation has undefined behavior. Any attempt to store a trap representation in an object has undefined behavior. The standard explicitly states (6.2.6.2p2) that any given signed integer type has one bit pattern that might or might not be a trap representation - it's up to the implementation to decide (and to document their decision). For types that use a one's complement or sign-magnitude representation, this is the bit pattern that would otherwise represent negative 0. If the type uses a twos-complement representation, this is the bit pattern that would otherwise represent -2^N, where N is the number of value bits in the type. Some people read that clause as allowing only that one trap representation, and requiring that all other bit patterns must be valid. I don't read it that way. It seems to me that what it says still allows for the possibility of other trap representations as well. An implementation that used 1 padding bit, 1 sign bit, and 30 value bits for 'int' could set INT_MAX to 1000000000, and INT_MIN to -1000000000, and declare that all bit patterns that would seem to represent values outside that range are actually trap representations. It's been argued that this violates the requirement that for any signed type "Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type." But every does have that value, in every non-trap representation that has that bit set. Nov 16 '08 #28

 P: n/a James Kuyper wrote:now again consider a^=b^=a^=bwhen compiler see this expression it starts from RHS as assignment Wrong, when the compiler sees that expression, he can do whatever he wants because that expression invokes undefined behavior. When the compiler sees that expression, it cannot be sure whether it will be executed. It must compile the code, and if the program reaches the point where the expression is evaluated, _then_ the behaviour is undefined. The compiler cannot do whatever it wants. The compiler can make the resulting code do whatever it wants. The standard quite explicitly says that the consequences of undefined behavior can include failing to compile, which clearly indicates that it can precede execution of the relevant code. The standard does not explain this in any detail, but I believe that the relevant rule is that the undefined behavior is allowed at any point after execution of the relevant code becomes inevitable. I had to read this a couple of times before I understood the main point. I mostly agree with the main point (explained further below), but the inference stated in the first sentence seems wrong to me. It's true that undefined behavior during, e.g., macro processing might cause compilation to fail, but execution-time undefined behavior can't start until program execution is started. Whatever execution-time undefined behavior there might be isn't allowed to creep across the boundary into program translation. Example: if(some condition) a^=b^=a^=b; For code like this, the behavior of the code becomes undefined as soon as it becomes inevitable that the if() clause will be executed. This means that at points in the code prior the if() statement, the compiler is allowed to generate code using optimizations that only work if the if-condition is not true. As a result, those optimizations may cause your code to misbehave long before evaluation of the offending statement. That's true, but only up to a point, under the as-if rule. In particular, the undefined behavior must not be allowed to go back before any access of a volatile variable, or any operation on any potentially interactive device. In practical terms, the undefined behavior usually cannot be moved back before a call to an external function, since any unknown function might call longjmp(), preventing the code in question from being reached. I don't think any of these refinements are at odds with the intended point, but they seem important enough to be worth mentioning. Nov 18 '08 #29

 P: n/a James Kuyper

 P: n/a Tim Rentsch wrote: James Kuyper

 P: n/a >I had to read this a couple of times before I understood the main >point. I mostly agree with the main point (explained further below),but the inference stated in the first sentence seems wrong to me.It's true that undefined behavior during, e.g., macro processing mightcause compilation to fail, but execution-time undefined behavior can'tstart until program execution is started. Whatever execution-timeundefined behavior there might be isn't allowed to creep across theboundary into program translation. Get a good textbook on Temporal Mechanics. Miles O'Brien of Deep Space Nine has one. The subject isn't as simple as you might think. One call to fflush(stdin) retroactively removed the 'ant' keyword from C89 (const ant int xyz = 42; defined an actual integer constant expression in a variable). Too bad; it was a useful feature :-) . Nov 19 '08 #32

 P: n/a go****@hammy.burditt.org (Gordon Burditt) writes: I had to read this a couple of times before I understood the main point. I mostly agree with the main point (explained further below), but the inference stated in the first sentence seems wrong to me. It's true that undefined behavior during, e.g., macro processing might cause compilation to fail, but execution-time undefined behavior can't start until program execution is started. Whatever execution-time undefined behavior there might be isn't allowed to creep across the boundary into program translation. Get a good textbook on Temporal Mechanics. Miles O'Brien of Deep Space Nine has one. The subject isn't as simple as you might think. I already have one. It was required reading in the Temporal Mechanics course that I took ten years from now. Nov 19 '08 #33