why is it so ?

junky_fellow

Why

i = ++i; is undefined
but
i = func(++i); is defined ?

I know there is a sequence point after a call to a function
after the arguments have been evaluated. But I am not able
to visualise how the sequence point makes the second statement
defined ? Also can someone give me a detailed explanation of
why first statement is undefined ?

Thanx for any help in advance...

Nov 15 '05 #1

Subscribe Post Reply

1281

Robert Gamble

ju**********@yahoo.co.in wrote:

Why

i = ++i; is undefined
but
i = func(++i); is defined ?

I know there is a sequence point after a call to a function
after the arguments have been evaluated. But I am not able
to visualise how the sequence point makes the second statement
defined ?
See the recent thread entitled "Sequence points and function calls" on
comp.std.c.
Also can someone give me a detailed explanation of
why first statement is undefined?

The standard states that an object shall not have its stored value
modified more than once between sequence points, your example modifies
i twice without an intervening SP, its pretty straight-forward I think.
If you still don't understand you'll have to be more specific about
what are having trouble grasping.

Robert Gamble

Nov 15 '05 #2

junky_fellow

Robert Gamble wrote:
<snip>

Also can someone give me a detailed explanation of
why first statement is undefined?

The standard states that an object shall not have its stored value
modified more than once between sequence points, your example modifies
i twice without an intervening SP, its pretty straight-forward I think.
If you still don't understand you'll have to be more specific about
what are having trouble grasping.

Robert Gamble

Still I don't understand what's the harm in doing that. Can
you please show (step by step) how this will lead to different results
on different complilers ?

Nov 15 '05 #3

Robert Gamble

ju**********@yahoo.co.in wrote:

Robert Gamble wrote:
<snip>
Also can someone give me a detailed explanation of
why first statement is undefined?
The standard states that an object shall not have its stored value
modified more than once between sequence points, your example modifies
i twice without an intervening SP, its pretty straight-forward I think.
If you still don't understand you'll have to be more specific about
what are having trouble grasping.

Robert Gamble

Still I don't understand what's the harm in doing that.

The harm? It invokes undefined behavior.
Can you please show (step by step) how this will lead to different results
on different complilers ?

Outside of the aforementioned restriction placed by the Standard, there
is only one logical way to evaluate the expression i=++i, the same way
that i=++j would be evaluated, with the value of i being increased by
one in the former example. But the Standard does place the restriction
there and since i is modified twice (once for the increment operator
and once for the assignment operator) the behavior is undefined. I
don't know exact the reasoning behind the restriction with regards to
non-ambiguous expressions but suspect it has to do with the complexity
involved with trying to come up with wording to define what is
ambiguous and what is not.

Robert Gamble

Nov 15 '05 #4

Chris Torek

In article <11**********************@z14g2000cwz.googlegroups .com>
Robert Gamble <rg*******@gmail.com> wrote:

Outside of the aforementioned restriction placed by the Standard, there
is only one logical way to evaluate the expression i=++i ...

Which "one way" is that? Is that the one that goes:

"increment variable i as stored in register %l2, then
register %l2 to register %l2"

or would perhaps be:

"remember to increment register %l2; then compute %l2+1
with result going into %l2; then execute remembered increment,
so increment register %l2"

? Note that the latter winds up with "i" increasing by two
instead of 1, but both emit two instructions:

inc %l2
mov %l2,%l2

or:

add %l2,1,%l2
inc %l2

and "add" and "mov" are both single-cycle instructions ("mov" is
an assembler alias for "logical-or with register %g0").
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 15 '05 #5

Keith Thompson

ju**********@yahoo.co.in writes:

Robert Gamble wrote:
<snip>
> Also can someone give me a detailed explanation of
> why first statement is undefined?

The standard states that an object shall not have its stored value
modified more than once between sequence points, your example modifies
i twice without an intervening SP, its pretty straight-forward I think.
If you still don't understand you'll have to be more specific about
what are having trouble grasping.

Robert Gamble

Still I don't understand what's the harm in doing that. Can
you please show (step by step) how this will lead to different results
on different complilers ?

The statement in question was

i = i++;

It exhibits undefined behavior because the standard says it invokes
undefined behavior.

Optimizing compilers can perform various transformations on your code;
the generated assembly or machine code may bear little obvious
resemblance to what you wrote, except that it behaves the same way.
Since undefined behavior can do anything, the optimizer is allowed to
assume that there is no undefined behavior. If this assumption turns
out to be incorrect, the results can be arbitrarily bad -- and it's
your fault, not the optimizer's. "If you lie to the compiler, it will
get its revenge." -- Henry Spencer

And in this particular case, nothing would be gained by defining the
behavior. There is no reason to write
i = i++;
in the first place. If you want to increment i, just write
i++;
Assigning the result back to i would be superfluous even if it worked.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 15 '05 #6

Robert Gamble

Chris Torek wrote:

In article <11**********************@z14g2000cwz.googlegroups .com>
Robert Gamble <rg*******@gmail.com> wrote:
Outside of the aforementioned restriction placed by the Standard, there
is only one logical way to evaluate the expression i=++i ...
Which "one way" is that? Is that the one that goes:

"increment variable i as stored in register %l2, then
register %l2 to register %l2"

This makes sense (assuming a non-optimizing compiler)
or would perhaps be:

"remember to increment register %l2; then compute %l2+1
with result going into %l2; then execute remembered increment,
so increment register %l2"

This might make sense if the expression was i = i++ but I don't think
this would be a valid way to handle i = ++i.

Robert Gamble

Nov 15 '05 #7

Flash Gordon

Robert Gamble wrote:

Chris Torek wrote:
In article <11**********************@z14g2000cwz.googlegroups .com>
Robert Gamble <rg*******@gmail.com> wrote:
Outside of the aforementioned restriction placed by the Standard, there
is only one logical way to evaluate the expression i=++i ...

Which "one way" is that? Is that the one that goes:

"increment variable i as stored in register %l2, then
register %l2 to register %l2"

This makes sense (assuming a non-optimizing compiler)

or would perhaps be:

"remember to increment register %l2; then compute %l2+1
with result going into %l2; then execute remembered increment,
so increment register %l2"

This might make sense if the expression was i = i++ but I don't think
this would be a valid way to handle i = ++i.

Well, here is a way it could do it on a processor that can run
instructions in parallel:
t = i + 1
In parallel write t to i and increment i
SIGINSTRUCTIONCLASH
i.e. bang because two instructions are writing to the same location at
the same time.

I don't know if there are any such implementations, but there are
definitely processors with multiple pipelines, so it could happen.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.

Nov 15 '05 #8

Mark McIntyre

On 19 Jul 2005 04:10:33 -0700, in comp.lang.c ,
ju**********@yahoo.co.in wrote:

Why

i = ++i; is undefined
but
i = func(++i); is defined ?

I know there is a sequence point after a call to a function
after the arguments have been evaluated. But I am not able
to visualise how the sequence point makes the second statement
defined ?
At a sequence point, the effect of all preceding operations must be
complete. Therefore the effect of ++ must be complete. Therefore
passing it to a function is defined.
Also can someone give me a detailed explanation of
why first statement is undefined ?

Axiomatically, the order in which things between sequence points are
done, is undefined.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Nov 15 '05 #9

Mark McIntyre

On 19 Jul 2005 07:22:55 -0700, in comp.lang.c ,
ju**********@yahoo.co.in wrote:

(of invoking UB via modifying an object twice between sequence points.

Still I don't understand what's the harm in doing that. Can
you please show (step by step) how this will lead to different results
on different complilers ?

First you tell us what the result is. Hint: there are multiple
possibilities, even excluding the use of parallel processor pipelines
to execute the store and increment.
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Nov 15 '05 #10

Barry Schwarz

On 19 Jul 2005 04:10:33 -0700, ju**********@yahoo.co.in wrote:

Why

i = ++i; is undefined
but
i = func(++i); is defined ?

I know there is a sequence point after a call to a function
after the arguments have been evaluated. But I am not able
to visualise how the sequence point makes the second statement
defined ? Also can someone give me a detailed explanation of
why first statement is undefined ?

In the first example, the ++ operator returns the incremented value of
i but does not necessarily increment storage until a sequence point is
reached. The only sequence point is the terminating semicolon. So it
would be possible for the ++ operator to increment the value in i
before or after the value is changed by the = operator. This allows
for two different possible results in i which is obviously
intolerable. So the standard requires this to be undefined behavior.

In the second example, there is a sequence point after the argument is
evaluated but before the function is called. Therefore, the update to
i due to the ++ operator is performed before the function is called.
After the function returns, the value of i is updated again by the =
operator but there is no other update "pending" and therefore no
potential ambiguity.

<<Remove the del for email>>

Nov 15 '05 #11

Eric Sosman

Barry Schwarz wrote:

On 19 Jul 2005 04:10:33 -0700, ju**********@yahoo.co.in wrote:

Why

i = ++i; is undefined
but
i = func(++i); is defined ?

I know there is a sequence point after a call to a function
after the arguments have been evaluated. But I am not able
to visualise how the sequence point makes the second statement
defined ? Also can someone give me a detailed explanation of
why first statement is undefined ?

In the first example, the ++ operator returns the incremented value of
i but does not necessarily increment storage until a sequence point is
reached. The only sequence point is the terminating semicolon. So it
would be possible for the ++ operator to increment the value in i
before or after the value is changed by the = operator. This allows
for two different possible results in i which is obviously
intolerable. So the standard requires this to be undefined behavior.

In the second example, there is a sequence point after the argument is
evaluated but before the function is called. Therefore, the update to
i due to the ++ operator is performed before the function is called.
After the function returns, the value of i is updated again by the =
operator but there is no other update "pending" and therefore no
potential ambiguity.

Personally, I've never been entirely convinced that
`i = f(++i)' is bulletproof. Yes, there's a sequence point
between evaluating `++i' and starting to execute `f', and
there's even another one before `f' returns its value. But
I don't think this guarantees a sequence point between the
evaluation of `++i' and the assignment to the l.h.s.

Now, "it stands to reason" that the assignment cannot
occur until after `f' returns its value, and `f' cannot
return its value until it's been called, so the sequence
point at the call should suffice. But I don't think this
argument is reliable: What if the compiler can predict the
value `f' will return without actually calling it at all?
For example,

int f(int x) {
printf ("%d bottles of beer on the wall\n", x);
return 0; /* 0 = success, -1 = failure */
}

is "predictable," and I think the compiler would be within
its rights to zero `i', call `f', and ignore the returned
value. Lord only knows what would happen to the `++',
or what might appear on stdout. Old Frothingslosh, if
your luck is bad.

--
Er*********@sun.com

Nov 15 '05 #12

Robert Gamble

Eric Sosman wrote:

Barry Schwarz wrote:
On 19 Jul 2005 04:10:33 -0700, ju**********@yahoo.co.in wrote:

Why

i = ++i; is undefined
but
i = func(++i); is defined ?

I know there is a sequence point after a call to a function
after the arguments have been evaluated. But I am not able
to visualise how the sequence point makes the second statement
defined ? Also can someone give me a detailed explanation of
why first statement is undefined ?

In the first example, the ++ operator returns the incremented value of
i but does not necessarily increment storage until a sequence point is
reached. The only sequence point is the terminating semicolon. So it
would be possible for the ++ operator to increment the value in i
before or after the value is changed by the = operator. This allows
for two different possible results in i which is obviously
intolerable. So the standard requires this to be undefined behavior.

In the second example, there is a sequence point after the argument is
evaluated but before the function is called. Therefore, the update to
i due to the ++ operator is performed before the function is called.
After the function returns, the value of i is updated again by the =
operator but there is no other update "pending" and therefore no
potential ambiguity.

Personally, I've never been entirely convinced that
`i = f(++i)' is bulletproof. Yes, there's a sequence point
between evaluating `++i' and starting to execute `f', and
there's even another one before `f' returns its value. But
I don't think this guarantees a sequence point between the
evaluation of `++i' and the assignment to the l.h.s.

Now, "it stands to reason" that the assignment cannot
occur until after `f' returns its value, and `f' cannot
return its value until it's been called, so the sequence
point at the call should suffice. But I don't think this
argument is reliable: What if the compiler can predict the
value `f' will return without actually calling it at all?
For example,

int f(int x) {
printf ("%d bottles of beer on the wall\n", x);
return 0; /* 0 = success, -1 = failure */
}

is "predictable," and I think the compiler would be within
its rights to zero `i', call `f', and ignore the returned
value.

I don't think so. A very similiar example was provided recently in the
thread entitled "Sequence points and function calls" in comp.std.c. In
response, Peter Nilsson noted that such optimization "cannot induce
undefined behavior
in a case where the abstract semantics are well defined" and points out
an example in the Standard to support this statement (Example 6 in
5.1.2.3). Antoine Leca also noted "it is possible to allow the
compiler to
optimise by storing 0 in i early (or for example, on a different CPU),
but
not earlier than the fetch of the previous value of i to feed into the
call." and provided relevant citations from the Standard. I agree with
their reasoning and can't find fault with their arguments.

Robert Gamble

Nov 15 '05 #13

Keith Thompson

Eric Sosman <er*********@sun.com> writes:
[...]

Personally, I've never been entirely convinced that
`i = f(++i)' is bulletproof. Yes, there's a sequence point
between evaluating `++i' and starting to execute `f', and
there's even another one before `f' returns its value. But
I don't think this guarantees a sequence point between the
evaluation of `++i' and the assignment to the l.h.s.

Now, "it stands to reason" that the assignment cannot
occur until after `f' returns its value, and `f' cannot
return its value until it's been called, so the sequence
point at the call should suffice. But I don't think this
argument is reliable: What if the compiler can predict the
value `f' will return without actually calling it at all?
For example,

int f(int x) {
printf ("%d bottles of beer on the wall\n", x);
return 0; /* 0 = success, -1 = failure */
}

is "predictable," and I think the compiler would be within
its rights to zero `i', call `f', and ignore the returned
value. Lord only knows what would happen to the `++',
or what might appear on stdout. Old Frothingslosh, if
your luck is bad.

The whole purpose of sequence points, I think, is to impose a
reasonable set of of restrictions on what optimizations a compiler is
allowed to perform. An optimizer *can* move a side effect across a
sequence point, but it's allowed to do so only if it doesn't destroy
the semantics of the program. The actual program needs to behave,
within certain limits, in a manner consistent with the way the program
behaves in the C abstract machine.

For example:

int i = 3;
printf("i = %d\n", i);
i ++;

Moving the increment before the printf, causing it to print "i = 4",
would be non-conforming. I believe the transformation you describe
would be non-conforming for the same reasons.

But I'm not 100% certain that I'm correct about this, and I could
probably shoot some holes in my own arguments if I put my mind to it.

As a programmer, I'll just avoid things like "i = f(++i);". If I were
implementing a compiler, I'd try to be conservative enough in my
optimizations so that "i = f(++i);" works as expected, even if I can
justify breaking it by invoking undefined behavior if I squint while
reading the standard.

The real question is, if an implementation does something other than
the obvious for "i = f(++i);", can I complain about non-conformance in
my bug report? I *think* the answer is yes.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 15 '05 #14

CBFalconer

Keith Thompson wrote:

.... snip ...
The real question is, if an implementation does something other
than the obvious for "i = f(++i);", can I complain about non-
conformance in my bug report? I *think* the answer is yes.

And who would ever write that when "i = f(i + 1);" works without
any question, and is even clear to the reader? ++ and -- are
overused IMO.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 15 '05 #15

S.Tobias

Keith Thompson <ks***@mib.org> wrote:

Eric Sosman <er*********@sun.com> writes:
[...]
Personally, I've never been entirely convinced that
`i = f(++i)' is bulletproof. Yes, there's a sequence point
[snip]
The whole purpose of sequence points, I think, is to impose a
reasonable set of of restrictions on what optimizations a compiler is
allowed to perform. An optimizer *can* move a side effect across a
sequence point, but it's allowed to do so only if it doesn't destroy
the semantics of the program. The actual program needs to behave, [snip]
As a programmer, I'll just avoid things like "i = f(++i);". If I were
implementing a compiler, I'd try to be conservative enough in my
optimizations so that "i = f(++i);" works as expected, even if I can

[snip]

I think there's a worse pit-fall:
int i=0;
a[i] = f(i++);
Which element is being set?
I think this is unspecified (6.5.16#4), but the behaviour
is defined.

Here's a test program and results:

gcc: 1 0
como: 0 1

#include <stdio.h>

int one(int unused) { return 1; }

int main()
{
int a[2] = {0};
int i = 0;
a[i] = one(i++);
printf("%d %d\n", a[0], a[1]);
return 0;
}

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`

Nov 15 '05 #16

pete

CBFalconer wrote:

Keith Thompson wrote:

... snip ...

The real question is, if an implementation does something other
than the obvious for "i = f(++i);", can I complain about non-
conformance in my bug report? I *think* the answer is yes.

And who would ever write that when "i = f(i + 1);" works without
any question, and is even clear to the reader? ++ and -- are
overused IMO.

For style reasons, I don't like arguments with side effects.

--
pete

Nov 15 '05 #17

Barry Schwarz

On 21 Jul 2005 09:23:50 GMT, "S.Tobias"
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:

Keith Thompson <ks***@mib.org> wrote:
Eric Sosman <er*********@sun.com> writes:
[...]
Personally, I've never been entirely convinced that
`i = f(++i)' is bulletproof. Yes, there's a sequence point
[snip]
The whole purpose of sequence points, I think, is to impose a
reasonable set of of restrictions on what optimizations a compiler is
allowed to perform. An optimizer *can* move a side effect across a
sequence point, but it's allowed to do so only if it doesn't destroy
the semantics of the program. The actual program needs to behave,

[snip]

As a programmer, I'll just avoid things like "i = f(++i);". If I were
implementing a compiler, I'd try to be conservative enough in my
optimizations so that "i = f(++i);" works as expected, even if I can

[snip]

I think there's a worse pit-fall:
int i=0;
a[i] = f(i++);
Which element is being set?
I think this is unspecified (6.5.16#4), but the behaviour
is defined.

I don't think the behavior is defined. While i is being updated only
once, there is a second requirement that i be evaluated at most once
as part of the process. Here i is being evaluated twice.

Here's a test program and results:

gcc: 1 0
como: 0 1

#include <stdio.h>

int one(int unused) { return 1; }

int main()
{
int a[2] = {0};
int i = 0;
a[i] = one(i++);
printf("%d %d\n", a[0], a[1]);
return 0;
}

The fact that undefined behavior appears to work as expected doesn't
make the behavior defined.

<<Remove the del for email>>

Nov 15 '05 #18

Robert Gamble

Barry Schwarz wrote:

On 21 Jul 2005 09:23:50 GMT, "S.Tobias"
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Keith Thompson <ks***@mib.org> wrote:
Eric Sosman <er*********@sun.com> writes:
[...]
Personally, I've never been entirely convinced that
`i = f(++i)' is bulletproof. Yes, there's a sequence point

[snip]
The whole purpose of sequence points, I think, is to impose a
reasonable set of of restrictions on what optimizations a compiler is
allowed to perform. An optimizer *can* move a side effect across a
sequence point, but it's allowed to do so only if it doesn't destroy
the semantics of the program. The actual program needs to behave,

[snip]

As a programmer, I'll just avoid things like "i = f(++i);". If I were
implementing a compiler, I'd try to be conservative enough in my
optimizations so that "i = f(++i);" works as expected, even if I can

[snip]

I think there's a worse pit-fall:
int i=0;
a[i] = f(i++);
Which element is being set?
I think this is unspecified (6.5.16#4), but the behaviour
is defined.

I don't think the behavior is defined. While i is being updated only
once, there is a second requirement that i be evaluated at most once
as part of the process. Here i is being evaluated twice.

That would mean that the expression "i = i + i"; is undefind as i is
evaluated more than once. The second requirement you speak of actually
states that "the prior value shall be read only to determine the value
to be stored". The example still doesn't meet this requirement though
as it is not guaranteed that i won't be both incremented and evaluated
before the call to f and the evaluation of i in a[i] is not being read
to "determine the value to be stored" into i.

Here's a test program and results:

gcc: 1 0
como: 0 1

#include <stdio.h>

int one(int unused) { return 1; }

int main()
{
int a[2] = {0};
int i = 0;
a[i] = one(i++);
printf("%d %d\n", a[0], a[1]);
return 0;
}

The fact that undefined behavior appears to work as expected doesn't
make the behavior defined.

I don't think Stan was implying that it did, just stating that he
thought it was unspecified and providing an example to demonstrate the
danger of the scenerio.

I could be argued that whether or not the behavior is undefined is
unspecifed since the order of evaluation that could make the behavior
undefined is itself unspecified.

Robert Gamble

Nov 15 '05 #19

Barry Schwarz

On 21 Jul 2005 17:31:20 -0700, "Robert Gamble" <rg*******@gmail.com>
wrote:

Barry Schwarz wrote:
On 21 Jul 2005 09:23:50 GMT, "S.Tobias"
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
>Keith Thompson <ks***@mib.org> wrote:
>> Eric Sosman <er*********@sun.com> writes:
>> [...]
>>> Personally, I've never been entirely convinced that
>>> `i = f(++i)' is bulletproof. Yes, there's a sequence point
>[snip]
>
>> The whole purpose of sequence points, I think, is to impose a
>> reasonable set of of restrictions on what optimizations a compiler is
>> allowed to perform. An optimizer *can* move a side effect across a
>> sequence point, but it's allowed to do so only if it doesn't destroy
>> the semantics of the program. The actual program needs to behave,
>[snip]
>>
>> As a programmer, I'll just avoid things like "i = f(++i);". If I were
>> implementing a compiler, I'd try to be conservative enough in my
>> optimizations so that "i = f(++i);" works as expected, even if I can
>[snip]
>
>I think there's a worse pit-fall:
> int i=0;
> a[i] = f(i++);
>Which element is being set?
>I think this is unspecified (6.5.16#4), but the behaviour
>is defined.

I don't think the behavior is defined. While i is being updated only
once, there is a second requirement that i be evaluated at most once
as part of the process. Here i is being evaluated twice.

That would mean that the expression "i = i + i"; is undefind as i is
evaluated more than once. The second requirement you speak of actually

I don't think the i on the left of the = operator is evaluated. If it
were, then a sequence like
int i;
i = 0;
would evaluate the uninitialized i which is another example of
undefined behavior.

<<Remove the del for email>>

Nov 15 '05 #20

Robert Gamble

Barry Schwarz wrote:

On 21 Jul 2005 17:31:20 -0700, "Robert Gamble" <rg*******@gmail.com>
wrote:
Barry Schwarz wrote:
On 21 Jul 2005 09:23:50 GMT, "S.Tobias"
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:

>Keith Thompson <ks***@mib.org> wrote:
>> Eric Sosman <er*********@sun.com> writes:
>> [...]
>>> Personally, I've never been entirely convinced that
>>> `i = f(++i)' is bulletproof. Yes, there's a sequence point
>[snip]
>
>> The whole purpose of sequence points, I think, is to impose a
>> reasonable set of of restrictions on what optimizations a compiler is
>> allowed to perform. An optimizer *can* move a side effect across a
>> sequence point, but it's allowed to do so only if it doesn't destroy
>> the semantics of the program. The actual program needs to behave,
>[snip]
>>
>> As a programmer, I'll just avoid things like "i = f(++i);". If I were
>> implementing a compiler, I'd try to be conservative enough in my
>> optimizations so that "i = f(++i);" works as expected, even if I can
>[snip]
>
>I think there's a worse pit-fall:
> int i=0;
> a[i] = f(i++);
>Which element is being set?
>I think this is unspecified (6.5.16#4), but the behaviour
>is defined.

I don't think the behavior is defined. While i is being updated only
once, there is a second requirement that i be evaluated at most once
as part of the process. Here i is being evaluated twice.

That would mean that the expression "i = i + i"; is undefind as i is
evaluated more than once. The second requirement you speak of actually

I don't think the i on the left of the = operator is evaluated.

Neither do I. I do think that it is evaluated twice on the right-hand
side though.

Robert Gamble

Nov 15 '05 #21

CBFalconer

Robert Gamble wrote:

Barry Schwarz wrote:
"Robert Gamble" <rg*******@gmail.com> wrote:

.... snip ...

That would mean that the expression "i = i + i"; is undefind as
i is evaluated more than once. The second requirement you speak

I don't think the i on the left of the = operator is evaluated.

Neither do I. I do think that it is evaluated twice on the
right-hand side though.

One possible code generation sequence for a stack machine is:

instr. stack content after
lda i &i, ....
lda i &i, &i, ....
load i, &i, ....
lda i &i, i, &i, ....
load i, i, &i, ...
add i+i, &i, ....
store ....

where lda is load address, load is *TOS->TOS,, etc. Another
possible sequence could be:

lda i
dup
load
dup
add
store

after some elementary optimization.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 15 '05 #22

S.Tobias

Barry Schwarz <sc******@deloz.net> wrote:

On 21 Jul 2005 09:23:50 GMT, "S.Tobias"
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:

I think there's a worse pit-fall:
int i=0;
a[i] = f(i++);
Which element is being set?
I think this is unspecified (6.5.16#4), but the behaviour
is defined.

I don't think the behavior is defined. While i is being updated only
once, there is a second requirement that i be evaluated at most once
as part of the process. Here i is being evaluated twice.

Thanks, I think you're referring to:
(n8??.txt, 6.5)
[#2] Between the previous and next sequence point an object
shall have its stored value modified at most once by the
evaluation of an expression. Furthermore, the prior value
shall be accessed only to determine the value to be
stored.60)

Note that this requirement is made only between two consecutive
sequence points. Indeed, (only) if lhs is evaluated first, the
last sentence is violated for the `i' object.

Have a look at this, I beleive it is correct now:

Here's a test program and results:

gcc: 1 0
that's gcc3.3
gcc2.95: 0 1
como: 0 1

#include <stdio.h>

int one(int unused) { return 1; }

int main()
{
int a[2] = {0};
int i = 0; #if 0 a[i] = one(i++); #endif
a[i] = one(0 || i++); printf("%d %d\n", a[0], a[1]);
return 0;
}

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`

Nov 15 '05 #23

Robert Gamble

S.Tobias wrote:

Barry Schwarz <sc******@deloz.net> wrote:
On 21 Jul 2005 09:23:50 GMT, "S.Tobias"
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:

I think there's a worse pit-fall:
int i=0;
a[i] = f(i++);
Which element is being set?
I think this is unspecified (6.5.16#4), but the behaviour
is defined.

I don't think the behavior is defined. While i is being updated only
once, there is a second requirement that i be evaluated at most once
as part of the process. Here i is being evaluated twice.

Thanks, I think you're referring to:
(n8??.txt, 6.5)
[#2] Between the previous and next sequence point an object
shall have its stored value modified at most once by the
evaluation of an expression. Furthermore, the prior value
shall be accessed only to determine the value to be
stored.60)

Note that this requirement is made only between two consecutive
sequence points. Indeed, (only) if lhs is evaluated first, the
last sentence is violated for the `i' object.

Have a look at this, I beleive it is correct now:
Here's a test program and results:

gcc: 1 0
that's gcc3.3
gcc2.95: 0 1
como: 0 1

#include <stdio.h>

int one(int unused) { return 1; }

int main()
{
int a[2] = {0};
int i = 0; #if 0 a[i] = one(i++); #endif
a[i] = one(0 || i++); printf("%d %d\n", a[0], a[1]);
return 0;
}

Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a[i]" before the function is called
without an intervening sequence point.

Robert Gamble

Nov 15 '05 #24

S.Tobias

Robert Gamble <rg*******@gmail.com> wrote:

S.Tobias wrote:
a[i] = one(0 || i++);

Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a[i]" before the function is called
without an intervening sequence point.

I think it can be evaluated in one of the two ways:

(lhs, rhs)
a[i] , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a[i] , =

The first case is more interesting. There's a sequence point
between evaluation of `a[i]' and `i++', so the behaviour is defined.
As for the whole expression, the behaviour is unspecified (I think).

Or is it that lhs and rhs can be evaluated in paralell? Then
there's indeed UB, but then there would be one in `i=f(++i)' too.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`

Nov 15 '05 #25

Tim Rentsch

"S.Tobias" <si***@FamOuS.BedBuG.pAlS.INVALID> writes:

Robert Gamble <rg*******@gmail.com> wrote:
S.Tobias wrote:

a[i] = one(0 || i++);

Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a[i]" before the function is called
without an intervening sequence point.

I think it can be evaluated in one of the two ways:

(lhs, rhs)
a[i] , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a[i] , =

You missed at least one way:

0 [SEQP] a[i] , i++ [SEQP] one() , return 1 [SEQP] =

This ordering shows how this statement could evoke undefined behavior.

Nov 15 '05 #26

Robert Gamble

S.Tobias wrote:

Robert Gamble <rg*******@gmail.com> wrote:
S.Tobias wrote:

a[i] = one(0 || i++);

Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a[i]" before the function is called
without an intervening sequence point.

I think it can be evaluated in one of the two ways:

(lhs, rhs)
a[i] , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a[i] , =

The first case is more interesting. There's a sequence point
between evaluation of `a[i]' and `i++', so the behaviour is defined.
As for the whole expression, the behaviour is unspecified (I think).

Or is it that lhs and rhs can be evaluated in paralell? Then
there's indeed UB, but then there would be one in `i=f(++i)' too.

I am not aware of anything in the Standard that states that both ++i in
f(++i) and i in a[i] can't be evaluated before the actual function
call, this would invoke undefined behavior, see Tim's response.

As for i=f(++i), I still think this is well-defined. The i on the lhs
is not evaluated and cannot be assigned to until after f returns which
guarantees a sequence point between the two modifications.

To summerize my stance:
i = f(++i); well-defined
i = f(i++); well-defined
a[i] = f(++i); undefined
a[i] = f(i++); undefined

Also undefined would be:
a[++i] = f(i);
a[i++] = f(i);

Robert Gamble

Nov 15 '05 #27

S.Tobias

Tim Rentsch <tx*@alumnus.caltech.edu> wrote:

"S.Tobias" <si***@FamOuS.BedBuG.pAlS.INVALID> writes:
Robert Gamble <rg*******@gmail.com> wrote:
> S.Tobias wrote:

>> a[i] = one(0 || i++);

> Still has the possibility for undefined behavior, the "0 || i++" could
> be evaluated followed by i in "a[i]" before the function is called
> without an intervening sequence point.

I think it can be evaluated in one of the two ways:

(lhs, rhs)
a[i] , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a[i] , =

You missed at least one way:

0 [SEQP] a[i] , i++ [SEQP] one() , return 1 [SEQP] =

Could you please tell me by which rule you have interleaved evaluation
of lhs and rhs?

# 6.5.16 Assignment operators
# 4 The order of evaluation of the operands is unspecified. [...]

Compare it with:

# 6.5 Expressions
# 3 The grouping of operators and operands is indicated by the
# syntax.71) Except as specified later (for the function-call (),
# &&, ||, ?:, and comma operators), the order of evaluation of
# subexpressions and the order in which side effects take place
# are both unspecified.
#
# 6.5.2.2 Function calls
# 10 The order of evaluation of the function designator, the actual
# arguments, and subexpressions within the actual arguments is
# unspecified, but there is a sequence point before the actual call.

I interpret above wording in such a way, that for "=" operator either
lhs is fully evaluated and then rhs is evaluated, or rhs is fully
evaluated and then lhs. If the Standard meant otherwise, 6.5.16p.4
could be dropped (as is for other operators) - 6.5p.3 would be enough.
Note also explicit wording for function call ("and subexpressions...").

(6.5.2.2p.10 is actually needed and is not covered by 6.5p.3, because
function designator and arguments are not operands to a common operator;
rather arguments parameterize the operator, to which the (single)
operand is function expression.)

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`

Nov 15 '05 #28

Robert Gamble

S.Tobias wrote:

Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
"S.Tobias" <si***@FamOuS.BedBuG.pAlS.INVALID> writes:
Robert Gamble <rg*******@gmail.com> wrote:
> S.Tobias wrote:

>> a[i] = one(0 || i++);

> Still has the possibility for undefined behavior, the "0 || i++" could
> be evaluated followed by i in "a[i]" before the function is called
> without an intervening sequence point.

I think it can be evaluated in one of the two ways:

(lhs, rhs)
a[i] , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a[i] , =

You missed at least one way:

0 [SEQP] a[i] , i++ [SEQP] one() , return 1 [SEQP] =

Could you please tell me by which rule you have interleaved evaluation
of lhs and rhs?

# 6.5.16 Assignment operators
# 4 The order of evaluation of the operands is unspecified. [...]

Compare it with:

# 6.5 Expressions
# 3 The grouping of operators and operands is indicated by the
# syntax.71) Except as specified later (for the function-call (),
# &&, ||, ?:, and comma operators), the order of evaluation of
# subexpressions and the order in which side effects take place
# are both unspecified.
#
# 6.5.2.2 Function calls
# 10 The order of evaluation of the function designator, the actual
# arguments, and subexpressions within the actual arguments is
# unspecified, but there is a sequence point before the actual call.

I interpret above wording in such a way, that for "=" operator either
lhs is fully evaluated and then rhs is evaluated, or rhs is fully
evaluated and then lhs. If the Standard meant otherwise, 6.5.16p.4
could be dropped (as is for other operators) - 6.5p.3 would be enough.
Note also explicit wording for function call ("and subexpressions...").

I don't think this is a solid argument, on the contrary, one could just
as easily conclude that the wording is there specifically to allow what
you say it doesn't. If the intent was that the rhs be evaluated before
the lhs it would have been very easy to add a clause saying just that.

There is a document entitled "Sequence Point Analysis" authored by
Raymond Mak of IBM and published as WG14 N926 in which he presents a
method of breaking up expressions using partial ordering by creating an
abstract syntax tree to better understand these types of issues
regarding sequence points, how expressions are evaluated, and what
types of expressions fall into the categories of defined, undefined,
and unspecified hehavior. The document is freely available at
http://www.open-std.org/jtc1/sc22/wg.../docs/n926.htm. I recommend
reading it, the material is well-presented and easy to follow.

The method presented in this document provides a relatively clear and
concise way to understand the nature of sequence points in expressions.
Many examples are discussed including two that are relevant to this
thread. Below are those two examples and the conclusions, refer to the
actual document for the details.

EXAMPLE 4

int x;
extern int f(int);
x = f(x++);

....

There is an intervening sequence point. The expression is
well-defined.

and

EXAMPLE 13

int x[2], *y;
y=x;
*y = f(y++);

....

Even though there is a sequence point in between, the two nodes are
not ordered. The expression is undefined.

Example 13 is equivalent in nature to the example we are discussing
now. It is undefined for the same reasons that a[i] = f(i++); is
undefined.

It might be worth noting that gcc 3.3.5 seems to agree with all of this
(not to imply of course that the compiler dictates the Standard, etc.).

Robert Gamble

Nov 15 '05 #29

Tim Rentsch

"S.Tobias" <si***@FamOuS.BedBuG.pAlS.INVALID> writes:

Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
"S.Tobias" <si***@FamOuS.BedBuG.pAlS.INVALID> writes:
Robert Gamble <rg*******@gmail.com> wrote:
> S.Tobias wrote:

>> a[i] = one(0 || i++);

> Still has the possibility for undefined behavior, the "0 || i++" could
> be evaluated followed by i in "a[i]" before the function is called
> without an intervening sequence point.

I think it can be evaluated in one of the two ways:

(lhs, rhs)
a[i] , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a[i] , =

You missed at least one way:

0 [SEQP] a[i] , i++ [SEQP] one() , return 1 [SEQP] =

Could you please tell me by which rule you have interleaved evaluation
of lhs and rhs? [snip]

Sorry for the late reply here...

I don't have specific text I can point to that shows this
clearly and unambiguously. It's just how expression
evaluation works, with the clause that "order of evaluation
of subexpressions is unspecified".

A way that might be useful as an explanation is this: start
at the top of the abstract syntax tree, and recurse
non-deterministically down both subtrees of ordinary
operators (basically everything but &&, ||, comma, and ?:)
and deterministically down the left branch then the other
branch (if necessary) for the remaining operators; any
order of execution in this non-deterministic expansion
is a possible order of execution.

I also echo the comment made in another reply to look
at the document describing the formal model done by
Raymond Mak.

Nov 15 '05 #30

Similar topics