By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,130 Members | 1,365 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,130 IT Pros & Developers. It's quick & easy.

Aliasing in assignment

P: n/a
The following code crashes on Solaris 10 when compiled without
optimization:

typedef struct Node Node;

struct Node {
int val;
Node *next;
};

int main(void)
{
Node b = { 2, 0 };
Node a = { 1, &b };

a = *(a.next);

return a.val;
}

What happens is that after a has been written to, and a.next has been
set to null, a.next is dereferenced again (for some obscure reason).

Whose fault is this, the programmer's or the compiler's? Initially I
thought that it's the programmer's fault since there are no sequence
points between the access to a.next and the writing to a, but if that
made the code illegal, how about the following ubiquitous idiom:

Node* p = &a;
p = p->next;

Here, too, we both access p and write to p without sequence points in
between. What's the difference, or is there any?

Thanks in advance.
Lauri
Mar 22 '07 #1
Share this Question
Share on Google+
16 Replies


P: n/a
In article <et**********@oravannahka.helsinki.fi>,
Lauri Alanko <la@iki.fiwrote:
>The following code crashes on Solaris 10 when compiled without
optimization:
[snip bits that make it compileable]
Node b = { 2, 0 };
Node a = { 1, &b };

a = *(a.next);
>What happens is that after a has been written to, and a.next has been
set to null, a.next is dereferenced again (for some obscure reason).

Whose fault is this, the programmer's or the compiler's? Initially I
thought that it's the programmer's fault since there are no sequence
points between the access to a.next and the writing to a,
As far as I can tell, the old value is accessed only to determine the
new value to be stored (access pointer in a.next to determine where to
find value to store, dereference pointer to get value to store), which
means it's perfectly acceptable according to 6.5#2 of n869, which is
the paragraph that would make it undefined if that were the problem.

So unless I'm missing something, this looks like a compiler bug.
dave

--
Dave Vandervies dj******@csclub.uwaterloo.ca
Well, it's logically consistent and interesting. That appears to be
all mathematicians need.
--James Riden in the scary devil monastery
Mar 22 '07 #2

P: n/a
Lauri Alanko <l...@iki.fiwrote:
The following code crashes on Solaris 10 when compiled without
optimization:

typedef struct Node Node;

struct Node {
int val;
Node *next;
};

int main(void)
{
Node b = { 2, 0 };
Node a = { 1, &b };
This violates constraint 6.7.8p4 which states that an
initialisor has to be a constant.

Try...

static Node b = { 2, 0 };
static Node a = { 1, &b };
>
a = *(a.next);
return a.val;
Note that neither 1 nor 2 are portable values for main to
return to the host. Use 0, EXIT_SUCCESS or EXIT_FAILURE;
the latter two from <stdlib.h>.
}

What happens is that after a has been written to, and a.next
has been set to null, a.next is dereferenced again (for some
obscure reason).

Whose fault is this, the programmer's or the compiler's?
One fault is the programmer's. If fixing that doesn't fix the
problem, then it appears to be the compiler's.

--
Peter

Mar 22 '07 #3

P: n/a
Peter Nilsson wrote:
Lauri Alanko <l...@iki.fiwrote:
>The following code crashes on Solaris 10 when compiled without
optimization:

typedef struct Node Node;

struct Node {
int val;
Node *next;
};

int main(void)
{
Node b = { 2, 0 };
Node a = { 1, &b };

This violates constraint 6.7.8p4 which states that an
initialisor has to be a constant.
No; the cited paragraph says

All the expressions in an initializer for an object that
^^^^^^^^^^^^^^^^^^
has static storage duration shall be constant expressions
^^^^^^^^^^^^^^^^^^^^^^^^^^^
or string literals.

(Emphasis mine.)
>Whose fault is this, the programmer's or the compiler's?

One fault is the programmer's. If fixing that doesn't fix the
problem, then it appears to be the compiler's.
I think (mind you, I say I "think") it's the compiler's fault.

--
Eric Sosman
es*****@acm-dot-org.invalid
Mar 22 '07 #4

P: n/a
Eric Sosman <es*****@acm-dot-org.invalidwrites:
Peter Nilsson wrote:
>Lauri Alanko <l...@iki.fiwrote:
>>int main(void)
{
Node b = { 2, 0 };
Node a = { 1, &b };

This violates constraint 6.7.8p4 which states that an
initialisor has to be a constant.

No; the cited paragraph says

All the expressions in an initializer for an object that
^^^^^^^^^^^^^^^^^^
has static storage duration shall be constant expressions
^^^^^^^^^^^^^^^^^^^^^^^^^^^
or string literals.
But in C89, the corresponding paragraph said:

All the expressions in an initializer for an object that has static
storage duration or in an initializer list for an object that has
aggregate or union type shall be constant expressions.

(or something similar; I'm quoting from a draft.)
--
Comp-sci PhD expected before end of 2007
Seeking industrial or academic position *outside California* in 2008
Mar 22 '07 #5

P: n/a
Ben Pfaff wrote:
>
Eric Sosman <es*****@acm-dot-org.invalidwrites:
Peter Nilsson wrote:
Lauri Alanko <l...@iki.fiwrote:
int main(void)
{
Node b = { 2, 0 };
Node a = { 1, &b };

This violates constraint 6.7.8p4 which states that an
initialisor has to be a constant.
No; the cited paragraph says

All the expressions in an initializer for an object that
^^^^^^^^^^^^^^^^^^
has static storage duration shall be constant expressions
^^^^^^^^^^^^^^^^^^^^^^^^^^^
or string literals.

But in C89, the corresponding paragraph said:

All the expressions in an initializer
for an object that has static
storage duration or in an initializer list for an object that has
aggregate or union type shall be constant expressions.

(or something similar; I'm quoting from a draft.)
The same words are in ISO/IEC 9899: 1990.

--
pete
Mar 23 '07 #6

P: n/a
On Mar 23, 2:52 am, Lauri Alanko <l...@iki.fiwrote:
The following code crashes on Solaris 10 when compiled without
optimization:

typedef struct Node Node;

struct Node {
int val;
Node *next;

};

int main(void)
{
Node b = { 2, 0 };
Node a = { 1, &b };

a = *(a.next);

return a.val;

}

What happens is that after a has been written to, and a.next has been
set to null, a.next is dereferenced again (for some obscure reason).
How do you know?
Whose fault is this, the programmer's or the compiler's? Initially I
thought that it's the programmer's fault since there are no sequence
points between the access to a.next and the writing to a.
There's an ugly clause in the standard that defines what's legal.
Informally, it is legal to both read from and write to a variable
without a sequence point, if and only if you must perform the read
in order to compute the value to be written -- ie. there is a
temporal relationship.

In this case, it is OK because you cannot dereference a.next without
first evaluating a.next, which in turn requires that you have already
evaluated 'a'.

Mar 23 '07 #7

P: n/a
On Mar 22, 6:52 am, Lauri Alanko <l...@iki.fiwrote:
The following code crashes on Solaris 10 when compiled without
optimization:

typedef struct Node Node;

struct Node {
int val;
Node *next;

};

int main(void)
{
Node b = { 2, 0 };
Node a = { 1, &b };

a = *(a.next);

return a.val;

}

What happens is that after a has been written to, and a.next has been
set to null, a.next is dereferenced again (for some obscure reason).

Whose fault is this, the programmer's or the compiler's? [...]
<snip>

If you're using C89, it's the programmer's fault. Quoting from my BS/
EN 29899:1993 A4 hardcopy:

6.5.7 Initialization

Constraints [...]

All the expressions in an initializer for an object that has static
storage duration or in an initializer list for an object that has
aggregate or union type shall be constant expressions.
Obviously the code "Node a = { 1, &b };" does not satisfy this
constraint. In fact, my compiler complains about it when invoked in
strict C89 mode:

[mark@icepick ~]$ gcc -Wall -O2 foo.c -o foo -ansi -pedantic -std=c89
foo.c: In function 'main':
foo.c:14: warning: initializer element is not computable at load time
With the advent of GNU C and C99, the rules have changed. Quoting
9899:1999 TC2 draft N1124:

6.7.8 Initialization

Constraints [...]

4. All the expressions in an initializer for an object that has
static storage duration shall be constant expressions or string
literals.
Notice that the constraints have been loosened for aggregate and union
types.

Although I certainly cannot speak for the WG, it appears that one of
the reasons this feature has been adopted in C99 is the prevalent use
of the following GNU C extension (which pre-dates C99):

[GCC 2.95.3 Manual, Extensions to the C Language Family]

4.18 Non-Constant Initializers

As in standard C++, the elements of an aggregate initializer for an
automatic variable are not required to be constant expressions in GNU
C.

Initially I
thought that it's the programmer's fault since there are no sequence
points between the access to a.next and the writing to a, but if that
made the code illegal, how about the following ubiquitous idiom:

Node* p = &a;
p = p->next;

Here, too, we both access p and write to p without sequence points in
between. What's the difference, or is there any?
The difference is that the undefined behavior is invoked in the
initialization of the type. This undefined behavior is apparently
causing the assignment to crash, which is one of the things that
undefined behavior often does.

In contrast, the pointer object Node* p is not an aggregate type, and
not subject to the same initialization rules that an aggregate or
union type is.

Mark F. Haigh
mf*****@sbcglobal.net
Mar 23 '07 #8

P: n/a
On Mar 22, 2:52 pm, Lauri Alanko <l...@iki.fiwrote:
The following code crashes on Solaris 10 when compiled without
optimization:

typedef struct Node Node;

struct Node {
int val;
Node *next;

};

int main(void)
{
Node b = { 2, 0 };
Node a = { 1, &b };

a = *(a.next);

return a.val;

}

What happens is that after a has been written to, and a.next has been
set to null, a.next is dereferenced again (for some obscure reason).

Whose fault is this, the programmer's or the compiler's? Initially I
thought that it's the programmer's fault since there are no sequence
points between the access to a.next and the writing to a, but if that
made the code illegal, how about the following ubiquitous idiom:

Node* p = &a;
p = p->next;

Here, too, we both access p and write to p without sequence points in
between. What's the difference, or is there any?

Thanks in advance.

Lauri
Since there are complaints about the initialisers (why the hell would
a compiler accept an initialisation if it invokes undefined
behavior? ), could you tell us what happens if you write

int main(void)
{
Node b, a;
b.val = 2; b.next = 0;
a.val = 1; a.next = &b;

a = *(a.next);

return a.val;

}

Mar 24 '07 #9

P: n/a
christian.bau wrote, On 24/03/07 00:39:

<snip>
Since there are complaints about the initialisers (why the hell would
a compiler accept an initialisation if it invokes undefined
behavior? ), could you tell us what happens if you write
<snip>

Look up undefined in a dictionary or the C standard. It means it is not
defined, part of not being defined is that it does not define that a
diagnostic should be produced.
--
Flash Gordon
Mar 24 '07 #10

P: n/a
On Mar 23, 6:19 pm, Flash Gordon <s...@flash-gordon.me.ukwrote:
christian.bau wrote, On 24/03/07 00:39:

<snip>
Since there are complaints about the initialisers (why the hell would
a compiler accept an initialisation if it invokes undefined
behavior? ), could you tell us what happens if you write

<snip>

Look up undefined in a dictionary or the C standard. It means it is not
defined, part of not being defined is that it does not define that a
diagnostic should be produced.
--
The C Standard Rationale has some interesting things to say:

3 Terms and Definitions

25 The terms unspecified behavior, undefined behavior, and
implementation-defined behavior are used to categorize the result of
writing programs whose properties the Standard does not, or cannot,
completely describe. The goal of adopting this categorization is to
allow a certain variety among implementations which permits quality of
implementation to be an active force in the marketplace as well as to
allow certain popular extensions, without removing the cachet of
conformance to the Standard.
[...]

Ah, yes. "Quality of implementation". Good-quality implementations
warn the user and try to do something reasonable. Poor-quality
implementations silently produce broken code.

I'd wager that Christian understands the definition of 'undefined'.
His point is that an implementation that cannot warn the user over
such a simple and minor transgression is a bit too DeathStation-ish on
the QoI scale to be allowed to roam free in the wild.
Mark F. Haigh
mf*****@sbcglobal.net

Mar 24 '07 #11

P: n/a
On Mar 22, 9:52 pm, Lauri Alanko <l...@iki.fiwrote:
The following code crashes on Solaris 10 when compiled without
optimization:
... int main(void)
{
Node b = { 2, 0 };
Node a = { 1, &b };
a = *(a.next);
return a.val;
}
Lawyerly types are debating what the compiler *may*
or *must* do, but I'm very curious about what it *did*
do. Please let us see the compiler output
(eg, output of ``cc -S'').

IIRC, Sun's compiler for Sparc would sometimes
(because of pipelining and to save space
in branches) allow an unwilled statement to execute,
but only if it were harmless, and (I thought) only
with optimization. Anyway that shouldn't arise in
your unbranching non-inlined function.

James

Mar 24 '07 #12

P: n/a
Mark F. Haigh wrote, On 24/03/07 07:24:
On Mar 23, 6:19 pm, Flash Gordon <s...@flash-gordon.me.ukwrote:
>christian.bau wrote, On 24/03/07 00:39:

<snip>
>>Since there are complaints about the initialisers (why the hell would
a compiler accept an initialisation if it invokes undefined
behavior? ), could you tell us what happens if you write
<snip>

Look up undefined in a dictionary or the C standard. It means it is not
defined, part of not being defined is that it does not define that a
diagnostic should be produced.
--

The C Standard Rationale has some interesting things to say:

3 Terms and Definitions

25 The terms unspecified behavior, undefined behavior, and
implementation-defined behavior are used to categorize the result of
writing programs whose properties the Standard does not, or cannot,
completely describe. The goal of adopting this categorization is to
allow a certain variety among implementations which permits quality of
implementation to be an active force in the marketplace as well as to
allow certain popular extensions, without removing the cachet of
conformance to the Standard.
[...]

Ah, yes. "Quality of implementation". Good-quality implementations
warn the user and try to do something reasonable. Poor-quality
implementations silently produce broken code.

I'd wager that Christian understands the definition of 'undefined'.
His point is that an implementation that cannot warn the user over
such a simple and minor transgression is a bit too DeathStation-ish on
the QoI scale to be allowed to roam free in the wild.
In this particular case it could be that it does not warn because it
allows it as an extension which is allowed by what you quote above. So
there might be a very good reason for not producing a warning in default
mode.
--
Flash Gordon
Mar 24 '07 #13

P: n/a
Thanks to Dave and Wolf for informative answers: 6.5#2 indeed seems to
justify both "p = p->next" and "a = *(a.next)", so I can conclude that
this is a compiler bug.

To those interested in the details:

typedef struct Node Node;

struct Node {
int val;
Node *next;
};

int main(void)
{
Node a, b;
b.val = 2;
b.next = 0;
a.val = 1;
a.next = &b;

a = *(a.next);

return a.val;
}

$ uname -a
SunOS xxxxxxxx 5.10 Generic sun4u sparc SUNW,Sun-Fire-V210 Solaris
$ /opt/SUNWspro/bin/cc -g -V -S t.c -o t.s
cc: Sun C 5.8 2005/10/13
acomp: Sun C 5.8 2005/10/13
$ /opt/SUNWspro/bin/cc -g -V -o t t.s
cc: Sun C 5.8 2005/10/13
ld: Software Generation Utilities - Solaris Link Editors: 5.10-1.479
$ ./t
Segmentation Fault

Here's the relevant part from t.s:

! 14 a.next = &b;

add %fp,-20,%l0
st %l0,[%fp-8]

! block 5
..L21:

! 16 a = *(a.next);

ld [%fp-8],%l2
add %fp,-12,%l0
..L_y0:
ld [%l2+0],%l1
st %l1,[%l0+0]
..L_y1:
ld [%l2+4],%l1
st %l1,[%l0+4]
ld [%fp-8],%l0
or %g0,4,%g1
1:
subcc %g1,4,%g1
..L_y2:
ld [%l0+%g1],%l2
bg 1b+4
subcc %g1,4,%g1

The segfault happens in the last ld instruction, since %l0 is zero.
("How do I know?" I use dbx, doh.) The last six instructions don't seem
to make any sense in any case. It's as if there were a dummy *(a.next)
dereference after the assignment was completed. This happens both with
and without -g, but not with -O.

Finally, to the numerous would-be language lawyers who responded: please
try to get your act together. Comp.lang.c must be in a sorry state
nowadays, if you can't find better remarks than "All right, maybe it's
legal _now_, but it's only been legal for seven years. If you'd tried
pulling that trick before then, you'd be in _real_ trouble now!" Somehow
that seems to lack the desired punch...

For what it's worth, Sun cc's man page explicitly says that C99 language
features are supported by default.
Lauri
Mar 26 '07 #14

P: n/a
Lauri Alanko wrote:
>
Thanks to Dave and Wolf for informative answers: 6.5#2 indeed
seems to justify both "p = p->next" and "a = *(a.next)", so I can
conclude that this is a compiler bug.
No you can't.

.... snip ...
typedef struct Node Node;

struct Node {
int val;
Node *next;
};

int main(void)
{
Node a, b;
b.val = 2;
b.next = 0;
a.val = 1;
a.next = &b;

a = *(a.next);

return a.val;
}
If you follow the action, you will find you are dereferencing a
NULL pointer. Boom.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Mar 27 '07 #15

P: n/a
In article <46***************@yahoo.com>,
CBFalconer <cb********@maineline.netwrote:
>Lauri Alanko wrote:
>>
Thanks to Dave and Wolf for informative answers: 6.5#2 indeed
seems to justify both "p = p->next" and "a = *(a.next)", so I can
conclude that this is a compiler bug.

No you can't.

... snip ...
>typedef struct Node Node;

struct Node {
int val;
Node *next;
};

int main(void)
{
Node a, b;
b.val = 2;
b.next = 0;
a.val = 1;
a.next = &b;

a = *(a.next);

return a.val;
}

If you follow the action, you will find you are dereferencing a
NULL pointer. Boom.
Where?
dave

--
Dave Vandervies dj******@csclub.uwaterloo.ca

I forget the details. It seemed pretty clever when I was about 9 years old.
--Ben Ketcham in comp.lang.c
Mar 27 '07 #16

P: n/a
Dave Vandervies wrote:
CBFalconer <cb********@maineline.netwrote:
>Lauri Alanko wrote:
>>>
Thanks to Dave and Wolf for informative answers: 6.5#2 indeed
seems to justify both "p = p->next" and "a = *(a.next)", so I can
conclude that this is a compiler bug.

No you can't.

... snip ...
>>typedef struct Node Node;

struct Node {
int val;
Node *next;
};

int main(void)
{
Node a, b; /*1*/
b.val = 2; /*2*/
b.next = 0; /*3*/
a.val = 1; /*4*/
a.next = &b; /*5*/

a = *(a.next); /*6*/

return a.val; /*7*/ /* ids added - cbf */
}

If you follow the action, you will find you are dereferencing a
NULL pointer. Boom.

Where?
Now I don't see it myself. 6 sets a = b, so a.next is NULL. Yet
a.val is 2. Now it looks like a bug to me.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Mar 27 '07 #17

This discussion thread is closed

Replies have been disabled for this discussion.