473,326 Members | 2,110 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

About casts (and pointers)

Some time a go, in a discussion here in comp.lang.c, I learnt that it's
better not to use a (sometype **) where a (void **) is expected (using
a cast). Part of the discussion boiled down to the rule: if I cast a
(sometype **) to a (void **) I am making a number of assumptions about
the implementation's (void **) representation and length. Specifically,
if I do the above cast I'm assuming that a (sometype **) and a (void
**) have the same size and representation, and this might not always be
true. Ok, all clear up to this point.

But now my question is: does the above rule generalize to *every*
possible cast (expecially those where pointers are involved)? Is every
explicit cast unsafe? (Here I'm not talking about conversions to/from
void *, which I know are safe if performed implicitly by the language.
I also know that casts are better avoided unless strictly needed. But
I'm curious to know how things work).

As an example, look at the widely used cast (struct sockaddr_in *) to
(struct sockaddr *). Is that safe?
Seems to me that, as in the case of (void **), that cast assumes that
the size and representation of a (struct sockaddr *) are the same of a
(struct sockaddr_in *). Should this be done using an intermediate (void
*)?

What is (if any) the general rule? Or, in general, what can be said and
assumed about the casted object?

Thanks and sorry for the possibly silly questions.

Nov 14 '05 #1
47 2577


su****@katamail.com wrote:
Some time a go, in a discussion here in comp.lang.c, I learnt that it's
better not to use a (sometype **) where a (void **) is expected (using
a cast). Part of the discussion boiled down to the rule: if I cast a
(sometype **) to a (void **) I am making a number of assumptions about
the implementation's (void **) representation and length. Specifically,
if I do the above cast I'm assuming that a (sometype **) and a (void
**) have the same size and representation, and this might not always be
true. Ok, all clear up to this point.
Almost clear, but still slightly murky. When you (try
to) use a `sometype **' and a `void **' interchangeably, the
assumption is that `sometype *' (one asterisk) and `void *'
have the same representation.
But now my question is: does the above rule generalize to *every*
possible cast (expecially those where pointers are involved)? Is every
explicit cast unsafe? (Here I'm not talking about conversions to/from
void *, which I know are safe if performed implicitly by the language.
I also know that casts are better avoided unless strictly needed. But
I'm curious to know how things work).
Some explicit casts are safe, some are not. For example,
any data pointer can be converted to `unsigned char *' and
then used to access the individual bytes of the original object;
this it perfectly safe (although what you actually do to the
bytes might not be).

Honesty is the best policy. If you've got an object
of type `sometype', use a `sometype *' to point to it.
If you've got an object of type `sometype *', point to it
with a `sometype **'. If that's not possible (for example,
when writing a qsort() comparison function), then converting
a data pointer to `void *' (one asterisk) and back is all
right. Other pointer conversions should be viewed with
suspicion, although not necessarily with horror.
As an example, look at the widely used cast (struct sockaddr_in *) to
(struct sockaddr *). Is that safe?
Seems to me that, as in the case of (void **), that cast assumes that
the size and representation of a (struct sockaddr *) are the same of a
(struct sockaddr_in *). Should this be done using an intermediate (void
*)?


Ah, yes, well, you've encountered a special case. If
you've got two struct types that begin with the same sequence
of elements, then you can use a pointer to either type to get
at those initial elements. A `struct sockaddr' (presumably)
starts with a few elements that let the called function
decide whether it's been given a `struct sockaddr_in' or a
`struct sockaddr_something_else', and thereafter the called
function can convert the pointer to the proper struct type
so as to access the remaining elements.

Fans of object-oriented languages sneer at this "poor
man's polymorphism," and some of their sneers are perhaps
justified: it works, but it's fragile in the sense that the
compiler usually cannot warn you about simple errors. If you
must use an API that indulges in this sort of thing -- well,
that's what the API demands, and you haven't a lot of choice.
When designing your own functions, though, I'd suggest you avoid
this practice unless you find compelling reasons to adopt it.

--
Er*********@sun.com

Nov 14 '05 #2
In article <d4**********@news1brm.Central.Sun.COM>,
Eric Sosman <er*********@sun.com> wrote:
Ah, yes, well, you've encountered a special case. If
you've got two struct types that begin with the same sequence
of elements, then you can use a pointer to either type to get
at those initial elements. A `struct sockaddr' (presumably)
starts with a few elements that let the called function
decide whether it's been given a `struct sockaddr_in' or a
`struct sockaddr_something_else', and thereafter the called
function can convert the pointer to the proper struct type
so as to access the remaining elements.


Strictly speaking, this is only true if you define a union containing a
"struct sockaddr_in" and a "struct sockaddr_something_else", and the
compiler must have seen the declaration of that union before your code
handles "struct sockaddr_in" and "struct sockaddr_something_else"
interchangably.

In practice, it will always work except perhaps on the DeathStation 9000
because it would be quite difficult for a compiler to make it work when
the compiler is forced to make it work but not in other cases.
Nov 14 '05 #3
Christian Bau wrote:
In article <d4**********@news1brm.Central.Sun.COM>,
Eric Sosman <er*********@sun.com> wrote:
Ah, yes, well, you've encountered a special case. If
you've got two struct types that begin with the same sequence
of elements, then you can use a pointer to either type to get
at those initial elements. A `struct sockaddr' (presumably)
starts with a few elements that let the called function
decide whether it's been given a `struct sockaddr_in' or a
`struct sockaddr_something_else', and thereafter the called
function can convert the pointer to the proper struct type
so as to access the remaining elements.
Strictly speaking, this is only true if you define a union containing a
"struct sockaddr_in" and a "struct sockaddr_something_else", and the
compiler must have seen the declaration of that union before your code
handles "struct sockaddr_in" and "struct sockaddr_something_else"
interchangably.


Yeah, I thought about the "in a union" thing when composing
my post, but decided to ignore it. As far as I can tell, the
compiler could only behave perversely if it could somehow prove
that that the two structs could never possibly appear as members
of the same union, even in translation units the compiler has not
yet seen that might or might not be linked into the same final
program. I believe such a proof is beyond the capabilities of
current compilers, and is likely to remain so until I die and no
longer care about it ...
In practice, it will always work except perhaps on the DeathStation 9000
because it would be quite difficult for a compiler to make it work when
the compiler is forced to make it work but not in other cases.


Perversity is always a possibility. It's not very marketable,
though, outside the realm of popular "music."

--
Eric Sosman
es*****@acm.org
Nov 14 '05 #4
su****@katamail.com wrote:
Some time a go, in a discussion here in comp.lang.c, I learnt that it's better not to use a (sometype **) where a (void **) is expected (using a cast). Part of the discussion boiled down to the rule: if I cast a
(sometype **) to a (void **) I am making a number of assumptions about the implementation's (void **) representation and length. Specifically, if I do the above cast I'm assuming that a (sometype **) and a (void
**) have the same size and representation, and this might not always be true. Ok, all clear up to this point.

But now my question is: does the above rule generalize to *every*
possible cast (expecially those where pointers are involved)? Is every explicit cast unsafe? (Here I'm not talking about conversions to/from
void *, which I know are safe if performed implicitly by the

language.

The standard defines exact width integer types [u]intN_t in a strict
way saying that there must be no padding etc. As a result, IMO/AFAIK
it's safe to convert a pointer from one [u]intN_t type to a different
[u]intM_t type. For example, the following is safe:

uint32_t * a; ...
uint8_t * b = a;
b[3] = b[5];
a = b+4;
a[7] = a[9];

(On the other hand, the [u]intN_t types need not be supported of
course.)

Daniel Vallstrom

Nov 14 '05 #5
"Daniel Vallstrom" <da**************@gmail.com> writes:
[...]
The standard defines exact width integer types [u]intN_t in a strict
way saying that there must be no padding etc.
Yes.
As a result, IMO/AFAIK
it's safe to convert a pointer from one [u]intN_t type to a different
[u]intM_t type.
How does that follow?
For example, the following is safe:

uint32_t * a; ...
uint8_t * b = a;


There is no implicit conversion from uint32_t* to uint8_t*, so you
need an explicit cast here (though some compilers may allow it).

uint8_t is a special case because, if it exists, it's very likely
(certain?) to be a typedef for unsigned char. But let's change the
example to:

uint32_t arr[10];
uint32_t *a = arr;
uint16_t *b = a;

You can safely convert a pointer to one type to a pointer to another
type *and back again* if the intermediate pointer is correctly
aligned; if it isn't, the conversion invokes undefined behavior. It's
likely that int32_t has stricter alignment requirements than int16_t;
if so, converting from int16_t* to int32_t* can invoke undefined
behavior. (It's possible, but unlikely in reality, that int16_t has
stricter alignment requirements than int32_t.)

Even if the conversion is allowed, the result isn't necessarily going
to be useful.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #6
Keith Thompson wrote:
"Daniel Vallstrom" <da**************@gmail.com> writes:
[...]
The standard defines exact width integer types [u]intN_t in a strict way saying that there must be no padding etc.
Yes.
As a result, IMO/AFAIK it's safe to convert a pointer from one [u]intN_t type to a different [u]intM_t type.


How does that follow?


It doesn't. (It would if pointers were addresses.)

For example, the following is safe:

uint32_t * a; ...
uint8_t * b = a;


There is no implicit conversion from uint32_t* to uint8_t*, so you
need an explicit cast here (though some compilers may allow it).


Right.
uint8_t is a special case because, if it exists, it's very likely
(certain?) to be a typedef for unsigned char.
Right. Change 8 to 16.
But let's change the
example to:
Let's not;) I'll instead apologies for a very poor post riddled
with errors and try again with a proper example:

This, I believe, should be safe --- even though it's rather pointless:

int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );

You can safely convert a pointer to one type to a pointer to another
type *and back again* if the intermediate pointer is correctly
aligned; if it isn't, the conversion invokes undefined behavior. It's likely that int32_t has stricter alignment requirements than int16_t;
if so, converting from int16_t* to int32_t* can invoke undefined
behavior. (It's possible, but unlikely in reality, that int16_t has
stricter alignment requirements than int32_t.)

Even if the conversion is allowed, the result isn't necessarily going
to be useful.


Right. Thanks for all the corrections.
Daniel Vallstrom

Nov 14 '05 #7
On 19 Apr 2005 04:31:03 -0700, Daniel Vallstrom
<da**************@gmail.com> wrote:
This, I believe, should be safe --- even though it's rather pointless:

int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );


No, because the intX_t types are guaranteed only to be of a type with at
least X bits. An int16_t could have 16 bits and an int32_t 64 bits, for
instance. Or an int32_t might have to be aligned at 8 byte boundaries
but an int16_t only need to be aligned at 1 byte boundaries. Even
setting a pointer to the type might result in a trap.

A pathological but possible implementation:

char (int8_t) is 12 bits (1 byte)
short (int16_t) is 24 bits (2 bytes), aligned on a 2-byte boundary
int (int32_t) is 36 bits (3 bytes), aligned on a 3-byte boundary.

Your code would generate b as (char*)a + 4, which is not on the 3-byte
boundary required for an int. The compiler could legitimately round the
address up (or down) when converting to an int32_t* (it could also
launch ICBMs at the White House, cause a plague of frogs or any other
undefined behaviour) and converting it back to an int16_t* could do the
same...

You can convert a pointer to any type to a pointer to any other type,
providing that its alignment is satisfactory. I can find no guarantee
that doing pointer arithmetic on it and converting it back will result
in anything defined.

Chris C
Nov 14 '05 #8
Chris Croughton <ch***@keristor.net> wrote:
On 19 Apr 2005 04:31:03 -0700, Daniel Vallstrom
<da**************@gmail.com> wrote:
This, I believe, should be safe --- even though it's rather pointless:

int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );


No, because the intX_t types are guaranteed only to be of a type with at
least X bits. An int16_t could have 16 bits and an int32_t 64 bits, for
instance. Or an int32_t might have to be aligned at 8 byte boundaries
but an int16_t only need to be aligned at 1 byte boundaries. Even
setting a pointer to the type might result in a trap.


Nope.

# The typedef name intN_t designates a signed integer type with width N,
# no padding bits, and a two’s complement representation. Thus, int8_t
# denotes a signed integer type with a width of exactly 8 bits.

(7.18.1.1#1)

You're thinking of int_leastN_t.

Richard
Nov 14 '05 #9
Eric Sosman wrote:
su****@katamail.com wrote:
As an example, look at the widely used cast (struct sockaddr_in *) to (struct sockaddr *). Is that safe?
Seems to me that, as in the case of (void **), that cast assumes that the size and representation of a (struct sockaddr *) are the same of a (struct sockaddr_in *). Should this be done using an intermediate (void *)?


Ah, yes, well, you've encountered a special case. If
you've got two struct types that begin with the same sequence
of elements, then you can use a pointer to either type to get
at those initial elements. A `struct sockaddr' (presumably)
starts with a few elements that let the called function
decide whether it's been given a `struct sockaddr_in' or a
`struct sockaddr_something_else', and thereafter the called
function can convert the pointer to the proper struct type
so as to access the remaining elements.


To add another question to this special case:
the standard guarantees in 6.2.5#26 that (struct sockaddr *) and
(struct sockaddr_in *) are of the same representation and alignment,
which would allow us to assume that (struct sockaddr **) and (struct
sockaddr_in **) are portably convertible into each other (and usable
afterwards) as they are pointing to mutually correctly aligned
locations. Is this reasoning correct?

Mark

Nov 14 '05 #10


Mark Piffer wrote:
Eric Sosman wrote:
su****@katamail.com wrote:
As an example, look at the widely used cast (struct sockaddr_in *)
to
(struct sockaddr *). Is that safe?
Seems to me that, as in the case of (void **), that cast assumes
that
the size and representation of a (struct sockaddr *) are the same
of a
(struct sockaddr_in *). Should this be done using an intermediate
(void
*)?


Ah, yes, well, you've encountered a special case. If
you've got two struct types that begin with the same sequence
of elements, then you can use a pointer to either type to get
at those initial elements. A `struct sockaddr' (presumably)
starts with a few elements that let the called function
decide whether it's been given a `struct sockaddr_in' or a
`struct sockaddr_something_else', and thereafter the called
function can convert the pointer to the proper struct type
so as to access the remaining elements.

To add another question to this special case:
the standard guarantees in 6.2.5#26 that (struct sockaddr *) and
(struct sockaddr_in *) are of the same representation and alignment,
which would allow us to assume that (struct sockaddr **) and (struct
sockaddr_in **) are portably convertible into each other (and usable
afterwards) as they are pointing to mutually correctly aligned
locations. Is this reasoning correct?


The conclusion is correct (they are interconvertible),
but I think the reasoning is faulty. It's not that the target
structs have the same alignment requirement -- they needn't --
but that the representations of the two types of struct pointer
are identical. At "the bare bits level," the conversion from
one type to the other is therefore a no-op, hence no information
is lost and when the pointer is converted back again it still
compares equal to the original and properly points to the same
target.

There are a bunch of special cases about pointers, mostly
(I think) to protect pre-Standard code -- a Standard that broke
a significant fraction of that large amount of code would have
had difficulty gaining acceptance! Off the top of my head:

- Qualified and unqualified versions of pointers to the
same type have the same representation. Thus, `int*'
and `const int*' and `volatile int*' ... look alike.

- `void*' and `char*' and `unsigned char*' and `signed char*'
share the same representation.

- Pointers to all kinds of structs look alike.

- Pointers to all kinds of unions look alike.

- A pointer to a struct can be converted to and from a
pointer to the struct's first element (if it isn't a
bit field) safely.

- A pointer to a union can be converted to and from a
pointer to any element of the union (except bit fields)
safely.

- There's the above-mentioned rule about structs with
identical initial sequences of elements, although (as
Christian Bau mentioned) there are a few additional
conditions on this one.

- ... and there may be a few others I can't think of at
the moment.

Still and all, honesty remains the best policy whenever you
can get away with it. Point to T objects with T* pointers and
you're sure to be safe. Always.

--
Er*********@sun.com

Nov 14 '05 #11
On Mon, 18 Apr 2005 15:05:14 -0700, sunglo wrote:
Some time a go, in a discussion here in comp.lang.c, I learnt that it's
better not to use a (sometype **) where a (void **) is expected (using
a cast). Part of the discussion boiled down to the rule: if I cast a
(sometype **) to a (void **) I am making a number of assumptions about
the implementation's (void **) representation and length.
That's not the real problem (or at least only half of the problem).
Casting something ** to void ** gives the compiler the chance to change
the representation of the result to something appropriate for a void **
value. There's no guarantee that void ** can represent the result, so
there is a possibilty of failure, but this can work even if something **
and void ** have different representations.

The bigger problem is what you do with the void ** value once you have it.
You probably want to dereference it to get a void * value (trying to use
void ** as some sort of "generic" pointer to pointer). This is where
things get very nasty. Your sometype ** pointer is presumably pointing at
a sometype * pointer. If you dereference the void ** value you are trying
to reinterpret the something * pointer as if it is a void * pointer. This
would be a direct reinterpretation of the object representation which
would fail if the representations are different.
Specifically,
if I do the above cast I'm assuming that a (sometype **) and a (void
**) have the same size and representation, and this might not always be
true. Ok, all clear up to this point.
They don't have to, but something * and void * probably do.
But now my question is: does the above rule generalize to *every*
possible cast (expecially those where pointers are involved)? Is every
explicit cast unsafe? (Here I'm not talking about conversions to/from
void *, which I know are safe if performed implicitly by the language.
I also know that casts are better avoided unless strictly needed. But
I'm curious to know how things work).
Conversion to and from pointers to character types is as safe as void *.
As an example, look at the widely used cast (struct sockaddr_in *) to
(struct sockaddr *). Is that safe?
As others have pointed out it would be difficult for an implementation to
make conversion between pointers to structure types fail. But that's not
really the correct way of looking at this. If we assume that the types you
mention are part of an implementation provided library that adheres to a
specification that requires such conversions to work (i.e. as an extension
to C) then those implementations will have to make sure that it works
irrespective of what C itself guarantees.

So, yes, this is safe. But not because C says that it is safe.

From the Unix world you just have to look at dlsym() for something far
more evil (mixing function and object pointers), but the same
considerations apply.
Seems to me that, as in the case of (void **), that cast assumes that
the size and representation of a (struct sockaddr *) are the same of a
(struct sockaddr_in *). Should this be done using an intermediate (void
*)?


Again, when you convert between two types those types don't have to have
the same representation, the conversion can make any necessary
adjustments, except that this doesn't guarantee that the value is
representable in the target type.

Lawrence


Nov 14 '05 #12
On Tue, 19 Apr 2005 15:14:52 GMT, Richard Bos
<rl*@hoekstra-uitgeverij.nl> wrote:
Chris Croughton <ch***@keristor.net> wrote:
On 19 Apr 2005 04:31:03 -0700, Daniel Vallstrom
<da**************@gmail.com> wrote:
> This, I believe, should be safe --- even though it's rather pointless:
>
> int16_t * a = malloc( sizeof *a * 10 ); ...
> int32_t * b = (int32_t*)(a+2);
> int16_t * c = (int16_t*)b - 2;
> assert( a == c );


No, because the intX_t types are guaranteed only to be of a type with at
least X bits. An int16_t could have 16 bits and an int32_t 64 bits, for
instance. Or an int32_t might have to be aligned at 8 byte boundaries
but an int16_t only need to be aligned at 1 byte boundaries. Even
setting a pointer to the type might result in a trap.


Nope.

# The typedef name intN_t designates a signed integer type with width N,
# no padding bits, and a two’s complement representation. Thus, int8_t
# denotes a signed integer type with a width of exactly 8 bits.

(7.18.1.1#1)

You're thinking of int_leastN_t.


So I am, the intX_t ones are the ones which are not useful because they
might not exist at all. However, the alignment issues still exist. An
int32_t might only have 32 bits usable but take up 8 bytes because of
alignment requirements.

Chris C
Nov 14 '05 #13
"Daniel Vallstrom" <da**************@gmail.com> writes:
[...]
This, I believe, should be safe --- even though it's rather pointless:

int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );


I suspect it would be safe under any non-perverse implementation, but
I don't think the standard guarantees it. As Chris Croughton points
out, the standard guarantees that you can convert from int16_t* to
int32_t* and back to int16_t* (assuming the intermediate pointer is
properly aligned), but performing arithmetic on the intermediate
pointer before converting it back voids the warranty.

Note that if a were a pointer to a declared object rather than to a
chunk of memory allocated by malloc(), there could be alignment
problems.

I'm sure this breaks on the mythical DS9K; it would be interesting to
see how.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #14
Chris Croughton <ch***@keristor.net> writes:
[...]
So I am, the intX_t ones are the ones which are not useful because they
might not exist at all. However, the alignment issues still exist. An
int32_t might only have 32 bits usable but take up 8 bytes because of
alignment requirements.


I don't think that's possible. If an integer type with a width of 32
bits requires 64 bits of storage, then it has 32 padding bits and
isn't eligible to be called int32_t.

It's still likely that int16_t and int32_t have different alignment
requirements, and even possible (but not likely) that int16_t has
stricter alignment requirements than int32_t.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #15
In article <lf********************@comcast.com>,
Eric Sosman <es*****@acm-dot-org.invalid> wrote:
Christian Bau wrote:
In article <d4**********@news1brm.Central.Sun.COM>,
Eric Sosman <er*********@sun.com> wrote:
Ah, yes, well, you've encountered a special case. If
you've got two struct types that begin with the same sequence
of elements, then you can use a pointer to either type to get
at those initial elements. A `struct sockaddr' (presumably)
starts with a few elements that let the called function
decide whether it's been given a `struct sockaddr_in' or a
`struct sockaddr_something_else', and thereafter the called
function can convert the pointer to the proper struct type
so as to access the remaining elements.


Strictly speaking, this is only true if you define a union containing a
"struct sockaddr_in" and a "struct sockaddr_something_else", and the
compiler must have seen the declaration of that union before your code
handles "struct sockaddr_in" and "struct sockaddr_something_else"
interchangably.


Yeah, I thought about the "in a union" thing when composing
my post, but decided to ignore it. As far as I can tell, the
compiler could only behave perversely if it could somehow prove
that that the two structs could never possibly appear as members
of the same union, even in translation units the compiler has not
yet seen that might or might not be linked into the same final
program. I believe such a proof is beyond the capabilities of
current compilers, and is likely to remain so until I die and no
longer care about it ...


Well, the compiler is "in practice" forced to use the same layout for
all structs starting with members of the same type (for example, for all
structs starting with a short and an int, offsetof gives the same result
for the second struct member).

However, the compiler is free to assume that two pointers to different
struct types cannot access the same memory without undefined behavior.
So if you only have

#include <stdio.h>
#include <stdlib.h>

struct s1 { short x; int y; }
struct s2 { short a; int b; double c; }

void f (struct s1* p1, struct s2* p2) {
p1->y = 0;
p2->b = 1;

// Here the compiler can assume that because the types
// of *p1 and *p2 are different, p1 and p2 cannot point
// to the same memory without undefined behavior.
if (p1->y == 1) printf ("It worked!\n");
if (p1->y == 0) printf ("It didn't work!\n");
}

int main (void) {
void* p = malloc (sizeof (struct s1) + sizeof (struct s2));
if (p) f (p, p);
return 0;
}

could print either "It worked!" or "It didn't work!". The compiler can
assume that the assignment to p2->b cannot change p1->y (unless there is
undefined behavior).

You need a union containing both structs to avoid the undefined behavior.
Nov 14 '05 #16
Keith Thompson wrote:
"Daniel Vallstrom" writes:

int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );


I suspect it would be safe under any non-perverse implementation, but
I don't think the standard guarantees it.

I'm sure this breaks on the mythical DS9K; it would be interesting to
see how.


Let's try to figure it out.
'a' is guaranteed to be correctly aligned for int32_t, since it's the
result of a malloc.
It seems to be guaranteed that sizeof(int32_t) == 2 * sizeof(int16_t).
So (a+2) must be the same as (int32_t *)a + 1, which must also be
correctly aligned for int32_t.
Looks good to me so far...

Is it guaranteed that you can dereference any of these pointers,
without triggering a trap representation or something?

Nov 14 '05 #17
In article <sl******************@ccserver.keris.net>,
Chris Croughton <ch***@keristor.net> wrote:
On Tue, 19 Apr 2005 15:14:52 GMT, Richard Bos
<rl*@hoekstra-uitgeverij.nl> wrote:
Chris Croughton <ch***@keristor.net> wrote:
On 19 Apr 2005 04:31:03 -0700, Daniel Vallstrom
<da**************@gmail.com> wrote:

> This, I believe, should be safe --- even though it's rather pointless:
>
> int16_t * a = malloc( sizeof *a * 10 ); ...
> int32_t * b = (int32_t*)(a+2);
> int16_t * c = (int16_t*)b - 2;
> assert( a == c );

No, because the intX_t types are guaranteed only to be of a type with at
least X bits. An int16_t could have 16 bits and an int32_t 64 bits, for
instance. Or an int32_t might have to be aligned at 8 byte boundaries
but an int16_t only need to be aligned at 1 byte boundaries. Even
setting a pointer to the type might result in a trap.


Nope.

# The typedef name intN_t designates a signed integer type with width N,
# no padding bits, and a two’s complement representation. Thus, int8_t
# denotes a signed integer type with a width of exactly 8 bits.

(7.18.1.1#1)

You're thinking of int_leastN_t.


So I am, the intX_t ones are the ones which are not useful because they
might not exist at all. However, the alignment issues still exist. An
int32_t might only have 32 bits usable but take up 8 bytes because of
alignment requirements.


A type with 32 bits usable but taking up 8 bytes would have at least 32
padding bits. int32_t doesn't have any padding bits, so its size cannot
be 8 bytes.
Nov 14 '05 #18
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.org> wrote:
"Daniel Vallstrom" <da**************@gmail.com> writes:
[...]
This, I believe, should be safe --- even though it's rather pointless:

int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );
I suspect it would be safe under any non-perverse implementation, but
I don't think the standard guarantees it. As Chris Croughton points
out, the standard guarantees that you can convert from int16_t* to
int32_t* and back to int16_t* (assuming the intermediate pointer is
properly aligned), but performing arithmetic on the intermediate
pointer before converting it back voids the warranty.


I think this code is safe (but needs to be handled very careful),
because sizeof (int32_t) must be twice the sizeof (int16_t), so (int32_t
*) (a+2) and ((int32_t *) a) + 1 must be the same pointer, and therefore
(int32_t *) (a+2) must be properly aligned.
Note that if a were a pointer to a declared object rather than to a
chunk of memory allocated by malloc(), there could be alignment
problems.

I'm sure this breaks on the mythical DS9K; it would be interesting to
see how.

Nov 14 '05 #19
In article <11*********************@z14g2000cwz.googlegroups. com>,
"Old Wolf" <ol*****@inspire.net.nz> wrote:
Keith Thompson wrote:
"Daniel Vallstrom" writes:

int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );


I suspect it would be safe under any non-perverse implementation, but
I don't think the standard guarantees it.

I'm sure this breaks on the mythical DS9K; it would be interesting to
see how.


Let's try to figure it out.
'a' is guaranteed to be correctly aligned for int32_t, since it's the
result of a malloc.
It seems to be guaranteed that sizeof(int32_t) == 2 * sizeof(int16_t).
So (a+2) must be the same as (int32_t *)a + 1, which must also be
correctly aligned for int32_t.
Looks good to me so far...

Is it guaranteed that you can dereference any of these pointers,
without triggering a trap representation or something?


No. We know that a[2] and a[3] occupy the same space as b [0]. But if
you wrote

a [2] = 100;
a [3] = 200;

... b [0] ...

you'll have undefined behavior because you accessed the memory using an
lvalue that has type int32_t, and you have to use either the type of the
stored data (int16_t) or a char type. And int32_t cannot be a char type
if int16_t exists!
Nov 14 '05 #20
Keith Thompson wrote:
"Daniel Vallstrom" <da**************@gmail.com> writes:
[...]
This, I believe, should be safe --- even though it's rather pointless:
int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );
I suspect it would be safe under any non-perverse implementation, but
I don't think the standard guarantees it.


How does that follow?) As Wolf and Bau point out, a+2 is properly
aligned for int32_t* (since there is no padding etc. and because a
points to a valid "int32_t* address" and therefor "a+32bits" points
to a valid "int32_t* address"). Hence converting to int32_t* and back
must yield the same pointer.
As Chris Croughton points
out, the standard guarantees that you can convert from int16_t* to
int32_t* and back to int16_t* (assuming the intermediate pointer is
properly aligned),
Which it must be in this case.
but performing arithmetic on the intermediate
pointer before converting it back voids the warranty.
There is no arithmetic on the intermediate pointer here.
Note that if a were a pointer to a declared object rather than to a
chunk of memory allocated by malloc(), there could be alignment
problems.
Which is why malloc was used.
I'm sure this breaks on the mythical DS9K; it would be interesting to
see how.


Are you saying that DS9K isn't a conforming implementation?p
Daniel Vallstrom

Nov 14 '05 #21
Eric Sosman wrote:
Mark Piffer wrote:
To add another question to this special case:
the standard guarantees in 6.2.5#26 that (struct sockaddr *) and
(struct sockaddr_in *) are of the same representation and alignment, which would allow us to assume that (struct sockaddr **) and (struct sockaddr_in **) are portably convertible into each other (and usable afterwards) as they are pointing to mutually correctly aligned
locations. Is this reasoning correct?
The conclusion is correct (they are interconvertible),
but I think the reasoning is faulty. It's not that the target
structs have the same alignment requirement -- they needn't --


Please note that I didn't refer to the target objects (I wrote (struct
sockaddr *) meaning it to name the pointer-to type) and I said that
(struct sockaddr **) points to aligned objects, hence refering to
(struct sockaddr *) and not (struct sockaddr). I even think it was you
who corrected my faulty understanding (that I indeed had) towards
alignment of pointers-to-structs a few weeks ago.
but that the representations of the two types of struct pointer
are identical. At "the bare bits level," the conversion from
one type to the other is therefore a no-op, hence no information
is lost and when the pointer is converted back again it still
compares equal to the original and properly points to the same
target.

There are a bunch of special cases about pointers, mostly
(I think) to protect pre-Standard code -- a Standard that broke
a significant fraction of that large amount of code would have
had difficulty gaining acceptance! Off the top of my head:

- Qualified and unqualified versions of pointers to the
same type have the same representation. Thus, `int*'
and `const int*' and `volatile int*' ... look alike.

- `void*' and `char*' and `unsigned char*' and `signed char*'
share the same representation.

Another question which I didn't see (or missed) being sufficiently
answered: is it possible for a pointer to an object (other than the
char's above) to be converted to void* and then NOT pointing to the
first byte of the object? i.e.

struct { int bar; }foo;
void *vp = &foo;
unsigned char *ucp = (unsigned char*)&foo; // guaranteed to point to
1st byte
vp!=ucp; // possible to evaluate to 1???

regards,
Mark

Nov 14 '05 #22
"Daniel Vallstrom" <da**************@gmail.com> writes:
Keith Thompson wrote:
"Daniel Vallstrom" <da**************@gmail.com> writes:
[...]
> This, I believe, should be safe --- even though it's rather pointless: >
> int16_t * a = malloc( sizeof *a * 10 ); ...
> int32_t * b = (int32_t*)(a+2);
> int16_t * c = (int16_t*)b - 2;
> assert( a == c );


I suspect it would be safe under any non-perverse implementation, but
I don't think the standard guarantees it.


How does that follow?) As Wolf and Bau point out, a+2 is properly
aligned for int32_t* (since there is no padding etc. and because a
points to a valid "int32_t* address" and therefor "a+32bits" points
to a valid "int32_t* address"). Hence converting to int32_t* and back
must yield the same pointer.


I think you're right.
As Chris Croughton points
out, the standard guarantees that you can convert from int16_t* to
int32_t* and back to int16_t* (assuming the intermediate pointer is
properly aligned),


Which it must be in this case.
but performing arithmetic on the intermediate
pointer before converting it back voids the warranty.


There is no arithmetic on the intermediate pointer here.


Right again. I mis-read the code the first time through.

[...]
I'm sure this breaks on the mythical DS9K; it would be interesting to
see how.


Are you saying that DS9K isn't a conforming implementation?p


Of course not; the DS9K is conforming by definition. I was saying
that the code invokes undefined behavior -- but I was probably
mistaken.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #23
On Tue, 19 Apr 2005 22:45:31 GMT, Keith Thompson
<ks***@mib.org> wrote:
Chris Croughton <ch***@keristor.net> writes:
[...]
So I am, the intX_t ones are the ones which are not useful because they
might not exist at all. However, the alignment issues still exist. An
int32_t might only have 32 bits usable but take up 8 bytes because of
alignment requirements.
I don't think that's possible. If an integer type with a width of 32
bits requires 64 bits of storage, then it has 32 padding bits and
isn't eligible to be called int32_t.


Are they necessarily padding bits? They might be required for the
hardware, and non-accessible for arithmetic types. I remember a machine
(Honeywell?) where the extra bits coded whether the value was a pointer
or an arithmetic value, so although it had 48 bits total only some of
them were usable for arithmetic types. Assuming 32 bits of actual
arithmetic data, would that not qualify as an int32_t? Or is it
guaranteed that CHAR_BIT * sizeof intX_t == X?
It's still likely that int16_t and int32_t have different alignment
requirements, and even possible (but not likely) that int16_t has
stricter alignment requirements than int32_t.


If int32_t were emulated using chars but int16_t was a native type it
could happen. Probably more likely with int64_t though.

Chris C
Nov 14 '05 #24
Chris Croughton <ch***@keristor.net> writes:
On Tue, 19 Apr 2005 22:45:31 GMT, Keith Thompson
<ks***@mib.org> wrote:
Chris Croughton <ch***@keristor.net> writes:
[...]
So I am, the intX_t ones are the ones which are not useful because they
might not exist at all. However, the alignment issues still exist. An
int32_t might only have 32 bits usable but take up 8 bytes because of
alignment requirements.


I don't think that's possible. If an integer type with a width of 32
bits requires 64 bits of storage, then it has 32 padding bits and
isn't eligible to be called int32_t.


Are they necessarily padding bits?


Yes, I believe so. For a signed integer type, the bits of the object
representation are divided into three groups: value bits, padding bits
(there may be none), and the sign bit. If those extra 32 bits aren't
value or sign bits, they must be padding bits.

[snip]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #25
Old Wolf wrote:
> int16_t * a = malloc( sizeof *a * 10 ); ...
> int32_t * b = (int32_t*)(a+2);
> int16_t * c = (int16_t*)b - 2;
> assert( a == c );

[ ... ]
'a' is guaranteed to be correctly aligned for int32_t, since it's the
result of a malloc.


Why does a malloc() for type int16_t guarantee correct alignment for type
int32_t?
Christian
Nov 14 '05 #26
Christian Kandeler <ch****************@hob.de_invalid> writes:
Old Wolf wrote:
> int16_t * a = malloc( sizeof *a * 10 ); ...
> int32_t * b = (int32_t*)(a+2);
> int16_t * c = (int16_t*)b - 2;
> assert( a == c );


[ ... ]
'a' is guaranteed to be correctly aligned for int32_t, since it's the
result of a malloc.


Why does a malloc() for type int16_t guarantee correct alignment for type
int32_t?


Because it's not a malloc() for type int16_t. malloc() just gets an
argument specifying the number of bytes to allocate; it has no idea
what's going be stored in the allocated memory. The pointer returned
by malloc (if not non-null) is suitably aligned for any object type.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #27
Keith Thompson wrote:
Why does a malloc() for type int16_t guarantee correct alignment for type
int32_t?


Because it's not a malloc() for type int16_t. malloc() just gets an
argument specifying the number of bytes to allocate; it has no idea
what's going be stored in the allocated memory.


Ah yes, of course. Silly me. A look at the prototype of malloc() probably
would have helped...
Christian
Nov 14 '05 #28
In article <11*********************@g14g2000cwa.googlegroups. com>
Mark Piffer <so************@yahoo.com> wrote:
Another question which I didn't see (or missed) being sufficiently
answered: is it possible for a pointer to an object (other than the
char's above) to be converted to void* and then NOT pointing to the
first byte of the object? i.e.

struct { int bar; }foo;
void *vp = &foo;
unsigned char *ucp = (unsigned char*)&foo; // guaranteed to point to
1st byte
vp!=ucp; // possible to evaluate to 1???


I think this is not possible.

One pitfall I think *is* possible (but certainly not common; indeed,
I have never encountered this in anything other than Lisp systems'
internals) is to have pointers to other types have low-order bits
set, that (in C) are removed by the process of converting to either
"void *" or "char *". For instance:

void *vp;
int (*a)[4];
...
a = vp; /* might compile to "add 3,vp,a" ~= a <- (int)vp+3 */
...
vp = a; /* might compile to "sub 3,vp,a" ~= vp <- (int)a-3 */

Lisp systems sometimes use this kind of address arithmetic (in
assembly as produced by the Lisp-to-machine-code compiler) to
implement type-tagging, especially if the machine has fast trap
handling.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #29
Christian Kandeler <ch****************@hob.de_invalid> wrote:
Old Wolf wrote:
> int16_t * a = malloc( sizeof *a * 10 ); ...
> int32_t * b = (int32_t*)(a+2);
> int16_t * c = (int16_t*)b - 2;
> assert( a == c );

'a' is guaranteed to be correctly aligned for int32_t, since it's the
result of a malloc.


Why does a malloc() for type int16_t guarantee correct alignment for type
int32_t?


There is no such thing as a malloc() "for type <type>". malloc() is a
malloc() is a malloc() returns a void * which is correctly aligned for
all object types.

Richard
Nov 14 '05 #30
In article <11*********************@f14g2000cwb.googlegroups. com>,
"Daniel Vallstrom" <da**************@gmail.com> wrote:
Are you saying that DS9K isn't a conforming implementation?p


The DS 9000 is a conforming implementation.

The DS 9001 is so cleverly designed that experts can't agree whether it
is conforming or not!
Nov 14 '05 #31
Christian Bau <ch***********@cbau.freeserve.co.uk> writes:
In article <11*********************@f14g2000cwb.googlegroups. com>,
"Daniel Vallstrom" <da**************@gmail.com> wrote:
Are you saying that DS9K isn't a conforming implementation?p


The DS 9000 is a conforming implementation.

The DS 9001 is so cleverly designed that experts can't agree whether it
is conforming or not!


And the DS 9002 is deliberately designed so that determining whether
it's a conforming implementation is equivalent to solving the halting
problem.

(Hmm, I wonder if that's true for most real-world implementations.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #32
Keith Thompson <ks***@mib.org> wrote:
Christian Bau <ch***********@cbau.freeserve.co.uk> writes:
In article <11*********************@f14g2000cwb.googlegroups. com>,
"Daniel Vallstrom" <da**************@gmail.com> wrote:
Are you saying that DS9K isn't a conforming implementation?p
The DS 9000 is a conforming implementation.

The DS 9001 is so cleverly designed that experts can't agree whether it
is conforming or not!


And the DS 9002 is deliberately designed so that determining whether
it's a conforming implementation is equivalent to solving the halting
problem.


I thought the DS 9002 was the one that does absolutely nothing useful,
but does document to perfection _how_ it does nothing useful.

And shouldn't it be called the DS 9001:2000 now?
(Hmm, I wonder if that's true for most real-world implementations.)


Probably.

Richard
Nov 14 '05 #33
Eric Sosman <er*********@sun.com> writes:
Off the top of my head:

- Qualified and unqualified versions of pointers to the
same type have the same representation. Thus, `int*'
and `const int*' and `volatile int*' ... look alike.

- `void*' and `char*' and `unsigned char*' and `signed char*'
share the same representation.

- Pointers to all kinds of structs look alike.

- Pointers to all kinds of unions look alike.

- A pointer to a struct can be converted to and from a
pointer to the struct's first element (if it isn't a
bit field) safely.

- A pointer to a union can be converted to and from a
pointer to any element of the union (except bit fields)
safely.

- There's the above-mentioned rule about structs with
identical initial sequences of elements, although (as
Christian Bau mentioned) there are a few additional
conditions on this one.

- ... and there may be a few others I can't think of at
the moment.


A nice list. Also:

- Pointers to arrays with the same element type but with
differing lengths have the same representation.
Nov 14 '05 #34
Eric Sosman <es*****@acm-dot-org.invalid> writes:
Christian Bau wrote:
In article <d4**********@news1brm.Central.Sun.COM>,
Eric Sosman <er*********@sun.com> wrote:
Ah, yes, well, you've encountered a special case. If
you've got two struct types that begin with the same sequence
of elements, then you can use a pointer to either type to get
at those initial elements. A `struct sockaddr' (presumably)
starts with a few elements that let the called function
decide whether it's been given a `struct sockaddr_in' or a
`struct sockaddr_something_else', and thereafter the called
function can convert the pointer to the proper struct type
so as to access the remaining elements.


Strictly speaking, this is only true if you define a union containing a
"struct sockaddr_in" and a "struct sockaddr_something_else", and the
compiler must have seen the declaration of that union before your code
handles "struct sockaddr_in" and "struct sockaddr_something_else"
interchangably.


Yeah, I thought about the "in a union" thing when composing
my post, but decided to ignore it. As far as I can tell, the
compiler could only behave perversely if it could somehow prove
that that the two structs could never possibly appear as members
of the same union, even in translation units the compiler has not
yet seen that might or might not be linked into the same final
program. I believe such a proof is beyond the capabilities of
current compilers, and is likely to remain so until I die and no
longer care about it ...


Micro nit: ITYM "... could behave perversely only if ...".

Surely it's always possible that identically declared struct's *might*
be put in a union in another translation unit. Together with 6.2.7 p1
this would imply that struct's that share a common initial sequence
must have the same representation for the common initial sequence.
Right?
Nov 14 '05 #35
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
Eric Sosman <er*********@sun.com> writes:
Off the top of my head:

- Qualified and unqualified versions of pointers to the
same type have the same representation. Thus, `int*'
and `const int*' and `volatile int*' ... look alike.
[ And so forth. ]
- ... and there may be a few others I can't think of at
the moment.


A nice list. Also:

- Pointers to arrays with the same element type but with
differing lengths have the same representation.


Surely not? A pointer-to-int that happens to point at the first element
of an array of 5 ints must have the same representation as one that
points at the first element of an array of 13 ints, yes, but
pointer-to-array[5]-of-int is a completely different type from
pointer-to-array[13]-of-int, and need not have the same representation
(or alignment) at all.

Richard
Nov 14 '05 #36
rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
Eric Sosman <er*********@sun.com> writes:
Off the top of my head:

- Qualified and unqualified versions of pointers to the
same type have the same representation. Thus, `int*'
and `const int*' and `volatile int*' ... look alike.
[ And so forth. ]
- ... and there may be a few others I can't think of at
the moment.


A nice list. Also:

- Pointers to arrays with the same element type but with
differing lengths have the same representation.


Surely not? A pointer-to-int that happens to point at the first element
of an array of 5 ints must have the same representation as one that
points at the first element of an array of 13 ints, yes, but
pointer-to-array[5]-of-int is a completely different type from
pointer-to-array[13]-of-int, and need not have the same representation
(or alignment) at all.


Perhaps surprising but true. 6.2.5 p26, pointers to compatible types
shall have the same representation and alignment requirements. Both
'int[5]' and 'int[13]' are compatible with 'int[]', hence 'int (*)[5]'
and 'int (*)[13]' both have the same representation and alignment
requirements as 'int (*)[]', and hence as each other. So for example,

int a[5];
int (*pa5)[5] = &a;
int (*pa)[] = pa5;
int (*pa13)[13] = pa;

All the initializing assignments take place between compatible pointer
types (pointers to compatible types are compatible types); the value
and the representation hence stays the same (6.3 p2).

The initializing assignment to 'pa13' may evoke undefined behavior, if
the alignment requirements for an 'int[13]' array aren't satisfied by
the value '&a'. But the representation and alignment requirements of
the pointer variables themselves must be the same.
(Minor note: Technically, it's the implicit conversion of 'pa' rather
than the initializing assignment that can produce undefined behavior.)
Nov 14 '05 #37
In article <kf*************@alumnus.caltech.edu>,
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
Surely it's always possible that identically declared struct's *might*
be put in a union in another translation unit. Together with 6.2.7 p1
this would imply that struct's that share a common initial sequence
must have the same representation for the common initial sequence.
Right?


Not really.

The compiler must make things "work" _if_ two structs _are_ in fact
members of the same union and _if_ this is known when the code is
compiled, and in no other case. If you give the compiler a source file
containing main () and some other functions, and the compiler can deduce
that no function outside that file are actually called, then it is
clearly allowed to use different offsets for both structs.

And having the same layout doesn't guarantee defined behavior, as the
compiler can often assume that pointers to different structs don't point
to the same memory.
Nov 14 '05 #38
Christian Bau <ch***********@cbau.freeserve.co.uk> writes:
In article <kf*************@alumnus.caltech.edu>,
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
Surely it's always possible that identically declared struct's *might*
be put in a union in another translation unit. Together with 6.2.7 p1
this would imply that struct's that share a common initial sequence
must have the same representation for the common initial sequence.
Right?
Not really.

The compiler must make things "work" _if_ two structs _are_ in fact
members of the same union and _if_ this is known when the code is
compiled, and in no other case. If you give the compiler a source file
containing main () and some other functions, and the compiler can deduce
that no function outside that file are actually called, then it is
clearly allowed to use different offsets for both structs.


I believe that whether or not the compiler can deduce that no function
outside the single-translation-unit program is called is irrelevant.
First, 6.2.7 p1 doesn't say translation units in the same executable;
it just says translation units. Second, there are lots of ways a
structure value might be transmitted between different executables,
eg,

- writing bytes out to a file and another executable reading
them in;

- storing a value in shared memory;

- transmitting bytes over a pipe or a socket;

- just leaving a value in memory somewhere and expecting the
next program to pick it up;

- for a more subtle example - the value of an 'offsetof'
macro call might be stored by one executable and used
by another.

The C standard makes (some) guarantees about program execution, but it
seems silly to think it makes guarantees *only* about program
execution. There also are guarantees about what representations are
used in (some) data types. I think most people reading sections
6.2.5, 6.2.6, 6.2.7 and 6.3 would conclude that guarantees are made
about the representations of compatible types (of structs) in
different translation units, regardless of whether they were ever
bound into a single executable. In response to that, do you have
anything to offer other than just an assertion to the contrary?
Saying "... it is clearly allowed ..." without giving any supporting
statements isn't very convincing.

And having the same layout doesn't guarantee defined behavior, as the
compiler can often assume that pointers to different structs don't point
to the same memory.


Certainly having the same layout doesn't guarantee defined behavior in
all cases. But I think it *does* guarantee defined behavior in *some*
cases that otherwise would be undefined.

I acknowledge the point about the compiler being allowed to make
assumptions for pointers pointing to different types. I saw that
point in an earlier posting of yours, and it's a good one. What
I'm asking about, however, is something different.
Nov 14 '05 #39
Christian Bau wrote:

Well, the compiler is "in practice" forced to use the same layout for all structs starting with members of the same type (for example, for all structs starting with a short and an int, offsetof gives the same result for the second struct member).
Why?
However, the compiler is free to assume that two pointers to different struct types cannot access the same memory without undefined behavior.

Again, why?
So if you only have

#include <stdio.h>
#include <stdlib.h>

struct s1 { short x; int y; }
struct s2 { short a; int b; double c; }

void f (struct s1* p1, struct s2* p2) {
p1->y = 0;
p2->b = 1;
These assignments are made through int lvalues. The struct types are
incidental.
// Here the compiler can assume that because the types
// of *p1 and *p2 are different, p1 and p2 cannot point
// to the same memory without undefined behavior.
if (p1->y == 1) printf ("It worked!\n");
if (p1->y == 0) printf ("It didn't work!\n");
}
Consider the following implementation...

short: 2 bytes, 2 byte alignment
int: 4 bytes, 4 byte alignment
double: 16 bytes, 16 byte alignment

....and consider the layouts (where . is padding)...

struct s1: |xx..yyyy|
struct s2: |aa......bbbb....cccccccccccccccc|

Can you cite chapter and verse where an implementation cannot adopt
this choice of layouts?

Can you cite why this cannot be done "in practice"?
int main (void) {
void* p = malloc (sizeof (struct s1) + sizeof (struct s2));
if (p) f (p, p);
return 0;
}

could print either "It worked!" or "It didn't work!". The compiler can assume that the assignment to p2->b cannot change p1->y (unless there is undefined behavior).


I can't see a distinction between this and the usual pointer aliasing
problems. If you want the compiler to make the assumption that p1
and p2 point to different locations, then you have to make them
restrict qualified. [That's what restrict is for, after all.]

--
Peter

Nov 14 '05 #40
In article <kf*************@alumnus.caltech.edu>,
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
I believe that whether or not the compiler can deduce that no function
outside the single-translation-unit program is called is irrelevant.
First, 6.2.7 p1 doesn't say translation units in the same executable;
it just says translation units. Second, there are lots of ways a
structure value might be transmitted between different executables,
eg,

- writing bytes out to a file and another executable reading
them in;

- storing a value in shared memory;

- transmitting bytes over a pipe or a socket;

- just leaving a value in memory somewhere and expecting the
next program to pick it up;

- for a more subtle example - the value of an 'offsetof'
macro call might be stored by one executable and used
by another.

The C standard makes (some) guarantees about program execution, but it
seems silly to think it makes guarantees *only* about program
execution. There also are guarantees about what representations are
used in (some) data types. I think most people reading sections
6.2.5, 6.2.6, 6.2.7 and 6.3 would conclude that guarantees are made
about the representations of compatible types (of structs) in
different translation units, regardless of whether they were ever
bound into a single executable. In response to that, do you have
anything to offer other than just an assertion to the contrary?
Saying "... it is clearly allowed ..." without giving any supporting
statements isn't very convincing.


The C Standard gives a guarantee that under certain circumstances
modifying an object through a pointer to some struct modifies another
object in a defined way. Example:

typedef struct { int x; double y; char z; } s1;
typedef struct { int a; double b; short c; } s2;

typedef union { s1 x; s2 y; } u;

int main (void) {
s1 x;
((s2 *) &x) -> b = 3.7;
return 0;
}

The assignment is guaranteed to set x.y to 3.7. How the compiler does
this is none of anyones business. A simple strategy to achieve this is
for the compiler to use the same layout for all initial sequences of all
structs. That means the offset of any one struct member depends only of
the type of that member and all preceeding members, but not on the type
of any following members.

In my tiny example, it is clear that the compiler can achieve what is
guaranteed by the standard without using the same layout for s1 and s2.
All it needs to do is replace the left hand side of the assignment ((s2
*) &x)->b with x.y. That will achieve exactly what is guaranteed by the
C Standard. Yes, the simplest implementation will use the same offsetof
for s1.y and s2.b, but _that_ is not guaranteed by the C Standard.

So the situation is: The C Standard guarantees X. A simple method to
achieve this in an implementation is to do Y. A compiler would in fact
have to work hard to achieve X without doing Y. This does _not_
guarantee Y in any way, it guarantees X.

If the C Standard had wished to guarantee same layout for structures, it
would have been easy to add this.
Nov 14 '05 #41
Peter Nilsson <ai***@acay.com.au> wrote:
Christian Bau wrote:
[snip]
However, the compiler is free to assume that two pointers to

different
struct types cannot access the same memory without undefined

behavior.

Again, why?
They are a little colloquial words, but I think he is right.
So if you only have

#include <stdio.h>
#include <stdlib.h>

struct s1 { short x; int y; }
struct s2 { short a; int b; double c; }

void f (struct s1* p1, struct s2* p2) {
p1->y = 0;
p2->b = 1; These assignments are made through int lvalues. The struct types are
incidental.
Not quite incidental:

# [#4] A postfix expression followed by the -> operator and an
# identifier designates a member of a structure or union
# object. The value is that of the named member of the object
# to which the first expression points, and is an lvalue.69)

If `p1' points to object of type `struct s1', and p2 points at
the same location, then the expression `p2->b' raises UB, because
the objects simply doesn't have a `b' member (I think it's undefined,
because the Std doesn't specify it).

If `p1' points to an object without an effective type (allocated),
then I believe the first assignment impresses the new type on that
object.

There's no way here you can access the same struct object through
both "->y" and "->b" without UB.
[snip]
Consider the following implementation... short: 2 bytes, 2 byte alignment
int: 4 bytes, 4 byte alignment
double: 16 bytes, 16 byte alignment ...and consider the layouts (where . is padding)... struct s1: |xx..yyyy|
struct s2: |aa......bbbb....cccccccccccccccc| Can you cite chapter and verse where an implementation cannot adopt
this choice of layouts?
It can.
Can you cite why this cannot be done "in practice"?
"In practice" means binary compatibility.
int main (void) {
void* p = malloc (sizeof (struct s1) + sizeof (struct s2));
if (p) f (p, p);
return 0;
}

[snip]
I can't see a distinction between this and the usual pointer aliasing
problems. If you want the compiler to make the assumption that p1
and p2 point to different locations, then you have to make them
restrict qualified. [That's what restrict is for, after all.]


It's not that they cannot point to the same object (they can), its
that you cannot legally access the object members by using member-access
from another struct.

Consider this:

void f(short *ps, double *pd)
{
*ps = 0;
*pd = 1; /* could this change `*ps'? - no */
*ps;
}
typedef struct { int i; short s; } Sshort;
typedef struct { int i; double d; } Sdouble;

void f(Sshort *ps, Sdouble *pd)
{
ps->i = 0;
pd->i = 1; /* could this change `*ps'? - no */
*ps->i;
/* what counts here is "ps->" and "pd->"; also note that "->i" mean
two different things, ie. members in two different name spaces */
}

union { Sshort s; Sdouble d; };

void g(Sshort *ps, Sdouble *pd)
{
ps->i = 0;
pd->i = 1; /* could this change `*ps'? - yes! ps and pd *can* point
to the same object and the Std provides exception
for such access */
*ps->i; /* must re-read the object */
}

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #42
Christian Bau <ch***********@cbau.freeserve.co.uk> wrote:
The C Standard gives a guarantee that under certain circumstances
modifying an object through a pointer to some struct modifies another
object in a defined way. Example: typedef struct { int x; double y; char z; } s1;
typedef struct { int a; double b; short c; } s2; typedef union { s1 x; s2 y; } u; int main (void) {
s1 x;
((s2 *) &x) -> b = 3.7;
return 0;
} The assignment is guaranteed to set x.y to 3.7.
Basically I agree with what you say here and earlier, but in this
example we have a specific situation and the compiler may produce
specific code.

# One special guarantee is made in
# order to simplify the use of unions: If a union contains
# several structures that share a common initial sequence (see
# below), and if the union object currently contains one of
# these structures, it is permitted to inspect the common
# initial part of any of them anywhere that a declaration of
# the completed type of the union is visible.

The exception is made for the struct objects which are part of a union.
In your example the compiler may notice that `x' object is not part
of a union and is not obliged by this exception.

Otherwise, when the compiler processes separate translation units,
and when it can't trace the origin of lvalues (passed through pointers),
then yes, it must produce a general code which is aware of
the special guarantee.

How the compiler does
this is none of anyones business. A simple strategy to achieve this is
for the compiler to use the same layout for all initial sequences of all
structs. That means the offset of any one struct member depends only of
the type of that member and all preceeding members, but not on the type
of any following members.


"In practice" yes. But now I'm not sure if it can be actually
strictly proved. Suppose an implementation ("Made in Hell ;-)")
in which every struct object is "observed" and before each
member access a special code is generated that synchronizes
the resulting value of the expression with the previous write
access - then I think there would be no need to have the same
layout. But I don't really know, there are lots of other issues
to consider.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #43
Christian Bau <ch***********@cbau.freeserve.co.uk> writes:
In article <kf*************@alumnus.caltech.edu>,
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
I believe that whether or not the compiler can deduce that no function
outside the single-translation-unit program is called is irrelevant.
First, 6.2.7 p1 doesn't say translation units in the same executable;
it just says translation units. Second, there are lots of ways a
structure value might be transmitted between different executables,
eg,

- writing bytes out to a file and another executable reading
them in;

- storing a value in shared memory;

- transmitting bytes over a pipe or a socket;

- just leaving a value in memory somewhere and expecting the
next program to pick it up;

- for a more subtle example - the value of an 'offsetof'
macro call might be stored by one executable and used
by another.

The C standard makes (some) guarantees about program execution, but it
seems silly to think it makes guarantees *only* about program
execution. There also are guarantees about what representations are
used in (some) data types. I think most people reading sections
6.2.5, 6.2.6, 6.2.7 and 6.3 would conclude that guarantees are made
about the representations of compatible types (of structs) in
different translation units, regardless of whether they were ever
bound into a single executable. In response to that, do you have
anything to offer other than just an assertion to the contrary?
Saying "... it is clearly allowed ..." without giving any supporting
statements isn't very convincing.
The C Standard gives a guarantee that under certain circumstances
modifying an object through a pointer to some struct modifies another
object in a defined way. Example:

typedef struct { int x; double y; char z; } s1;
typedef struct { int a; double b; short c; } s2;

typedef union { s1 x; s2 y; } u;

int main (void) {
s1 x;
((s2 *) &x) -> b = 3.7;
return 0;
}

The assignment is guaranteed to set x.y to 3.7. How the compiler does
this is none of anyones business. A simple strategy to achieve this is
for the compiler to use the same layout for all initial sequences of all
structs. That means the offset of any one struct member depends only of
the type of that member and all preceeding members, but not on the type
of any following members.

In my tiny example, it is clear that the compiler can achieve what is
guaranteed by the standard without using the same layout for s1 and s2.
All it needs to do is replace the left hand side of the assignment ((s2
*) &x)->b with x.y. That will achieve exactly what is guaranteed by the
C Standard. Yes, the simplest implementation will use the same offsetof
for s1.y and s2.b, but _that_ is not guaranteed by the C Standard.


The argument given tacitly assumes that the guarantees of 6.5.2.3 p5
are the only guarantees that affect what the compiler can do in this
case. In other words it assumes the very thing it purports to show.
Circular reasoning.

So the situation is: The C Standard guarantees X. A simple method to
achieve this in an implementation is to do Y. A compiler would in fact
have to work hard to achieve X without doing Y. This does _not_
guarantee Y in any way, it guarantees X.


A more accurate statement is: The C Standard guarantees X; it also
guarantees X', X'', X''', .... Now the question is, Does the union of
the guarantees X, X', X'', ..., imply Y?

What was given was an argument that "not (X implies Y)". That may be
true, but it's irrelevant to the question. Any reasoning that tries
to answer that question (and give the answer "No") needs to take into
account all the guarantees that are present in the C Standard, not
just those in 6.5.2.3.
Nov 14 '05 #44
S.Tobias wrote:
Peter Nilsson <ai***@acay.com.au> wrote:
Christian Bau wrote:
[snip]
However, the compiler is free to assume that two pointers to
different struct types cannot access the same memory without
undefined behavior.
Again, why?


They are a little colloquial words, but I think he is right.


Apart from 'compiler', for which you can simply substitute the
term 'implementation', I can't see a colloquialism being used.
So if you only have

#include <stdio.h>
#include <stdlib.h>

struct s1 { short x; int y; }
struct s2 { short a; int b; double c; }

void f (struct s1* p1, struct s2* p2) {
p1->y = 0;
p2->b = 1;
These assignments are made through int lvalues. The struct types are incidental.


Not quite incidental:

# [#4] A postfix expression followed by the -> operator and an
# identifier designates a member of a structure or union
# object. The value is that of the named member of the object
# to which the first expression points, and is an lvalue.69)

If `p1' points to object of type `struct s1', and p2 points at
the same location, then the expression `p2->b' raises UB, because
the objects simply doesn't have a `b' member (I think it's
undefined, because the Std doesn't specify it).


But there is a member b in struct s2, to which p2 points, so I
have no idea why you think that section adds anything relevant.
If `p1' points to an object without an effective type (allocated),
then I believe the first assignment impresses the new type on that
object.
The only objects being modified are p1->y and p2->b, both have the
same type and potentially the same address. The type of any enclosing
object is irrelevant.
There's no way here you can access the same struct object through
both "->y" and "->b" without UB.
Chapter and verse please. I can't see how your cited section adds
anything relevant.
[snip]
Consider the following implementation...
short: 2 bytes, 2 byte alignment
int: 4 bytes, 4 byte alignment
double: 16 bytes, 16 byte alignment

...and consider the layouts (where . is padding)...

struct s1: |xx..yyyy|
struct s2: |aa......bbbb....cccccccccccccccc|

Can you cite chapter and verse where an implementation cannot adopt
this choice of layouts?


It can.
Can you cite why this cannot be done "in practice"?


"In practice" means binary compatibility.


Are you saying an int is not binary compatible (whatever that
means) with an int?
int main (void) {
void* p = malloc (sizeof (struct s1) + sizeof (struct s2));
if (p) f (p, p);
return 0;
} [snip]
I can't see a distinction between this and the usual pointer

aliasing problems. If you want the compiler to make the assumption that p1
and p2 point to different locations, then you have to make them
restrict qualified. [That's what restrict is for, after all.]


It's not that they cannot point to the same object (they can), its
that you cannot legally access the object members by using member-
access from another struct.


Where does the standard state this?

How is the situation any different to...

stuct s { int i };

void foo(struct s *sp, int *ip)
{
sp->i = 0;
*ip = 1;
}

...
struct s s = { 42 };
foo(&s, &s->i);
Consider this:

void f(short *ps, double *pd)
{
*ps = 0;
*pd = 1; /* could this change `*ps'? - no */
It *could* change the representation.
*ps;
Now you invoke UB because *ps could be a trap representation (if
*ps and *pd overlap memory.) But if *ps and *pd were to point
to objects of the same type, then there is no UB.
}
typedef struct { int i; short s; } Sshort;
typedef struct { int i; double d; } Sdouble;

void f(Sshort *ps, Sdouble *pd)
{
ps->i = 0;
pd->i = 1; /* could this change `*ps'? - no */
Yes it can.
*ps->i;
/* what counts here is "ps->" and "pd->"; ...


Why does it count? Your quoted section _only_ states that 'i' must
be a member of the struct pointed to by the given pointer expression
to the left of ->, which it is in both cases.

--
Peter

Nov 14 '05 #45
Peter Nilsson <ai***@acay.com.au> wrote:
S.Tobias wrote:
Peter Nilsson <ai***@acay.com.au> wrote:
Christian Bau wrote: > So if you only have
>
> #include <stdio.h>
> #include <stdlib.h>
>
> struct s1 { short x; int y; }
> struct s2 { short a; int b; double c; }
>
> void f (struct s1* p1, struct s2* p2) {
> p1->y = 0;
> p2->b = 1;
These assignments are made through int lvalues. The struct types are incidental.


Not quite incidental:

# [#4] A postfix expression followed by the -> operator and an
# identifier designates a member of a structure or union
# object. The value is that of the named member of the object ^^^^^^^^^^^^^^======
# to which the first expression points, and is an lvalue.69)

If `p1' points to object of type `struct s1', and p2 points at
the same location, then the expression `p2->b' raises UB, because
the objects simply doesn't have a `b' member (I think it's
undefined, because the Std doesn't specify it). But there is a member b in struct s2, to which p2 points, so I
have no idea why you think that section adds anything relevant.
I was thinking of a special situation here (sure, you can't read my
mind...), where the call is:
struct s1 s;
f(&s, (struct s2*)&s);

Of course `struct s2' (and lvalue `*p2') has a member `b',
but not the *object* that p2 points to.

The Standard does not define what happens when you apply `->' operator
to an lvalue which designates an object that doesn't have the specified
member.
If `p1' points to an object without an effective type (allocated),
then I believe the first assignment impresses the new type on that
object. The only objects being modified are p1->y and p2->b, both have the
same type and potentially the same address.
I have to correct myself again: the first assignment impresses `struct s1'
type, and the second impresses `struct s2'. The UB arises when you
try to access is with `p1->y' again (below).
The type of any enclosing
object is irrelevant.
It is, for the "p1->" part: object `*p1' *must* have the specified member,
or else there's UB. (IOW: `p1', to which `->' is applied must point
to an object that has that member.)
There's no way here you can access the same struct object through
both "->y" and "->b" without UB. Chapter and verse please. I can't see how your cited section adds
anything relevant.
I can't give you a c&v, because it's not specified that way. Show
me that you can, and then I'll give you c&v that you can't.
[snip]
Consider the following implementation...

short: 2 bytes, 2 byte alignment
int: 4 bytes, 4 byte alignment
double: 16 bytes, 16 byte alignment

...and consider the layouts (where . is padding)...

struct s1: |xx..yyyy|
struct s2: |aa......bbbb....cccccccccccccccc|

Can you cite chapter and verse where an implementation cannot adopt
this choice of layouts?


It can.
Can you cite why this cannot be done "in practice"?


"In practice" means binary compatibility. Are you saying an int is not binary compatible (whatever that
means) with an int?
What I meant to say was that a "practical" compiler must make sure
that "yyyy" and "bbbb" are aligned. I don't see any other
way other to satisfy all requirements if both structs are to be
defined in different translation units, and the linker is ignorant
of C. Christian has given enough argument for that.
> int main (void) {
> void* p = malloc (sizeof (struct s1) + sizeof (struct s2));
> if (p) f (p, p);
> return 0;
> }

[snip]
I can't see a distinction between this and the usual pointer aliasing problems. If you want the compiler to make the assumption that p1
and p2 point to different locations, then you have to make them
restrict qualified. [That's what restrict is for, after all.]


It's not that they cannot point to the same object (they can), its
that you cannot legally access the object members by using member-
access from another struct.

Where does the standard state this? How is the situation any different to... stuct s { int i }; void foo(struct s *sp, int *ip)
{
sp->i = 0;
*ip = 1;
}
This is okay, `*ip' may alias `sp->i'.
...
struct s s = { 42 };
foo(&s, &s->i); Consider this:

void f(short *ps, double *pd)
{
*ps = 0;
*pd = 1; /* could this change `*ps'? - no */ It *could* change the representation.
It could and it would raise UB either here (object with declared type) or ..
*ps;
... here (object without a declared type, where the effective type
is impressed by the latest write access, or lvalue type, in that order).
This would break aliasing rules, even before accessing trap representation.

The conlusion is that the compiler may assume that *ps and *pd don't
alias the same object.

See 6.5, and the Rationale has many explanations.
Now you invoke UB because *ps could be a trap representation (if
*ps and *pd overlap memory.)
UB arises even before that, and even if there're no trap representations.
But if *ps and *pd were to point
to objects of the same type, then there is no UB.
Yes, it means they may alias the same object.
}
typedef struct { int i; short s; } Sshort;
typedef struct { int i; double d; } Sdouble;

void f(Sshort *ps, Sdouble *pd)
{
ps->i = 0;
pd->i = 1; /* could this change `*ps'? - no */ Yes it can.
Yes it can, leading to UB at some place, etc.

It's the same argument as above. What only differs is this:
The conclusion is that the expression `ps->i' doesn't alias `pd->i'
(subject to the exception that Christian talked about).

(If there was a third argument `int *pi', then of course `*pi' may
alias either ps->i or pd->i).
*ps->i;
/* what counts here is "ps->" and "pd->"; ...

Why does it count? Your quoted section _only_ states that 'i' must
be a member of the struct pointed to by the given pointer expression
to the left of ->, which it is in both cases.


No! As I said above: when you apply `->', the *object* must have
the member `i'. If the lvalue doesn't have it - it's constraint violation.
If the object doesn't have it - it's UB.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #46
S.Tobias wrote:
Peter Nilsson <ai***@acay.com.au> wrote:
S.Tobias wrote:
Peter Nilsson <ai***@acay.com.au> wrote:
> Christian Bau wrote:
> > So if you only have
> >
> > #include <stdio.h>
> > #include <stdlib.h>
> >
> > struct s1 { short x; int y; }
> > struct s2 { short a; int b; double c; }
> >
> > void f (struct s1* p1, struct s2* p2) {
> > p1->y = 0;
> > p2->b = 1;
>
> These assignments are made through int lvalues. The struct types > are incidental.

Not quite incidental:

# [#4] A postfix expression followed by the -> operator and an # identifier designates a member of a structure or union # object. The value is that of the named member of the object
^^^^^^^^^^^^^^====== # to which the first expression points, and is an lvalue.69)
If `p1' points to object of type `struct s1', and p2 points at
the same location, then the expression `p2->b' raises UB, because
the objects simply doesn't have a `b' member (I think it's
undefined, because the Std doesn't specify it).


But there is a member b in struct s2, to which p2 points, so I
have no idea why you think that section adds anything relevant.


I was thinking of a special situation here (sure, you can't read my
mind...), where the call is:
struct s1 s;
f(&s, (struct s2*)&s);

Of course `struct s2' (and lvalue `*p2') has a member `b',
but not the *object* that p2 points to.

The Standard does not define what happens when you apply `->'

operator to an lvalue which designates an object that doesn't have the specified member.


Then the following code, which is in very common practice, is
undefined...

struct x { int a; double b; };
struct y { struct x x; int c; };

struct y y = { 0 };
struct x *xp = (struct x *) xp;
xp->a = 42;
xp->b = 4 * atan(1);

Your rule states that the object to which xp points to must have
a and b members. But the object being pointed to only has x and c
members.

But lets instead suppose I do something like...

struct x *xp = malloc(sizeof *yp);
struct y *yp = (struct y *) xp;
x->a = 42;
y->c = 42;

Is the second assignment undefined behaviour?

I think I understand your arguments, but I'm not convinced that the
interpretation is what the committee intended.

Note that 6.5.2.3p3 (the one before the one you quote) uses different
wording...

A postfix expression followed by the . operator and an identifier
designates a member of a structure or union object. The value is
that of the named member, and is an lvalue if the first expression
is an lvalue. If the first expression has qualified type, the
result has the so-qualified version of the type of the designated
member.

So, in the original code by Christian, 'p2->b = 1;' is UB, but it
seems that '(*p2).b = 1;' is okay!

Like I said, I don't think that's the committee's intent.

--
Peter

Nov 14 '05 #47
Peter Nilsson <ai***@acay.com.au> wrote:
S.Tobias wrote:
I was thinking of a special situation here (sure, you can't read my
mind...), where the call is:
struct s1 s;
f(&s, (struct s2*)&s);

Of course `struct s2' (and lvalue `*p2') has a member `b',
but not the *object* that p2 points to.

The Standard does not define what happens when you apply `->'

operator
to an lvalue which designates an object that doesn't have the

specified
member.

Then the following code, which is in very common practice, is
undefined... struct x { int a; double b; };
struct y { struct x x; int c; }; struct y y = { 0 };
struct x *xp = (struct x *) xp; ITYM: (struct x *) &y; xp->a = 42;
xp->b = 4 * atan(1); Your rule states that the object to which xp points to must have
a and b members. But the object being pointed to only has x and c
members.
Everything is all right: `y' contains a subobject `x' which contains
members `a' and `b', and the pointer `xp' points to it by virtue of
6.7.2.1#12 (n869.txt, "suitably converted").
But lets instead suppose I do something like... struct x *xp = malloc(sizeof *yp);
struct y *yp = (struct y *) xp;
x->a = 42;
y->c = 42; Is the second assignment undefined behaviour?
I don't know, but I think it's not UB. The question here is when
and how an allocated object becomes a struct.

I believe (I can't give you any references to support this right now) that
with allocated objects (a) the object has a (effective) type such
as you (the programmer) want it to have (by the rules 6.5#6); and
(b) the access semantics are meant to be the same as those for
the declared objects, subject to (a).

I think, when you assign a member, the allocated object (or its part)
becomes that structure (with that member), and remaining members
become indeterminate.

I believe that the second assignment in your example augments the
structure, but I don't really know. Perhaps a more tricky example
is worth looking at:

struct y2 { struct x x; int d; } *y2p = (struct y2 *)xp;
x->a = 42;
y->c = 42; //augments?
y2->d = 54; //changes type
x->a; y2->x->a; //UB?
I think I understand your arguments, but I'm not convinced that the
interpretation is what the committee intended.
Then the simplest thing to do is to ask them in csc; I won't do it
now, because I'm not prepared for the discussion yet. In fact,
I was hoping that someone of the Elders of the C Tribe would add
their comments in this discussion, so that I could become more
convinced (either way), too.

+++

I think that the issues here are similar to those in the
discussion "contiguity of arrays" last year (about "int a[2][2];
a[0][2];"). The problem is not what is where, but what the
compiler believes is where. Answering to my post (Message-ID:
<2r*************@uni-berlin.de>) Douglas Gwyn informally agreed to
my supposition that in order to access a subobject, you have to
explicitly give the full "path" to it, referring to it, and its
containing objects (see the post, I might be bending the interpretation
here); values of pointers that you get by some miracle
"out of nowhere" will cause UB at some point or another. I feel
that this intention might equally apply to accessing members in a
structure or union (ie. you can't access a member without referring
to its containing struct at some place), but I wouldn't vow for its truth.

+++
Note that 6.5.2.3p3 (the one before the one you quote) uses different
wording... A postfix expression followed by the . operator and an identifier
designates a member of a structure or union object. The value is
that of the named member, and is an lvalue if the first expression
is an lvalue. If the first expression has qualified type, the
result has the so-qualified version of the type of the designated
member. So, in the original code by Christian, 'p2->b = 1;' is UB, but it
seems that '(*p2).b = 1;' is okay!


I'm sorry, but you have to give me more clue what the difference is.
Both specifications refer to an "object". The only substantial
difference I see is that "->" expression is always lvalue (pointers
cannot point to temporaries), whereas "." expression is either
lvalue or rvalue. I think both forms above are equivalent.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #48

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Dmitry D | last post by:
Hi, I'm new to C++ (started learning in the beginning of this summer), and I have the following question (sorry if it sounds stupid): In many code samples and source files, I see NULL expression...
4
by: Michael Wagner | last post by:
I do some Windows kernel programming, where what I need to pass to some Kernel call is "void* Context". Sometime later, I will get that Conext back. I want to pass a class pointer to this system...
3
by: Jeff Gaynor | last post by:
Hi, I am a moonlighting Java programmer who needs to get some JNI written in C. Thanks to legacy considerations, it must be in C. I hope I can frame this question in a way that you all can...
24
by: venkatesh | last post by:
hai, any can tell what is the advantge of using pointers?any increase in speed or execution time? if you know please tell what are the real time applications of data structure? thanks in...
5
by: kyle.tk | last post by:
I am having some (compiler) problems with the following code. The compiler complains of "warning: assignment makes pointer from integer without a cast" at the lines I indicated. But it still...
66
by: KimmoA | last post by:
Hey! Some questions about C that have been bugging me for a while... 1) Is inline a valid C keyword or not? I was kind of surprised of not finding it in C, to be honest. My "The C Programming...
21
by: Bo Yang | last post by:
As long as I write c++ code, I am worry about the pointer. The more the system is, the dangerous the pointer is I think. I must pass pointers erverywhere, and is there a way to ensure that every...
81
by: jacob navia | last post by:
Hi I am still writing my tutorial book about C. Here is the section about casts. I would be interested in your opinions about this. Some people have definite views about this subject ("never...
160
by: raphfrk | last post by:
Is this valid? int a; void *b; b = (void *)a; // b points to a b += 5*sizeof(*a); // b points to a a = 100;
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.