MeJohn wrote:
Hello,
I have a question about the layout of the fields in a structure. I have
read
in a paper that if two structures contain an initial sequence of
fields,
all of which have compatible types, then the offsets of the
corresponding
fields in initial sequence are guaranteed to be the same. For example,
with
typedef struct {
int a;
int b;
char c;
} s1;
typedef struct {
int a;
int b;
int d[2];
} s2;
s1 t1;
s2 t2;
then the fields t1.b and t2.b have the same offset. I know that,
according
to the ISO specification (ISO/IEC 9899:1999, §6.5.2.3), this happens
when
we consider two elements of type s1 and s2 in a union, for example :
union {
s1 t1;
s2 t2
} a;
Is it true when t1 and t2 are not fields of a union ?
The special guarantee of 6.5.2.3 only applies to structs
that are members of the same union. However, a compiler would
need to be unbelievably perverse to let union membership (sings:
"Look for the union label ...") change the struct layout. Even
in the absence of your `union ... a', it would be hard for the
compiler to prove to itself that s1 and s2 don't appear as union
members in another source file, perhaps in a source file that
hasn't even been written yet.
Even if the layout of the initial struct members is the same,
though, you're not out of the woods. Let's take a different pair
of struct types to make the matter clearer:
typedef struct {
char a;
char b;
char c;
} s3;
typedef struct {
char a;
char b;
double d;
} s4;
Now, we know from the language definition that offsetof(s3,a)
and offsetof(s4,a) are both zero. We're also going to suppose
that offsetof(s3,b) == offsetof(s4,b), even though this isn't
truly guaranteed. However, this does not mean that we can
use an s4* to access the a and b elements of an s3! The two
struct types may well have different alignment requirements,
and the compiler is entitled to assume that a valid s4* points
to a location that meets the alignment requirement for an
actual s4 instance. It might then generate different code to
access the more stringently-aligned data: "Instead of fetching
the two bytes individually, I'll use this clever longword-
aligned instruction to fetch them both at one blow and then
use fast register-to-register instructions to split 'em apart."
If instead you point your s4* at what is actually an s3 instance
that might not be s4-aligned, the s4-dependent generated code
may malfunction.
(By the way, the above is not a merely theoretical concern:
I once tracked down a bug that stemmed from exactly this cause.)
Advice: Avoid the practice if you can reasonably do so.
You usually can, perhaps at the expense of introducing another
level of struct, as in
typedef struct { char a, b; } preamble;
typedef struct {
preamble p;
char c;
} s5;
typedef struct {
preamble p;
double d;
} s6;
You can now take an s5* or s6*, convert it to a preamble*, and
use the latter to access the a and b members with complete
safety. (You must actually do the conversion, though: for the
alignment reasons mentioned above, it would not be safe to use
the s6* itself to access the p.a and p.b members of an s5, or
vice versa.)
--
Eric Sosman
es*****@acm-dot-org.invalid