Accessing alternate union members

Barry Schwarz

Given a union of the form
union {
T1 m1;
T2 m2;}obj;
where T1 and T2 are different scalar (non-aggregate) types.

The C99 standard states that
obj.m1 = value;
if (obj.m2 ...
invokes undefined behavior because my reference to the union is via a
member different than the last one stored into.

My question is, what about the following?
memcpy(&obj, &data, sizeof data);
if (obj.m1 ...

Ignoring the pathological cases such as sizeof data > sizeof obj or
sizeof data < sizeof (T1), is this valid?

If so and if I replace m1 with m2 above (thereby accessing something
other than the first member), is it still valid?
<<Remove the del for email>>

Nov 14 '05 #1

Subscribe Post Reply

4332

Chris Torek

In article <news:bu**********@216.39.143.103>
Barry Schwarz <sc******@deloz.net> writes:

Given a union of the form
union {
T1 m1;
T2 m2;}obj;
where T1 and T2 are different scalar (non-aggregate) types.

The C99 standard states that
obj.m1 = value;
if (obj.m2 ...
invokes undefined behavior because my reference to the union is via a
member different than the last one stored into.
Right. Note that on "real world" systems (as opposed to Deathstations
or some such :-) ) the problem is most likely to occur when T1 is
some sort of integral type and T2 is some sort of floating-point
type, and you have managed to store a reserved or signalling-NaN
bit pattern into the bytes that will be examined for obj.m2. For
instance, it is easy enough to come up with bit patterns that result
in "floating point exception" crashes on Intel CPUs (provided
signalling NaNs are not being ignored) when T1 is int and T2 is
float, or when T1 is long long and T2 is double.
My question is, what about the following?
memcpy(&obj, &data, sizeof data);
if (obj.m1 ...

Ignoring the pathological cases such as sizeof data > sizeof obj or
sizeof data < sizeof (T1), is this valid?
Since this copies bytes (what C99 calls "object representations")
from "data" to "obj", it is valid if and only if those bytes are
those resulting from storing a valid value to an obj.m1 or equivalent.
One obvious problem here is that "obj" has an unnamed union type,
so that it is impossible for "data" to have the same type unless
"data" is declared and defined in a separate translation unit --
but in that separate translation unit it is at least difficult, if
not impossible, to declare "obj" correctly.

If we give the union type a name so that we can consistently refer
to it:

union U { T1 m1; T2 m2; };
union U obj;
union U data;

then we can be sure about what is in "data" if, e.g., we do this:

obj.m1 = value;
memcpy(&data, &obj, sizeof data);

Now "data" is a copy of "obj", so that data.m1 is valid because
obj.m1 is valid. A subsequent memcpy() back to &obj leaves obj.m1
valid again.
If so and if I replace m1 with m2 above (thereby accessing something
other than the first member), is it still valid?

The conditions for whether obj.m2 is valid are basically the same as
those for whether obj.m1 is valid -- the bytes copied from &data to
&obj must be those making up a vaild "object representation".

A somewhat trickier question (and the one I suspect you are really
asking) is: suppose we have union U as above, but we then do
something like this:

union U obj;
T1 data;
...
data = some_valid_value_of_type_t1;
memcpy(&obj, &data, sizeof data);
... now refer to obj.m1 ...

I think it is safe to say that most real-world C implementations
will have no problem with this; but without careful scrutiny of
the C99 standard to prove otherwise, I would assume that
Deathstation-like "evil" C implementations would be allowed to fail
if "unused" bytes of the union were not properly set. For instance,
suppose T1 is int and T2 is double, and sizeof(int) is 4 while
sizeof(double) is 8. Suppose further that the Evil Implementation
handles the union by storing a checksummed copy of the four bytes
making up the "int" in a fifth byte in the space that would otherwise
be occupied by the double. If the checksum fails to match, the
implementation delivers a runtime exception. As far as I can tell
(without careful study of the C99 wording) this is allowed.

In other words, unless you want to depend on the friendliness of
your implementation, Don't Do That. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 14 '05 #2

Barry Schwarz

On 18 Jan 2004 00:35:12 GMT, Chris Torek <no****@torek.net> wrote:

In article <news:bu**********@216.39.143.103>
Barry Schwarz <sc******@deloz.net> writes:
Given a union of the form
union {
T1 m1;
T2 m2;}obj;
where T1 and T2 are different scalar (non-aggregate) types.

The C99 standard states that
obj.m1 = value;
if (obj.m2 ...
invokes undefined behavior because my reference to the union is via a
member different than the last one stored into.
Right. Note that on "real world" systems (as opposed to Deathstations
or some such :-) ) the problem is most likely to occur when T1 is
some sort of integral type and T2 is some sort of floating-point
type, and you have managed to store a reserved or signalling-NaN
bit pattern into the bytes that will be examined for obj.m2. For
instance, it is easy enough to come up with bit patterns that result
in "floating point exception" crashes on Intel CPUs (provided
signalling NaNs are not being ignored) when T1 is int and T2 is
float, or when T1 is long long and T2 is double.

All true but only tangentially related to my question. The situation
you describe can be produced just as easily with code of the form
int i = ...
float f;
memcpy(&f, &i, sizeof i);
if (f ...
yet the language does not *require* this to be undefined as it does my
first sample.

My question is, what about the following?
memcpy(&obj, &data, sizeof data);
if (obj.m1 ...

Ignoring the pathological cases such as sizeof data > sizeof obj or
sizeof data < sizeof (T1), is this valid?
Since this copies bytes (what C99 calls "object representations")
from "data" to "obj", it is valid if and only if those bytes are
those resulting from storing a valid value to an obj.m1 or equivalent.
One obvious problem here is that "obj" has an unnamed union type,
so that it is impossible for "data" to have the same type unless
"data" is declared and defined in a separate translation unit --
but in that separate translation unit it is at least difficult, if
not impossible, to declare "obj" correctly.

I realize that copying an invalid bit pattern to an object and then
attempting to evaluate the object is a no-no, but it is basically a
run time problem. If we make T2 in my question unsigned char, then no
matter what value is stored in m1, m2 can never have any invalid or
trap representation. However, code of the form
obj.m1 = value;
if (obj.m2 ...
still invokes undefined behavior simply because the standard says so,
not for any practical reason.

So my real question is, ignoring pathological cases (to also include
invalid bit patterns) and considering that I do not store into a
member of the union, does my second example involve a priori undefined
behavior the way my first does?

If we give the union type a name so that we can consistently refer
to it:

union U { T1 m1; T2 m2; };
union U obj;
union U data;

then we can be sure about what is in "data" if, e.g., we do this:

obj.m1 = value;
memcpy(&data, &obj, sizeof data);

Now "data" is a copy of "obj", so that data.m1 is valid because
obj.m1 is valid. A subsequent memcpy() back to &obj leaves obj.m1
valid again.
If so and if I replace m1 with m2 above (thereby accessing something
other than the first member), is it still valid?
The conditions for whether obj.m2 is valid are basically the same as
those for whether obj.m1 is valid -- the bytes copied from &data to
&obj must be those making up a vaild "object representation".

A somewhat trickier question (and the one I suspect you are really
asking) is: suppose we have union U as above, but we then do
something like this:

union U obj;
T1 data;
...
data = some_valid_value_of_type_t1;
memcpy(&obj, &data, sizeof data);
... now refer to obj.m1 ...

I think it is safe to say that most real-world C implementations
will have no problem with this; but without careful scrutiny of
the C99 standard to prove otherwise, I would assume that
Deathstation-like "evil" C implementations would be allowed to fail
if "unused" bytes of the union were not properly set. For instance,
suppose T1 is int and T2 is double, and sizeof(int) is 4 while
sizeof(double) is 8. Suppose further that the Evil Implementation
handles the union by storing a checksummed copy of the four bytes
making up the "int" in a fifth byte in the space that would otherwise
be occupied by the double. If the checksum fails to match, the
implementation delivers a runtime exception. As far as I can tell
(without careful study of the C99 wording) this is allowed.

Again no disagreement. And I like your example of why it should be
undefined. However, if T1 is the same type as T2, my first example
still invokes undefined behavior by definition (or is it by
specification) while the problem you describe cannot occur in my
second example.

In other words, unless you want to depend on the friendliness of
your implementation, Don't Do That. :-)

Maybe if I phrased the question as: "A really clever lint program
would be correct to generate a diagnostic that my first example must
invoke undefined behavior. Would it be correct to do so, according to
the standard, for my second example?"
<<Remove the del for email>>

Nov 14 '05 #3

Similar topics

Safe union of std::vectors

by: Simon Elliott | last post by:

I'd like to do something along these lines: struct foo { int i1_; int i2_; }; struct bar {

C / C++

Accessing individual bytes of an integer

by: Daniel Lidström | last post by:

Hello! I want to work with individual bytes of integers. I know that ints are 32-bit and will always be. Sometimes I want to work with the entire 32-bits, and other times I want to modify just...

C / C++

Accessing high and low bytes of a unsigned short in a struct.

by: James Roberge | last post by:

I am having a little trouble getting my union/struct to work correctly. I am creating a struct that will contain information about the status of various Z80 cpu registers in an emulator i am...

C / C++

union initializers idiosyncrasies

by: Neil Zanella | last post by:

Hello, I would like to know what the C standards (and in particular the C99 standard) have to say about union initializers with regards to the following code snippet (which compiles fine under...

C / C++

accessing comon initial sequence in union

by: S.Tobias | last post by:

Quote from 6.5.2.3 Structure and union members, Examle 3: The following is not a valid fragment (because the union type is not visible within function f): struct t1 { int m; }; struct t2 {...

C / C++

Union Issue

by: ranjeet.gupta | last post by:

Dear ALL As we know that when we declare the union then we have the size of the union which is the size of the highest data type as in the below case the size should be 4 (For my case and...

C / C++

accessing inner struct members

by: Walter Deodiaus | last post by:

I have typedef struct { union _union{ .... struct { int i; }u1; .... }Union; } Struct ;

C / C++

union arrangement

by: tedu | last post by:

does anyone know of a platform/compiler which will place union elements to not overlap? as in union u { int a; long b; size_t c; }; in my limited experience, writing to any of (a, b, or c)...

C / C++

UNION Query for Alternate Names

by: OldBirdman | last post by:

Assume 2 tables tblP {Primary Table} tblP.Key {AutoNumber and all that} tblP.Name {Name of EE, Movie, Bird Species, or Whatever} tblA {Alternate Table} tblA.Key {AutoNumber and...

Microsoft Access / VBA

Cloud Servers without Credit Card and Email Registration: A Simpler Way to Get on the Cloud

by: CloudSolutions | last post by:

Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...

General

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

General

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General