473,395 Members | 1,720 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Safe use of unions

The quoted text below is from comp.std.c which originated
from a discussion on comp.lang.c. I've edited out the parts
that do not apply to my question.

Robert Gamble wrote:
Dann Corbit wrote:
#include <stdio.h>

int main(void)
{
typedef union foo_u {
struct a {
unsigned char carr[sizeof(unsigned int)];
} aa;
struct b {
unsigned int ui;
} bb;
} foo;

foo bar;
bar.bb.ui = 1;
printf("%u\n", (unsigned)bar.aa.carr[0]);
return 0;
}

#include <stdio.h>

int main(void)
{
typedef union foo_u {
unsigned char carr[sizeof(unsigned int)];
unsigned int ui;
} foo;

foo bar;
bar.ui = 1;
printf("%u\n", (unsigned)bar.carr[0]);
return 0;
}

Is the first sample safe but the second not safe?


Neither are safe.


Why is either example unsafe? I understand the output of
the printf calls is unspecified. But I do not see anything
that would be cause for concern other than that.

Jun 30 '06 #1
4 1595
On 30 Jun 2006 07:55:27 -0700, di*************@aol.com wrote in
comp.lang.c:
The quoted text below is from comp.std.c which originated
from a discussion on comp.lang.c. I've edited out the parts
that do not apply to my question.

Robert Gamble wrote:
Dann Corbit wrote:
#include <stdio.h>

int main(void)
{
typedef union foo_u {
struct a {
unsigned char carr[sizeof(unsigned int)];
} aa;
struct b {
unsigned int ui;
} bb;
} foo;

foo bar;
bar.bb.ui = 1;
printf("%u\n", (unsigned)bar.aa.carr[0]);
return 0;
}

#include <stdio.h>

int main(void)
{
typedef union foo_u {
unsigned char carr[sizeof(unsigned int)];
unsigned int ui;
} foo;

foo bar;
bar.ui = 1;
printf("%u\n", (unsigned)bar.carr[0]);
return 0;
}

Is the first sample safe but the second not safe?


Neither are safe.


Why is either example unsafe? I understand the output of
the printf calls is unspecified. But I do not see anything
that would be cause for concern other than that.


I disagree with Robert's assessment. They are both perfectly safe.
Any area of memory at all that a program has a right to access
(static, automatic, or allocated) may be read as an array of unsigned
char.

The standard still uses the phrase "character type" in several places,
which is an anachronism from the C89/C90 days. Only unsigned char is
truly safe now, since C99 specifically allows signed char, and
therefore plain char if signed, to have padding bits and trap
representations.

It is also perfectly safe to write to any such memory via an lvalue of
any character type, not just unsigned char, provided that the memory
is not accesses with an lvalue of another type until being modified by
said lvalue of the other type first.

For example, paragraph 5 of 6.2.6 Representations of types 6.2.6.1
General:

"Certain object representations need not represent a value of the
object type. If the stored value of an object has such a
representation and is read by an lvalue expression that does not have
character type, the behavior is undefined. If such a representation is
produced by a side effect that modifies all or any part of the object
by an lvalue expression that does not have character type, the
behavior is undefined. Such a representation is called
a trap representation."

Also, paragraph 7 of 6.5 Expressions:

"An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:73)

— a type compatible with the effective type of the object,

— a qualified version of a type compatible with the effective type of
the object,

— a type that is the signed or unsigned type corresponding to the
effective type of the object,

— a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,

— an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or

— a character type."

Recognition of this special dispensation for unsigned char actually
caused a change in the C99 standard's definition for the term
"undefined behavior" between C90 and C99 draft N869, and the final C((
standard.

C90: "3.16 undefined behavior: Behavior, upon use of a nonponable or
erroneous program construct, of erroneous data, or of indeterminately
valued objects, for which this International Standard imposes no
requirements"

N869: "3.18
1 undefined behavior
behavior, upon use of a nonportable or erroneous program construct, of
erroneous data, or of indeterminately valued objects, for which this
International Standard imposes no requirements"

ISO 9899:1999: "3.4.3
1 undefined behavior
behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this International Standard imposes no
requirements"

The phrase "or of indeterminately valued objects" was specifically
removed because accessing any object as a suitably sized array of
unsigned char is not undefined, as unsigned char has no trap
representations.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Jun 30 '06 #2
Jack Klein wrote:
On 30 Jun 2006 07:55:27 -0700, di*************@aol.com wrote in
comp.lang.c:
The quoted text below is from comp.std.c which originated
from a discussion on comp.lang.c. I've edited out the parts
that do not apply to my question.

Robert Gamble wrote:
Dann Corbit wrote:
> #include <stdio.h>
>
> int main(void)
> {
> typedef union foo_u {
> struct a {
> unsigned char carr[sizeof(unsigned int)];
> } aa;
> struct b {
> unsigned int ui;
> } bb;
> } foo;
>
> foo bar;
> bar.bb.ui = 1;
> printf("%u\n", (unsigned)bar.aa.carr[0]);
> return 0;
> }

> #include <stdio.h>
>
> int main(void)
> {
> typedef union foo_u {
> unsigned char carr[sizeof(unsigned int)];
> unsigned int ui;
> } foo;
>
> foo bar;
> bar.ui = 1;
> printf("%u\n", (unsigned)bar.carr[0]);
> return 0;
> }
>
> Is the first sample safe but the second not safe?

Neither are safe.
Why is either example unsafe? I understand the output of
the printf calls is unspecified. But I do not see anything
that would be cause for concern other than that.


I disagree with Robert's assessment. They are both perfectly safe.


I agree with you about the second example, I was wrong. The C89
Standard had the following wording:

"With one exception, if a member of a union object is accessed after
a value has been stored in a different member of the object, the
behavior is implementation-defined."

It was widely accepted that the behavior was actually meant to be
undefined.

Section J.1 - "Unspecified Behavior" in the C99 Standard states:

"The value of a union member other than the last one stored into
(6.2.6.1)."

However, the associated verbiage in the C99 Standard has been removed,
I did not realize this at the time I wrote my original response. The
validity of such a construct is now determined by aliasing rules in
which case the specific example is well-defined. (Note to self: stop
accepting statements made in Section J without further research).

I still am not convinced about the first example though.
Any area of memory at all that a program has a right to access
(static, automatic, or allocated) may be read as an array of unsigned
char.
[snip]
Also, paragraph 7 of 6.5 Expressions:

"An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:73)

- a type compatible with the effective type of the object,

- a qualified version of a type compatible with the effective type of
the object,

- a type that is the signed or unsigned type corresponding to the
effective type of the object,

- a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,

- an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or

- a character type."


The fact that the "a character type" appears *after* the statement
about aggregate or union types makes me skeptical as to whether this
section validates the first example.

Robert Gamble

Jul 1 '06 #3
"Robert Gamble" <rg*******@gmail.com> writes:
Jack Klein wrote:
On 30 Jun 2006 07:55:27 -0700, di*************@aol.com wrote in
comp.lang.c:
> The quoted text below is from comp.std.c which originated
> from a discussion on comp.lang.c. I've edited out the parts
> that do not apply to my question.
>
> Robert Gamble wrote:
> > Dann Corbit wrote:
> > > #include <stdio.h>
> > >
> > > int main(void)
> > > {
> > > typedef union foo_u {
> > > struct a {
> > > unsigned char carr[sizeof(unsigned int)];
> > > } aa;
> > > struct b {
> > > unsigned int ui;
> > > } bb;
> > > } foo;
> > >
> > > foo bar;
> > > bar.bb.ui = 1;
> > > printf("%u\n", (unsigned)bar.aa.carr[0]);
> > > return 0;
> > > }
> >
> > > #include <stdio.h>
> > >
> > > int main(void)
> > > {
> > > typedef union foo_u {
> > > unsigned char carr[sizeof(unsigned int)];
> > > unsigned int ui;
> > > } foo;
> > >
> > > foo bar;
> > > bar.ui = 1;
> > > printf("%u\n", (unsigned)bar.carr[0]);
> > > return 0;
> > > }
> > >
> > > Is the first sample safe but the second not safe?
> >
> > Neither are safe.
>
> Why is either example unsafe? I understand the output of
> the printf calls is unspecified. But I do not see anything
> that would be cause for concern other than that.
I disagree with Robert's assessment. They are both perfectly safe.


I agree with you about the second example, I was wrong. The C89
Standard had the following wording:

"With one exception, if a member of a union object is accessed after
a value has been stored in a different member of the object, the
behavior is implementation-defined."

It was widely accepted that the behavior was actually meant to be
undefined.


How does this affect the oft seen habit of "field overlaying" where, for
example, a struct of 4 chars overlays an int in a union in order to give
byte access to certain bit groups of the int value? (see the union
bits32_tag example in Expert C Programming) Or am I misunderstanding or
reading out of context? Or is this habit non-standard and non portable?
Section J.1 - "Unspecified Behavior" in the C99 Standard states:

"The value of a union member other than the last one stored into
(6.2.6.1)."

However, the associated verbiage in the C99 Standard has been removed,
I did not realize this at the time I wrote my original response. The
validity of such a construct is now determined by aliasing rules in
which case the specific example is well-defined. (Note to self: stop
accepting statements made in Section J without further research).

I still am not convinced about the first example though.
Any area of memory at all that a program has a right to access
(static, automatic, or allocated) may be read as an array of unsigned
char.


[snip]
Also, paragraph 7 of 6.5 Expressions:

"An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:73)

- a type compatible with the effective type of the object,

- a qualified version of a type compatible with the effective type of
the object,

- a type that is the signed or unsigned type corresponding to the
effective type of the object,

- a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,

- an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or

- a character type."


The fact that the "a character type" appears *after* the statement
about aggregate or union types makes me skeptical as to whether this
section validates the first example.

Robert Gamble


--
Lint early. Lint often.
Jul 1 '06 #4
Richard G. Riley wrote:
"Robert Gamble" <rg*******@gmail.comwrites:
Jack Klein wrote:
On 30 Jun 2006 07:55:27 -0700, di*************@aol.com wrote in
comp.lang.c:

The quoted text below is from comp.std.c which originated
from a discussion on comp.lang.c. I've edited out the parts
that do not apply to my question.

Robert Gamble wrote:
Dann Corbit wrote:
#include <stdio.h>

int main(void)
{
typedef union foo_u {
struct a {
unsigned char carr[sizeof(unsigned int)];
} aa;
struct b {
unsigned int ui;
} bb;
} foo;

foo bar;
bar.bb.ui = 1;
printf("%u\n", (unsigned)bar.aa.carr[0]);
return 0;
}
>
#include <stdio.h>

int main(void)
{
typedef union foo_u {
unsigned char carr[sizeof(unsigned int)];
unsigned int ui;
} foo;

foo bar;
bar.ui = 1;
printf("%u\n", (unsigned)bar.carr[0]);
return 0;
}

Is the first sample safe but the second not safe?
>
Neither are safe.

Why is either example unsafe? I understand the output of
the printf calls is unspecified. But I do not see anything
that would be cause for concern other than that.

I disagree with Robert's assessment. They are both perfectly safe.
I agree with you about the second example, I was wrong. The C89
Standard had the following wording:

"With one exception, if a member of a union object is accessed after
a value has been stored in a different member of the object, the
behavior is implementation-defined."

It was widely accepted that the behavior was actually meant to be
undefined.

How does this affect the oft seen habit of "field overlaying" where, for
example, a struct of 4 chars overlays an int in a union in order to give
byte access to certain bit groups of the int value? (see the union
bits32_tag example in Expert C Programming) Or am I misunderstanding or
reading out of context? Or is this habit non-standard and non portable?
The technique you describe (and the example you cite) is very
unportable. Size assumptions aside, there is no guarantee that there
won't be padding between the members of the structure as there is with
an array. The technique is undefined in C90 for the reasons cited
above which is why it is often advised to use a pointer to unsigned
char to examine the contents instead. If the struct contained unsigned
chars instead of chars and it was guaranteed that there was no padding
between the members then (ignoring size assumptions again) this
technique might be safe in C99; but it doesn't and it's not so it is
better to use either the pointer to unsigned char technique which will
work equally well in C90 and C99 or an union to map an array of
sizeof(type) unsigned chars to type which is safe in C99 but undefined
in C90.

Robert Gamble

Jul 9 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
by: srijit | last post by:
Hello, I would like to know the definition of type safe and whether Python can be considered as a type safe language. Similarly are Java, C# or C++ type safe? Regards, Srijit
5
by: Simon Elliott | last post by:
I'd like to do something along these lines: struct foo { int i1_; int i2_; }; struct bar {
6
by: Neil Zanella | last post by:
Hello, I would like to know whether the following C fragment is legal in standard C and behaves as intended under conforming implementations... union foo { char c; double d; };
36
by: Robert Vazan | last post by:
I am looking for other people's attempts to create safe subset of C and enforce it with scripts. Does anybody know about anything like this? By "safe", I mean the following: * Strongly typed...
23
by: rohit | last post by:
Hi, In my couple of years of experience, I have never found a single instance where I needed to use unions and bitfields(though I have used structures).I was just imagining where would these find...
4
by: uralmutlu | last post by:
Hi, I was wandering if I can have classes in unions? I basically have source code in a format very similar to: union example { ClassA variable1; ClassB variable2; };
67
by: bluejack | last post by:
A recent post asking for help with unions reminded me of this component of the C language that I have only used a couple of times, and those almost entirely out of personal whim -- Unions for the...
21
by: Chad | last post by:
Okay, so like recently the whole idea of using a Union in C finally sunk into my skull. Seriously, I think it probably took me 2 years to catch on what a Union really is. Belated, I mentioned this...
11
by: pereges | last post by:
Hello, can some one please guide me a little into using unions. I read about unions in K & R but I am finding it difficult to apply to my problem at hand. I want to save up some space by using...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.