473,763 Members | 8,483 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Unions Redux

Ok, we've had two long and haphazard threads about unions recently,
and I still don't feel any closer to certainty about what is permitted
and what isn't. The other thread topics were "Real Life Unions"
and "union {unsigned char u[10]; ...} ".

Here's a concrete example:

#include <stdio.h>

int main(void)
{
union { int s; unsigned int us; } u;

u.us = 50;
printf("%d\n", u.s);
return 0;
}

Is this program well-defined (printing 50), implementation-defined, or
UB ?

Note that the aliasing rules in C99 6.5 are not violated here -- it is
not forbidden under that section to access an object of some type T
with an lvalue expression whose type is the signed or unsigned
version of T.

In other words, is there anything other than the aliasing rules that
restrict 'free' use of unions?

Mar 14 '07 #1
26 1924
Old Wolf <ol*****@inspir e.net.nzwrote:
[ snip ]
union { int s; unsigned int us; } u;
u.us = 50;
printf("%d\n", u.s);
I was looking at N1124. Annex J lists among the unspecified
behaviors,

-- The value of a union member other than the last one stored
into (6.2.6.1).

Annex J is informative, not normative, but it still makes sense to
look at section 6.2.6.1 to see why the behavior is unspecified.
There we see that u.s and u.us have object representations as
sequences of unsigned char (paragraphs 2,4), but that some object
representations may be trap representations leading to undefined
behavior when you access u.s (par. 5). Further down, 6.2.6.2(1-2)
leave open the possibility that int has more trap representations
than unsigned int, for example if there are padding bits or if ints
are sign-magnitude and M<N-1 in 6.2.6.2(2).

So it seems that your code has undefined (not just unspecified)
behavior, by 6.2.6.1(5).

If you want to read your 50 as a signed int, you need to convert
the bits, e.g. by assignment, u.s= u.us; on a really wicked machine,
that assignment need not be a no-op !

My argument doesn't work the other way. 6.2.6.2(5) says:

A valid (non-trap) object representation of a signed integer
type where the sign bit is zero is a valid object representation
of the corresponding unsigned type, and shall represent
the same value.

so I haven't ruled out
u.s = 50;
printf("%u\n", u.us);

I can't see anything else in 6.2.6.1 that could rule it out, either.
Paragraph 6.2.6.1(7) applies, but I'm pretty sure that sizeof(int)==
sizeof(unsigned int) by 6.2.6.2 so there are no leftover bytes
for (7) to ... bite (D'oh).

Note that the aliasing rules in C99 6.5 are not violated here -- it is
not forbidden under that section to access an object of some type T
with an lvalue expression whose type is the signed or unsigned
version of T.
I don't think that's relevant. It says that the compiler must
assume, for optimization purposes, that u.s and u.us are potentially
aliased (well, duh). For example,

memcpy(buf, &u.us, sizeof(u.us)); /* unsigned char buf[BIG] */
do_something((c onst unsigned char *)buf);
u.s= 50;
memcpy(buf, &u.us, sizeof(u.us)); /* can't optimize out */

In other words, is there anything other than the aliasing rules that
restrict 'free' use of unions?
As I said, I don't think the aliasing rules matter. Your example
amounts to a C++ reinterpret_cas t<intand can hit a trap
representation.
--
pa at panix dot com
Mar 15 '07 #2
On 14 Mar 2007 15:10:44 -0700, "Old Wolf" <ol*****@inspir e.net.nz>
wrote in comp.lang.c:
Ok, we've had two long and haphazard threads about unions recently,
and I still don't feel any closer to certainty about what is permitted
and what isn't. The other thread topics were "Real Life Unions"
and "union {unsigned char u[10]; ...} ".
Most of the rambling was caused by the original OP, I think, rather
than the material. I am not criticizing, just observing.
Here's a concrete example:

#include <stdio.h>

int main(void)
{
union { int s; unsigned int us; } u;

u.us = 50;
printf("%d\n", u.s);
return 0;
}

Is this program well-defined (printing 50), implementation-defined, or
UB ?
The program is well-defined, I'll elaborate further down.
Note that the aliasing rules in C99 6.5 are not violated here -- it is
not forbidden under that section to access an object of some type T
with an lvalue expression whose type is the signed or unsigned
version of T.
As you pointed out, it does not violate the alias rules, but there are
other rules to consider. In this particular case, you are accessing
an object of type unsigned int with an lvalue expression of type
signed int. The entire question of the validity of the operation
depends on the object representation compatibility, and has nothing at
all to do with the fact that there is a union involved.

In this particular case, the operation is well-defined because of the
standard's guarantees about corresponding signed and unsigned integer
types. For a positive value within the range of both types, the bit
representation is identical for both.

However, if you had assigned INT_MAX + 1 to u.us, and your
implementation is one of the universal ones where UINT_MAX INT_MAX,
the behavior would be implementation-defined because, at least
theoretically, (unsigned)INT_M AX + 1 could contain a bit pattern that
is a trap representation for signed int.
In other words, is there anything other than the aliasing rules that
restrict 'free' use of unions?
Personally, I think people get to wound up in the mystical and magical
properties of unions.

They are good for two things:

1. Space saving, such as a struct containing a data type specifier
and a union of all the possible data types. This is a frequent
feature in message passing systems. Generally, type punning is not
used here.

2. Another way to do type punning.

Consider:

int test_lone(long l)
{
int *ip = (int *)&l;
int i = *ip;
return i==l;
}

Is this code undefined, implementation-defined, or unspecified?

Technically it is undefined, but on an implementation like today's
typical desktop, where int and long have the same representation and
alignment, the result will be that the function returns 1. On an
implementation where int and long are different sizes, who knows.

Now consider:

int test_long(long l)
{
union { long ll; int ii } li;
li.ll = l;
return li.ll==li.ii;
}

Is this code undefined? Well, yes, but it is no different in
functionality than the first function. If int and long have the same
representation, it will return 1.

There is no difference in aliasing in a union than there is via
pointer casting.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.l earn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Mar 15 '07 #3
On Mar 14, 6:10 pm, "Old Wolf" <oldw...@inspir e.net.nzwrote:
Ok, we've had two long and haphazard threads about unions recently,
and I still don't feel any closer to certainty about what is permitted
and what isn't. The other thread topics were "Real Life Unions"
and "union {unsigned char u[10]; ...} ".

Here's a concrete example:

#include <stdio.h>

int main(void)
{
union { int s; unsigned int us; } u;

u.us = 50;
printf("%d\n", u.s);
return 0;
}

Is this program well-defined (printing 50), implementation-defined, or
UB ?

Note that the aliasing rules in C99 6.5 are not violated here -- it is
not forbidden under that section to access an object of some type T
with an lvalue expression whose type is the signed or unsigned
version of T.

In other words, is there anything other than the aliasing rules that
restrict 'free' use of unions?
C89: undefined.
It is undefined because the member of the union accessed is not the
member last stored, this is explicitly stated in the Standard.

C99: well-defined.
In C99 the aliasing rules and representation requirements determine
the legitimacy in this case. As you noted, your example does not
violate the aliasing rules so you are good there. The Standard
explicitly states that the set of non-negative values common to any
given signed integer type and its corresponding unsigned type have the
same representations in both types so you are good there as well.

Robert Gamble

Mar 15 '07 #4
On Mar 15, 4:05 pm, Jack Klein <jackkl...@spam cop.netwrote:
There is no difference in aliasing in a union than there is via
pointer casting.
Thanks for finally putting the issue to rest! I'm glad it
really is that simple.

Mar 15 '07 #5
Robert Gamble wrote:
On Mar 14, 6:10 pm, "Old Wolf" <oldw...@inspir e.net.nzwrote:
>Ok, we've had two long and haphazard threads about unions recently,
and I still don't feel any closer to certainty about what is permitted
and what isn't. The other thread topics were "Real Life Unions"
and "union {unsigned char u[10]; ...} ".

Here's a concrete example:

#include <stdio.h>

int main(void)
{
union { int s; unsigned int us; } u;

u.us = 50;
printf("%d\n", u.s);
return 0;
}

Is this program well-defined (printing 50), implementation-defined, or
UB ?

Note that the aliasing rules in C99 6.5 are not violated here -- it is
not forbidden under that section to access an object of some type T
with an lvalue expression whose type is the signed or unsigned
version of T.

In other words, is there anything other than the aliasing rules that
restrict 'free' use of unions?

C89: undefined.
It is undefined because the member of the union accessed is not the
member last stored, this is explicitly stated in the Standard.
Where? 6.3.2.3 says it's implementation-defined (ANSI standard if
it matters).

Yevgen
Mar 15 '07 #6
On Mar 14, 10:32 pm, p...@see.signat ure.invalid (Pierre Asselin)
wrote:
Old Wolf <oldw...@inspir e.net.nzwrote:
[ snip ]
union { int s; unsigned int us; } u;
u.us = 50;
printf("%d\n", u.s);

I was looking at N1124. Annex J lists among the unspecified
behaviors,

-- The value of a union member other than the last one stored
into (6.2.6.1).

Annex J is informative, not normative,
Indeed it is not and in this case it is simply erroneous. The
sentence you cited is leftover from a previous version of the
Standard, there is no supporting text in the Standard proper.
but it still makes sense to
look at section 6.2.6.1 to see why the behavior is unspecified.
You are making the false assumption that 6.2.6.1 actually contains
verbiage to this effect, it doesn't.
There we see that u.s and u.us have object representations as
sequences of unsigned char (paragraphs 2,4), but that some object
representations may be trap representations leading to undefined
behavior when you access u.s (par. 5). Further down, 6.2.6.2(1-2)
leave open the possibility that int has more trap representations
than unsigned int, for example if there are padding bits or if ints
are sign-magnitude and M<N-1 in 6.2.6.2(2).

So it seems that your code has undefined (not just unspecified)
behavior, by 6.2.6.1(5).
I have no idea how you were able to come to that conclusion from
anything you have cited so far. You appear to have come to a
premature conclusion and then tried, unconvincingly, to make the
evidence fit that conclusion.
If you want to read your 50 as a signed int, you need to convert
the bits, e.g. by assignment, u.s= u.us; on a really wicked machine,
that assignment need not be a no-op !

My argument doesn't work the other way. 6.2.6.2(5) says:

A valid (non-trap) object representation of a signed integer
type where the sign bit is zero is a valid object representation
of the corresponding unsigned type, and shall represent
the same value.
The opposite is also true, 6.2.5p9:

"The range of nonnegative values of a signed integer type is a
subrange of the corresponding unsigned integer type, and the
representation of the same value in each type is the same."
so I haven't ruled out
u.s = 50;
printf("%u\n", u.us);

I can't see anything else in 6.2.6.1 that could rule it out, either.
Paragraph 6.2.6.1(7) applies, but I'm pretty sure that sizeof(int)==
sizeof(unsigned int) by 6.2.6.2 so there are no leftover bytes
for (7) to ... bite (D'oh).
Note that the aliasing rules in C99 6.5 are not violated here -- it is
not forbidden under that section to access an object of some type T
with an lvalue expression whose type is the signed or unsigned
version of T.

I don't think that's relevant. It says that the compiler must
assume, for optimization purposes, that u.s and u.us are potentially
aliased (well, duh). For example,

memcpy(buf, &u.us, sizeof(u.us)); /* unsigned char buf[BIG] */
do_something((c onst unsigned char *)buf);
u.s= 50;
memcpy(buf, &u.us, sizeof(u.us)); /* can't optimize out */
In other words, is there anything other than the aliasing rules that
restrict 'free' use of unions?

As I said, I don't think the aliasing rules matter. Your example
amounts to a C++ reinterpret_cas t<intand can hit a trap
representation.
I don't know what C++ has to do with this.

Robert Gamble

Mar 15 '07 #7
Jack Klein wrote:
On 14 Mar 2007 15:10:44 -0700, "Old Wolf" <ol*****@inspir e.net.nz>
wrote in comp.lang.c:
>Ok, we've had two long and haphazard threads about unions recently,
and I still don't feel any closer to certainty about what is permitted
and what isn't. The other thread topics were "Real Life Unions"
and "union {unsigned char u[10]; ...} ".

Most of the rambling was caused by the original OP, I think, rather
than the material. I am not criticizing, just observing.
[snip]
There is no difference in aliasing in a union than there is via
pointer casting.
Sorry if it's something obvious or stupid, but please consider this
(no pointers involved).
Suppose double is eight bytes big, int is four bytes, there are
no padding bits in int.

/* (1) get bits from a double and see what happens */
double a = 3.45; unsigned int b;
memcpy(&b, &a, sizeof b);
printf("%u", b);

/* (2) do same thing using a union */
union U {double a; unsigned int b;} u;
u.a = 3.14;
printf("%u", u.b);

/* (3) initialize union with memcpy and access its member */
double d;
union U {double a; unsigned int b;} u;
d = 3.14;
memcpy(&u, &d, sizeof d);
printf("%u", u.b);

Which of three are valid? I think (1) is; (3) maybe; (2) maybe, if
(3) valid and aliasing rules don't work here. If aliasing rules
do apply to (2), then how is first assignment in (2) different
from memcpy() in (3)?

It's in fact the same question as in the other post by OP, about
aliasing. Perhaps all such magic is allowed and that's it. Perhaps
I'm just stupid that I can't understand these simple things.

Yevgen
Mar 15 '07 #8
On Mar 15, 1:20 am, Yevgen Muntyan <muntyan.remove t...@tamu.edu>
wrote:
Robert Gamble wrote:
On Mar 14, 6:10 pm, "Old Wolf" <oldw...@inspir e.net.nzwrote:
Ok, we've had two long and haphazard threads about unions recently,
and I still don't feel any closer to certainty about what is permitted
and what isn't. The other thread topics were "Real Life Unions"
and "union {unsigned char u[10]; ...} ".
Here's a concrete example:
#include <stdio.h>
int main(void)
{
union { int s; unsigned int us; } u;
u.us = 50;
printf("%d\n", u.s);
return 0;
}
Is this program well-defined (printing 50), implementation-defined, or
UB ?
Note that the aliasing rules in C99 6.5 are not violated here -- it is
not forbidden under that section to access an object of some type T
with an lvalue expression whose type is the signed or unsigned
version of T.
In other words, is there anything other than the aliasing rules that
restrict 'free' use of unions?
C89: undefined.
It is undefined because the member of the union accessed is not the
member last stored, this is explicitly stated in the Standard.

Where? 6.3.2.3 says it's implementation-defined (ANSI standard if
it matters).
It's a mistake in the Standard, it is supposed to be undefined.

Robert Gamble

Mar 15 '07 #9
Robert Gamble wrote:
On Mar 15, 1:20 am, Yevgen Muntyan <muntyan.remove t...@tamu.edu>
wrote:
>Robert Gamble wrote:
>>On Mar 14, 6:10 pm, "Old Wolf" <oldw...@inspir e.net.nzwrote:
Ok, we've had two long and haphazard threads about unions recently,
and I still don't feel any closer to certainty about what is permitted
and what isn't. The other thread topics were "Real Life Unions"
and "union {unsigned char u[10]; ...} ".
Here's a concrete example:
#include <stdio.h>
int main(void)
{
union { int s; unsigned int us; } u;
u.us = 50;
printf("%d\n", u.s);
return 0;
}
Is this program well-defined (printing 50), implementation-defined, or
UB ?
Note that the aliasing rules in C99 6.5 are not violated here -- it is
not forbidden under that section to access an object of some type T
with an lvalue expression whose type is the signed or unsigned
version of T.
In other words, is there anything other than the aliasing rules that
restrict 'free' use of unions?
C89: undefined.
It is undefined because the member of the union accessed is not the
member last stored, this is explicitly stated in the Standard.
Where? 6.3.2.3 says it's implementation-defined (ANSI standard if
it matters).

It's a mistake in the Standard, it is supposed to be undefined.
So infamous compiler which doesn't strive to C99 compliance could make
the following explode (or Death Station version from 1991 for that
matter)?

union U {int a; unsigned char u[25];};
int main (void)
{
unsigned char c;
union U u;
u.a = 8;
c = u.u[0];
return 0;
}

Perhaps "do not touch union member which wasn't previously set" is
not just a product of my paranoidal mind?

DR's 257, 283, 236 and thread named "Union arrangement" in comp.lang.c
(the end of it) are interesting reading, by the way.
C faq is more loyal to this:

http://www.c-faq.com/struct/union.html
A union is essentially a structure in which all of the fields overlay
each other; you can only use one field at a time. (You can also cheat by
writing to one field and reading from another, to inspect a type's
bit patterns or interpret them differently, but that's obviously pretty
machine-dependent.)

Yevgen
Mar 15 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
1959
by: Sue | last post by:
Anyone have any ideas on why the code below will show up in a browser's sourcecode as an empty table, and is not visible? aspx: <headertemplate> <asp:Table ID="MyTable" runat="server" /> </headertemplate>
15
5283
by: David | last post by:
Some developers in my group are using UNIONS to define their data types in a C++ program for an embedded system. Are there any pro and cons in doing this when you can define a CLASS to do the same thing? I guess there might be some additional overhead with CLASSES, but is that really an issue with today's computers?
6
13385
by: Neil Zanella | last post by:
Hello, I would like to know whether the following C fragment is legal in standard C and behaves as intended under conforming implementations... union foo { char c; double d; };
16
3954
by: Tim Cambrant | last post by:
Hi. I was reading up a bit on the features of C I seldom use, and I came across unions. I understand the concept, and that all the contained variables etc. share the same memory. Thus, when a new value is declared to a variable in the union, the existing value is overwritten even though the new value is declared to a different variable than that of the first value. Now I'm just wondering what the use of this is. I'm sure there are lots,...
23
2831
by: rohit | last post by:
Hi, In my couple of years of experience, I have never found a single instance where I needed to use unions and bitfields(though I have used structures).I was just imagining where would these find relevance.Though both of these(bitfields and unions) are used where space is a constraint(so I can assume always in embedded systems,where memory is particularly less)and we want to save space/memory. As far as I have read, there is no...
4
1766
by: uralmutlu | last post by:
Hi, I was wandering if I can have classes in unions? I basically have source code in a format very similar to: union example { ClassA variable1; ClassB variable2; };
67
3359
by: bluejack | last post by:
A recent post asking for help with unions reminded me of this component of the C language that I have only used a couple of times, and those almost entirely out of personal whim -- Unions for the sake of Unions simply because I wanted to see one in action. Granted: it makes it possible to save a few bytes of storage when you have something that can be a chicken or a rooster, but not both, and you're always going to know which it is. ...
11
2020
by: pereges | last post by:
Hello, can some one please guide me a little into using unions. I read about unions in K & R but I am finding it difficult to apply to my problem at hand. I want to save up some space by using unions . My questions are : 1. Is it dangerous to use unions ? Is it worth the trouble if I want to save memory ? Are they error prone ? 2. I read that it is not possible to access more than 1 member at any instant from a union. What does this...
0
10002
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9938
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9823
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7368
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6643
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5270
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5406
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3528
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2794
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.