473,839 Members | 1,530 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

union {unsigned char u[10]; ...}

Hey,

Why is it legal to do

union U {unsigned char u[8]; int a;};
union U u;
u.a = 1;
u.u[0];

I tried to find it in the standard, but I only found that
value of u.u here is unspecified. Standard implies that
once u.u above is legal, then u.u[0] will be bits of first byte
u.a and so on, so here we can treat u.u in the same way as
if we did

int a = 8;
unsigned char u[sizeof a];
memcpy(u, &a, sizeof a);

But, I can't find the place which says u.u in the first
example is indeed legal and u.u value is the same as
the value of the union (which then is bytes from u.a
value, and so on).

Regards,
Yevgen
Mar 13 '07 #1
30 3298
Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
Why is it legal to do

union U {unsigned char u[8]; int a;};
union U u;
u.a = 1;
u.u[0];
See C99 section 6.5 "Expression s":

An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:73)
[...]
- a character type.
--
"Some programming practices beg for errors;
this one is like calling an 800 number
and having errors delivered to your door."
--Steve McConnell
Mar 13 '07 #2
Ben Pfaff wrote:
Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
>Why is it legal to do

union U {unsigned char u[8]; int a;};
union U u;
u.a = 1;
u.u[0];

See C99 section 6.5 "Expression s":

An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:73)
[...]
- a character type.
But character type is not a union. Moreover, it actually says

- an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
— a character type.

i.e. character type is not in the list of types mentioned in
the above paragraph, about unions. Do I miss something here?
("aforementione d" means "listed above", right?)

Yevgen
Mar 13 '07 #3
Yevgen Muntyan wrote:
Ben Pfaff wrote:
>Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
>>Why is it legal to do

union U {unsigned char u[8]; int a;};
union U u;
u.a = 1;
u.u[0];

See C99 section 6.5 "Expression s":

An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:73)
[...]
- a character type.

But character type is not a union. Moreover, it actually says

- an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
— a character type.

i.e. character type is not in the list of types mentioned in
the above paragraph, about unions. Do I miss something here?
("aforementione d" means "listed above", right?)
Moreover, this paragraph is actually irrelevant. It says
you can do something like

int func (int *i);
union U {int a; double b;};
U u;
u.a = 2;
func ((int*)&u);

but it doesn't let you do

U u;
u.b = 2;
func ((int*)&u);

Same thing for character type, even if it was in the list:

you can have character array in the union, and *if* you
set this member value to representation of some double,
then you can pass the union around as it was double. But
it's not clear at all if you can set double member, and
then use the union as if you set character array member.

Yevgen
Mar 13 '07 #4
Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
Ben Pfaff wrote:
>Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
>>Why is it legal to do

union U {unsigned char u[8]; int a;};
union U u;
u.a = 1;
u.u[0];

See C99 section 6.5 "Expression s":

An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:73)
[...]
- a character type.

But character type is not a union.
You're accessing an object of type "int" through an object of
character type. The fact that the "int" is inside a union is
immaterial.
Moreover, it actually says

- an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
This gives permission for a different class of accesses, one that
in this case we're not interested in.
--
char a[]="\n .CJacehknorstu" ;int putchar(int);in t main(void){unsi gned long b[]
={0x67dffdff,0x 9aa9aa6a,0xa77f fda9,0x7da6aa6a ,0xa67f6aaa,0xa a9aa9f6,0x11f6} ,*p
=b,i=24;for(;p+ =!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
2:{i++;if(i)bre ak;else default:continu e;if(0)case 1:putchar(a[i&15]);break;}}}
Mar 13 '07 #5
Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
Ben Pfaff wrote:
>Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
>>Why is it legal to do

union U {unsigned char u[8]; int a;};
union U u;
u.a = 1;
u.u[0];
See C99 section 6.5 "Expression s":
An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:73)
[...]
- a character type.

But character type is not a union.
[snip]

u.a is of type int. u.u[0] is of type char, a character type. The
code above accesses the stored value of the object u.a using an lvalue
expression, u.u[0], which is of character type, which satisfies 6.5.

Intuitively, the rules for unions imply that u.u[0] actually does
access the first byte of u.a. Rigorously proving this from the
wording of the standard may be trickier, and it's a larger task than
I'm willing to undertake at the moment.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Mar 13 '07 #6
Keith Thompson wrote:
Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
>Ben Pfaff wrote:
>>Yevgen Muntyan <mu************ ****@tamu.eduwr ites:

Why is it legal to do

union U {unsigned char u[8]; int a;};
union U u;
u.a = 1;
u.u[0];
See C99 section 6.5 "Expression s":
An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:73)
[...]
- a character type.
But character type is not a union.
[snip]

u.a is of type int. u.u[0] is of type char, a character type. The
code above accesses the stored value of the object u.a using an lvalue
expression, u.u[0], which is of character type, which satisfies 6.5.
I am not convinced. Consider this:

int a;
int b;
int *p = &b;
*(p - 1);

It accesses value of a using an lvalue of type int. The problem is
of course that *(p-1) is illegal. Same thing with that union: why
is u.u access allowed, and why is value of u.u is the same as if you
actually set it, using u.u[0] = 8?
Intuitively, the rules for unions imply that u.u[0] actually does
access the first byte of u.a. Rigorously proving this from the
wording of the standard may be trickier, and it's a larger task than
I'm willing to undertake at the moment.
No, it's easy once you know that bytes of u.u value are the same as
bytes of value of u.

Yevgen
Mar 14 '07 #7
Ben Pfaff wrote:
Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
>Ben Pfaff wrote:
>>Yevgen Muntyan <mu************ ****@tamu.eduwr ites:

Why is it legal to do

union U {unsigned char u[8]; int a;};
union U u;
u.a = 1;
u.u[0];
See C99 section 6.5 "Expression s":

An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:73)
[...]
- a character type.
But character type is not a union.

You're accessing an object of type "int" through an object of
character type.
Only when I have this object of character type. The very question
is why "u.u" yields an object of character type after u.a = 1;
(i.e. why it's not UB or constraint violation or something), and
if u.u is indeed allowed then why bytes in u.u value will be the
same as in u.a (first bytes, of course, ignoring sizes and padding
and whatnot).

I guess my question is actually this:

union U {int a; float b;};
u.a = something;

Is 'u.b' allowed here given that the bit representation of
u.a is a bit representation of a float object, and is u.b
value the same as if we did

int a = something;
float b;
memcpy (&b, &a, 4);

assuming 4 bytes int and float.

Best regards,
Yevgen
Mar 14 '07 #8
Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
Consider this:

int a;
int b;
int *p = &b;
*(p - 1);

It accesses value of a using an lvalue of type int.
No, it doesn't. It yields undefined behavior. And real
compilers are likely to put "a" into a register here, defeating
this idea in practice.
--
"I've been on the wagon now for more than a decade. Not a single goto
in all that time. I just don't need them any more. I don't even use
break or continue now, except on social occasions of course. And I
don't get carried away." --Richard Heathfield
Mar 14 '07 #9
Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
Ben Pfaff wrote:
>Yevgen Muntyan <mu************ ****@tamu.eduwr ites:
>>Ben Pfaff wrote:
Yevgen Muntyan <mu************ ****@tamu.eduwr ites:

Why is it legal to do
>
union U {unsigned char u[8]; int a;};
union U u;
u.a = 1;
u.u[0];
See C99 section 6.5 "Expression s":

An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:73)
[...]
- a character type.
But character type is not a union.

You're accessing an object of type "int" through an object of
character type.

Only when I have this object of character type. The very question
is why "u.u" yields an object of character type after u.a = 1;
(i.e. why it's not UB or constraint violation or something), and
if u.u is indeed allowed then why bytes in u.u value will be the
same as in u.a (first bytes, of course, ignoring sizes and padding
and whatnot).
I don't really understand this question. The standard has
wording that says that u.a and u.u are at the same address, and
it has wording that says that any object may be accessed through
an lvalue of character type[*]. Put the two together, and it's
allowed.
[*] It's best to use an unsigned character type: signed character
types can have trap representations .
I guess my question is actually this:

union U {int a; float b;};
u.a = something;

Is 'u.b' allowed here given that the bit representation of
u.a is a bit representation of a float object,
No. There's a special dispensation in C99 6.5 (which we've
discussed) which allows accessing any object as an array of
characters. There's no such dispensation for aliasing an int and
a float.
and is u.b value the same as if we did

int a = something;
float b;
memcpy (&b, &a, 4);

assuming 4 bytes int and float.
No, that's a different situation: memcpy accesses objects as
arrays of characters. Thus, you can use it to do this sort of
thing and then access "b" as a float, given some additional
provisos (e.g. the bits in "a" are not a trap representation when
interpreted as float, "float" is 4 bytes long, "int" is at least
4 bytes long, ...)
--
int main(void){char p[]="ABCDEFGHIJKLM NOPQRSTUVWXYZab cdefghijklmnopq rstuvwxyz.\
\n",*q="kl BIcNBFr.NKEzjwC IxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+= strchr(p,*q++)-p;if(i>=(int)si zeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
Mar 14 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
4001
by: Mathieu Malaterre | last post by:
Hello, I have the following problem. I need to convert a unsigned char array into a string using only number (0-9) and '.'. The goal being to stored it on the minimal number of bytes. The first approach is a representation ala IP address: 255.255.255.255.255.255 therefore it takes 6*3+5 = 23 bytes. I can even get rid of the dot since the lenght is fixed. So I can go down to 3*6 = 18 bytes.
7
21976
by: Zhang Liming | last post by:
unsigned char rcd; rcd contains 10 chars, such as '1', '2'.... etc. string str; the problem is how to pass the contents in rcd to str with minimal cost?
29
10003
by: Kenzogio | last post by:
Hi, I have a struct "allmsg" and him member : unsigned char card_number; //16 allmsg.card_number
0
9698
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10911
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10654
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10298
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9429
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7833
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7021
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
2
4066
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3136
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.