By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,988 Members | 1,341 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,988 IT Pros & Developers. It's quick & easy.

byte alignment in structures and unions

P: n/a
Hi!

I want to assign the number 129 (binary 10000001) to the MSB (most
significant byte) of a 4-byte long and leave the other lower bytes in-
tact!

-working on normal pentium (...endian)

I want to do it with code that does NOT use shifts (<<) , bit-
operations (| &) !!
So the compiler will have to do the work and I'll introduce
appropriate structs and unions.
#include <stdio.h>

struct each_of_four {
unsigned char byte0;
unsigned char byte1;
unsigned char byte2;
unsigned char byte3;
}
/*__attribute__ ((packed))*/
;

union align_long_and_each_of_four {
long dummy; /* 4 bytes */
struct each_of_four four;
}
/*__attribute__ ((packed))*/
;
int main(void)
{
long val; // 4 bytes

/****************** TEST A: COMPILER ERROR - WHY?
*********************/
((union align_long_and_each_of_four) val).four.byte3 = (unsigned
char) 129;


#define FUNNY_NUMBER ((union align_long_and_each_of_four) \
(const long) ((129<<24) | (val & 16777215))).four.byte3
// 16777215 = 2^24-1

printf("test FUNNY_NUMBER: %d\n", FUNNY_NUMBER);

/****************** TEST B: COMPILER ERROR - WHY?
*********************/
((union align_long_and_each_of_four) val).four.byte3 = FUNNY_NUMBER;

return 0;
}
Compiler error report--->
test_align.c:25: error: invalid lvalue in assignment
test_align.c:39: error: invalid lvalue in assignment

How can this be fixed??

Thanks
anon.asdf

Aug 9 '07 #1
Share this Question
Share on Google+
17 Replies


P: n/a
/****************** TEST A: COMPILER ERROR - WHY?
*********************/
((union align_long_and_each_of_four) val).four.byte3 = (unsigned
char) 129;

The really interesting here, is that the following code DOES work!

{
union align_long_and_each_of_four tmp;

tmp.four.byte3 = (unsigned char) 129;
}

But still - how can the compiler error in TEST A be fixed??

Aug 9 '07 #2

P: n/a
an*******@gmail.com wrote On 08/09/07 13:38,:
Hi!

I want to assign the number 129 (binary 10000001) to the MSB (most
significant byte) of a 4-byte long and leave the other lower bytes in-
tact!

-working on normal pentium (...endian)

I want to do it with code that does NOT use shifts (<<) , bit-
operations (| &) !!
So the compiler will have to do the work and I'll introduce
appropriate structs and unions.
#include <stdio.h>

struct each_of_four {
unsigned char byte0;
unsigned char byte1;
unsigned char byte2;
unsigned char byte3;
}
/*__attribute__ ((packed))*/
;

union align_long_and_each_of_four {
long dummy; /* 4 bytes */
struct each_of_four four;
}
/*__attribute__ ((packed))*/
;
int main(void)
{
long val; // 4 bytes

/****************** TEST A: COMPILER ERROR - WHY?
*********************/
((union align_long_and_each_of_four) val).four.byte3 = (unsigned
char) 129;


#define FUNNY_NUMBER ((union align_long_and_each_of_four) \
(const long) ((129<<24) | (val & 16777215))).four.byte3
// 16777215 = 2^24-1

printf("test FUNNY_NUMBER: %d\n", FUNNY_NUMBER);

/****************** TEST B: COMPILER ERROR - WHY?
*********************/
((union align_long_and_each_of_four) val).four.byte3 = FUNNY_NUMBER;
Because you cannot cast to or from a union (or struct)
type: They are not "scalar types" (6.5.4p2). Keep in mind
that a cast is an operator that converts a value, not a
magical "let's pretend" construct. And in any case, the
value produced by a cast operator has the same status as
a value produced by (for example) a unary minus operator:
You cannot write `-x = 42', either.
return 0;
}
Compiler error report--->
test_align.c:25: error: invalid lvalue in assignment
test_align.c:39: error: invalid lvalue in assignment

How can this be fixed??
One way is

((unsigned char*)&val)[3] = 129;

Of course, this fails miserably if `val' is not four bytes
long with the MSB in the fourth position. A better way is

val = (val & 0xffffffUL) | (129UL << 24);

(Yes, I know you said you didn't want to use shifts or
bitwise operators. Tough: It's a better way anyhow.)

A final thought: *Every* solution has the problem that
it makes non-portable assumptions about what happens to
the value of `val' when you reach in and hammer one of its
bytes. When you do so, you have left the guarantees of the
C language behind, and will need to make your way in
uncharted territory without their protection. Things would
be somewhat better with `unsigned long', but ...

--
Er*********@sun.com
Aug 9 '07 #3

P: n/a
On Thu, 09 Aug 2007 17:38:22 +0000, anon.asdf wrote:
Hi!

I want to assign the number 129 (binary 10000001) to the MSB (most
significant byte) of a 4-byte long and leave the other lower bytes in-
tact!

-working on normal pentium (...endian)

I want to do it with code that does NOT use shifts (<<) , bit-
operations (| &) !!
l %= 0x01000000;
l += 129 * 0x01000000;
This works regardless of endianness.
So the compiler will have to do the work and I'll introduce
appropriate structs and unions.
#include <stdio.h>

struct each_of_four {
unsigned char byte0;
unsigned char byte1;
unsigned char byte2;
unsigned char byte3;
}
/*__attribute__ ((packed))*/
What was wrong with unsigned char bytes[4], which causes the same
thing that the stuff you commented out would do somewhere, but in
standard C?
;
union align_long_and_each_of_four {
long dummy; /* 4 bytes */
struct each_of_four four;
}
/*__attribute__ ((packed))*/
;
What was wrong with { long dummy; unsigned char four[4]; }?
int main(void)
{
long val; // 4 bytes

/****************** TEST A: COMPILER ERROR - WHY?
*********************/
((union align_long_and_each_of_four) val).four.byte3 = (unsigned
char) 129;
Because the result of a cast isn't a lvalue.
Try
((unsigned char *)&val)[3] = 129;
No unions and no struct needed.
#define FUNNY_NUMBER ((union align_long_and_each_of_four) \
(const long) ((129<<24) | (val & 16777215))).four.byte3
Didn't you say you didn't want to use bitwise operations?
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 9 '07 #4

P: n/a
On Thu, 09 Aug 2007 17:49:01 +0000, anon.asdf wrote:
> /****************** TEST A: COMPILER ERROR - WHY?
*********************/
((union align_long_and_each_of_four) val).four.byte3 = (unsigned
char) 129;
[snip]
But still - how can the compiler error in TEST A be fixed??
If you *really* want to do that, try
(union align_long_and_each_of_four *)val->four.byte3 = 129;
But there are better ways to do that, see my other reply.
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 9 '07 #5

P: n/a
On Aug 9, 8:40 pm, Eric Sosman <Eric.Sos...@sun.comwrote:
One way is

((unsigned char*)&val)[3] = 129;
Thank you for the insights!

((unsigned char*)&val)[3] = 129;
is elegant.

I wonder if the compiler resolves it (above) to the same shifts as
val = (val & 0xffffffUL) | (129UL << 24);
or utilizes some tighter optimization, if the architecture allows it.
??

Things would
be somewhat better with `unsigned long', but ...
How does `unsigned long' change the situation?

-anon.asdf

Aug 9 '07 #6

P: n/a
On Aug 9, 9:07 pm, Army1987 <army1...@NOSPAM.itwrote:
l %= 0x01000000;
l += 129 * 0x01000000;
This works regardless of endianness.

Unfortunately this does not work! Try

{
long val = (129<<24) + 1;
val %= 0x01000000;
val += 129 * 0x01000000;
printf("%ld\n", val); // get -2147483647, but should be -2130706431
}

What was wrong with { long dummy; unsigned char four[4]; }?
Nothing. I could use:
((unsigned char *)&dummy)[3] = four[3];
Try
((unsigned char *)&val)[3] = 129;
No unions and no struct needed.
Yes - that's perfect!
#define FUNNY_NUMBER ((union align_long_and_each_of_four) \
(const long) ((129<<24) | (val & 16777215))).four.byte3

Didn't you say you didn't want to use bitwise operations?
True.
But I'm hoping the compiler will resolve it to a constant, so the
shifts are only in the c-code, but not in the machine code.

Thanks for your comments!

-anon.asdf

Aug 9 '07 #7

P: n/a
On Aug 9, 9:10 pm, Army1987 <army1...@NOSPAM.itwrote:
If you *really* want to do that, try
(union align_long_and_each_of_four *)val->four.byte3 = 129;
Thanks! thats good - what I was looking for!
- but you forgot the & and parenthesis:

((union align_long_and_each_of_four *)&val)->four.byte3 = 129;

Regards,
anon.asdf

Aug 9 '07 #8

P: n/a
#define FUNNY_NUMBER ((union align_long_and_each_of_four) \
(const long) ((129<<24) | (val & 16777215))).four.byte3
Didn't you say you didn't want to use bitwise operations?

True.
But I'm hoping the compiler will resolve it to a constant, so the
shifts are only in the c-code, but not in the machine code.

Thanks for your comments!

-anon.asdf
My comment here is incorrect! It can never be a constant, since it
includes the variable val .
-anon.asdf

Aug 9 '07 #9

P: n/a
In article <11*********************@z24g2000prh.googlegroups. com>
<an*******@gmail.comwrote:
>... ((unsigned char*)&val)[3] = 129; is elegant.
Elegant, but not terribly portable, and on some machines, a lot
slower than the shift-and-mask method:
>I wonder if the compiler resolves it (above) to the same shifts as
val = (val & 0xffffffUL) | (129UL << 24);
or utilizes some tighter optimization, if the architecture allows it.
This depends on the architecture *and* the optimizer.

Taking the address of variables defeats some optimizers entirely.
In such cases, the compiler may "throw up its hands in defeat" as
it were, and compile code like:

store reg, mem | put "val" into RAM so it can be modified piece-wise
movi #129, t0 | tempreg = constant
store tmp, mem+3 | set mem[3]
load reg, mem | pull "val" back out of RAM

which, on register-oriented machines where RAM is slow compared to
the CPU, may take a dozen or more clock cycles. (Clever caches may
manage to shrink this to just 3 clock cycles in the best case: one
for the first store, one for the second store done "in parallel" with
the move-immediate, and one for the load.)

The shift-and-mask version might instead compile to:

movih #0xff00, t0 | tempreg = 0xff00 << 16
andn reg, t0, reg | val &= ~tempreg
movih #0x8100, t0 | tempreg = 0x8100 << 16 (ie 129UL << 24)
or reg, t0, reg

which, although it is still four instructions, executes in two
clock cycles (two instructions per clock), regardless of cache
activity and RAM and so on.

Other optimizers are a bit (or even a lot) more clever, and can
indeed turn the one sequence into the other.

The main disadvantage to the "access individual bytes of variable"
method is that it not only depends on the size of bytes -- which
tends to be exactly 8 bits across a wide variety of machines today,
so that you are relatively safe there -- but also on the "endian-ness"
of the CPU, which tends to vary. The shift-and-mask version,
although it is more verbose in source form, is a lot easier for
most optimizers.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Aug 9 '07 #10

P: n/a
Army1987 wrote:
>
On Thu, 09 Aug 2007 17:38:22 +0000, anon.asdf wrote:
Hi!

I want to assign the number 129 (binary 10000001)
to the MSB (most significant byte) of a 4-byte long
and leave the other lower bytes in-tact!
Try
((unsigned char *)&val)[3] = 129;
No unions and no struct needed.
You can assign the value of 129
to the highest addressed byte of any object,
this way:

((unsigned char *)&val)[sizeof val - 1] = 129;

--
pete
Aug 9 '07 #11

P: n/a
On Thu, 09 Aug 2007 21:07:47 +0200, Army1987 wrote:
On Thu, 09 Aug 2007 17:38:22 +0000, anon.asdf wrote:
>Hi!

I want to assign the number 129 (binary 10000001) to the MSB (most
significant byte) of a 4-byte long and leave the other lower bytes in-
tact!
That'd be the representation of a negative integer.
l %= 0x01000000;
l += 129 * 0x01000000;
This works regardless of endianness.
It would work if l were unsigned long, and either 129 * 0x01000000
fitted in a signed long, or I wrote l += 129U * 0x01000000; (or
l += 0x81000000; of course).
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 10 '07 #12

P: n/a
On Thu, 09 Aug 2007 13:06:50 -0700, anon.asdf wrote:
On Aug 9, 9:10 pm, Army1987 <army1...@NOSPAM.itwrote:
>If you *really* want to do that, try
(union align_long_and_each_of_four *)val->four.byte3 = 129;

Thanks! thats good - what I was looking for!
- but you forgot the & and parenthesis:

((union align_long_and_each_of_four *)&val)->four.byte3 = 129;
Yeah...
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 10 '07 #13

P: n/a
In article <46***********@mindspring.com>,
pete <pf*****@mindspring.comwrote:
>Army1987 wrote:
>On Thu, 09 Aug 2007 17:38:22 +0000, anon.asdf wrote:
I want to assign the number 129 (binary 10000001)
to the MSB (most significant byte) of a 4-byte long
and leave the other lower bytes in-tact!
>Try
((unsigned char *)&val)[3] = 129;
No unions and no struct needed.
>You can assign the value of 129
to the highest addressed byte of any object,
this way:
((unsigned char *)&val)[sizeof val - 1] = 129;
Yes, that should indeed assign into the highest addressed byte.
Unfortunately the highest addressed byte might not be the MSB
(most significant byte). On big-endian machines, it would
often be the lowest addressed byte that is the MSB.
--
There are some ideas so wrong that only a very intelligent person
could believe in them. -- George Orwell
Aug 10 '07 #14

P: n/a
pete <pf*****@mindspring.comwrites:
Army1987 wrote:
>On Thu, 09 Aug 2007 17:38:22 +0000, anon.asdf wrote:
I want to assign the number 129 (binary 10000001)
to the MSB (most significant byte) of a 4-byte long
and leave the other lower bytes in-tact!
>Try
((unsigned char *)&val)[3] = 129;
No unions and no struct needed.

You can assign the value of 129
to the highest addressed byte of any object,
this way:

((unsigned char *)&val)[sizeof val - 1] = 129;
But the question was how to assign a value to the most significant
byte, not the highest addressed byte.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 10 '07 #15

P: n/a
In article <pa****************************@NOSPAM.it>,
Army1987 <ar******@NOSPAM.itwrote:
>On Thu, 09 Aug 2007 17:38:22 +0000, anon.asdf wrote:
>>I want to assign the number 129 (binary 10000001) to the MSB (most
significant byte) of a 4-byte long and leave the other lower bytes in-
tact!
>That'd be the representation of a negative integer.
Not necessarily.

A) We don't know how big a byte is on the target machine. It
might not be overflow.
B) If you are working in signed mode on an 8 bit byte,
then it is overflow and so not defined;
C) If you are working unsigned, it is not overflow, but if the
machine is a seperated-sign machine, the correspondance between
sign bit and arithmetic values is unspecified (but other
representation constraints pretty much imply the seperated-sign
would have to be the most significant bit.)
--
Programming is what happens while you're busy making other plans.
Aug 10 '07 #16

P: n/a
On Aug 10, 2:01 am, Army1987 <army1...@NOSPAM.itwrote:
On Thu, 09 Aug 2007 21:07:47 +0200, Army1987 wrote:
On Thu, 09 Aug 2007 17:38:22 +0000, anon.asdf wrote:
Hi!
I want to assign the number 129 (binary 10000001) to the MSB (most
significant byte) of a 4-byte long and leave the other lower bytes in-
tact!

That'd be the representation of a negative integer.

{
long val = (129<<24) + 1;
val %= 0x01000000;
val += 129 * 0x01000000;
printf("%ld\n", val); // get -2147483647, but should be -2130706431

}

referring to the above
It would work if val were unsigned long, and either 129 * 0x01000000
fitted in a signed long, or I wrote val += 129U * 0x01000000; (or
val += 0x81000000; of course).
It can also work if val is signed long - as follows:
{
long val /* = 0 */;
val %= 0x01000000U;
val += 129U * 0x01000000;
printf("%ld\n", val);
}

-anon.asdf

Aug 10 '07 #17

P: n/a
On Fri, 10 Aug 2007 02:13:09 +0000, Walter Roberson wrote:
In article <pa****************************@NOSPAM.it>,
Army1987 <ar******@NOSPAM.itwrote:
>>On Thu, 09 Aug 2007 17:38:22 +0000, anon.asdf wrote:
>>>I want to assign the number 129 (binary 10000001) to the MSB (most
significant byte) of a 4-byte long and leave the other lower bytes in-
tact!
>>That'd be the representation of a negative integer.

Not necessarily.

A) We don't know how big a byte is on the target machine. It
might not be overflow.
Speak for yourself. I do know how big a byte is on the OP's
machine.
B) If you are working in signed mode on an 8 bit byte,
then it is overflow and so not defined;
Well, do you think anybody will speak of single bytes in a larger
object in terms of a signed char?
C) If you are working unsigned, it is not overflow, but if the
machine is a seperated-sign machine, the correspondance between
sign bit and arithmetic values is unspecified (but other
representation constraints pretty much imply the seperated-sign
would have to be the most significant bit.)
When the sign bit is set, the value is negative (provided it isn't
a trap), period. This is true in any of the three allowed
representations. And I happen to know that the OP has two's
complement and no trap representation.
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 10 '07 #18

This discussion thread is closed

Replies have been disabled for this discussion.