By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,269 Members | 1,506 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,269 IT Pros & Developers. It's quick & easy.

bits and stuff

P: n/a
#include <stdio.h>

int main()
{
unsigned long num;
unsigned int a = 10, b = 20, c = 30, d = 40;
/* num = 0; */

num |= a << 24;
num |= b << 16;
num |= c << 8;
num |= d << 0;

printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d);
printf("num = %08.8x\n", num);

return 0;
}

What I'm trying to do here is pack a, b, c, and d into num.

It works if I set num = 0, but how come it's not working if I leave it out
(like above)? I know that when num is declared, its memory contents is full
of garbage, but I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)

Thanks,
Joe


Nov 14 '05 #1
Share this Question
Share on Google+
19 Replies


P: n/a
Joe Laughlin wrote:
#include <stdio.h>

int main()
{
unsigned long num;
unsigned int a = 10, b = 20, c = 30, d = 40;
/* num = 0; */

num |= a << 24;
This is the same as writing:
num = num | a << 24;

i.e. you're doing a bitwise `or' of whatever was in `num' and `a'
left-shifted 24 bits. See what the problem is?
num |= b << 16;
num |= c << 8;
num |= d << 0;

printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d);
printf("num = %08.8x\n", num);

return 0;
}

What I'm trying to do here is pack a, b, c, and d into num.

It works if I set num = 0, but how come it's not working if I leave it out
(like above)? I know that when num is declared, its memory contents is full
of garbage, but I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)

See above.

HTH,
--ag
--
Artie Gold -- Austin, Texas

"What they accuse you of -- is what they have planned."
Nov 14 '05 #2

P: n/a
On Fri, 11 Jun 2004 23:01:49 GMT, in comp.lang.c , "Joe Laughlin"
<Jo***************@boeing.com> wrote:
#include <stdio.h>

int main()
{
unsigned long num;
this is an uninitialised variable. It contains garbage.
num |= a << 24;
here you OR the bits of a with garbage.
GIGO.
What I'm trying to do here is pack a, b, c, and d into num.
This is not guaranteed to work at all - you're assuming that
sizeof(long)==4 which need not be true.
I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)


You're ORing it with the garbage.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #3

P: n/a
On Sat, 12 Jun 2004 00:27:04 +0100, Mark McIntyre
<ma**********@spamcop.net> wrote in comp.lang.c:
On Fri, 11 Jun 2004 23:01:49 GMT, in comp.lang.c , "Joe Laughlin"
<Jo***************@boeing.com> wrote:
#include <stdio.h>

int main()
{
unsigned long num;


this is an uninitialised variable. It contains garbage.
num |= a << 24;


here you OR the bits of a with garbage.
GIGO.
What I'm trying to do here is pack a, b, c, and d into num.


This is not guaranteed to work at all - you're assuming that
sizeof(long)==4 which need not be true.


No he's not. He's assuming that unsigned long contains at least 32
value bits, and that is guaranteed by the standard.

On the compiler I used at work today, unsigned long has exactly 32
value bits. And sizeof(unsigned long) is 2, not 4.
I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)


You're ORing it with the garbage.


Well, that's true.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #4

P: n/a
On Fri, 11 Jun 2004 23:01:49 GMT, "Joe Laughlin"
<Jo***************@boeing.com> wrote in comp.lang.c:

In addition to what others have said, you're risking undefined
behavior in ways they haven't pointed out.
#include <stdio.h>

int main()
{
unsigned long num;
unsigned int a = 10, b = 20, c = 30, d = 40;
/* num = 0; */

num |= a << 24;
The C standard requires that signed and unsigned ints be able to
represent a range of values that requires them to have at least 16
bits. There are indeed implementations where an unsigned int has 16
bits and no more, although this is no longer common in popular desk
top systems. On such a system, shifting an unsigned int left by 24,
or by 16 as below, generates undefined behavior.

This should be written as:

num = (unsigned long)a << 24;

The first one only should be "=" rather than "|=".
num |= b << 16;
This one also needs the cast to (unsigned long). The other two do
not.
num |= c << 8;
num |= d << 0;

printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d);
printf("num = %08.8x\n", num);

return 0;
}

What I'm trying to do here is pack a, b, c, and d into num.

It works if I set num = 0, but how come it's not working if I leave it out
(like above)? I know that when num is declared, its memory contents is full
of garbage, but I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)

Thanks,
Joe


--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #5

P: n/a
Mark McIntyre wrote:
<Jo***************@boeing.com> wrote:
#include <stdio.h>

int main()
{
unsigned long num;


this is an uninitialised variable. It contains garbage.
num |= a << 24;


here you OR the bits of a with garbage.
GIGO.
What I'm trying to do here is pack a, b, c, and d into num.


This is not guaranteed to work at all - you're assuming that
sizeof(long)==4 which need not be true.
I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)


You're ORing it with the garbage.


In addition, the "a << 24" expression above is an int expression.
It is only converted to long for the addition to num, which is too
late. If int happens to be 16 bits you have undefined behaviour.
Thus that should be written as:

((unsigned long)a << 24)

and similarly for the b value. Next you should worry about
CHAR_BIT being larger than 8, and a,b,c,d potentially containing
values larger than 255.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 14 '05 #6

P: n/a
Joe Laughlin wrote:

#include <stdio.h>

int main()
{
unsigned long num;
unsigned int a = 10, b = 20, c = 30, d = 40;
/* num = 0; */

num |= a << 24;
num |= b << 16;
num |= c << 8;
num |= d << 0;

printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d);
printf("num = %08.8x\n", num);

return 0;
}

What I'm trying to do here is pack a, b, c, and d into num.

It works if I set num = 0, but how come it's not working if I leave it out
(like above)? I know that when num is declared, its memory contents is full
of garbage, but I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)

Thanks,
Joe


Other posters pointed out the issues with
the above.

Have you thought about using a union (let the
compiler do the work for you)?

int
main()
{
union {
unsigned long n;
unsigned char piece[ sizeof (unsigned long) ];
} num;
unsigned int a = 10, b = 20, c = 30, d = 40;

num.piece[ 0 ] = a;
num.piece[ 1 ] = b;
num.piece[ 2 ] = c;
num.piece[ 3 ] = d;

printf("a = %2.2x\nb = %2.2x\nc = %2.2x\nd = %2.2x\n", a, b, c, d);
printf("num = %8.8lx\n", num.n);

return (0);
}

Of course, if your unsigned ints overflow what will fit
in an unsigned char, you'll get unknown results...
HTH,

Stephen
Nov 14 '05 #7

P: n/a
>Joe Laughlin wrote:
[code that uses left-shift and bitwise-OR to construct a 32-bit value
from four eight-bit values, with a slight flaw]

In article <news:40***************@cost-com.net>
Stephen L. <sd*********@cost-com.net> writes:
Other posters pointed out the issues with
the above.

Have you thought about using a union (let the
compiler do the work for you)?
This method has advantages and disadvantages. Often the
disadvantages outweigh the advantages:
union {
unsigned long n;
unsigned char piece[ sizeof (unsigned long) ];
} num;

num.piece[ 0 ] = a; [and so on]
Of course, if your unsigned ints overflow what will fit
in an unsigned char, you'll get unknown results...


The disadvantage of using the union trick -- or, equivalently,
using an "unsigned char *" to point to the individual C-bytes that
make up an "unsigned long" -- is that you expose yourself to the
implementation's representation. In particular, on common
implementations today, you now have to worry about:

- sizeof(unsigned long) changing from 4 to 8
- endianness

The advantage of using the union trick is the same as the disadvantage:
you expose yourself to the implementation's representation. If
that is what you *want* to do, go ahead and do it. On the other
hand, if you just want to compose a predictable 32-bit value from
four eight-bit values, the shift-and-bitwise-OR method will always
work. The common concerns above (sizeof(unsigned long) and
endinanness) become entirely irrelevant.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #8

P: n/a
"Stephen L." wrote:
.... snip ...
Have you thought about using a union (let the
compiler do the work for you)?

int
main()
{
union {
unsigned long n;
unsigned char piece[ sizeof (unsigned long) ];
} num;
unsigned int a = 10, b = 20, c = 30, d = 40;


That approach is inherently unsafe, both for misuse of the union,
and for dependence on byte sex.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #9

P: n/a
Hiho,
[union-for-bytewise-access]

That approach is inherently unsafe, both for misuse of the union,
and for dependence on byte sex.


Mmmm, I know about the former but could you or someone else please
expand your answer concerning the latter? I am not even sure what
you mean by byte sex...
I quickly scanned the question list of the faq and did not find
one dealing with this. For accessing double bits/bytes/words/whatever,
I would rather look at the memory representation with the
help of pointers but I am not sure whether this is the only, let
alone the best way.
Cheers,
Michael

Nov 14 '05 #10

P: n/a
On Fri, 11 Jun 2004 21:26:13 -0500, in comp.lang.c , Jack Klein
<ja*******@spamcop.net> wrote:
On Sat, 12 Jun 2004 00:27:04 +0100, Mark McIntyre
<ma**********@spamcop.net> wrote in comp.lang.c:
On Fri, 11 Jun 2004 23:01:49 GMT, in comp.lang.c , "Joe Laughlin"
<Jo***************@boeing.com> wrote:

>What I'm trying to do here is pack a, b, c, and d into num.
This is not guaranteed to work at all - you're assuming that
sizeof(long)==4 which need not be true.


No he's not. He's assuming that unsigned long contains at least 32
value bits,


And that each of his unsigned ints has at most 8 relevant value bits. Is
that certain, on all implementations?
and that is guaranteed by the standard.
You're right in that.
On the compiler I used at work today, unsigned long has exactly 32
value bits. And sizeof(unsigned long) is 2, not 4.


I'd be interested to know if it has a comforming hosted implementation tho
:-)
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #11

P: n/a
On Sat, 12 Jun 2004 07:30:16 -0400, in comp.lang.c , "Stephen L."
<sd*********@cost-com.net> wrote:
Joe Laughlin wrote:
Have you thought about using a union (let the
compiler do the work for you)?


snip example of packing a union and then unpacking it differently.
Of course, if your unsigned ints overflow what will fit
in an unsigned char, you'll get unknown results...


But reading from a union by accessing a member other than that which you
wrote last is UB anyway. Its a common extension to place meaning on the
behaviour of course, but you can't rely on it.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #12

P: n/a
On Sun, 13 Jun 2004 22:23:09 +0200, in comp.lang.c , Michael Mair
<ma********************@ians.uni-stuttgart.de> wrote:
Hiho,
[union-for-bytewise-access]

That approach is inherently unsafe, both for misuse of the union,
and for dependence on byte sex.


Mmmm, I know about the former but could you or someone else please
expand your answer concerning the latter? I am not even sure what
you mean by byte sex...


I think CBF means Endianness. See Chris Torek's post.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #13

P: n/a
in comp.lang.c i read:
[union-for-bytewise-access]
That approach is inherently unsafe, both for misuse of the union,
and for dependence on byte sex.


Mmmm, I know about the former but could you or someone else please
expand your answer concerning the latter? I am not even sure what
you mean by byte sex...


which end of the array of bytes is the least significant, i.e., is the
least significant byte [0] or [sizeof object - 1]? and those are not the
only possibilities if sizeof object is greater than 2, where [1] is another
though it's not common to see these days.

signed long value = 1;
unsigned char bytes[sizeof value];
memcpy(bytes, &value, sizeof value);
/* is bytes[0], bytes[1] or bytes[sizeof bytes - 1], or some other, the 1? */

today one tends to see 01 00 00 00 (little endian) or 00 00 00 01 (big
endian), which correspond to b[0] or b[sizeof b - 1] having the 1.

and that ignores padding, which is, again, unlikely these days but it is
allowed, and for all anyone knows it will reappear or you're program will
have to work on a dinosaur. if there is padding then it may be that none
of the bytes in my example will be a 1, or there may be more than one with
a non-zero value.

oh, and what if sizeof value is 1? are you thinking `how can such a thing
be'? in c it is possible if CHAR_BIT is 32 or larger, which is seen on
today's dsp's. in that case you aren't accessing octets, which is often
what people want to do with the sort of tricks discussed, rather you are
accessing the one, 32 bit, byte, hence [0] is all there is, but how you
serialize it's octets remains an issue.

all this makes working with internal representations a difficult and
tedious, though not insurmountable thing.

--
a signature
Nov 14 '05 #14

P: n/a
Michael Mair wrote:

[union-for-bytewise-access]

That approach is inherently unsafe, both for misuse of the union,
and for dependence on byte sex.


Mmmm, I know about the former but could you or someone else please
expand your answer concerning the latter? I am not even sure what
you mean by byte sex...


Also known as endianess. The order of octets within the
representation of an integer, or other item. I know of at least 3
fairly popular versions for 32 bit integers. The shift, mask, and
add method is independant of this.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 14 '05 #15

P: n/a
Thanks for enlightening me :-)

Nov 14 '05 #16

P: n/a
Thanks for enlightening me :-)

Nov 14 '05 #17

P: n/a
Mark McIntyre <ma**********@spamcop.net> wrote:
But reading from a union by accessing a member other than that which you
wrote last is UB anyway. Its a common extension to place meaning on the
behaviour of course, but you can't rely on it.


FWIW, doesn't this make unions worse than useless in portable code?
If there were two (or more) variables that never needed to maintain
their values concurrently, I would expect a compiler to assign them
the same bit of memory anyway.
Nov 14 '05 #18

P: n/a
ol*****@inspire.net.nz (Old Wolf) wrote:
Mark McIntyre <ma**********@spamcop.net> wrote:
But reading from a union by accessing a member other than that which you
wrote last is UB anyway. Its a common extension to place meaning on the
behaviour of course, but you can't rely on it.


FWIW, doesn't this make unions worse than useless in portable code?
If there were two (or more) variables that never needed to maintain
their values concurrently, I would expect a compiler to assign them
the same bit of memory anyway.


It cannot always tell. Besides, which is the more useful declaration:

int foodvalue(union animal animal);

or

int foodvalue(struct common_animal animal,
struct fish fish, struct bird bird, struct mammal mammal,
enum taxon which_taxon);

I'd prefer the former.

Richard
Nov 14 '05 #19

P: n/a
On 15 Jun 2004 17:31:03 -0700, in comp.lang.c , ol*****@inspire.net.nz (Old
Wolf) wrote:
Mark McIntyre <ma**********@spamcop.net> wrote:
But reading from a union by accessing a member other than that which you
wrote last is UB anyway. Its a common extension to place meaning on the
behaviour of course, but you can't rely on it.


FWIW, doesn't this make unions worse than useless in portable code?


Not at all. Take a look at how the original MS Excel toolkit passed cell
contents to and from functions:

stuct
{
char datatype;
union
{
int intdata;
long longdata;
float floatdata;
double doubledata;
// etc
}
}

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #20

This discussion thread is closed

Replies have been disabled for this discussion.