# bits and stuff

 P: n/a #include int main() { unsigned long num; unsigned int a = 10, b = 20, c = 30, d = 40; /* num = 0; */ num |= a << 24; num |= b << 16; num |= c << 8; num |= d << 0; printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d); printf("num = %08.8x\n", num); return 0; } What I'm trying to do here is pack a, b, c, and d into num. It works if I set num = 0, but how come it's not working if I leave it out (like above)? I know that when num is declared, its memory contents is full of garbage, but I thought that by packing the ints into it would've overwritten all the garbage? (hope that makes sense) Thanks, Joe Nov 14 '05 #1
 P: n/a Joe Laughlin wrote: #include int main() { unsigned long num; unsigned int a = 10, b = 20, c = 30, d = 40; /* num = 0; */ num |= a << 24; This is the same as writing: num = num | a << 24; i.e. you're doing a bitwise `or' of whatever was in `num' and `a' left-shifted 24 bits. See what the problem is? num |= b << 16; num |= c << 8; num |= d << 0; printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d); printf("num = %08.8x\n", num); return 0; } What I'm trying to do here is pack a, b, c, and d into num. It works if I set num = 0, but how come it's not working if I leave it out (like above)? I know that when num is declared, its memory contents is full of garbage, but I thought that by packing the ints into it would've overwritten all the garbage? (hope that makes sense) See above. HTH, --ag -- Artie Gold -- Austin, Texas "What they accuse you of -- is what they have planned." Nov 14 '05 #2

 P: n/a On Fri, 11 Jun 2004 23:01:49 GMT, in comp.lang.c , "Joe Laughlin" wrote: #include int main(){ unsigned long num; this is an uninitialised variable. It contains garbage. num |= a << 24; here you OR the bits of a with garbage. GIGO. What I'm trying to do here is pack a, b, c, and d into num. This is not guaranteed to work at all - you're assuming that sizeof(long)==4 which need not be true. I thought that by packing the ints into it would'veoverwritten all the garbage? (hope that makes sense) You're ORing it with the garbage. -- Mark McIntyre CLC FAQ CLC readme: ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- Nov 14 '05 #3

 P: n/a On Sat, 12 Jun 2004 00:27:04 +0100, Mark McIntyre wrote in comp.lang.c: On Fri, 11 Jun 2004 23:01:49 GMT, in comp.lang.c , "Joe Laughlin" wrote:#include int main(){ unsigned long num; this is an uninitialised variable. It contains garbage. num |= a << 24; here you OR the bits of a with garbage. GIGO.What I'm trying to do here is pack a, b, c, and d into num. This is not guaranteed to work at all - you're assuming that sizeof(long)==4 which need not be true. No he's not. He's assuming that unsigned long contains at least 32 value bits, and that is guaranteed by the standard. On the compiler I used at work today, unsigned long has exactly 32 value bits. And sizeof(unsigned long) is 2, not 4. I thought that by packing the ints into it would'veoverwritten all the garbage? (hope that makes sense) You're ORing it with the garbage. Well, that's true. -- Jack Klein Home: http://JK-Technology.Com FAQs for comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html comp.lang.c++ http://www.parashift.com/c++-faq-lite/ alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html Nov 14 '05 #4

 P: n/a On Fri, 11 Jun 2004 23:01:49 GMT, "Joe Laughlin" wrote in comp.lang.c: In addition to what others have said, you're risking undefined behavior in ways they haven't pointed out. #include int main() { unsigned long num; unsigned int a = 10, b = 20, c = 30, d = 40; /* num = 0; */ num |= a << 24; The C standard requires that signed and unsigned ints be able to represent a range of values that requires them to have at least 16 bits. There are indeed implementations where an unsigned int has 16 bits and no more, although this is no longer common in popular desk top systems. On such a system, shifting an unsigned int left by 24, or by 16 as below, generates undefined behavior. This should be written as: num = (unsigned long)a << 24; The first one only should be "=" rather than "|=". num |= b << 16; This one also needs the cast to (unsigned long). The other two do not. num |= c << 8; num |= d << 0; printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d); printf("num = %08.8x\n", num); return 0; } What I'm trying to do here is pack a, b, c, and d into num. It works if I set num = 0, but how come it's not working if I leave it out (like above)? I know that when num is declared, its memory contents is full of garbage, but I thought that by packing the ints into it would've overwritten all the garbage? (hope that makes sense) Thanks, Joe -- Jack Klein Home: http://JK-Technology.Com FAQs for comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html comp.lang.c++ http://www.parashift.com/c++-faq-lite/ alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html Nov 14 '05 #5

 P: n/a Mark McIntyre wrote: wrote: #include int main() { unsigned long num; this is an uninitialised variable. It contains garbage. num |= a << 24; here you OR the bits of a with garbage. GIGO. What I'm trying to do here is pack a, b, c, and d into num. This is not guaranteed to work at all - you're assuming that sizeof(long)==4 which need not be true. I thought that by packing the ints into it would've overwritten all the garbage? (hope that makes sense) You're ORing it with the garbage. In addition, the "a << 24" expression above is an int expression. It is only converted to long for the addition to num, which is too late. If int happens to be 16 bits you have undefined behaviour. Thus that should be written as: ((unsigned long)a << 24) and similarly for the b value. Next you should worry about CHAR_BIT being larger than 8, and a,b,c,d potentially containing values larger than 255. -- Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net) Available for consulting/temporary embedded and systems. USE worldnet address! Nov 14 '05 #6

 P: n/a Joe Laughlin wrote: #include int main() { unsigned long num; unsigned int a = 10, b = 20, c = 30, d = 40; /* num = 0; */ num |= a << 24; num |= b << 16; num |= c << 8; num |= d << 0; printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d); printf("num = %08.8x\n", num); return 0; } What I'm trying to do here is pack a, b, c, and d into num. It works if I set num = 0, but how come it's not working if I leave it out (like above)? I know that when num is declared, its memory contents is full of garbage, but I thought that by packing the ints into it would've overwritten all the garbage? (hope that makes sense) Thanks, Joe Other posters pointed out the issues with the above. Have you thought about using a union (let the compiler do the work for you)? int main() { union { unsigned long n; unsigned char piece[ sizeof (unsigned long) ]; } num; unsigned int a = 10, b = 20, c = 30, d = 40; num.piece[ 0 ] = a; num.piece[ 1 ] = b; num.piece[ 2 ] = c; num.piece[ 3 ] = d; printf("a = %2.2x\nb = %2.2x\nc = %2.2x\nd = %2.2x\n", a, b, c, d); printf("num = %8.8lx\n", num.n); return (0); } Of course, if your unsigned ints overflow what will fit in an unsigned char, you'll get unknown results... HTH, Stephen Nov 14 '05 #7

 P: n/a "Stephen L." wrote: .... snip ... Have you thought about using a union (let the compiler do the work for you)? int main() { union { unsigned long n; unsigned char piece[ sizeof (unsigned long) ]; } num; unsigned int a = 10, b = 20, c = 30, d = 40; That approach is inherently unsafe, both for misuse of the union, and for dependence on byte sex. -- Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net) Available for consulting/temporary embedded and systems. USE worldnet address! Nov 14 '05 #9

 P: n/a Hiho, [union-for-bytewise-access] That approach is inherently unsafe, both for misuse of the union, and for dependence on byte sex. Mmmm, I know about the former but could you or someone else please expand your answer concerning the latter? I am not even sure what you mean by byte sex... I quickly scanned the question list of the faq and did not find one dealing with this. For accessing double bits/bytes/words/whatever, I would rather look at the memory representation with the help of pointers but I am not sure whether this is the only, let alone the best way. Cheers, Michael Nov 14 '05 #10

 P: n/a On Fri, 11 Jun 2004 21:26:13 -0500, in comp.lang.c , Jack Klein wrote: On Sat, 12 Jun 2004 00:27:04 +0100, Mark McIntyre wrote in comp.lang.c: On Fri, 11 Jun 2004 23:01:49 GMT, in comp.lang.c , "Joe Laughlin" wrote: >What I'm trying to do here is pack a, b, c, and d into num. This is not guaranteed to work at all - you're assuming that sizeof(long)==4 which need not be true.No he's not. He's assuming that unsigned long contains at least 32value bits, And that each of his unsigned ints has at most 8 relevant value bits. Is that certain, on all implementations? and that is guaranteed by the standard. You're right in that. On the compiler I used at work today, unsigned long has exactly 32value bits. And sizeof(unsigned long) is 2, not 4. I'd be interested to know if it has a comforming hosted implementation tho :-) -- Mark McIntyre CLC FAQ CLC readme: ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- Nov 14 '05 #11

 P: n/a On Sat, 12 Jun 2004 07:30:16 -0400, in comp.lang.c , "Stephen L." wrote: Joe Laughlin wrote:Have you thought about using a union (let thecompiler do the work for you)? snip example of packing a union and then unpacking it differently. Of course, if your unsigned ints overflow what will fitin an unsigned char, you'll get unknown results... But reading from a union by accessing a member other than that which you wrote last is UB anyway. Its a common extension to place meaning on the behaviour of course, but you can't rely on it. -- Mark McIntyre CLC FAQ CLC readme: ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- Nov 14 '05 #12

 P: n/a On Sun, 13 Jun 2004 22:23:09 +0200, in comp.lang.c , Michael Mair wrote: Hiho, [union-for-bytewise-access] That approach is inherently unsafe, both for misuse of the union, and for dependence on byte sex.Mmmm, I know about the former but could you or someone else pleaseexpand your answer concerning the latter? I am not even sure whatyou mean by byte sex... I think CBF means Endianness. See Chris Torek's post. -- Mark McIntyre CLC FAQ CLC readme: ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- Nov 14 '05 #13

 P: n/a in comp.lang.c i read: [union-for-bytewise-access] That approach is inherently unsafe, both for misuse of the union, and for dependence on byte sex.Mmmm, I know about the former but could you or someone else pleaseexpand your answer concerning the latter? I am not even sure whatyou mean by byte sex... which end of the array of bytes is the least significant, i.e., is the least significant byte [0] or [sizeof object - 1]? and those are not the only possibilities if sizeof object is greater than 2, where [1] is another though it's not common to see these days. signed long value = 1; unsigned char bytes[sizeof value]; memcpy(bytes, &value, sizeof value); /* is bytes[0], bytes[1] or bytes[sizeof bytes - 1], or some other, the 1? */ today one tends to see 01 00 00 00 (little endian) or 00 00 00 01 (big endian), which correspond to b[0] or b[sizeof b - 1] having the 1. and that ignores padding, which is, again, unlikely these days but it is allowed, and for all anyone knows it will reappear or you're program will have to work on a dinosaur. if there is padding then it may be that none of the bytes in my example will be a 1, or there may be more than one with a non-zero value. oh, and what if sizeof value is 1? are you thinking `how can such a thing be'? in c it is possible if CHAR_BIT is 32 or larger, which is seen on today's dsp's. in that case you aren't accessing octets, which is often what people want to do with the sort of tricks discussed, rather you are accessing the one, 32 bit, byte, hence [0] is all there is, but how you serialize it's octets remains an issue. all this makes working with internal representations a difficult and tedious, though not insurmountable thing. -- a signature Nov 14 '05 #14

 P: n/a Michael Mair wrote: [union-for-bytewise-access] That approach is inherently unsafe, both for misuse of the union, and for dependence on byte sex. Mmmm, I know about the former but could you or someone else please expand your answer concerning the latter? I am not even sure what you mean by byte sex... Also known as endianess. The order of octets within the representation of an integer, or other item. I know of at least 3 fairly popular versions for 32 bit integers. The shift, mask, and add method is independant of this. -- Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net) Available for consulting/temporary embedded and systems. USE worldnet address! Nov 14 '05 #15

 P: n/a Thanks for enlightening me :-) Nov 14 '05 #16

 P: n/a Thanks for enlightening me :-) Nov 14 '05 #17

 P: n/a Mark McIntyre wrote: But reading from a union by accessing a member other than that which you wrote last is UB anyway. Its a common extension to place meaning on the behaviour of course, but you can't rely on it. FWIW, doesn't this make unions worse than useless in portable code? If there were two (or more) variables that never needed to maintain their values concurrently, I would expect a compiler to assign them the same bit of memory anyway. Nov 14 '05 #18

 P: n/a ol*****@inspire.net.nz (Old Wolf) wrote: Mark McIntyre wrote: But reading from a union by accessing a member other than that which you wrote last is UB anyway. Its a common extension to place meaning on the behaviour of course, but you can't rely on it. FWIW, doesn't this make unions worse than useless in portable code? If there were two (or more) variables that never needed to maintain their values concurrently, I would expect a compiler to assign them the same bit of memory anyway. It cannot always tell. Besides, which is the more useful declaration: int foodvalue(union animal animal); or int foodvalue(struct common_animal animal, struct fish fish, struct bird bird, struct mammal mammal, enum taxon which_taxon); I'd prefer the former. Richard Nov 14 '05 #19

 P: n/a On 15 Jun 2004 17:31:03 -0700, in comp.lang.c , ol*****@inspire.net.nz (Old Wolf) wrote: Mark McIntyre wrote: But reading from a union by accessing a member other than that which you wrote last is UB anyway. Its a common extension to place meaning on the behaviour of course, but you can't rely on it.FWIW, doesn't this make unions worse than useless in portable code? Not at all. Take a look at how the original MS Excel toolkit passed cell contents to and from functions: stuct { char datatype; union { int intdata; long longdata; float floatdata; double doubledata; // etc } } -- Mark McIntyre CLC FAQ CLC readme: ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- Nov 14 '05 #20

