473,394 Members | 1,053 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

bits and stuff

#include <stdio.h>

int main()
{
unsigned long num;
unsigned int a = 10, b = 20, c = 30, d = 40;
/* num = 0; */

num |= a << 24;
num |= b << 16;
num |= c << 8;
num |= d << 0;

printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d);
printf("num = %08.8x\n", num);

return 0;
}

What I'm trying to do here is pack a, b, c, and d into num.

It works if I set num = 0, but how come it's not working if I leave it out
(like above)? I know that when num is declared, its memory contents is full
of garbage, but I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)

Thanks,
Joe


Nov 14 '05 #1
19 1722
Joe Laughlin wrote:
#include <stdio.h>

int main()
{
unsigned long num;
unsigned int a = 10, b = 20, c = 30, d = 40;
/* num = 0; */

num |= a << 24;
This is the same as writing:
num = num | a << 24;

i.e. you're doing a bitwise `or' of whatever was in `num' and `a'
left-shifted 24 bits. See what the problem is?
num |= b << 16;
num |= c << 8;
num |= d << 0;

printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d);
printf("num = %08.8x\n", num);

return 0;
}

What I'm trying to do here is pack a, b, c, and d into num.

It works if I set num = 0, but how come it's not working if I leave it out
(like above)? I know that when num is declared, its memory contents is full
of garbage, but I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)

See above.

HTH,
--ag
--
Artie Gold -- Austin, Texas

"What they accuse you of -- is what they have planned."
Nov 14 '05 #2
On Fri, 11 Jun 2004 23:01:49 GMT, in comp.lang.c , "Joe Laughlin"
<Jo***************@boeing.com> wrote:
#include <stdio.h>

int main()
{
unsigned long num;
this is an uninitialised variable. It contains garbage.
num |= a << 24;
here you OR the bits of a with garbage.
GIGO.
What I'm trying to do here is pack a, b, c, and d into num.
This is not guaranteed to work at all - you're assuming that
sizeof(long)==4 which need not be true.
I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)


You're ORing it with the garbage.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #3
On Sat, 12 Jun 2004 00:27:04 +0100, Mark McIntyre
<ma**********@spamcop.net> wrote in comp.lang.c:
On Fri, 11 Jun 2004 23:01:49 GMT, in comp.lang.c , "Joe Laughlin"
<Jo***************@boeing.com> wrote:
#include <stdio.h>

int main()
{
unsigned long num;


this is an uninitialised variable. It contains garbage.
num |= a << 24;


here you OR the bits of a with garbage.
GIGO.
What I'm trying to do here is pack a, b, c, and d into num.


This is not guaranteed to work at all - you're assuming that
sizeof(long)==4 which need not be true.


No he's not. He's assuming that unsigned long contains at least 32
value bits, and that is guaranteed by the standard.

On the compiler I used at work today, unsigned long has exactly 32
value bits. And sizeof(unsigned long) is 2, not 4.
I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)


You're ORing it with the garbage.


Well, that's true.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #4
On Fri, 11 Jun 2004 23:01:49 GMT, "Joe Laughlin"
<Jo***************@boeing.com> wrote in comp.lang.c:

In addition to what others have said, you're risking undefined
behavior in ways they haven't pointed out.
#include <stdio.h>

int main()
{
unsigned long num;
unsigned int a = 10, b = 20, c = 30, d = 40;
/* num = 0; */

num |= a << 24;
The C standard requires that signed and unsigned ints be able to
represent a range of values that requires them to have at least 16
bits. There are indeed implementations where an unsigned int has 16
bits and no more, although this is no longer common in popular desk
top systems. On such a system, shifting an unsigned int left by 24,
or by 16 as below, generates undefined behavior.

This should be written as:

num = (unsigned long)a << 24;

The first one only should be "=" rather than "|=".
num |= b << 16;
This one also needs the cast to (unsigned long). The other two do
not.
num |= c << 8;
num |= d << 0;

printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d);
printf("num = %08.8x\n", num);

return 0;
}

What I'm trying to do here is pack a, b, c, and d into num.

It works if I set num = 0, but how come it's not working if I leave it out
(like above)? I know that when num is declared, its memory contents is full
of garbage, but I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)

Thanks,
Joe


--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #5
Mark McIntyre wrote:
<Jo***************@boeing.com> wrote:
#include <stdio.h>

int main()
{
unsigned long num;


this is an uninitialised variable. It contains garbage.
num |= a << 24;


here you OR the bits of a with garbage.
GIGO.
What I'm trying to do here is pack a, b, c, and d into num.


This is not guaranteed to work at all - you're assuming that
sizeof(long)==4 which need not be true.
I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)


You're ORing it with the garbage.


In addition, the "a << 24" expression above is an int expression.
It is only converted to long for the addition to num, which is too
late. If int happens to be 16 bits you have undefined behaviour.
Thus that should be written as:

((unsigned long)a << 24)

and similarly for the b value. Next you should worry about
CHAR_BIT being larger than 8, and a,b,c,d potentially containing
values larger than 255.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 14 '05 #6
Joe Laughlin wrote:

#include <stdio.h>

int main()
{
unsigned long num;
unsigned int a = 10, b = 20, c = 30, d = 40;
/* num = 0; */

num |= a << 24;
num |= b << 16;
num |= c << 8;
num |= d << 0;

printf("a = %02.2x\nb = %02.2x\nc = %02.2x\nd = %02.2x\n", a, b, c, d);
printf("num = %08.8x\n", num);

return 0;
}

What I'm trying to do here is pack a, b, c, and d into num.

It works if I set num = 0, but how come it's not working if I leave it out
(like above)? I know that when num is declared, its memory contents is full
of garbage, but I thought that by packing the ints into it would've
overwritten all the garbage? (hope that makes sense)

Thanks,
Joe


Other posters pointed out the issues with
the above.

Have you thought about using a union (let the
compiler do the work for you)?

int
main()
{
union {
unsigned long n;
unsigned char piece[ sizeof (unsigned long) ];
} num;
unsigned int a = 10, b = 20, c = 30, d = 40;

num.piece[ 0 ] = a;
num.piece[ 1 ] = b;
num.piece[ 2 ] = c;
num.piece[ 3 ] = d;

printf("a = %2.2x\nb = %2.2x\nc = %2.2x\nd = %2.2x\n", a, b, c, d);
printf("num = %8.8lx\n", num.n);

return (0);
}

Of course, if your unsigned ints overflow what will fit
in an unsigned char, you'll get unknown results...
HTH,

Stephen
Nov 14 '05 #7
>Joe Laughlin wrote:
[code that uses left-shift and bitwise-OR to construct a 32-bit value
from four eight-bit values, with a slight flaw]

In article <news:40***************@cost-com.net>
Stephen L. <sd*********@cost-com.net> writes:
Other posters pointed out the issues with
the above.

Have you thought about using a union (let the
compiler do the work for you)?
This method has advantages and disadvantages. Often the
disadvantages outweigh the advantages:
union {
unsigned long n;
unsigned char piece[ sizeof (unsigned long) ];
} num;

num.piece[ 0 ] = a; [and so on]
Of course, if your unsigned ints overflow what will fit
in an unsigned char, you'll get unknown results...


The disadvantage of using the union trick -- or, equivalently,
using an "unsigned char *" to point to the individual C-bytes that
make up an "unsigned long" -- is that you expose yourself to the
implementation's representation. In particular, on common
implementations today, you now have to worry about:

- sizeof(unsigned long) changing from 4 to 8
- endianness

The advantage of using the union trick is the same as the disadvantage:
you expose yourself to the implementation's representation. If
that is what you *want* to do, go ahead and do it. On the other
hand, if you just want to compose a predictable 32-bit value from
four eight-bit values, the shift-and-bitwise-OR method will always
work. The common concerns above (sizeof(unsigned long) and
endinanness) become entirely irrelevant.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #8
"Stephen L." wrote:
.... snip ...
Have you thought about using a union (let the
compiler do the work for you)?

int
main()
{
union {
unsigned long n;
unsigned char piece[ sizeof (unsigned long) ];
} num;
unsigned int a = 10, b = 20, c = 30, d = 40;


That approach is inherently unsafe, both for misuse of the union,
and for dependence on byte sex.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #9
Hiho,
[union-for-bytewise-access]

That approach is inherently unsafe, both for misuse of the union,
and for dependence on byte sex.


Mmmm, I know about the former but could you or someone else please
expand your answer concerning the latter? I am not even sure what
you mean by byte sex...
I quickly scanned the question list of the faq and did not find
one dealing with this. For accessing double bits/bytes/words/whatever,
I would rather look at the memory representation with the
help of pointers but I am not sure whether this is the only, let
alone the best way.
Cheers,
Michael

Nov 14 '05 #10
On Fri, 11 Jun 2004 21:26:13 -0500, in comp.lang.c , Jack Klein
<ja*******@spamcop.net> wrote:
On Sat, 12 Jun 2004 00:27:04 +0100, Mark McIntyre
<ma**********@spamcop.net> wrote in comp.lang.c:
On Fri, 11 Jun 2004 23:01:49 GMT, in comp.lang.c , "Joe Laughlin"
<Jo***************@boeing.com> wrote:

>What I'm trying to do here is pack a, b, c, and d into num.
This is not guaranteed to work at all - you're assuming that
sizeof(long)==4 which need not be true.


No he's not. He's assuming that unsigned long contains at least 32
value bits,


And that each of his unsigned ints has at most 8 relevant value bits. Is
that certain, on all implementations?
and that is guaranteed by the standard.
You're right in that.
On the compiler I used at work today, unsigned long has exactly 32
value bits. And sizeof(unsigned long) is 2, not 4.


I'd be interested to know if it has a comforming hosted implementation tho
:-)
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #11
On Sat, 12 Jun 2004 07:30:16 -0400, in comp.lang.c , "Stephen L."
<sd*********@cost-com.net> wrote:
Joe Laughlin wrote:
Have you thought about using a union (let the
compiler do the work for you)?


snip example of packing a union and then unpacking it differently.
Of course, if your unsigned ints overflow what will fit
in an unsigned char, you'll get unknown results...


But reading from a union by accessing a member other than that which you
wrote last is UB anyway. Its a common extension to place meaning on the
behaviour of course, but you can't rely on it.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #12
On Sun, 13 Jun 2004 22:23:09 +0200, in comp.lang.c , Michael Mair
<ma********************@ians.uni-stuttgart.de> wrote:
Hiho,
[union-for-bytewise-access]

That approach is inherently unsafe, both for misuse of the union,
and for dependence on byte sex.


Mmmm, I know about the former but could you or someone else please
expand your answer concerning the latter? I am not even sure what
you mean by byte sex...


I think CBF means Endianness. See Chris Torek's post.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #13
in comp.lang.c i read:
[union-for-bytewise-access]
That approach is inherently unsafe, both for misuse of the union,
and for dependence on byte sex.


Mmmm, I know about the former but could you or someone else please
expand your answer concerning the latter? I am not even sure what
you mean by byte sex...


which end of the array of bytes is the least significant, i.e., is the
least significant byte [0] or [sizeof object - 1]? and those are not the
only possibilities if sizeof object is greater than 2, where [1] is another
though it's not common to see these days.

signed long value = 1;
unsigned char bytes[sizeof value];
memcpy(bytes, &value, sizeof value);
/* is bytes[0], bytes[1] or bytes[sizeof bytes - 1], or some other, the 1? */

today one tends to see 01 00 00 00 (little endian) or 00 00 00 01 (big
endian), which correspond to b[0] or b[sizeof b - 1] having the 1.

and that ignores padding, which is, again, unlikely these days but it is
allowed, and for all anyone knows it will reappear or you're program will
have to work on a dinosaur. if there is padding then it may be that none
of the bytes in my example will be a 1, or there may be more than one with
a non-zero value.

oh, and what if sizeof value is 1? are you thinking `how can such a thing
be'? in c it is possible if CHAR_BIT is 32 or larger, which is seen on
today's dsp's. in that case you aren't accessing octets, which is often
what people want to do with the sort of tricks discussed, rather you are
accessing the one, 32 bit, byte, hence [0] is all there is, but how you
serialize it's octets remains an issue.

all this makes working with internal representations a difficult and
tedious, though not insurmountable thing.

--
a signature
Nov 14 '05 #14
Michael Mair wrote:

[union-for-bytewise-access]

That approach is inherently unsafe, both for misuse of the union,
and for dependence on byte sex.


Mmmm, I know about the former but could you or someone else please
expand your answer concerning the latter? I am not even sure what
you mean by byte sex...


Also known as endianess. The order of octets within the
representation of an integer, or other item. I know of at least 3
fairly popular versions for 32 bit integers. The shift, mask, and
add method is independant of this.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 14 '05 #15
Thanks for enlightening me :-)

Nov 14 '05 #16
Thanks for enlightening me :-)

Nov 14 '05 #17
Mark McIntyre <ma**********@spamcop.net> wrote:
But reading from a union by accessing a member other than that which you
wrote last is UB anyway. Its a common extension to place meaning on the
behaviour of course, but you can't rely on it.


FWIW, doesn't this make unions worse than useless in portable code?
If there were two (or more) variables that never needed to maintain
their values concurrently, I would expect a compiler to assign them
the same bit of memory anyway.
Nov 14 '05 #18
ol*****@inspire.net.nz (Old Wolf) wrote:
Mark McIntyre <ma**********@spamcop.net> wrote:
But reading from a union by accessing a member other than that which you
wrote last is UB anyway. Its a common extension to place meaning on the
behaviour of course, but you can't rely on it.


FWIW, doesn't this make unions worse than useless in portable code?
If there were two (or more) variables that never needed to maintain
their values concurrently, I would expect a compiler to assign them
the same bit of memory anyway.


It cannot always tell. Besides, which is the more useful declaration:

int foodvalue(union animal animal);

or

int foodvalue(struct common_animal animal,
struct fish fish, struct bird bird, struct mammal mammal,
enum taxon which_taxon);

I'd prefer the former.

Richard
Nov 14 '05 #19
On 15 Jun 2004 17:31:03 -0700, in comp.lang.c , ol*****@inspire.net.nz (Old
Wolf) wrote:
Mark McIntyre <ma**********@spamcop.net> wrote:
But reading from a union by accessing a member other than that which you
wrote last is UB anyway. Its a common extension to place meaning on the
behaviour of course, but you can't rely on it.


FWIW, doesn't this make unions worse than useless in portable code?


Not at all. Take a look at how the original MS Excel toolkit passed cell
contents to and from functions:

stuct
{
char datatype;
union
{
int intdata;
long longdata;
float floatdata;
double doubledata;
// etc
}
}

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

16
by: Jacob H | last post by:
Hi there list, I'm a beginning programmer, so please correct me if any of the following assumptions are wrong. Suppose I have the decimal number 255. Since integers in Python are 32 bits, it...
15
by: Tor Erik Sřnvisen | last post by:
Hi I need a time and space efficient way of storing up to 6 million bits. Time efficency is more important then space efficency as I'm going to do searches through the bit-set. regards tores
7
by: sathyashrayan | last post by:
Group, Following function will check weather a bit is set in the given variouble x. int bit_count(long x) { int n = 0; /* ** The loop will execute once for each bit of x set,
64
by: yossi.kreinin | last post by:
Hi! There is a system where 0x0 is a valid address, but 0xffffffff isn't. How can null pointers be treated by a compiler (besides the typical "solution" of still using 0x0 for "null")? -...
13
by: Tomás | last post by:
The quantity of bits used by an unsigned integer type in memory can be determined by: typedef unsigned long long UIntType; CHAR_BIT * sizeof(UIntType) However, what would be a good...
23
by: Umesh | last post by:
This is a basic thing. Say A=0100 0001 in ASCII which deals with 256 characters(you know better than me!) But we deal with only four characters and 2 bits are enough to encode them. I want to...
40
by: KG | last post by:
Could any one tell me how to reverse the bits in an interger?
77
by: borophyll | last post by:
As I read it, C99 states that a byte is an: "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment" (3.6) and that a byte...
59
by: riva | last post by:
I am developing a compression program. Is there any way to write a data to file in the form of bits, like write bit 0 then bit 1 and then bit 1 and so on ....
29
by: Virtual_X | last post by:
As in IEEE754 double consist of sign bit 11 bits for exponent 52 bits for fraction i write this code to print double parts as it explained in ieee754 i want to know if the code contain any...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.