473,750 Members | 2,253 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

A byte can be greater than 8 bits?

As I read it, C99 states that a byte is an:

"addressabl e unit of data storage large enough to hold any member of
the basic character
set of the execution environment" (3.6)

and that a byte must be at least 8 bits:

"The values given below shall be replaced by constant expressions
suitable for use in #if
preprocessing directives. Moreover, except for CHAR_BIT and
MB_LEN_MAX, the
following shall be replaced by expressions that have the same type as
would an
expression that is an object of the corresponding type converted
according to the integer
promotions. Their implementation-defined values shall be equal or
greater in magnitude
(absolute value) to those shown, with the same sign."

number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8 (5.2.4.2.1)

Does this mean that a byte can be larger than 8 bits (ie CHAR_BIT >
8)? I have gotten the impression that a byte, or unsigned char, was
always 8 bits, but perhaps I was wrong. If I am not, is there
somewhere in the standard that defines a byte as always being 8 bits?

Regards,
B.

Oct 1 '07 #1
77 4275
bo*******@gmail .com wrote:
>As I read it, C99 states that a byte is an:
"addressabl e unit of data storage large enough to hold any member of
the basic character set of the execution environment" (3.6)

and that a byte must be at least 8 bits:
...
Does this mean that a byte can be larger than 8 bits (ie CHAR_BIT >
8)? I have gotten the impression that a byte, or unsigned char, was
always 8 bits, but perhaps I was wrong.
Correct - For example, Texas Instruments C compiler for the DSP 2000
processor family defines CHAR_BIT == 16, and sizeof(char) ==
sizeof(short) == sizeof(int) == 1
--
Roberto Waltman

[ Please reply to the group,
return address is invalid ]
Oct 1 '07 #2

I would be very interested for posters here to list the current systems
they use which would cause problems by breaking the rules. It might be a
good edition to the FAQ for people to go out and find real systems so as
to understand better why they need to be careful:

Martin Wells <wa****@eircom. netwrites:
>Does this mean that a byte can be larger than 8 bits (ie CHAR_BIT >
8)?


Yes, a byte can have more than eight bits. Welcome to the world of
portable C programming :D Here's a few other things to look out for in
portable programming:
A) Systems where a byte is more than 8 bits :
>
1: Null pointers aren't necessarily represented as all-bits-zero, so
think twice about using memset(array_of _pointers,0,siz eof
array_of_pointe rs).
B) Systems where a NULL pointer can not be set by applying 0s.
>
2: Integer types other than unsigned char may contain padding bits, so
stay away from memcmp(arr1,arr 2,sizeof arr1).
C) Places where the padding bits are not concistently set between 2
arrays of the same types.
>
3: Number systems other than two's complement may be used, so be
careful about doing things like using -1 to represent all-bits-one.
D) Where -1 is not an all bits on.
>
4: Function pointers might not fit inside ANY of the other types (e.g.
such as void* or unsigned long), so don't do that.
E) Where a function pointer can not be stored in a VOID
pointer. (Regardless of style)

Oct 1 '07 #3
Richard wrote:
I would be very interested for posters here to list the current systems
they use which would cause problems by breaking the rules. It might be a
good edition to the FAQ for people to go out and find real systems so as
to understand better why they need to be careful:
I partly agree and partly disagree here. I agree because it shows to new
people that strictly following the standard /is/ important because these
common assumptions aren't necessarily true. But I disagree because it
may encourage the view that if a system cannot be found which differs
from the norm, then it doesn't matter that we're making assumptions.
Martin Wells <wa****@eircom. netwrites:
>>Does this mean that a byte can be larger than 8 bits (ie CHAR_BIT >
8)?

Yes, a byte can have more than eight bits. Welcome to the world of
portable C programming :D Here's a few other things to look out for in
portable programming:

A) Systems where a byte is more than 8 bits :
DSPs are a common example here.
>1: Null pointers aren't necessarily represented as all-bits-zero, so
think twice about using memset(array_of _pointers,0,siz eof
array_of_point ers).

B) Systems where a NULL pointer can not be set by applying 0s.
What do you mean by "applying 0s"? Null pointers can always be set by:
foo *x = (foo *)0;
(I'm not sure if the cast is necessary) even if the null pointer
representation is not all-bits-zero.

I personally don't know of any non-all-bits-zero null pointer systems,
but I've never seen reason to worry about it. I know /my/ code will work
everywhere regardless. As Kernighan and Ritchie wisely point out, ``if
you don't know how they are done on various machines, that innocence may
help to protect you.''
>2: Integer types other than unsigned char may contain padding bits, so
stay away from memcmp(arr1,arr 2,sizeof arr1).

C) Places where the padding bits are not concistently set between 2
arrays of the same types.
That is user-program dependent. Any system with padding bits can have
them set by the user program:
int main(void) {
unsigned int arr1[2], arr2[2];
arr1[0] = arr1[1] = (unsigned int) -1;
memset(arr2,2*s izeof(unsigned int),1,(unsigne d char)-1);
/* At this point, arr1's values equal arr2's values but
the padding bits may differ. */
}
>3: Number systems other than two's complement may be used, so be
careful about doing things like using -1 to represent all-bits-one.

D) Where -1 is not an all bits on.
This is compicated again. For unsigned types, -1 is always all-bits-one.
I can't think why you'd want to set a signed type to all-bits-one.
>4: Function pointers might not fit inside ANY of the other types (e.g.
such as void* or unsigned long), so don't do that.

E) Where a function pointer can not be stored in a VOID
pointer. (Regardless of style)
I can't think why you'd want to do this.

--
Philip Potter pgp <atdoc.ic.ac. uk
Oct 1 '07 #4
Philip Potter <pg*@see.sig.in validwrites:
Richard wrote:
>I would be very interested for posters here to list the current systems
they use which would cause problems by breaking the rules. It might be a
good edition to the FAQ for people to go out and find real systems so as
to understand better why they need to be careful:

I partly agree and partly disagree here. I agree because it shows to
new people that strictly following the standard /is/ important because
these common assumptions aren't necessarily true. But I disagree
because it may encourage the view that if a system cannot be found
which differs from the norm, then it doesn't matter that we're making
assumptions.
No. I think it's important to show that these systems really
exist. Otherwise a lot of people will think it's all a lot of hot
air. It does no harm to demonstrate WHERE the standard benefits the
programmer. Not some airy fairy "there might be a system with a 13 bit
char" for example.
>
>Martin Wells <wa****@eircom. netwrites:
>>>Does this mean that a byte can be larger than 8 bits (ie CHAR_BIT >
8)?

Yes, a byte can have more than eight bits. Welcome to the world of
portable C programming :D Here's a few other things to look out for in
portable programming:

A) Systems where a byte is more than 8 bits :

DSPs are a common example here.
>>1: Null pointers aren't necessarily represented as all-bits-zero, so
think twice about using memset(array_of _pointers,0,siz eof
array_of_poin ters).

B) Systems where a NULL pointer can not be set by applying 0s.

What do you mean by "applying 0s"? Null pointers can always be set by:
foo *x = (foo *)0;
See above where it clearly says using memset.
(I'm not sure if the cast is necessary) even if the null pointer
representation is not all-bits-zero.

I personally don't know of any non-all-bits-zero null pointer systems,
but I've never seen reason to worry about it. I know /my/ code will
work everywhere regardless. As Kernighan and Ritchie wisely point out,
if you don't know how they are done on various machines, that
innocence may help to protect you.''
Yes, but this has nothing to do with the question. The question being
posed is to demonstrate where this might be an issue if you do NOT stick
to the rules.
>
>>2: Integer types other than unsigned char may contain padding bits, so
stay away from memcmp(arr1,arr 2,sizeof arr1).

C) Places where the padding bits are not concistently set between 2
arrays of the same types.

That is user-program dependent. Any system with padding bits can have
them set by the user program:
int main(void) {
unsigned int arr1[2], arr2[2];
arr1[0] = arr1[1] = (unsigned int) -1;
memset(arr2,2*s izeof(unsigned int),1,(unsigne d char)-1);
/* At this point, arr1's values equal arr2's values but
the padding bits may differ. */
}
This is a forced issue and nothing to do with the question about
compiler or platform specifics. We are talking two arrays of the same
objects begin compared without using memset. In the real word - not the
hypothetical world.
>
>>3: Number systems other than two's complement may be used, so be
careful about doing things like using -1 to represent all-bits-one.

D) Where -1 is not an all bits on.

This is compicated again. For unsigned types, -1 is always
all-bits-one. I can't think why you'd want to set a signed type to
all-bits-one.
This is not the question. The question is what real platforms does -1
not get represented by an all bits on. This comes up all the time.
>
>>4: Function pointers might not fit inside ANY of the other types (e.g.
such as void* or unsigned long), so don't do that.

E) Where a function pointer can not be stored in a VOID
pointer. (Regardless of style)

I can't think why you'd want to do this.
This is not the question. The question is where it would not work.
Oct 1 '07 #5
Richard wrote:
Philip Potter <pg*@see.sig.in validwrites:
>Richard wrote:
>>I would be very interested for posters here to list the current systems
they use which would cause problems by breaking the rules. It might be a
good edition to the FAQ for people to go out and find real systems so as
to understand better why they need to be careful:
I partly agree and partly disagree here. I agree because it shows to
new people that strictly following the standard /is/ important because
these common assumptions aren't necessarily true. But I disagree
because it may encourage the view that if a system cannot be found
which differs from the norm, then it doesn't matter that we're making
assumptions.

No. I think it's important to show that these systems really
exist. Otherwise a lot of people will think it's all a lot of hot
air. It does no harm to demonstrate WHERE the standard benefits the
programmer. Not some airy fairy "there might be a system with a 13 bit
char" for example.
But again, if you don't know how things are done on systems, you will be
protected from it. I know that NULL is not guaranteed to be
all-bits-zero, and I don't know where it is and where it isn't. I'm not
even sure whether or not NULL is all-bits-zero on x86, and I'm happy to
stay that way.

It's good to have a couple of examples of why certain common assumptions
shouldn't be relied on, but compiling a comprehensive list is asking for
people to rely on that list instead.
>>Martin Wells <wa****@eircom. netwrites:

Does this mean that a byte can be larger than 8 bits (ie CHAR_BIT >
8)?
Yes, a byte can have more than eight bits. Welcome to the world of
portable C programming :D Here's a few other things to look out for in
portable programming:
A) Systems where a byte is more than 8 bits :
DSPs are a common example here.
>>>1: Null pointers aren't necessarily represented as all-bits-zero, so
think twice about using memset(array_of _pointers,0,siz eof
array_of_poi nters).
B) Systems where a NULL pointer can not be set by applying 0s.
What do you mean by "applying 0s"? Null pointers can always be set by:
foo *x = (foo *)0;

See above where it clearly says using memset.
>(I'm not sure if the cast is necessary) even if the null pointer
representati on is not all-bits-zero.

I personally don't know of any non-all-bits-zero null pointer systems,
but I've never seen reason to worry about it. I know /my/ code will
work everywhere regardless. As Kernighan and Ritchie wisely point out,
if you don't know how they are done on various machines, that
innocence may help to protect you.''

Yes, but this has nothing to do with the question. The question being
posed is to demonstrate where this might be an issue if you do NOT stick
to the rules.
This is why writing standard C is the /easy/ way: because you don't need
to compile such compatibility lists.
>>>2: Integer types other than unsigned char may contain padding bits, so
stay away from memcmp(arr1,arr 2,sizeof arr1).
C) Places where the padding bits are not concistently set between 2
arrays of the same types.
That is user-program dependent. Any system with padding bits can have
them set by the user program:
int main(void) {
unsigned int arr1[2], arr2[2];
arr1[0] = arr1[1] = (unsigned int) -1;
memset(arr2,2*s izeof(unsigned int),1,(unsigne d char)-1);
/* At this point, arr1's values equal arr2's values but
the padding bits may differ. */
}

This is a forced issue and nothing to do with the question about
compiler or platform specifics. We are talking two arrays of the same
objects begin compared without using memset. In the real word - not the
hypothetical world.
Why does your requirement exclude use of memset? Someone who uses
memcmp() will also use memset(). Someone who uses memset() may not do so
for every array he uses.
>>>3: Number systems other than two's complement may be used, so be
careful about doing things like using -1 to represent all-bits-one.
D) Where -1 is not an all bits on.
This is compicated again. For unsigned types, -1 is always
all-bits-one. I can't think why you'd want to set a signed type to
all-bits-one.

This is not the question. The question is what real platforms does -1
not get represented by an all bits on. This comes up all the time.
Give me an example of code which relies on -1 being all-bits-one.
>>>4: Function pointers might not fit inside ANY of the other types (e.g.
such as void* or unsigned long), so don't do that.
E) Where a function pointer can not be stored in a VOID
pointer. (Regardless of style)
I can't think why you'd want to do this.

This is not the question. The question is where it would not work.
Why is that question relevant if noone tries to do it? I don't see you
asking for a list of platforms where NULL is 0xdeadbeef, because no code
depends on that.

--
Philip Potter pgp <atdoc.ic.ac. uk
Oct 1 '07 #6
Philip Potter wrote:
>
Give me an example of code which relies on -1 being all-bits-one.
#include <stdio.h>
int maskNLowerBits( unsigned d,int n)
{
return d & (-1 << n);
}

int main(void)
{
unsigned i;
unsigned u = (unsigned)-1;

for (i=0; i<32;i++)
printf("Masking lower %d bits of %x is: %x\n",
i,u,maskNLowerB its(u,i));
}

Output:
Masking lower 0 bits of ffffffff is: ffffffff
Masking lower 1 bits of ffffffff is: fffffffe
Masking lower 2 bits of ffffffff is: fffffffc
Masking lower 3 bits of ffffffff is: fffffff8
Masking lower 4 bits of ffffffff is: fffffff0
Masking lower 5 bits of ffffffff is: ffffffe0
Masking lower 6 bits of ffffffff is: ffffffc0
Masking lower 7 bits of ffffffff is: ffffff80
Masking lower 8 bits of ffffffff is: ffffff00
Masking lower 9 bits of ffffffff is: fffffe00
Masking lower 10 bits of ffffffff is: fffffc00
Masking lower 11 bits of ffffffff is: fffff800
Masking lower 12 bits of ffffffff is: fffff000
Masking lower 13 bits of ffffffff is: ffffe000
Masking lower 14 bits of ffffffff is: ffffc000
Masking lower 15 bits of ffffffff is: ffff8000
Masking lower 16 bits of ffffffff is: ffff0000
Masking lower 17 bits of ffffffff is: fffe0000
Masking lower 18 bits of ffffffff is: fffc0000
Masking lower 19 bits of ffffffff is: fff80000
Masking lower 20 bits of ffffffff is: fff00000
Masking lower 21 bits of ffffffff is: ffe00000
Masking lower 22 bits of ffffffff is: ffc00000
Masking lower 23 bits of ffffffff is: ff800000
Masking lower 24 bits of ffffffff is: ff000000
Masking lower 25 bits of ffffffff is: fe000000
Masking lower 26 bits of ffffffff is: fc000000
Masking lower 27 bits of ffffffff is: f8000000
Masking lower 28 bits of ffffffff is: f0000000
Masking lower 29 bits of ffffffff is: e0000000
Masking lower 30 bits of ffffffff is: c0000000
Masking lower 31 bits of ffffffff is: 80000000

OK?

And you haven't given ANY example as Richard asked!

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Oct 1 '07 #7
jacob navia wrote:
Philip Potter wrote:
>>
Give me an example of code which relies on -1 being all-bits-one.

#include <stdio.h>
int maskNLowerBits( unsigned d,int n)
{
return d & (-1 << n);
}
Maybe Philip should have specified "code that doesn't
invoke undefined behavior." After the fix s/1/1u/ the
corrected code no longer relies on -1 being all-bits-one.

--
Eric Sosman
es*****@ieee-dot-org.invalid
Oct 1 '07 #8
Eric Sosman wrote:
jacob navia wrote:
>Philip Potter wrote:
>>>
Give me an example of code which relies on -1 being all-bits-one.

#include <stdio.h>
int maskNLowerBits( unsigned d,int n)
{
return d & (-1 << n);
}

Maybe Philip should have specified "code that doesn't
invoke undefined behavior." After the fix s/1/1u/ the
corrected code no longer relies on -1 being all-bits-one.
??
Is it legal to make -1u??
Unary minus applied to unsigned operand?
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Oct 1 '07 #9
Philip Potter wrote:

[snip]

The fact that Philippe could not bring a single example
means probably that there isn't any machines where
all those terrible conditions apply!
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Oct 1 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
3271
by: w3r3w0lf | last post by:
hello! I have a following situation: I have a byte array where at a certain location are stored 4 bytes, and these should be "put" into long variable (or any other 4 byte one). ie: byte a={0x0,0x0,0xfe,0x30,0x9e,0x2,0x66,0,0} and the bytes 0xfe,0x30,0x9e,0x2 should be put into long variable which should then contain 43921662 how to achieve this?
235
11760
by: napi | last post by:
I think you would agree with me that a C compiler that directly produces Java Byte Code to be run on any JVM is something that is missing to software programmers so far. With such a tool one could stay with C and still be able to produce Java byte code for platform independent apps. Also, old programs (with some tweaking) could be re-compiled and ported to the JVM. We have been developing such a tool over the last 2 years and currently...
16
9224
by: Samuel Thomas | last post by:
Hello Friends, I understand(could be wrong) that the smallest chunk of memory is called a word. If that is correct, that means if I am using a 32 bit OS a word is 4 bytes. So that's why the size of an int is 4 bytes. How is it that a char then gets 1 byte. Shouldn't it also get 4 bytes even though it might be able to store only 256 values? Is the OS doing some sort of trimming? Thanks
47
16560
by: Kapil Khosla | last post by:
Hi, I am trying to reverse a byte eg. 11010000 should look like 00001011 Plz note, it is not a homework problem and I do not need the c code for it. Just give me an idea how should I proceed about it. I know basic bit manipulation , shifting left, right and have done
4
1859
by: s.subbarayan | last post by:
Dear all, I would like to know the easiest efficient way to set or inject a particular value in the given word or byte?The problem is: I have to implement a function which will set a value from position "n" to "n+x" where n and x are passed dynamically,where n is start position of the bit from which i will be setting a value and x is the position where I will be finishing the setting.In short it looks like this:
33
3412
by: Benjamin M. Stocks | last post by:
Hello all, I've heard differing opinions on this and would like a definitive answer on this once and for all. If I have an array of 4 1-byte values where index 0 is the least signficant byte of a 4-byte value. Can I use the arithmatic shift operators to hide the endian-ness of the underlying processor when assembling a native 4-byte value like follows: unsigned int integerValue; unsigned char byteArray;
96
4964
by: david ullua | last post by:
I am reading "Joel on Software" these days, and am in stuck with the question of "how to calculate bitsize of a byte" which is listed as one of the basic interview questions in Joel's book. Anyone could give some ideas?I am expecting your reply. David.
14
1917
by: rsood | last post by:
Hi I'm developing a program, and naturally I want it to be as portable as possible. I need to be able to access specific numbers of bytes in it, but as far as I know, there is no keyword in the c language such as 'byte'. Is it always okay to assume that the char data type is always 1 byte, or is there some other way to be sure you are getting 1 byte that is not processor/OS dependent that is better, or is there no way to be both...
20
3523
by: quantumred | last post by:
I found the following code floating around somewhere and I'd like to get some comments. unsigned char a1= { 5,10,15,20}; unsigned char a2= { 25,30,35,40}; *(unsigned int *)a1=*(unsigned int *)a2; // now a1=a2, a1=a2, etc.
0
8999
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8836
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9575
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9394
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9338
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9256
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8260
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
2
2798
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2223
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.