473,465 Members | 1,925 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Implementations with CHAR_BIT=32

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Just out of curiosity, does anyone know actual implementations that have
this?

S.
Nov 15 '05 #1
12 3763
Skarmander <in*****@dontmailme.com> writes:
The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?
Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.
Just out of curiosity, does anyone know actual implementations that have
this?


I have heard that some DSPs use this model, but not hosted
implementations.
--
"...Almost makes you wonder why Heisenberg didn't include postinc/dec operators
in the uncertainty principle. Which of course makes the above equivalent to
Schrodinger's pointer..."
--Anthony McDonald
Nov 15 '05 #2
Ben Pfaff wrote:
Skarmander <in*****@dontmailme.com> writes:

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.

<snip>
I'd imagine that declaring an array of UCHAR_MAX elements is most
commonly done under the assumption that `char' is not significantly
larger than necessary to hold the characters in the character set (which
fails), and that the character set is "small" for some reasonable value
of "small", which does not include "32 bits" (this will probably still
hold).

Either that or the application really wants 8-bit bytes, but is using
UCHAR_MAX because it looks neater (which could be considered a bug, not
just an assumption).

I don't quite see the EOF problem, though. It's probably just my lack of
imagination, but could you give a code snippet that fails?

S.
Nov 15 '05 #3
Skarmander <in*****@dontmailme.com> writes:
I don't quite see the EOF problem, though. It's probably just my lack of
imagination, but could you give a code snippet that fails?


Something like this is often used to detect end-of-file or error:
if (getc(file) == EOF) {
/* ...handle error or end of file... */
}
If "int" and "char" have the same range, then a return value of
EOF doesn't necessarily mean that an error or end-of-file was
encountered.
--
"To get the best out of this book, I strongly recommend that you read it."
--Richard Heathfield
Nov 15 '05 #4
Skarmander wrote:
Ben Pfaff wrote:
Skarmander <in*****@dontmailme.com> writes:

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?


Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.

<snip>
I'd imagine that declaring an array of UCHAR_MAX elements is most
commonly done under the assumption that `char' is not significantly
larger than necessary to hold the characters in the character set (which
fails), and that the character set is "small" for some reasonable value
of "small", which does not include "32 bits" (this will probably still
hold).

Either that or the application really wants 8-bit bytes, but is using
UCHAR_MAX because it looks neater (which could be considered a bug, not
just an assumption).

I don't quite see the EOF problem, though. It's probably just my lack of
imagination, but could you give a code snippet that fails?


- Functions from <ctype.h> have an int parameter which is either
representable by unsigned char or equals the value of the macro
EOF (which is a negative integral constant expression).

- fgetc()/getc returns either the next character as unsigned char
converted to an int or EOF; with fputc(), you write an int which
is a char converted to unsigned char.

If the value range of unsigned char is not contained in int we
have signed integer overflow. If we, for one moment, assume this
overflow is well-defined and "wraps around" just as in the unsigned
integer case, then we still have the problem that we cannot
discern whether EOF is intended to be EOF or (int)((unsigned char) EOF).

So, character based I/O and <ctype.h> gives us some trouble.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 15 '05 #5
On 2005-11-07, Skarmander <in*****@dontmailme.com> wrote:
Ben Pfaff wrote:
Skarmander <in*****@dontmailme.com> writes:

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.

<snip>
I'd imagine that declaring an array of UCHAR_MAX elements is most
commonly done under the assumption that `char' is not significantly
larger than necessary to hold the characters in the character set
(which fails), and that the character set is "small" for some
reasonable value of "small", which does not include "32 bits" (this
will probably still hold).


eh - 20 bits [actually 20.0875-ish] bits is still pretty large, and
those will need to be stored in 32 bits most likely.
Either that or the application really wants 8-bit bytes, but is
using UCHAR_MAX because it looks neater (which could be considered a
bug, not just an assumption).

I don't quite see the EOF problem, though. It's probably just my
lack of imagination, but could you give a code snippet that fails?


It would be possible if there actually were a 32-bit character set, or
if getchar() read four bytes at a time from the file.
Nov 15 '05 #6
Jordan Abel <jm****@purdue.edu> writes:
On 2005-11-07, Skarmander <in*****@dontmailme.com> wrote:

[...]
I don't quite see the EOF problem, though. It's probably just my
lack of imagination, but could you give a code snippet that fails?


It would be possible if there actually were a 32-bit character set, or
if getchar() read four bytes at a time from the file.


getchar() by definition reads one byte at a time from the file.
A byte may be larger than 8 bits.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #7
Skarmander wrote:

Ben Pfaff wrote:
Skarmander <in*****@dontmailme.com> writes:

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.

<snip>
I'd imagine that declaring an array of UCHAR_MAX elements is most
commonly done under the assumption that `char' is not significantly
larger than necessary to hold the characters in the character set (which
fails), and that the character set is "small" for some reasonable value
of "small", which does not include "32 bits" (this will probably still
hold).

Either that or the application really wants 8-bit bytes, but is using
UCHAR_MAX because it looks neater (which could be considered a bug, not
just an assumption).

I don't quite see the EOF problem, though.
It's probably just my lack of
imagination, but could you give a code snippet that fails?


int putchar(int c);

putchar returns either ((int)(unsigned char)c) or EOF.

If sizeof(int) equals one, and c is negative,
then (unsigned char)c is greater than INT_MAX,
and that means that ((int)(unsigned char)c)
would be implementation defined
and possibley negative, upon success.

--
pete
Nov 15 '05 #8
Skarmander wrote:
Ben Pfaff wrote:
Skarmander <in*****@dontmailme.com> writes:

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.

<snip>
I'd imagine that declaring an array of UCHAR_MAX elements is most
commonly done under the assumption that `char' is not significantly
larger than necessary to hold the characters in the character set (which
fails), and that the character set is "small" for some reasonable value
of "small", which does not include "32 bits" (this will probably still
hold).

Either that or the application really wants 8-bit bytes, but is using
UCHAR_MAX because it looks neater (which could be considered a bug, not
just an assumption).

I don't quite see the EOF problem, though. It's probably just my lack of
imagination, but could you give a code snippet that fails?

S.


See http://www.homebrewcpu.com/projects.htm
and scroll down to his last project "LCC retargeting for D16/M homebrew
computer".
The D16/M is a hobbyist-designed, homebrew CPU which is fully 16 bits.
It cannot even handle 8 bits. So the architecture has 16bit chars,
16bit shorts, 16bit ints and 16bit pointers.
And the compiler worked quite well...
If you are never going to process text files generated on other
systems, there's no reason for chars to be 8 bits.

Nov 15 '05 #9
On Mon, 07 Nov 2005 22:29:29 +0100, Skarmander
<in*****@dontmailme.com> wrote in comp.lang.c:
The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Just out of curiosity, does anyone know actual implementations that have
this?

S.


I used an Analog Devices 32-bit SHARC DSP a few years ago, I forget
the exact model, where CHAR_BIT was 32 and all the integer types (this
was before long long) were 32 bits.

I currently do a lot of work with a Texas Instruments DSP in the
TMS32F28xx family where CHAR_BIT is 16 and the char, short, and int
types all share the same representation and size.

I imagine other DSPs from these and other manufacturers are similar,
although Freescale (was Motorola) has a 16 bit DSP that they say
supports CHAR_BIT 8, although I haven't used it.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 15 '05 #10
Ben Pfaff <bl*@cs.stanford.edu> writes:
Skarmander <in*****@dontmailme.com> writes:
The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?


Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. [...]


Just a reminder that CHAR_BIT == 32 and sizeof(int) == 1 both being
true doesn't automatically imply either that INT_MAX == CHAR_MAX or
that INT_MIN == CHAR_MIN. In particular,

INT_MAX 2147483647 CHAR_MAX 1073741823
INT_MIN -2147483648 CHAR_MIN -1073741824

are allowed, or even

INT_MAX 2147483647 CHAR_MAX 127
INT_MIN -2147483648 CHAR_MIN -128

are allowed.

What I would expect in a hosted implementation with CHAR_BIT == 32
and sizeof(int) == 1 is

INT_MAX 2147483647 CHAR_MAX 2147483647
INT_MIN -2147483648 CHAR_MIN -2147483647

So EOF could be -2147483648 and there would be no conflict with any
character value. Of course, on such a system, outputting binary data
would most likely be done with unsigned char rather than char.
Nov 15 '05 #11
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
Ben Pfaff <bl*@cs.stanford.edu> writes:
Skarmander <in*****@dontmailme.com> writes:
The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?


Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. [...]


Just a reminder that CHAR_BIT == 32 and sizeof(int) == 1 both being
true doesn't automatically imply either that INT_MAX == CHAR_MAX or
that INT_MIN == CHAR_MIN. In particular,

INT_MAX 2147483647 CHAR_MAX 1073741823
INT_MIN -2147483648 CHAR_MIN -1073741824

are allowed, or even

INT_MAX 2147483647 CHAR_MAX 127
INT_MIN -2147483648 CHAR_MIN -128

are allowed.


Not for char, it isn't. Other types can have padding bits; char
(unsigned in any case, and AFAIK since a TC some time ago the other
kinds, too) can not.

Richard
Nov 15 '05 #12
rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
Ben Pfaff <bl*@cs.stanford.edu> writes:
Skarmander <in*****@dontmailme.com> writes:

> The standard allows an implementation where CHAR_BIT = 32 and sizeof
> char = sizeof short = sizeof int = sizeof long = 1, right?

Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. [...]


Just a reminder that CHAR_BIT == 32 and sizeof(int) == 1 both being
true doesn't automatically imply either that INT_MAX == CHAR_MAX or
that INT_MIN == CHAR_MIN. In particular,

INT_MAX 2147483647 CHAR_MAX 1073741823
INT_MIN -2147483648 CHAR_MIN -1073741824

are allowed, or even

INT_MAX 2147483647 CHAR_MAX 127
INT_MIN -2147483648 CHAR_MIN -128

are allowed.


Not for char, it isn't. Other types can have padding bits; char
(unsigned in any case, and AFAIK since a TC some time ago the other
kinds, too) can not.


I need to ask for a reference. It's true that unsigned char can't
have padding bits, but I can't find any evidence that signed char
can't have padding bits. Apparently it _was_ true in C89/C90 that the
committee didn't expect that signed char's would ever have padding
bits; however, it seems like this unstated assumption was cleared up
in a DR (DR 069, I believe). The language in 6.2.6.2 p1,p2 seems to
say fairly plainly that padding bits aren't allowed for unsigned char
but are allowed for all other integer types.

There was a posting from Doug Gwyn in comp.std.c on July 12 of this
year saying that signed char's could have padding bits. A search in
Google Groups

"signed char" "padding bits" numeration

should turn that up. The message was:
Doug Gwyn in comp.std.c:

Keith Thompson wrote:
> If it doesn't produce undefined behavior, are we to infer that it
> produces defined behavior? If so, where is the behavior defined?

Certainly the value is well defined for unsigned char
(no padding bits). For signed char there can be
padding bits, but they don't affect the value, AND
(apparently according to the spec) the object can be
accessed regardless of the contents of the padding
bits. The only conformance issues would thus appear
to be: which of the three allowed binary numeration
schemes is used, and how many value bits are present?


Also: even if signed char's have no padding bits, it's still
true that INT_MIN < SCHAR_MIN can hold, which lets INT_MIN
serve as a value for EOF.
Nov 15 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

21
by: codergem | last post by:
One common answer is that all compilers keep the size of integer the same as the size of the register on a particular architecture. Thus, to know whether the machine is 32 bit or 64 bit, just see...
15
by: steve yee | last post by:
i want to detect if the compile is 32 bits or 64 bits in the source code itself. so different code are compiled respectively. how to do this?
16
by: chandanlinster | last post by:
As far as I know floating point variables, that are declared as float follow IEEE format representation (which is 32-bit in size). But chapter1-page no 9 of the book "The C programming language"...
168
by: broeisi | last post by:
Hello, Is there a way in C to get information at runtime if a processor is 32 or 64 bit? Cheers, Broeisi
5
by: seung | last post by:
Hi, I'm trying to use uint64_t by #including stdint.h in C on 32-bit machine. However, I can't seem to get the right value for shifting, for example: for(i = 0; i < 64; i++) { c = 1 << i;...
8
by: Daniel Kraft | last post by:
Hi, I did encounter a strange problem in my C program, and traced it down; it looks like I get different results for bit left shifts when the bit count is a constant or a funtion-return value. ...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.