473,229 Members | 1,875 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,229 software developers and data experts.

Implementations with CHAR_BIT=32

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Just out of curiosity, does anyone know actual implementations that have
this?

S.
Nov 15 '05 #1
12 3752
Skarmander <in*****@dontmailme.com> writes:
The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?
Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.
Just out of curiosity, does anyone know actual implementations that have
this?


I have heard that some DSPs use this model, but not hosted
implementations.
--
"...Almost makes you wonder why Heisenberg didn't include postinc/dec operators
in the uncertainty principle. Which of course makes the above equivalent to
Schrodinger's pointer..."
--Anthony McDonald
Nov 15 '05 #2
Ben Pfaff wrote:
Skarmander <in*****@dontmailme.com> writes:

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.

<snip>
I'd imagine that declaring an array of UCHAR_MAX elements is most
commonly done under the assumption that `char' is not significantly
larger than necessary to hold the characters in the character set (which
fails), and that the character set is "small" for some reasonable value
of "small", which does not include "32 bits" (this will probably still
hold).

Either that or the application really wants 8-bit bytes, but is using
UCHAR_MAX because it looks neater (which could be considered a bug, not
just an assumption).

I don't quite see the EOF problem, though. It's probably just my lack of
imagination, but could you give a code snippet that fails?

S.
Nov 15 '05 #3
Skarmander <in*****@dontmailme.com> writes:
I don't quite see the EOF problem, though. It's probably just my lack of
imagination, but could you give a code snippet that fails?


Something like this is often used to detect end-of-file or error:
if (getc(file) == EOF) {
/* ...handle error or end of file... */
}
If "int" and "char" have the same range, then a return value of
EOF doesn't necessarily mean that an error or end-of-file was
encountered.
--
"To get the best out of this book, I strongly recommend that you read it."
--Richard Heathfield
Nov 15 '05 #4
Skarmander wrote:
Ben Pfaff wrote:
Skarmander <in*****@dontmailme.com> writes:

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?


Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.

<snip>
I'd imagine that declaring an array of UCHAR_MAX elements is most
commonly done under the assumption that `char' is not significantly
larger than necessary to hold the characters in the character set (which
fails), and that the character set is "small" for some reasonable value
of "small", which does not include "32 bits" (this will probably still
hold).

Either that or the application really wants 8-bit bytes, but is using
UCHAR_MAX because it looks neater (which could be considered a bug, not
just an assumption).

I don't quite see the EOF problem, though. It's probably just my lack of
imagination, but could you give a code snippet that fails?


- Functions from <ctype.h> have an int parameter which is either
representable by unsigned char or equals the value of the macro
EOF (which is a negative integral constant expression).

- fgetc()/getc returns either the next character as unsigned char
converted to an int or EOF; with fputc(), you write an int which
is a char converted to unsigned char.

If the value range of unsigned char is not contained in int we
have signed integer overflow. If we, for one moment, assume this
overflow is well-defined and "wraps around" just as in the unsigned
integer case, then we still have the problem that we cannot
discern whether EOF is intended to be EOF or (int)((unsigned char) EOF).

So, character based I/O and <ctype.h> gives us some trouble.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 15 '05 #5
On 2005-11-07, Skarmander <in*****@dontmailme.com> wrote:
Ben Pfaff wrote:
Skarmander <in*****@dontmailme.com> writes:

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.

<snip>
I'd imagine that declaring an array of UCHAR_MAX elements is most
commonly done under the assumption that `char' is not significantly
larger than necessary to hold the characters in the character set
(which fails), and that the character set is "small" for some
reasonable value of "small", which does not include "32 bits" (this
will probably still hold).


eh - 20 bits [actually 20.0875-ish] bits is still pretty large, and
those will need to be stored in 32 bits most likely.
Either that or the application really wants 8-bit bytes, but is
using UCHAR_MAX because it looks neater (which could be considered a
bug, not just an assumption).

I don't quite see the EOF problem, though. It's probably just my
lack of imagination, but could you give a code snippet that fails?


It would be possible if there actually were a 32-bit character set, or
if getchar() read four bytes at a time from the file.
Nov 15 '05 #6
Jordan Abel <jm****@purdue.edu> writes:
On 2005-11-07, Skarmander <in*****@dontmailme.com> wrote:

[...]
I don't quite see the EOF problem, though. It's probably just my
lack of imagination, but could you give a code snippet that fails?


It would be possible if there actually were a 32-bit character set, or
if getchar() read four bytes at a time from the file.


getchar() by definition reads one byte at a time from the file.
A byte may be larger than 8 bits.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #7
Skarmander wrote:

Ben Pfaff wrote:
Skarmander <in*****@dontmailme.com> writes:

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.

<snip>
I'd imagine that declaring an array of UCHAR_MAX elements is most
commonly done under the assumption that `char' is not significantly
larger than necessary to hold the characters in the character set (which
fails), and that the character set is "small" for some reasonable value
of "small", which does not include "32 bits" (this will probably still
hold).

Either that or the application really wants 8-bit bytes, but is using
UCHAR_MAX because it looks neater (which could be considered a bug, not
just an assumption).

I don't quite see the EOF problem, though.
It's probably just my lack of
imagination, but could you give a code snippet that fails?


int putchar(int c);

putchar returns either ((int)(unsigned char)c) or EOF.

If sizeof(int) equals one, and c is negative,
then (unsigned char)c is greater than INT_MAX,
and that means that ((int)(unsigned char)c)
would be implementation defined
and possibley negative, upon success.

--
pete
Nov 15 '05 #8
Skarmander wrote:
Ben Pfaff wrote:
Skarmander <in*****@dontmailme.com> writes:

The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. Another problem is that declaring an array of
UCHAR_MAX elements is probably not possible; UCHAR_MAX + 1
elements is a constraint violation. I'm sure that other common
practices would fail as well.

<snip>
I'd imagine that declaring an array of UCHAR_MAX elements is most
commonly done under the assumption that `char' is not significantly
larger than necessary to hold the characters in the character set (which
fails), and that the character set is "small" for some reasonable value
of "small", which does not include "32 bits" (this will probably still
hold).

Either that or the application really wants 8-bit bytes, but is using
UCHAR_MAX because it looks neater (which could be considered a bug, not
just an assumption).

I don't quite see the EOF problem, though. It's probably just my lack of
imagination, but could you give a code snippet that fails?

S.


See http://www.homebrewcpu.com/projects.htm
and scroll down to his last project "LCC retargeting for D16/M homebrew
computer".
The D16/M is a hobbyist-designed, homebrew CPU which is fully 16 bits.
It cannot even handle 8 bits. So the architecture has 16bit chars,
16bit shorts, 16bit ints and 16bit pointers.
And the compiler worked quite well...
If you are never going to process text files generated on other
systems, there's no reason for chars to be 8 bits.

Nov 15 '05 #9
On Mon, 07 Nov 2005 22:29:29 +0100, Skarmander
<in*****@dontmailme.com> wrote in comp.lang.c:
The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?

Just out of curiosity, does anyone know actual implementations that have
this?

S.


I used an Analog Devices 32-bit SHARC DSP a few years ago, I forget
the exact model, where CHAR_BIT was 32 and all the integer types (this
was before long long) were 32 bits.

I currently do a lot of work with a Texas Instruments DSP in the
TMS32F28xx family where CHAR_BIT is 16 and the char, short, and int
types all share the same representation and size.

I imagine other DSPs from these and other manufacturers are similar,
although Freescale (was Motorola) has a 16 bit DSP that they say
supports CHAR_BIT 8, although I haven't used it.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 15 '05 #10
Ben Pfaff <bl*@cs.stanford.edu> writes:
Skarmander <in*****@dontmailme.com> writes:
The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?


Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. [...]


Just a reminder that CHAR_BIT == 32 and sizeof(int) == 1 both being
true doesn't automatically imply either that INT_MAX == CHAR_MAX or
that INT_MIN == CHAR_MIN. In particular,

INT_MAX 2147483647 CHAR_MAX 1073741823
INT_MIN -2147483648 CHAR_MIN -1073741824

are allowed, or even

INT_MAX 2147483647 CHAR_MAX 127
INT_MIN -2147483648 CHAR_MIN -128

are allowed.

What I would expect in a hosted implementation with CHAR_BIT == 32
and sizeof(int) == 1 is

INT_MAX 2147483647 CHAR_MAX 2147483647
INT_MIN -2147483648 CHAR_MIN -2147483647

So EOF could be -2147483648 and there would be no conflict with any
character value. Of course, on such a system, outputting binary data
would most likely be done with unsigned char rather than char.
Nov 15 '05 #11
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
Ben Pfaff <bl*@cs.stanford.edu> writes:
Skarmander <in*****@dontmailme.com> writes:
The standard allows an implementation where CHAR_BIT = 32 and sizeof
char = sizeof short = sizeof int = sizeof long = 1, right?


Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. [...]


Just a reminder that CHAR_BIT == 32 and sizeof(int) == 1 both being
true doesn't automatically imply either that INT_MAX == CHAR_MAX or
that INT_MIN == CHAR_MIN. In particular,

INT_MAX 2147483647 CHAR_MAX 1073741823
INT_MIN -2147483648 CHAR_MIN -1073741824

are allowed, or even

INT_MAX 2147483647 CHAR_MAX 127
INT_MIN -2147483648 CHAR_MIN -128

are allowed.


Not for char, it isn't. Other types can have padding bits; char
(unsigned in any case, and AFAIK since a TC some time ago the other
kinds, too) can not.

Richard
Nov 15 '05 #12
rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
Ben Pfaff <bl*@cs.stanford.edu> writes:
Skarmander <in*****@dontmailme.com> writes:

> The standard allows an implementation where CHAR_BIT = 32 and sizeof
> char = sizeof short = sizeof int = sizeof long = 1, right?

Yes. A lot of otherwise well-written code would malfunction on
such a (hosted) implementation, because EOF is now in the range
of a signed char. [...]


Just a reminder that CHAR_BIT == 32 and sizeof(int) == 1 both being
true doesn't automatically imply either that INT_MAX == CHAR_MAX or
that INT_MIN == CHAR_MIN. In particular,

INT_MAX 2147483647 CHAR_MAX 1073741823
INT_MIN -2147483648 CHAR_MIN -1073741824

are allowed, or even

INT_MAX 2147483647 CHAR_MAX 127
INT_MIN -2147483648 CHAR_MIN -128

are allowed.


Not for char, it isn't. Other types can have padding bits; char
(unsigned in any case, and AFAIK since a TC some time ago the other
kinds, too) can not.


I need to ask for a reference. It's true that unsigned char can't
have padding bits, but I can't find any evidence that signed char
can't have padding bits. Apparently it _was_ true in C89/C90 that the
committee didn't expect that signed char's would ever have padding
bits; however, it seems like this unstated assumption was cleared up
in a DR (DR 069, I believe). The language in 6.2.6.2 p1,p2 seems to
say fairly plainly that padding bits aren't allowed for unsigned char
but are allowed for all other integer types.

There was a posting from Doug Gwyn in comp.std.c on July 12 of this
year saying that signed char's could have padding bits. A search in
Google Groups

"signed char" "padding bits" numeration

should turn that up. The message was:
Doug Gwyn in comp.std.c:

Keith Thompson wrote:
> If it doesn't produce undefined behavior, are we to infer that it
> produces defined behavior? If so, where is the behavior defined?

Certainly the value is well defined for unsigned char
(no padding bits). For signed char there can be
padding bits, but they don't affect the value, AND
(apparently according to the spec) the object can be
accessed regardless of the contents of the padding
bits. The only conformance issues would thus appear
to be: which of the three allowed binary numeration
schemes is used, and how many value bits are present?


Also: even if signed char's have no padding bits, it's still
true that INT_MIN < SCHAR_MIN can hold, which lets INT_MIN
serve as a value for EOF.
Nov 15 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

21
by: codergem | last post by:
One common answer is that all compilers keep the size of integer the same as the size of the register on a particular architecture. Thus, to know whether the machine is 32 bit or 64 bit, just see...
15
by: steve yee | last post by:
i want to detect if the compile is 32 bits or 64 bits in the source code itself. so different code are compiled respectively. how to do this?
16
by: chandanlinster | last post by:
As far as I know floating point variables, that are declared as float follow IEEE format representation (which is 32-bit in size). But chapter1-page no 9 of the book "The C programming language"...
168
by: broeisi | last post by:
Hello, Is there a way in C to get information at runtime if a processor is 32 or 64 bit? Cheers, Broeisi
5
by: seung | last post by:
Hi, I'm trying to use uint64_t by #including stdint.h in C on 32-bit machine. However, I can't seem to get the right value for shifting, for example: for(i = 0; i < 64; i++) { c = 1 << i;...
8
by: Daniel Kraft | last post by:
Hi, I did encounter a strange problem in my C program, and traced it down; it looks like I get different results for bit left shifts when the bit count is a constant or a funtion-return value. ...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.