sizeof C integral types

ark

Risking to invoke flames from one Tom St Denis of Ottawa :)

Is there any guarantee that, say,
sizeof(int) == sizeof(unsigned int)
sizeof(long) > sizeof(char) ?

Thanks,
Ark

Nov 14 '05 #1

Subscribe Post Reply

4033

Russell Hanneken

ark wrote:

Is there any guarantee that, say,
sizeof(int) == sizeof(unsigned int)
sizeof(long) > sizeof(char) ?

I don't believe the standard guarantees that either is true.

--
Russell Hanneken
rg********@pobox.com
Remove the 'g' from my address to send me mail.

Nov 14 '05 #2

Hallvard B Furuseth

ark wrote:

Is there any guarantee that, say,
sizeof(int) == sizeof(unsigned int)
Yes.
sizeof(long) > sizeof(char) ?

No. BTW, sizeof(char) == 1 by definition.

However, sizeof(long) == 1, which I think implies sizeof(int) == 1,
would break several very common idioms, e.g.

int ch;
while ((ch = getchar()) != EOF) { ... }

because EOF is supposed to be a value which is different from all
'unsigned char' values. That is only possible when 'int' is wider
than 'unsigned char'.

Personally I've never seen a program which worred about this
possibility, though I suppose such programs exist. It might
be different with freestanding implementations (implementations
which do not use the C library, so getchar() is no problem).

--
Hallvard

Nov 14 '05 #3

Ben Pfaff

"ark" <ar****@comcast.net> writes:

Is there any guarantee that, say,
sizeof(int) == sizeof(unsigned int)
Yes.
sizeof(long) > sizeof(char) ?

No.
--
"Your correction is 100% correct and 0% helpful. Well done!"
--Richard Heathfield

Nov 14 '05 #4

Ben Pfaff

Hallvard B Furuseth <h.b.furuseth(nospam)@usit.uio(nospam).no> writes:

However, sizeof(long) == 1, which I think implies sizeof(int) == 1,

Actually there's no such implication. The range of int is a
subrange of the range of long, but there's no such guarantee on
the size in bytes of these types.

However, it would be a strange system for which sizeof(long) <
sizeof(int). I don't know of any.
--
"...Almost makes you wonder why Heisenberg didn't include postinc/dec operators
in the uncertainty principle. Which of course makes the above equivalent to
Schrodinger's pointer..."
--Anthony McDonald

Nov 14 '05 #5

Ben Pfaff

Russell Hanneken <rg********@pobox.com> writes:

ark wrote:
Is there any guarantee that, say,
sizeof(int) == sizeof(unsigned int)
sizeof(long) > sizeof(char) ?

I don't believe the standard guarantees that either is true.

The former, but not the latter, is guaranteed.
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Nov 14 '05 #6

ark

"Hallvard B Furuseth" <h.b.furuseth(nospam)@usit.uio(nospam).no> wrote in
message news:HB**************@bombur.uio.no...

However, sizeof(long) == 1, which I think implies sizeof(int) == 1,
would break several very common idioms, e.g.

int ch;
while ((ch = getchar()) != EOF) { ... }

because EOF is supposed to be a value which is different from all
'unsigned char' values. That is only possible when 'int' is wider
than 'unsigned char'.

Personally I've never seen a program which worred about this
possibility, though I suppose such programs exist. It might
be different with freestanding implementations (implementations
which do not use the C library, so getchar() is no problem).

--
Hallvard

I believe that a 16-bit DSP with a 16-bit byte would have
sizeof(int)==sizeof(short) (and ==1).
- Ark

Nov 14 '05 #7

E. Robert Tisdale

ark wrote:

Risking to invoke flames from one Tom St Denis of Ottawa :)

Is there any guarantee that, say,
sizeof(int) == sizeof(unsigned int)
sizeof(long) > sizeof(char)?

Are you aware of the type definitions in <stdint.h>?

Nov 14 '05 #8

CBFalconer

"E. Robert Tisdale" wrote:

ark wrote:
Risking to invoke flames from one Tom St Denis of Ottawa :)

Is there any guarantee that, say,
sizeof(int) == sizeof(unsigned int)
sizeof(long) > sizeof(char)?

Are you aware of the type definitions in <stdint.h>?

Only implied misinformation from Trollsdale this time. <stdint.h>
is a C99 artifact, and the type defined therein are only defined
when the implementation has suitable types. So you could do
something like:

#if defined(sometype)
#define mytype sometype
#else
#define mytype whatever
#endif

with suitable guards for a C99 system.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #9

Christian Bau

In article <40**************@jpl.nasa.gov>,
"E. Robert Tisdale" <E.**************@jpl.nasa.gov> wrote:

ark wrote:
Risking to invoke flames from one Tom St Denis of Ottawa :)

Is there any guarantee that, say,
sizeof(int) == sizeof(unsigned int)
sizeof(long) > sizeof(char)?

Are you aware of the type definitions in <stdint.h>?

Quite possibly he is aware of them and knows that they have nothing to
do with the question asked.

Nov 14 '05 #10

Jack Klein

On Thu, 15 Jan 2004 01:08:41 GMT, CBFalconer <cb********@yahoo.com>
wrote in comp.lang.c:

"E. Robert Tisdale" wrote:
ark wrote:
Risking to invoke flames from one Tom St Denis of Ottawa :)

Is there any guarantee that, say,
sizeof(int) == sizeof(unsigned int)
sizeof(long) > sizeof(char)?

Are you aware of the type definitions in <stdint.h>?

Only implied misinformation from Trollsdale this time. <stdint.h>
is a C99 artifact, and the type defined therein are only defined
when the implementation has suitable types. So you could do
something like:

#if defined(sometype)
#define mytype sometype
#else
#define mytype whatever
#endif

with suitable guards for a C99 system.

It is actually quite possible, and very useful, to build a subset of
<stdint.h> for any compiler. Interestingly enough, the (complete, not
subset) <stdint.h> that comes with ARM's ADS compiles and works
perfectly with Visual C++ 6.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html

Nov 14 '05 #11

CBFalconer

Jack Klein wrote:

.... snip ...
It is actually quite possible, and very useful, to build a subset
of <stdint.h> for any compiler. Interestingly enough, the
(complete, not subset) <stdint.h> that comes with ARM's ADS
compiles and works perfectly with Visual C++ 6.

Actually I would expect that to be possible with any system where
things are built on 1, 2, 4, etc. octet sized objects. I think
the availability is customized by simply omitting the appropriate
definitions from stdint.h

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #12

E. Robert Tisdale

ark wrote:

I believe that a 16-bit DSP with a 16-bit byte
What do you mean by byte?
Did you mean to say "machine word" or "data path"?
Or did you really mean 16-bit characters?
would have sizeof(int)==sizeof(short) (and ==1).

Take a look at
The Vector, Signal and Image Processing Library (VSIPL):

http://www.vsipl.org/

It defines types that are supposed to be portable
to a wide variety of DSP target platforms.

Nov 14 '05 #13

pete

Dan Pop wrote:

sizeof(long) > sizeof(char) ?

Implicitly guaranteed for hosted implementations, because the library
specification relies on INT_MAX >= UCHAR_MAX and this would be
impossible if sizeof(int) == 1.

I don't recall that ever being stated so plainly
on this newsgroup before.

--
pete

Nov 14 '05 #14

Alex

"Dan Pop" <Da*****@cern.ch> wrote in message
news:bu**********@sunnews.cern.ch...

sizeof(long) > sizeof(char) ?

Implicitly guaranteed for hosted implementations, because the
library specification relies on INT_MAX >= UCHAR_MAX [...]

By which I presume you mean an 'int' must be able to hold all possible
values of an 'unsigned char', required in (for example) getchar()?

Alex

Nov 14 '05 #15

Arthur J. O'Dwyer

On Thu, 15 Jan 2004, pete wrote:

Dan Pop wrote:
sizeof(long) > sizeof(char) ?

Implicitly guaranteed for hosted implementations, because the library
specification relies on INT_MAX >= UCHAR_MAX and this would be
impossible if sizeof(int) == 1.

I don't recall that ever being stated so plainly
on this newsgroup before.

Nor do I. And even though I at first thought it was technically
wrong because of padding bits, I now think that while it still may be
wrong, it's less wrong than I thought.

a) Plain char is unsigned. INT_MAX must be at least UCHAR_MAX so that
getchar() can return any plain char value, and INT_MIN must be less than
or equal to -32767. So the total number of values of 'int' must be at
least UCHAR_MAX+32768, which requires more bits than CHAR_BIT. Q.E.D.

b) Plain char is signed. The range of char, i.e., of signed char, must
be a subrange of the range of int. But is it possible we might have

#define CHAR_BIT 16
#define UCHAR_MAX 65535
#define SCHAR_MIN -32767 /* !!! */
#define SCHAR_MAX 32767
#define INT_MIN -32768
#define INT_MAX 32767
#define EOF -32768

Is anything wrong, from the C standpoint, with these definitions?

-Arthur

Nov 14 '05 #16

nrk

Arthur J. O'Dwyer wrote:

On Thu, 15 Jan 2004, pete wrote:

Dan Pop wrote:
>
> > sizeof(long) > sizeof(char) ?
>
> Implicitly guaranteed for hosted implementations, because the library
> specification relies on INT_MAX >= UCHAR_MAX and this would be
> impossible if sizeof(int) == 1.
I don't recall that ever being stated so plainly
on this newsgroup before.

Nor do I. And even though I at first thought it was technically
wrong because of padding bits, I now think that while it still may be
wrong, it's less wrong than I thought.

a) Plain char is unsigned. INT_MAX must be at least UCHAR_MAX so that
getchar() can return any plain char value, and INT_MIN must be less than
or equal to -32767. So the total number of values of 'int' must be at
least UCHAR_MAX+32768, which requires more bits than CHAR_BIT. Q.E.D.

b) Plain char is signed. The range of char, i.e., of signed char, must
be a subrange of the range of int. But is it possible we might have

#define CHAR_BIT 16
#define UCHAR_MAX 65535
#define SCHAR_MIN -32767 /* !!! */
#define SCHAR_MAX 32767
#define INT_MIN -32768
#define INT_MAX 32767
#define EOF -32768

Is anything wrong, from the C standpoint, with these definitions?

Yes, something is wrong. If CHAR_BIT is 16, SCHAR_MIN *has* to be -32768.
This follows from the specification that states that value bits in signed
types have the same meaning as corresponding value bits in the unsigned
types and the stipulation that an unsigned integer type with n bits must be
able to represent values in the range [0, 2^(n-1)].

-nrk.
-Arthur

--
Remove devnull for email

Nov 14 '05 #17

nrk

nrk wrote:

<snip>

be able to represent values in the range [0, 2^(n-1)].

Crap. That should read [0, 2^n - 1] of course.

-nrk.

<snip>

--
Remove devnull for email

Nov 14 '05 #18

pete

nrk wrote:

This follows from the specification that states that
value bits in signed types have the same meaning as corresponding
value bits in the unsigned types
I know that's not supposed to mandate sign and magnitude
representation of negative integers, but it seems like it does.
and the stipulation
that an unsigned integer type with n bits must be
able to represent values in the range [0, 2^(n-1)].

--
pete

Nov 14 '05 #19

nrk

pete wrote:

nrk wrote:
This follows from the specification that states that
value bits in signed types have the same meaning as corresponding
value bits in the unsigned types

I know that's not supposed to mandate sign and magnitude
representation of negative integers, but it seems like it does.

Sorry, that's my mistake for not reading further on. Just a little further
down the standard stipulates how a set sign bit will modify the value
represented in the value bits and gives three choices:

sign and magnitude
sign bit has value -(2^n) 2's complement
sign bit has value -(2^n - 1) 1's complement.

However, my earlier conclusion that SCHAR_MIN must be -32768 still stands,
as Arthur was trying to mix both 1's complement (SCHAR_MIN) and 2's
complement (INT_MIN) representations, which is not allowed. While the
interpretation of the sign bit is implementation defined, the
interpretation needs to be consistent across all the signed integer types.

-nrk.

and the stipulation
that an unsigned integer type with n bits must be
able to represent values in the range [0, 2^(n-1)].

--
Remove devnull for email

Nov 14 '05 #20

pete

nrk wrote:

pete wrote:
nrk wrote:
This follows from the specification that states that
value bits in signed types have the same meaning as corresponding
value bits in the unsigned types

I know that's not supposed to mandate sign and magnitude
representation of negative integers, but it seems like it does.

Sorry, that's my mistake for not reading further on.
Just a little further
down the standard stipulates how a set sign bit will modify the value
represented in the value bits and gives three choices:

sign and magnitude
sign bit has value -(2^n) 2's complement
sign bit has value -(2^n - 1) 1's complement.

However, my earlier conclusion that SCHAR_MIN must
be -32768 still stands,
as Arthur was trying to mix both 1's complement (SCHAR_MIN) and 2's
complement (INT_MIN) representations, which is not allowed.

I believe that he may have conceived the whole thing in 2's
complement and that -32767 is a valid limit for 2's complement.

--
pete

Nov 14 '05 #21

Dan Pop

In <Pi**********************************@unix46.andre w.cmu.edu> "Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> writes:

On Thu, 15 Jan 2004, pete wrote:

Dan Pop wrote:
>
> > sizeof(long) > sizeof(char) ?
>
> Implicitly guaranteed for hosted implementations, because the library
> specification relies on INT_MAX >= UCHAR_MAX and this would be
> impossible if sizeof(int) == 1.
I don't recall that ever being stated so plainly
on this newsgroup before.

Nor do I. And even though I at first thought it was technically
wrong because of padding bits, I now think that while it still may be
wrong, it's less wrong than I thought.

a) Plain char is unsigned. INT_MAX must be at least UCHAR_MAX so that
getchar() can return any plain char value, and INT_MIN must be less than
or equal to -32767. So the total number of values of 'int' must be at
least UCHAR_MAX+32768, which requires more bits than CHAR_BIT. Q.E.D.

b) Plain char is signed. The range of char, i.e., of signed char, must
be a subrange of the range of int. But is it possible we might have

The properties of plain char don't matter, it is unsigned char that
matters.
#define CHAR_BIT 16
#define UCHAR_MAX 65535
#define SCHAR_MIN -32767 /* !!! */
#define SCHAR_MAX 32767
#define INT_MIN -32768
#define INT_MAX 32767
#define EOF -32768

Is anything wrong, from the C standpoint, with these definitions?

Yes, for a hosted implementation: int cannot represent the whole range
of unsigned char.

Furthermore, INT_MIN and EOF, as defined above, do not have type int.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #22

Dan Pop

In <bu************@ID-149533.news.uni-berlin.de> "Alex" <me@privacy.net> writes:

"Dan Pop" <Da*****@cern.ch> wrote in message
news:bu**********@sunnews.cern.ch...
> sizeof(long) > sizeof(char) ?

Implicitly guaranteed for hosted implementations, because the
library specification relies on INT_MAX >= UCHAR_MAX [...]

By which I presume you mean an 'int' must be able to hold all possible
values of an 'unsigned char', required in (for example) getchar()?

Yes. Both <stdio.h> and <ctype.h> rely on this property of int.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #23

Dan Pop

In <Nb******************@nwrddc01.gnilink.net> nrk <ra*********@devnull.verizon.net> writes:

Yes, something is wrong. If CHAR_BIT is 16, SCHAR_MIN *has* to be -32768.

Has it? How would you represent -32768 using one's complement or sign
magnitude? Furthermore, even for implementations using two's complement,
the representation with the sign bit set and all the value bits zero is
allowed to be a trap representation.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #24

Dan Pop

In <qQ*******************@nwrddc01.gnilink.net> nrk <ra*********@devnull.verizon.net> writes:

pete wrote:
nrk wrote:
This follows from the specification that states that
value bits in signed types have the same meaning as corresponding
value bits in the unsigned types
I know that's not supposed to mandate sign and magnitude
representation of negative integers, but it seems like it does.

Sorry, that's my mistake for not reading further on. Just a little further
down the standard stipulates how a set sign bit will modify the value
represented in the value bits and gives three choices:

sign and magnitude
sign bit has value -(2^n) 2's complement
sign bit has value -(2^n - 1) 1's complement.

And just immediately afterwards that, it says:

Which of these applies is implementation-defined, as is
^^^^^
whether the value with sign bit 1 and all value bits zero
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^
(for the first two), or with sign bit and all value bits 1 (for
^^^^^^^^^^^^^^^^^^^
one's complement), is a trap representation or a normal value.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
However, my earlier conclusion that SCHAR_MIN must be -32768 still stands,
Does it?
as Arthur was trying to mix both 1's complement (SCHAR_MIN) and 2's ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^complement (INT_MIN) representations, which is not allowed. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Where did you get this idea from?
While the
interpretation of the sign bit is implementation defined, the
interpretation needs to be consistent across all the signed integer types.

Chapter and verse, please.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #25

nrk

Dan Pop wrote:

In <qQ*******************@nwrddc01.gnilink.net> nrk
<ra*********@devnull.verizon.net> writes:
pete wrote:
nrk wrote:

This follows from the specification that states that
value bits in signed types have the same meaning as corresponding
value bits in the unsigned types

I know that's not supposed to mandate sign and magnitude
representation of negative integers, but it seems like it does.

Sorry, that's my mistake for not reading further on. Just a little
further down the standard stipulates how a set sign bit will modify the
value represented in the value bits and gives three choices:

sign and magnitude
sign bit has value -(2^n) 2's complement
sign bit has value -(2^n - 1) 1's complement.

And just immediately afterwards that, it says:

Which of these applies is implementation-defined, as is
^^^^^
whether the value with sign bit 1 and all value bits zero
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^
(for the first two), or with sign bit and all value bits 1 (for
^^^^^^^^^^^^^^^^^^^
one's complement), is a trap representation or a normal value.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Yes.

However, my earlier conclusion that SCHAR_MIN must be -32768 still stands,

Does it?

No.

as Arthur was trying to mix both 1's complement (SCHAR_MIN) and 2's

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
complement (INT_MIN) representations, which is not allowed.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Where did you get this idea from?

From the fact that his SCHAR_MIN is -32767 and INT_MIN is -32768 (however,
as you've shown my conclusions were wrong).

While the
interpretation of the sign bit is implementation defined, the
interpretation needs to be consistent across all the signed integer types.

Chapter and verse, please.

6.2.6.2, I interpreted this from the part that talks about signed integer
types. Specifically:
Which of these applies is implementation-defined, as is
^^^^^
which I assumed to mean that the definition is implementation-defined and
that definition applies to all signed integer types (since the section
talks about all signed integer types).

What you're saying is that the implementation can pick and choose how to
interpret the sign bit and whether the particular combination mentioned is
a trap representation in each signed integer type (Out of curiosity, are
there any real-world examples of this?). Since I am no expert at
interpreting the standard, I am convinced that your interpretation is
correct.

As you've show elsethread, the problem with Arthur's implementation is that
int does not have the same range as unsigned char.

-nrk.
Dan

--
Remove devnull for email

Nov 14 '05 #26

Arthur J. O'Dwyer

On Thu, 15 Jan 2004, Dan Pop wrote:

"Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> writes:
On Thu, 15 Jan 2004, pete wrote:
Dan Pop wrote:
>
> > sizeof(long) > sizeof(char) ?
>
> Implicitly guaranteed for hosted implementations, because the library
> specification relies on INT_MAX >= UCHAR_MAX and this would be
> impossible if sizeof(int) == 1.

<snip> b) Plain char is signed. The range of char, i.e., of signed char, must
be a subrange of the range of int. But is it possible we might have #define CHAR_BIT 16
#define UCHAR_MAX 65535
#define SCHAR_MIN -32767 /* !!! */
#define SCHAR_MAX 32767
#define INT_MIN -32768
#define INT_MAX 32767
#define EOF -32768

Is anything wrong, from the C standpoint, with these definitions?
Yes, for a hosted implementation: int cannot represent the whole range
of unsigned char.

Okay, then I just don't see why you say that. I thought you were
talking about the EOF-and-getchar issue, but you're not. What *are*
you referring to? Chapter and verse would probably help.
Furthermore, INT_MIN and EOF, as defined above, do not have type int.

Sorry. But you know what I meant. :)

-Arthur

Nov 14 '05 #27

glen herrmannsfeldt

Hallvard B Furuseth wrote:
(snip)

No. BTW, sizeof(char) == 1 by definition.

However, sizeof(long) == 1, which I think implies sizeof(int) == 1,
would break several very common idioms, e.g.

int ch;
while ((ch = getchar()) != EOF) { ... }

because EOF is supposed to be a value which is different from all
'unsigned char' values. That is only possible when 'int' is wider
than 'unsigned char'.

While technically true, if long, int, and char were all 32 bits
or more I wouldn't worry about it.

Moore's law doesn't tend to apply to alphabets or character
sets, so we should be safe for many years to come.

-- glen

Nov 14 '05 #28

Dan Pop

In <Gr******************@nwrddc02.gnilink.net> nrk <ra*********@devnull.verizon.net> writes:

Dan Pop wrote:
In <qQ*******************@nwrddc01.gnilink.net> nrk
<ra*********@devnull.verizon.net> writes:
While the
interpretation of the sign bit is implementation defined, the
interpretation needs to be consistent across all the signed integer types.
Chapter and verse, please.

6.2.6.2, I interpreted this from the part that talks about signed integer
types. Specifically:
Which of these applies is implementation-defined, as is
^^^^^

which I assumed to mean that the definition is implementation-defined and
that definition applies to all signed integer types (since the section
talks about all signed integer types).

By your logic, all signed types would have the same number of value bits
and so on...
What you're saying is that the implementation can pick and choose how to
interpret the sign bit and whether the particular combination mentioned is
a trap representation in each signed integer type
If there is no wording prohibiting this, an implementor is free to do it.
And I can find no such wording.
(Out of curiosity, are there any real-world examples of this?).

I sincerely hope there aren't. But this doesn't affect the discussion
about hypothetical conforming implementations...

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #29

Dan Pop

In <Pi**********************************@unix50.andre w.cmu.edu> "Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> writes:

On Thu, 15 Jan 2004, Dan Pop wrote:

"Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> writes:
>On Thu, 15 Jan 2004, pete wrote:
>> Dan Pop wrote:
>> >
>> > > sizeof(long) > sizeof(char) ?
>> >
>> > Implicitly guaranteed for hosted implementations, because the library
>> > specification relies on INT_MAX >= UCHAR_MAX and this would be
>> > impossible if sizeof(int) == 1.
<snip> > b) Plain char is signed. The range of char, i.e., of signed char, must
>be a subrange of the range of int. But is it possible we might have >#define CHAR_BIT 16
>#define UCHAR_MAX 65535
>#define SCHAR_MIN -32767 /* !!! */
>#define SCHAR_MAX 32767
>#define INT_MIN -32768
>#define INT_MAX 32767
>#define EOF -32768
>
>Is anything wrong, from the C standpoint, with these definitions?

Yes, for a hosted implementation: int cannot represent the whole range
of unsigned char.

Okay, then I just don't see why you say that. I thought you were
talking about the EOF-and-getchar issue, but you're not. What *are*
you referring to? Chapter and verse would probably help.

Yes, I was talking about the EOF-and-getchar issue. Why would you think
otherwise?

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #30

Dan Pop

In <blDNb.60620$Rc4.217309@attbi_s54> glen herrmannsfeldt <ga*@ugcs.caltech.edu> writes:

Hallvard B Furuseth wrote:
(snip)
No. BTW, sizeof(char) == 1 by definition.

However, sizeof(long) == 1, which I think implies sizeof(int) == 1,
would break several very common idioms, e.g.

int ch;
while ((ch = getchar()) != EOF) { ... }

because EOF is supposed to be a value which is different from all
'unsigned char' values. That is only possible when 'int' is wider
than 'unsigned char'.

While technically true, if long, int, and char were all 32 bits
or more I wouldn't worry about it.

Moore's law doesn't tend to apply to alphabets or character
sets, so we should be safe for many years to come.

You seem to be blissfully ignoring the binary files, which have nothing
to do with alphabets or character sets.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #31

Hallvard B Furuseth

Dan Pop wrote:

In <blDNb.60620$Rc4.217309@attbi_s54> glen herrmannsfeldt <ga*@ugcs.caltech.edu> writes:
Hallvard B Furuseth wrote:
(snip)
However, sizeof(long) == 1, which I think implies sizeof(int) == 1,
would break several very common idioms, e.g.

int ch;
while ((ch = getchar()) != EOF) { ... }

because EOF is supposed to be a value which is different from all
'unsigned char' values. That is only possible when 'int' is wider
than 'unsigned char'.

While technically true, if long, int, and char were all 32 bits
or more I wouldn't worry about it.

Moore's law doesn't tend to apply to alphabets or character
sets, so we should be safe for many years to come.

You seem to be blissfully ignoring the binary files, which have nothing
to do with alphabets or character sets.

Also, there are people out there who deliberately send data to a program
which will cause it to misbehave. You may have heard of computer
viruses, for example...

--
Hallvard

Nov 14 '05 #32

pete

Dan Pop wrote:

In <Pi**********************************@unix50.andre w.cmu.edu> "Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> writes:
On Thu, 15 Jan 2004, Dan Pop wrote:

"Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> writes:
>On Thu, 15 Jan 2004, pete wrote:
>> Dan Pop wrote:
>> >
>> > > sizeof(long) > sizeof(char) ?
>> >
>> > Implicitly guaranteed for hosted implementations,
>> > because the library
>> > specification relies on INT_MAX >= UCHAR_MAX and this would be
>> > impossible if sizeof(int) == 1.

<snip>
> b) Plain char is signed. The range of char, i.e.,
> of signed char, must be a subrange of the range of int.
> But is it possible we might have

>#define CHAR_BIT 16
>#define UCHAR_MAX 65535
>#define SCHAR_MIN -32767 /* !!! */
>#define SCHAR_MAX 32767
>#define INT_MIN -32768
>#define INT_MAX 32767
>#define EOF -32768
>
>Is anything wrong, from the C standpoint, with these definitions?

Yes, for a hosted implementation:
int cannot represent the whole range of unsigned char.

Okay, then I just don't see why you say that. I thought you were
talking about the EOF-and-getchar issue, but you're not. What *are*
you referring to? Chapter and verse would probably help.

Yes, I was talking about the EOF-and-getchar issue.
Why would you think otherwise?

Arthur J. O'Dwyer may be under the impression that
"int cannot represent the whole range of unsigned char."
is an incorrect aphorism about C, being used to criticize the code,
rather than a direct critisism of the code.

--
pete

Nov 14 '05 #33

nrk

Dan Pop wrote:

In <Gr******************@nwrddc02.gnilink.net> nrk
<ra*********@devnull.verizon.net> writes:
Dan Pop wrote:
In <qQ*******************@nwrddc01.gnilink.net> nrk
<ra*********@devnull.verizon.net> writes:

While the
interpretation of the sign bit is implementation defined, the
interpretation needs to be consistent across all the signed integer
types.

Chapter and verse, please.
6.2.6.2, I interpreted this from the part that talks about signed integer
types. Specifically:
Which of these applies is implementation-defined, as is
^^^^^

which I assumed to mean that the definition is implementation-defined and
that definition applies to all signed integer types (since the section
talks about all signed integer types).

By your logic, all signed types would have the same number of value bits
and so on...

No, because the section discussing limits.h will tell me otherwise. But I
see your point (which I've already conceded).

What you're saying is that the implementation can pick and choose how to
interpret the sign bit and whether the particular combination mentioned is
a trap representation in each signed integer type

If there is no wording prohibiting this, an implementor is free to do it.
And I can find no such wording.

Precisely the point that I missed. This is a good lesson for me to learn as
far as interpreting the standard goes.

(Out of curiosity, are there any real-world examples of this?).

I sincerely hope there aren't. But this doesn't affect the discussion
about hypothetical conforming implementations...

Yes, of course. I didn't ask that question as a challenge to your
interpretation, only to see if any weird systems out there exploit this
leeway in the standard (so I can refuse to work on such a system :-).

-nrk.
Dan

--
Remove devnull for email

Nov 14 '05 #34

Hallvard B Furuseth

Dan Pop wrote:

In <WWhNb.53056$5V2.66126@attbi_s53> "ark" <ar****@comcast.net> writes:
Is there any guarantee that, say,
(...)
sizeof(long) > sizeof(char) ?

Implicitly guaranteed for hosted implementations, because the library
specification relies on INT_MAX >= UCHAR_MAX and this would be impossible
if sizeof(int) == 1.

I seem to remember this is an issue where some committee members on
comp.std.c sort of admit that the standard is buggy, but that they
couldn't agree on a fix. Unless I'm thinking of problems when 'char'
not two's complement and/or int->char conversion overflow doesn't simply
silently strip the top bits.

--
Hallvard

Nov 14 '05 #35

Arthur J. O'Dwyer

On Fri, 16 Jan 2004, pete wrote:

Dan Pop wrote:
Arthur J. O'Dwyer <aj*@nospam.andrew.cmu.edu> writes:
On Thu, 15 Jan 2004, Dan Pop wrote:
> "Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> writes:
> >On Thu, 15 Jan 2004, pete wrote:
> >> Dan Pop wrote:
> >> >
> >> > because the library
> >> > specification relies on INT_MAX >= UCHAR_MAX and this would be
> >> > impossible if sizeof(int) == 1. But is it possible we might have [letting plain char be signed]> >#define CHAR_BIT 16
> >#define UCHAR_MAX 65535
> >#define SCHAR_MIN -32767 /* !!! */
> >#define SCHAR_MAX 32767
> >#define INT_MIN -32768
> >#define INT_MAX 32767
> >#define EOF -32768
> >
> >Is anything wrong, from the C standpoint, with these definitions?
>
> Yes, for a hosted implementation:
> int cannot represent the whole range of unsigned char.

Okay, then I just don't see why you say that. I thought you were
talking about the EOF-and-getchar issue, but you're not. What *are*
you referring to? Chapter and verse would probably help.
Yes, I was talking about the EOF-and-getchar issue.
Why would you think otherwise?

Arthur J. O'Dwyer may be under the impression that

I hereby give you permission to call me by my first name only. ;-)
"int cannot represent the whole range of unsigned char."
is an incorrect aphorism about C, being used to criticize the code,
rather than a direct criticism of the code.

And I have no idea what you mean by that, so I'll leave it for the
moment. However, re: Dan's reply: getchar() returns either a 'char'
value, cast to 'int', or it returns EOF, which is a negative 'int'
value unequal to any 'char' value. Right?
My #definitions above provide exactly enough numbers to do this
job: the range of 'char', which is signed, goes from -32767 to 32767,
and EOF is the 'int' value -32768. So if you were talking only about
the "EOF-and-getchar issue," you were wrong, AFAICT.

However, since I posted that message I noticed a post elsethread
talking about the <ctype.h> functions, which expect to be passed an
'unsigned char' value, cast to 'int'. That complicates things, or
^^^^^^^^
so I thought... but now I'm not so sure about that, either. I think
I'm really going to need the C&V here, or you're going to have to
show me a piece of code that pokes large holes in my #definitions.

-Arthur

Nov 14 '05 #36

ark

"Dan Pop" <Da*****@cern.ch> wrote in message
news:bu**********@sunnews.cern.ch...
<snip>

sizeof(long) > sizeof(char) ?

Implicitly guaranteed for hosted implementations, because the library
specification relies on INT_MAX >= UCHAR_MAX and this would be impossible
if sizeof(int) == 1. Since LONG_MAX cannot be lower than INT_MAX,
sizeof(long) cannot be 1, either, on a hosted implementation.

<snip>

Would anything be wrong with intentional under-using the potential range
#define CHAR_BIT 1024
#define UCHAR_MAX 255
#define SCHAR_MIN -127
.... etc ...
#define <under-used limits for int, long ...>

Thanks again,
Ark

Nov 14 '05 #37

Arthur J. O'Dwyer

On Fri, 16 Jan 2004, ark wrote:

"Dan Pop" <Da*****@cern.ch> wrote...
sizeof(long) > sizeof(char) ?

Implicitly guaranteed for hosted implementations, because the library
specification relies on INT_MAX >= UCHAR_MAX and this would be impossible
if sizeof(int) == 1. Since LONG_MAX cannot be lower than INT_MAX,
sizeof(long) cannot be 1, either, on a hosted implementation.

Would anything be wrong with intentional under-using the potential range
#define CHAR_BIT 1024
#define UCHAR_MAX 255

Yes. 'unsigned char' must use a pure binary representation, and
may not contain any padding bits.

-Arthur

Nov 14 '05 #38

pete

Arthur J. O'Dwyer wrote:

On Fri, 16 Jan 2004, pete wrote: And I have no idea what you mean by that,

Me neither.
I'll consider the matter more carefully.

--
pete

Nov 14 '05 #39

pete

Arthur J. O'Dwyer wrote:

On Fri, 16 Jan 2004, pete wrote:
Dan Pop wrote:
Arthur J. O'Dwyer <aj*@nospam.andrew.cmu.edu> writes:
>On Thu, 15 Jan 2004, Dan Pop wrote:
>> "Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> writes: >> >#define CHAR_BIT 16
>> >#define UCHAR_MAX 65535
>> >#define SCHAR_MIN -32767 /* !!! */
>> >#define SCHAR_MAX 32767
>> >#define INT_MIN -32768
>> >#define INT_MAX 32767
>> >#define EOF -32768

However, re: Dan's reply: getchar() returns either a 'char'
value, cast to 'int', or it returns EOF, which is a negative 'int'
value unequal to any 'char' value. Right?
My #definitions above provide exactly enough numbers to do this
job: the range of 'char', which is signed, goes from -32767 to 32767,
and EOF is the 'int' value -32768. So if you were talking only about
the "EOF-and-getchar issue," you were wrong, AFAICT.
Similar issue.
On my system the return value of putchar(EOF) is greater than -1.
I think it's supposed to be greater than -1,
and I don't think that it can be, on your system.
I think
I'm really going to need the C&V here, or you're going to have to
show me a piece of code that pokes large holes in my #definitions.

/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
int r;

r = putchar(EOF);
if (r > -1) {
puts("putchar(EOF) is greater than -1 on my system.");
}
return 0;
}

/* END new.c */
--
pete

Nov 14 '05 #40

Arthur J. O'Dwyer

On Sat, 17 Jan 2004, pete wrote:

Arthur J. O'Dwyer wrote:
> >> >#define CHAR_BIT 16
> >> >#define UCHAR_MAX 65535
> >> >#define SCHAR_MIN -32767 /* !!! */
> >> >#define SCHAR_MAX 32767
> >> >#define INT_MIN -32768
> >> >#define INT_MAX 32767
> >> >#define EOF -32768
On my system the return value of putchar(EOF) is greater than -1.
I think it's supposed to be greater than -1,
and I don't think that it can be, on your system.
[compressed pete's code] #include <stdio.h>
int main(void) {
int r = putchar(EOF);
if (r > -1)
puts("putchar(EOF) is greater than -1 on my system.");
return 0;
}

Hmm... yes... on my hypothetical system, this would replace EOF
by -32768, convert -32768 to uchar 32768, print that value out,
and then try to return 32768 as an 'int'. That would not be
possible, hence my system would not be conforming. So I do
believe you've poked holes in it.

So, from the getchar() spec, 'int' must range over all 'char'
values plus EOF, and from the putc() spec, 'int' must range over
all 'unsigned char' values plus EOF. So I guess Dan Pop is right
after all... surprise surprise. ;-)

-Arthur

Nov 14 '05 #41

Peter Nilsson

"Dan Pop" <Da*****@cern.ch> wrote in message
news:bu**********@sunnews.cern.ch...

In <Pi**********************************@unix46.andre w.cmu.edu> "Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> writes:
On Thu, 15 Jan 2004, pete wrote:

Dan Pop wrote:
>
> > sizeof(long) > sizeof(char) ?
>
> Implicitly guaranteed for hosted implementations, because the library
> specification relies on INT_MAX >= UCHAR_MAX and this would be
> impossible if sizeof(int) == 1.

I don't recall that ever being stated so plainly
on this newsgroup before.

Nor do I. And even though I at first thought it was technically
wrong because of padding bits, I now think that while it still may be
wrong, it's less wrong than I thought.

a) Plain char is unsigned. INT_MAX must be at least UCHAR_MAX so that
getchar() can return any plain char value, and INT_MIN must be less than
or equal to -32767. So the total number of values of 'int' must be at
least UCHAR_MAX+32768, which requires more bits than CHAR_BIT. Q.E.D.

b) Plain char is signed. The range of char, i.e., of signed char, must
be a subrange of the range of int. But is it possible we might have

The properties of plain char don't matter, it is unsigned char that
matters.
#define CHAR_BIT 16
#define UCHAR_MAX 65535
#define SCHAR_MIN -32767 /* !!! */
#define SCHAR_MAX 32767
#define INT_MIN (-32767-1) [corrected]
#define INT_MAX 32767
#define EOF (-32767-1) [corrected]

Is anything wrong, from the C standpoint, with these definitions?

Yes, for a hosted implementation: int cannot represent the whole range
of unsigned char.

Assuming int representation is 2C and the conversion of unsigned char to int
is the obvious (usual) one, where's the problem? (From an implementation
conformity point of view.)

--
Peter

Nov 14 '05 #42

Dan Pop

In <Pi**********************************@unix48.andre w.cmu.edu> "Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> writes:

However, re: Dan's reply: getchar() returns either a 'char'
value, cast to 'int', or it returns EOF, which is a negative 'int'
value unequal to any 'char' value. Right?

Wrong! getchar() returns either an unsigned char cast to int, or EOF,
which has a negative value. The chapter and verse actually come from
the fgetc specification:

2 If the end-of-file indicator for the input stream pointed to by
stream is not set and a next character is present, the fgetc
function obtains that character as an unsigned char converted
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
to an int and advances the associated file position indicator
^^^^^^^^^
for the stream (if defined).

The conversion between unsigned char and int has a well defined
result only if the type int can represent the value being converted.
I hope this clarifies the issue.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #43

Dan Pop

In <d9XNb.81102$xy6.138533@attbi_s02> "ark" <ar****@comcast.net> writes:

"Dan Pop" <Da*****@cern.ch> wrote in message
news:bu**********@sunnews.cern.ch...
<snip>
> sizeof(long) > sizeof(char) ?

Implicitly guaranteed for hosted implementations, because the library
specification relies on INT_MAX >= UCHAR_MAX and this would be impossible
if sizeof(int) == 1. Since LONG_MAX cannot be lower than INT_MAX,
sizeof(long) cannot be 1, either, on a hosted implementation.

<snip>

Would anything be wrong with intentional under-using the potential range
#define CHAR_BIT 1024
#define UCHAR_MAX 255
#define SCHAR_MIN -127
... etc ...
#define <under-used limits for int, long ...>

Underusing is allowed for any type *except* unsigned char (and plain char,
if behaving as unsigned char). UCHAR_MAX *must* be 2 ** CHAR_BIT - 1 for
*any* conforming implementation. The reason is simple: unsigned char is
supposed to be suitable for accessing the implementation's byte, so if
the byte has CHAR_BIT bits, unsigned char must be able to access each
of them.

It is still possible to underuse the hardware byte, but, in this
case, the definition of CHAR_BIT must still reflect the size of the
implementation byte and not the size of the hardware byte. In your
example, CHAR_BIT would have to be defined as 8, even if the hardware
byte had 1024 bits. And wider types could only use the same bits
that unsigned char can access, because unsigned char is supposed to
be suitable for inspecting/altering their representation.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #44

sizeof C integral types

Similar topics