473,397 Members | 2,068 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

May fgetc() and friends return 163? Or UCHAR_MAX?

In a thread from substantially earlier this week,

Harald van D?k <tr*****@gmail.comwrote:
getchar does not work with plain chars, it works with unsigned chars. 163
fits just fine in an unsigned char, so getchar is allowed to return 163.
Being rather pendantic, I decided to try to verify whether this was
true. I would appreciate knowing whether my reading of the Standard
is correct.

7.19.7.1 (as we all know) states that fgetc() (and thus its friends)
"obtains [a] character as an unsigned char converted to an int".
There is nothing in the Standard (that I was able to find) which
states that sizeof(int) may not be 1, so it occurred to me to ask, "Is
163 always representable as a signed int if sizeof(int) is 1?"
5.2.4.2.1 states that INT_MAX may not be less than 32767, so the
answer to that question appears to be "yes".

On the other hand, I do not see anything in 5.2.4.2.1 which requires
that UCHAR_MAX not be greater than INT_MAX - which indeed it must be,
if sizeof(int) == 1, correct? In such a case, fgetc() may return
UCHAR_MAX (right?), and so either fgetc() must work behind-the-scenes
magic to return a signed integer representing UCHAR_MAX, or invoke UB
by overflowing the signed type int. Both of these alternatives seem
ridiculous to me, so what am I missing?

--
C. Benson Manica | I *should* know what I'm talking about - if I
cbmanica(at)gmail.com | don't, I need to know. Flames welcome.
Jun 7 '07 #1
4 1800
Christopher Benson-Manica wrote:
In a thread from substantially earlier this week,

Harald van D?k <tr*****@gmail.comwrote:
>getchar does not work with plain chars, it works with unsigned chars. 163
fits just fine in an unsigned char, so getchar is allowed to return 163.

Being rather pendantic, I decided to try to verify whether this was
true. I would appreciate knowing whether my reading of the Standard
is correct.

7.19.7.1 (as we all know) states that fgetc() (and thus its friends)
"obtains [a] character as an unsigned char converted to an int".
There is nothing in the Standard (that I was able to find) which
states that sizeof(int) may not be 1, so it occurred to me to ask, "Is
163 always representable as a signed int if sizeof(int) is 1?"
5.2.4.2.1 states that INT_MAX may not be less than 32767, so the
answer to that question appears to be "yes".
Right.
On the other hand, I do not see anything in 5.2.4.2.1 which requires
that UCHAR_MAX not be greater than INT_MAX - which indeed it must be,
if sizeof(int) == 1, correct?
Correct. signed int has at least INT_MAX - INT_MIN + 1 distinct
representations, and if sizeof(int) == 1, that means unsigned char must be
capable of storing at least that many values. However, it is allowed to be
capable of storing even more.
In such a case, fgetc() may return
UCHAR_MAX (right?), and so either fgetc() must work behind-the-scenes
magic to return a signed integer representing UCHAR_MAX, or invoke UB
by overflowing the signed type int. Both of these alternatives seem
ridiculous to me, so what am I missing?
The behaviour is not undefined for integer conversions of out-of-range
values, not even for the signed types. Either the result is
implementation-defined, or an implementation-defined signal is raised, see
6.3.1.3p3. The result is the same: fgetc need not or cannot be meaningful.

However, 7.19.2p3 states that

"A binary stream is an ordered sequence of characters that can transparently
record internal data. Data read in from a binary stream shall compare equal
to the data that were earlier written out to that stream, under the same
implementation. Such a stream may, however, have an implementation-defined
number of null characters appended to the end of the stream."

This requirement cannot be met by an implementation where the conversion of
out-of-range values results in a signal, or where the conversion of
out-of-range values cannot be reverted. So by my reading, only freestanding
implementations that do not provide the standard I/O functions at all are
allowed to define unsigned char and int in such ways.
Jun 7 '07 #2
Harald van D?k <tr*****@gmail.comwrote:
The behaviour is not undefined for integer conversions of out-of-range
values, not even for the signed types. Either the result is
implementation-defined, or an implementation-defined signal is raised, see
6.3.1.3p3. The result is the same: fgetc need not or cannot be meaningful.
The language in n869 does not mention signals, but I assume that is a
difference between the draft and the actual standards.
However, 7.19.2p3 states that
"A binary stream is an ordered sequence of characters that can transparently
record internal data. Data read in from a binary stream shall compare equal
to the data that were earlier written out to that stream, under the same
implementation. Such a stream may, however, have an implementation-defined
number of null characters appended to the end of the stream."
This requirement cannot be met by an implementation where the conversion of
out-of-range values results in a signal, or where the conversion of
out-of-range values cannot be reverted.
Yes, that makes sense, although on further reading, it seems that an
implementation could work internal magic to establish a one-to-one
relationship between all unsigned char values from 0 to UCHAR_MAX and all
signed int values from INT_MIN to INT_MAX. That would mean that an
implementation would have to ensure that there were at least as many
valid signed int values as unsigned char values, with an extra signed
int value representing EOF. It does sound like a tall order for an
implementation where sizeof(int) == 1, but possible on a DS9K level.

--
C. Benson Manica | I *should* know what I'm talking about - if I
cbmanica(at)gmail.com | don't, I need to know. Flames welcome.
Jun 7 '07 #3
Christopher Benson-Manica wrote:
Harald van D?k <tr*****@gmail.comwrote:
>The behaviour is not undefined for integer conversions of out-of-range
values, not even for the signed types. Either the result is
implementation-defined, or an implementation-defined signal is raised,
see 6.3.1.3p3. The result is the same: fgetc need not or cannot be
meaningful.

The language in n869 does not mention signals, but I assume that is a
difference between the draft and the actual standards.
I don't remember when it was added. I believe it's part of C99, but I may be
misremembering. I'm reading from n1124.
>However, 7.19.2p3 states that
>"A binary stream is an ordered sequence of characters that can
transparently
record internal data. Data read in from a binary stream shall compare
equal to the data that were earlier written out to that stream, under
the same implementation. Such a stream may, however, have an
implementation-defined number of null characters appended to the end of
the stream."
>This requirement cannot be met by an implementation where the conversion
of out-of-range values results in a signal, or where the conversion of
out-of-range values cannot be reverted.

Yes, that makes sense, although on further reading, it seems that an
implementation could work internal magic to establish a one-to-one
relationship between all unsigned char values from 0 to UCHAR_MAX and all
signed int values from INT_MIN to INT_MAX. That would mean that an
implementation would have to ensure that there were at least as many
valid signed int values as unsigned char values, with an extra signed
int value representing EOF. It does sound like a tall order for an
implementation where sizeof(int) == 1, but possible on a DS9K level.
EOF need not be distinct from any valid character converted to int. Though
most code doesn't, strictly speaking, after reading EOF, you should call
feof() and ferror() to check whether more characters can be read. Note that
this is necessary even for non-DS9K systems when using fgetwc() e.a.
Jun 7 '07 #4

"Christopher Benson-Manica" <at***@faeroes.freeshell.orgwrote in message
news:f4**********@chessie.cirr.com...
In a thread from substantially earlier this week,

Harald van D?k <tr*****@gmail.comwrote:
>getchar does not work with plain chars, it works with unsigned chars. 163
fits just fine in an unsigned char, so getchar is allowed to return 163.

Being rather pendantic, I decided to try to verify whether this was
true. I would appreciate knowing whether my reading of the Standard
is correct.

7.19.7.1 (as we all know) states that fgetc() (and thus its friends)
"obtains [a] character as an unsigned char converted to an int".
There is nothing in the Standard (that I was able to find) which
states that sizeof(int) may not be 1, so it occurred to me to ask, "Is
163 always representable as a signed int if sizeof(int) is 1?"
5.2.4.2.1 states that INT_MAX may not be less than 32767, so the
answer to that question appears to be "yes".

On the other hand, I do not see anything in 5.2.4.2.1 which requires
that UCHAR_MAX not be greater than INT_MAX - which indeed it must be,
if sizeof(int) == 1, correct? In such a case, fgetc() may return
UCHAR_MAX (right?), and so either fgetc() must work behind-the-scenes
magic to return a signed integer representing UCHAR_MAX, or invoke UB
by overflowing the signed type int. Both of these alternatives seem
ridiculous to me, so what am I missing?
Yes. That's a known glitch. A system that makes sizeof(int) == 1 has no way
of returning EOF and distinguishing it from a legal value.
In practise files are probably read as octets. Which is OK but breaks
fputc(), but only in binary mode.

Jun 17 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Jason K | last post by:
Let me preface this by saying this obviously isn't a C++ *language* issue per se; rather probably an issue relating to quality of implementation, unless I'm just misusing iostream... I wrote a...
13
by: William L. Bahn | last post by:
I'm sure this has been asked before, and I have looked in the FAQ, but I'm looking for an explanation for the following: The functions pairs: gets()/fgets() puts()/fputs() printf()/fprintf()...
9
by: Carramba | last post by:
hi! how do I call UCHAR_MIN & UCHAR_MAX ? printf("unsigned-char: \n", UCHAR_MIN, UCHAR_MAX); Iam getting error with %c? with one I should use? and why? thax in advance --...
6
by: Kobu | last post by:
Do the "larger" input functions like scanf, gets, fgets use fgetc to take input or an operating system call function like read() (I know it could be any "way", but I'm trying to find out how it's...
2
by: Mark | last post by:
i'm trying to read a file one char at a time into a char array thusly... char buffer; while (readChars(InputFile, buffer, BUFFER_SIZE) != 0) { //not doing anything atm } int readChars(FILE...
8
by: M. Ĺhman | last post by:
I'm reading "C: A Reference Manual" but still can't understand a very basic thing: is there any functional difference between fgetc/fputc and fread/fwrite (when reading/writing one unsigned char)?...
3
by: cinsky | last post by:
Hi, While reading ISO C Standard, I found follow text in 7.19.8.1: size_t fread(void *restrict ptr, size_t SIZE, ...) ... For each object, SIZE calls are made to the fgetc function and the...
16
by: appi | last post by:
I am learning datastructure using c in my syllabus and i want to learn about datastructure, c++, java, oracle, web design, html,c#. SO I AM REQUESTING YOU THAT PLEASE SEND ME SOME INFORMATION...
2
by: Praesidium | last post by:
I'm having a bit of a problem running a C program I'm working on (compiled in cygwin). There's a way to go, as I'm intent on making sure each bit works before moving on to the next. As it appears...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.