473,748 Members | 6,664 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

getc and "large" bytes

Assuming all the values of int are in the range of unsigned char, what
happends if getc returns EOF?
Is it possible that EOF was the value of the byte read?
Does that mean that code aiming for maximum portability needs to check
for both feof() and ferror()?
(for example, if both feof() and ferror() return 0 for the stream when
getc() returned EOF, consider EOF a valid byte read)
To me, that seems to be the case, but maybe the standard says this to
be incorrect.

As always, all replies appreciated.
Jun 27 '08 #1
32 2093
vi******@gmail. com writes:
Assuming all the values of int are in the range of unsigned char, what
happends if getc returns EOF?
Your assumption is false.
--
Ben Pfaff
http://benpfaff.org
Jun 27 '08 #2
On May 23, 6:35 pm, Ben Pfaff <b...@cs.stanfo rd.eduwrote:
vipps...@gmail. com writes:
Assuming all the values of int are in the range of unsigned char, what
happends if getc returns EOF?

Your assumption is false.
Would you please elaborate?
Jun 27 '08 #3
vi******@gmail. com said:
On May 23, 6:35 pm, Ben Pfaff <b...@cs.stanfo rd.eduwrote:
>vipps...@gmail .com writes:
Assuming all the values of int are in the range of unsigned char, what
happends if getc returns EOF?

Your assumption is false.
Would you please elaborate?
The int type must be able to represent values in the range INT_MIN to -1,
none of which values are in the range of unsigned char (which, lacking a
sign bit, cannot represent negative values).

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Jun 27 '08 #4
On May 23, 6:42 pm, Richard Heathfield <r...@see.sig.i nvalidwrote:
vipps...@gmail. com said:
On May 23, 6:35 pm, Ben Pfaff <b...@cs.stanfo rd.eduwrote:
vipps...@gmail. com writes:
Assuming all the values of int are in the range of unsigned char, what
happends if getc returns EOF?
Your assumption is false.
Would you please elaborate?

The int type must be able to represent values in the range INT_MIN to -1,
none of which values are in the range of unsigned char (which, lacking a
sign bit, cannot represent negative values).
I'm talking about the case that both int and unsigned char are 16
bits, and to be honest I'm still not convinced that this is false.
Jun 27 '08 #5
In article <c9************ *************** *******@e39g200 0hsf.googlegrou ps.com>,
<vi******@gmail .comwrote:
>Assuming all the values of int are in the range of unsigned char, what
happends if getc returns EOF?
If int and char are the same size, and all possible unsigned char
values can be read, then it is possible that getc() will attempt to
convert to an int a value which cannot be represented as one. This is
implementation-defined. Assuming it works in the usual way, it may
return a negative integer equal to EOF.
>Does that mean that code aiming for maximum portability needs to check
for both feof() and ferror()?
Yes, but it seems to me that undefined behaviour is involved.

For maximum portability, don't use machines like that :-)

-- Richard
--
In the selection of the two characters immediately succeeding the numeral 9,
consideration shall be given to their replacement by the graphics 10 and 11 to
facilitate the adoption of the code in the sterling monetary area. (X3.4-1963)
Jun 27 '08 #6
vi******@gmail. com wrote:
On May 23, 6:42 pm, Richard Heathfield <r...@see.sig.i nvalidwrote:
>vipps...@gmail .com said:
>>On May 23, 6:35 pm, Ben Pfaff <b...@cs.stanfo rd.eduwrote:
vipps...@gma il.com writes:
Assuming all the values of int are in the range of unsigned char, what
happends if getc returns EOF?
Your assumption is false.
Would you please elaborate?
The int type must be able to represent values in the range INT_MIN to -1,
none of which values are in the range of unsigned char (which, lacking a
sign bit, cannot represent negative values).
I'm talking about the case that both int and unsigned char are 16
bits, and to be honest I'm still not convinced that this is false.
Then you didn't express quite what you meant.

It seems to me that the behavior required of getc() places
far-reaching requirements on implementations where `int' and
`char' have the same width. Here are a few:

1) Since `unsigned char' can represent 2**N distinct values
and all of these must be distinguishable when converted to `int',
it follows that `int' must also have 2**N distinct values. Thus,
signed-magnitude and ones' complement representations are ruled
out, and INT_MIN must have its most negative possible value
(that is, INT_MIN == -INT_MAX - 1, all-bits-set cannot be a trap
representation) .

1a) "Must be distinguishable when converted" follows from
7.19.2p3's promise that data read from a binary stream must
compare equal to the data written. If two different characters
mapped to the same `int', this promise couldn't be kept.

2) Converting a too-large `unsigned char' to `int' must
not raise a signal. (At least, it must not do so when getc()
performs the conversion; it's possible that an "open-code"
conversion would behave differently.)

An implication of (1) for the programmer is that yes, there
will be a legitimate `unsigned char' value that maps to EOF
when converted to `int'. Hence, a maximally portable program
cannot assume that the value EOF indicates a getc() failure;
it must go on to check both feof() and ferror():

int ch = getc(stream);
if (ch == EOF) {
if (feof(stream))
return ALL_DONE;
if (ferror(stream) )
return ALL_FU;
}
return ALLS_WELL; /* even if ch == EOF */

--
Er*********@sun .com
Jun 27 '08 #7
vi******@gmail. com writes:
On May 23, 6:35 pm, Ben Pfaff <b...@cs.stanfo rd.eduwrote:
>vipps...@gmail .com writes:
Assuming all the values of int are in the range of unsigned char, what
happends if getc returns EOF?

Your assumption is false.
Would you please elaborate?
-1 is in the range of int.
-1 is not in the range of unsigned char.
Therefore it is not true that all the values of int are in the
range of unsigned char.
--
char a[]="\n .CJacehknorstu" ;int putchar(int);in t main(void){unsi gned long b[]
={0x67dffdff,0x 9aa9aa6a,0xa77f fda9,0x7da6aa6a ,0xa67f6aaa,0xa a9aa9f6,0x11f6} ,*p
=b,i=24;for(;p+ =!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
2:{i++;if(i)bre ak;else default:continu e;if(0)case 1:putchar(a[i&15]);break;}}}
Jun 27 '08 #8

"Ben Pfaff" <bl*@cs.stanfor d.eduwrote in message
news:87******** ****@blp.benpfa ff.org...
vi******@gmail. com writes:
>On May 23, 6:35 pm, Ben Pfaff <b...@cs.stanfo rd.eduwrote:
>>vipps...@gmai l.com writes:
Assuming all the values of int are in the range of unsigned char, what
happends if getc returns EOF?

Your assumption is false.
Would you please elaborate?

-1 is in the range of int.
-1 is not in the range of unsigned char.
Therefore it is not true that all the values of int are in the
range of unsigned char.
The OP mentioned an example where both might be 16 bits. So -1 in one could
be 0xFFFF in the other, causing ambiguity in the (I think unlikely) event of
reading a 16-bit character 0xFFFF from a file with 16-bit encoding.

(How would such a character size read standard 8-bit files? By
zero-extending to 16?)

--
Bartc

Jun 27 '08 #9
vi******@gmail. com writes:
On May 23, 6:42 pm, Richard Heathfield <r...@see.sig.i nvalidwrote:
>vipps...@gmail .com said:
On May 23, 6:35 pm, Ben Pfaff <b...@cs.stanfo rd.eduwrote:
vipps...@gmail .com writes:
Assuming all the values of int are in the range of unsigned char, what
happends if getc returns EOF?
>Your assumption is false.
Would you please elaborate?

The int type must be able to represent values in the range INT_MIN to -1,
none of which values are in the range of unsigned char (which, lacking a
sign bit, cannot represent negative values).
I'm talking about the case that both int and unsigned char are 16
bits, and to be honest I'm still not convinced that this is false.
Your underlying point is right; you just stated it incorrectly. The
problem occurs when not all values of unsigned char are in the range
of int.

The value returned by getc() is either the next character from the
input stream, interpreted as an unsigned char and converted to int, or
the value EOF (which must be negative and is typically -1).

On most systems, all values of type unsigned char can be converted to
int without changing their numeric value.

If both int and unsigned char are 16 bits, then (a) the conversion
from unsigned char to int is implementation-defined for values
numerically greater than INT_MAX, and (b) some valid unsigned char
value might be converted to the value EOF.

You can work around (b) by checking feof() and ferror() after getc()
returns EOF. If both are false, then you can assume that you read a
legimate character (say, 0xFFFF) that happened to be converted to EOF
(or that there's a bug in the implementation' s feof() or ferror()
function, which might be almost as likely). Most programmers don't
bother to worry about this possibility. As a result, some code will
likely break if ported to such a system (most likely a DSP, which
probably has a freestanding implementation anyway and thus needn't
support <stdio.hat all) *if* it happens to read such a character.

(a) the implementation-definedness of the conversion, could be a more
serious problem. Given this problem, I can't think of a way to write
*really* portable code to read from a file.

fread() is likely to copy the input directly into an array of
characters, and thus probably won't run into the same problem -- but
fread() is defined to work by calling fgetc(), so the standard doesn't
guarantee that you won't run into exactly the same problem.

In my opinion, it would be reasonable for the standard to require
INT_MAX >= UCHAR_MAX for all hosted implementations .

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Jun 27 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

36
6394
by: Andrea Griffini | last post by:
I did it. I proposed python as the main language for our next CAD/CAM software because I think that it has all the potential needed for it. I'm not sure yet if the decision will get through, but something I'll need in this case is some experience-based set of rules about how to use python in this context. For example... is defining readonly attributes in classes worth the hassle ? Does duck-typing scale well in complex
25
20560
by: tekctrl | last post by:
Anyone: I have a simple MSAccess DB which was created from an old ASCII flatfile. It works fine except for something that just started happening. I'll enter info in a record, save the record, and try to move to another record and get an Access error "Record is too large". The record is only half filled, with many empty fields. If I remove the added data or delete some older data, then it saves ok and works fine again. Whenever I'm...
0
9534
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9316
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8239
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6793
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4597
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4867
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3303
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2777
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2211
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.