By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,605 Members | 1,391 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,605 IT Pros & Developers. It's quick & easy.

EOF for binary files

P: n/a
I've read in a book:

<quote>
With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input. Instead, you can
use the library function feof(), which can be used for both binary- and
text-mode files:

int feof(FILE *fp);
</quote>

Isn't it true that testing for EOF is valid for both text- and
binary-mode files?

Also, the FAQ recommends not to use feof():
<quote>In virtually all cases, there's no need to use feof at all.
</quote>

Nov 11 '06 #1
Share this Question
Share on Google+
12 Replies


P: n/a
Registered User said:
I've read in a book:

<quote>
With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input.
Ditch the book. It doesn't understand EOF.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: normal service will be restored as soon as possible. Please do not
adjust your email clients.
Nov 11 '06 #2

P: n/a
Richard Heathfield wrote:
Registered User said:
I've read in a book:

<quote>
With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input.

Ditch the book. It doesn't understand EOF.
Oh, thanks Richard!! That part of the book really got me confused.

Nov 11 '06 #3

P: n/a
Registered User said:
Richard Heathfield wrote:
>Registered User said:
I've read in a book:

<quote>
With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input.

Ditch the book. It doesn't understand EOF.
Oh, thanks Richard!! That part of the book really got me confused.
The mistake the author makes is that he appears to believe EOF is a
character. It isn't. It's a message from your I/O library which, freely
translated, means "you asked me for more data, squire, but there ain't
none. The pot's empty. Sorry, I'd love to help and all that...".

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: normal service will be restored as soon as possible. Please do not
adjust your email clients.
Nov 11 '06 #4

P: n/a
Registered User wrote:
>
I've read in a book:

<quote>
With a binary-mode stream, you can't detect the end-of-file by
looking for EOF, because a byte of data from a binary stream could
have that value, which would result in premature end of input.
Instead, you can use the library function feof(), which can be
used for both binary- and text-mode files:

int feof(FILE *fp);
</quote>

Isn't it true that testing for EOF is valid for both text- and
binary-mode files?
Yes. The only possible exception occurs when (sizeof(int) == 1).
A stream is a stream of bytes, and the routines to read them return
ints formed from the (unsigned)char value involved. Thus the value
of EOF is always distinct.
>
Also, the FAQ recommends not to use feof():
<quote>In virtually all cases, there's no need to use feof at all.
</quote>
feof is primarily useful to distinguish between i/o errors and
actual eof, either of which conditions will usually return EOF.

if (EOF == (ch = getc(f))) {
if (feof(f)) /* actual file eof encountered */
else {
/* use ferror etc. to determine the cause */
}
}
else {
/* use the value of ch, which is a valid unsigned char */
}

note that ch must have been declared as an int.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Nov 11 '06 #5

P: n/a
In article <11**********************@m73g2000cwd.googlegroups .com>,
Registered User <in*******************@gmail.comwrote:
>With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input.
It would certainly be a mistake to compare a byte against EOF if the
byte is a char, because EOF is an int value and a char converted to
an int might have the same value as EOF. But getc() doesn't return
a char; it returns an unsigned char converted to an int, so there
is no possibility of a real byte appearing to be equal to EOF, because
EOF is guaranteed to be negative.

So you can perfectly well compare against EOF provided you don't
convert the value to a char first.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Nov 11 '06 #6

P: n/a
Registered User wrote:
I've read in a book:

<quote>
With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input. Instead, you can
use the library function feof(), which can be used for both binary- and
text-mode files:

int feof(FILE *fp);
</quote>

Isn't it true that testing for EOF is valid for both text- and
binary-mode files?
The book is right in the sense that it is possible for a
byte read from a stream (text or binary) to have the value
EOF, but only on "exotic" machines where bytes and ints have
the same size. That is, the book is right if it's trying to
be "fully general" -- but if it's writing about "mainstream"
C implementations it's wrong.

The Standard defines all input operations as if they used
the fgetc() function as many times as necessary (the actual
implementation might do something more intricate, but the end
result must be the same). The fgetc() function returns an int
value: either EOF to indicate failure, or an actual input byte
represented as unsigned char converted to int. If int is
wider than char, converting an unsigned char to an int yields
a non-negative value, and since the EOF macro expands to a
negative number there can be no confusion.

On those exotic architectures, though, things get sticky.
If sizeof(int) == 1, there must be unsigned char values that
are too large for int: for example, on a system with sixteen-bit
chars and sixteen-bit ints, INT_MAX will be 32767 but UCHAR_MAX
will be 65535. Since fgetc() must be able to read back any
character values fputc() might have written (subject to some
restrictions that don't matter here), on this system it must
be able to return 65536 distinguishable int values. Half of
those will necessarily be negative, and one of them will have
the same value as EOF. So on exotic architectures, it is
possible for fgetc() to return EOF when reading "real" data,
and the only way to tell whether the EOF is actual data or an
indication of input failure is to call both feof() and ferror().
Also, the FAQ recommends not to use feof():
<quote>In virtually all cases, there's no need to use feof at all.
</quote>
I'm not the FAQ author, but I'd read "in virtually all cases"
to mean "whenever int is wider than char," or "on virtually all
`mainstream' machines." It would be nice, IMHO, if the FAQ were
more explicit about this, but it's not a big failing.

The FAQ is right in implying that feof() is seldom used,
because after receiving an EOF return value (on a "mainstream"
system) your immediate concern should be "End-of-input, or error?"
and it seems more natural to use ferror() for that question:

int ch;
while ( (ch = fgetc(stream)) != EOF ) {
/* process the character just read */
}
/* "Why did we get EOF?" */
if (ferror(stream)) {
/* do something about the I/O error */
}
else {
/* normal end-of-input */
}

This code assumes that EOF can only appear as the result of
end-of-input or I/O error, so if there's no I/O error the stream
must have reached its end. Of course, the same reasoning would
hold for using feof(stream) and swapping the bodies of the two
if statements, but "ferror?" seems a more direct inquiry.

On "exotic" architectures the either/or reasoning breaks down
because there's a third possibility: an EOF return might be actual
input data. If you're writing with such a system in mind you need
to use both feof() and ferror() to distinguish the three outcomes,
and the loop might look something like

int ch;
while ( (ch = fgetc(stream)) , /* comma operator */
(!feof(stream) && !ferror(stream) ) {
/* process the character just read */
}
/* "Was it error or end-of-input?" */
if (ferror(stream)) {
/* do something about the I/O error */
}
else {
/* normal end-of-input */
}

Of course, this can be written in many other rearrangements. One
likely change would be to call feof() and ferror() only when an EOF
shows up instead of every single time, by changing the while clause
to something like

while ( (ch = fgetc(stream)) != EOF
|| (!feof(stream) && !ferror(stream)) )

Since most I/O devices are pathetically slow compared to most CPUs,
this "optimization" probably doesn't save noticeable time -- but
it is in the tradition of C to worry about tiny efficiencies while
ignoring gross waste. ;-) (That same tradition, by the way, calls
for using getc() instead of fgetc() wherever possible.)

--
Eric Sosman
es*****@acm-dot-org.invalid
Nov 11 '06 #7

P: n/a
Op 11 Nov 2006 14:34:44 GMT schreef Richard Tobin:
In article <11**********************@m73g2000cwd.googlegroups .com>,
Registered User <in*******************@gmail.comwrote:
>>With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input.

It would certainly be a mistake to compare a byte against EOF if the
byte is a char, because EOF is an int value and a char converted to
an int might have the same value as EOF. But getc() doesn't return
a char; it returns an unsigned char converted to an int, so there
is no possibility of a real byte appearing to be equal to EOF, because
EOF is guaranteed to be negative.
getc returns an int, not a char, be it signed or unsigned.
#include <stdio.h>
int getc(FILE *FP);
And yes, if no EOF condition is reached, the int may be regarded as char.
EOF does not fit in a char so it well may be some negative number.
So you can perfectly well compare against EOF provided you don't
convert the value to a char first.
Yes.
--
Coos
Nov 11 '06 #8

P: n/a
Coos Haak wrote:
Op 11 Nov 2006 14:34:44 GMT schreef Richard Tobin:
>In article <11**********************@m73g2000cwd.googlegroups .com>,
Registered User <in*******************@gmail.comwrote:
>>With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input.
It would certainly be a mistake to compare a byte against EOF if the
byte is a char, because EOF is an int value and a char converted to
an int might have the same value as EOF. But getc() doesn't return
a char; it returns an unsigned char converted to an int, so there
is no possibility of a real byte appearing to be equal to EOF, because
EOF is guaranteed to be negative.

getc returns an int, not a char, be it signed or unsigned.
Richard said that.
#include <stdio.h>
int getc(FILE *FP);
And yes, if no EOF condition is reached, the int may be regarded as char.
Be *definition* if EOF is not returned the value is that of an
*unsigned* char as, again, richard said.
EOF does not fit in a char so it well may be some negative number.
EOF is *defined* as being a negative number, so there is no "may well
be" about it.
>So you can perfectly well compare against EOF provided you don't
convert the value to a char first.

Yes.
Everything Richard said in that post is correct, not just that last
sentence.
--
Flash Gordon
Nov 11 '06 #9

P: n/a
Op Sat, 11 Nov 2006 17:21:14 +0000 schreef Flash Gordon:
Coos Haak wrote:
>Op 11 Nov 2006 14:34:44 GMT schreef Richard Tobin:
>>In article <11**********************@m73g2000cwd.googlegroups .com>,
Registered User <in*******************@gmail.comwrote:

With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input.
It would certainly be a mistake to compare a byte against EOF if the
byte is a char, because EOF is an int value and a char converted to
an int might have the same value as EOF. But getc() doesn't return
My mistake, I overlooked this -------
Sorry for reading and replying too fast and hasty ;-(
--
Coos
Nov 11 '06 #10

P: n/a
"Registered User" <in*******************@gmail.comwrites:
I've read in a book:

<quote>
With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input. Instead, you can
use the library function feof(), which can be used for both binary- and
text-mode files:

int feof(FILE *fp);
</quote>
Who is the author? If it's Schildt, we already know about him (and
warn people away from his books whenever possible). If it's someone
else, we may have another name for The List.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 11 '06 #11

P: n/a
"Registered User" <in*******************@gmail.comwrote:
# I've read in a book:
#
# <quote>
# With a binary-mode stream, you can't detect the end-of-file by looking
# for EOF, because a byte of data from a binary stream could have that
# value, which would result in premature end of input. Instead, you can
# use the library function feof(), which can be used for both binary- and
# text-mode files:

It's referring to getw(fp) which can return the same value as EOF
without actually being at the end of file.

--
SM Ryan http://www.rawbw.com/~wyrmwif/
The little stoner's got a point.
Nov 11 '06 #12

P: n/a
SM Ryan <wy*****@tango-sierra-oscar-foxtrot-tango.fake.orgwrites:
"Registered User" <in*******************@gmail.comwrote:
I've read in a book:

<quote>
With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input. Instead, you can
use the library function feof(), which can be used for both binary- and
text-mode files:

It's referring to getw(fp) which can return the same value as EOF
without actually being at the end of file.
What makes you think it's referring to getw()? There is no such function
in standard C.

<OT>
There is a non-standard function getw() that reads a word (defined as
an int) from a stream. It's not even POSIX; it's defined by SVID, and
one man page recommends using fread() instead. The text quoted from
the book doesn't even make sense in terms of getw(), since it talks
about a *byte* of data having the value EOF.
</OT>

It's far more likely that the author of the book just doesn't know
what he's talking about.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 11 '06 #13

This discussion thread is closed

Replies have been disabled for this discussion.