<posted & mailed>
There's no "ASCII" in C. There is a somewhat artificial distinction between
"text" and "binary". "text" being a special case of a binary file whereby
the operating system might do something to the data as it is written to the
disk to make it compatible with applications that operate on text.
Since there's no definition of what that magic might be, there's likewise no
way to distinguish a "text" file from a "binary" file. All text files are
binary files. The only way to recognize a text file would be to check if
the file matches the local environment's criteria for a "text" file (and
most environments don't have the concept of a "text" file at all).
The cannonical example is CP/M (and Microsoft's products, which harken back
to it). There, if you open a file for writing as a "text" file, every "\n"
that is written becomes "\r\n" on disk, and when you close the file, "\032"
is appended to the end of the file. When you read from the text file, the
reverse operations occur. Windows still does this. The only way you would
could differentiate between a text file and binary file would be to be
armed with this information, then open the target file in binary mode and
check that every byte in the file returns true for isprint() or isspace()
except the last byte in the file, which must equal '\032'. If so, you know
the file is a text file. You don't need to test if the file is a binary
file, since all files are.
It gets more complicated in modern days where multiple character sets and
various encodings are used for text... In that case, the encoding needs to
be indicated within the file somehow and that frequently presumes multibyte
character sets, etc., which already preclude them from being treated as
simple text files in the first place.
Sunner Sun wrote:
Hi, all
Since the OS look both ASCII and binary file as a sequence of bytes, is
there any way to determine the file type except to judge the extension?
Thank you!
--
remove .spam from address to reply by e-mail.