473,499 Members | 1,589 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How do I know whether the file is text or binary?

How do I know whether the file is text or binary?

http://www.alvas.net - Audio tools for C# and VB.Net developers + Christmas
discount
Dec 13 '07 #1
6 6279
On Dec 13, 11:17 am, "Alexander Vasilevsky" <m...@alvas.netwrote:
How do I know whether the file is text or binary?

http://www.alvas.net- Audio tools for C# and VB.Net developers + Christmas
discount
The "brute force" method would be to open the file with a
BinaryReader, read every byte in the file and if any byte is outside
the range of standard ASCII characters, then it is a binary file. You
may have to take into account possible international characters.

I suppose it could be possible that a true binary file may not contain
any non-standard ASCII characters, but that, IMHO, would be rare.

Dec 13 '07 #2
On Dec 13, 4:35 pm, za...@construction-imaging.com wrote:
The "brute force" method would be to open the file with a
BinaryReader, read every byte in the file and if any byte is outside
the range of standard ASCII characters, then it is a binary file. You
may have to take into account possible international characters.

I suppose it could be possible that a true binary file may not contain
any non-standard ASCII characters, but that, IMHO, would be rare.
More to the point, it's possible to be a text file which contains non-
ASCII characters. If it's encoded in UTF-8, for instance, it may well
contain a BOM at the start which is non-ASCII, as well as non-ASCII
encoded characters.

For the OP: there's no such thing as a "text file" or a "binary file"
really - it's all a question of interpretation. A file is (at least in
the simple case - there may be alternate streams etc) just a sequence
of bytes. Any particular sequence of bytes can be treated as binary,
or perhaps treated as text depending on which encoding is chosen.

Jon
Dec 13 '07 #3
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:4c**********************************@d4g2000p rg.googlegroups.com...
For the OP: there's no such thing as a "text file" or a "binary file"
really - it's all a question of interpretation. A file is (at least in
the simple case - there may be alternate streams etc) just a sequence
of bytes. Any particular sequence of bytes can be treated as binary,
or perhaps treated as text depending on which encoding is chosen.
Well said. I recall several mainframe operating systems that make the
distinction, but in Windows, a file does not have a type. Every file is a
stream of bytes, and some streams of bytes can be interpreted as text.

Of course files have names, and the names end with extensions, and *by
convention* many pieces of software assume that the extension gives
information about the file. But the OS does not keep track of file types.
Dec 13 '07 #4

"Michael A. Covington" <lo**@ai.uga.edu.for.addresswrote in message
news:%2***************@TK2MSFTNGP04.phx.gbl...
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:4c**********************************@d4g2000p rg.googlegroups.com...
>For the OP: there's no such thing as a "text file" or a "binary file"
really - it's all a question of interpretation. A file is (at least in
the simple case - there may be alternate streams etc) just a sequence
of bytes. Any particular sequence of bytes can be treated as binary,
or perhaps treated as text depending on which encoding is chosen.

Well said. I recall several mainframe operating systems that make the
distinction, but in Windows, a file does not have a type. Every file is a
stream of bytes, and some streams of bytes can be interpreted as text.

Of course files have names, and the names end with extensions, and *by
convention* many pieces of software assume that the extension gives
information about the file. But the OS does not keep track of file types.
The OS doesn't define any mandatory file type metadata. But if an
application adds metadata to the file, the OS most certainly keeps track of
it. Of course, it would still not be meaningful as a file type to the OS,
it would be just as much a convention as the file extension.
Dec 13 '07 #5
Alexander Vasilevsky wrote:
How do I know whether the file is text or binary?
As many other has states then there are really no way to tell.

If you implement the following logic:
read the first 1000 bytes file
if number of <LF 5 and
number of <NUL== 0 and
frequency of ' '..'~' 90%
then
return text
else
return binary
end if

You will be correct in 95+% of cases for western language text.

Arne
Dec 13 '07 #6
Tom
I'm just now transitioning from C/C++ structured to C# and am working
with various file types in the learning process. My work in C involved
a lot of binary file I/O as well as a lot of text data file input and
conversion. Your question is one I have worked on recently too but do
not have fully resolved as yet ... but here's some comments >>

As others have pointed out ... the files are just streams of bytes and
it is how you are able to interpret them that is the key.

If the file is binary, you'd have to have the specific algorithm to
utilize it ... or be a very good code breaker. Using WinHex is a good
first look tool to use to see if you have any interest in that area.
It takes a little digging to get proficient with it ... but you can
certainly see every bit and various translations as you explore the
file.

If the file is text, the cultural and encoding attributes need to be
dealt with. If you are only working in English ... that simplifies
things greatly. My approach is to open the file in a RichTextBox and
view the first 2000 bytes. In a blink you'll know if it is readable or
gibberish.

As a learning project I am still polishing on an enhanced FilePicker
that began with Petzold's Directory TreeView and File ListView. I have
added a RichTextBox in a splitter panel below the tree and list
panels. I set it up as read only to assure no accidental editing
occurs. Select a file in the list and on the same form you get a fast
preview. Easy to recognize as text and also provides enough to usually
determine if it is the file I am targeting.

I'm still trying to get a working understanding of binding and
DataGridView to replace the list with a more feature packed class;
however, all that comes at price of speed and footprint. I'm not sure
if it is worth it ... but it sure provides a good focus to use for
learning. Also, I am working on a buffering algorithm to allow fast
viewing of huge data files without loading the entire file. Endless
enhancements are possible and such a project you might find fun.

If you decide to explore WinHex ... make a binary file with some known
values of doubles and various sized ints. Then when examining it you
will know what you are looking for and it becomes a lot easier. Also
open a short little NotePad txt file and give it a view. It's really
pretty interesting.

Best of Luck.

-- Tom

On Thu, 13 Dec 2007 18:17:17 +0200, "Alexander Vasilevsky"
<ma**@alvas.netwrote:
>How do I know whether the file is text or binary?

http://www.alvas.net - Audio tools for C# and VB.Net developers + Christmas
discount
Dec 14 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
6393
by: john smith | last post by:
Hi, I have a file format that is going to contain some parts in ascii, and some parts with raw binary data. Should I open this file with ios::bin or no? For example: filename: a.bin number of...
6
519
by: | last post by:
I am rewriting a C++ application in C#. This file has a combination of Text and Binary data. I used CFile before to read the text. If I hit a certain string that denotes the following data is...
11
4291
by: Dale | last post by:
How to recognize whether file has XML format or not? Here is the code segment: XmlDocument* pDomDocument = new XmlDocument(); try { pDomDocument->Load(strFileName ) ; } catch(Exception* e) {
35
2619
by: munish.nr | last post by:
Hi All, I want to know the size of file (txt,img or any other file). i knoe only file name. how i can acheive this. does anybody is having idea about that. plz help. rgrds, Munish Nayyar
8
1770
by: siliconwafer | last post by:
Hi All, If I open a binary file in text mode and use text functions to read it then will I be reading numbers as characters or actual values? What if I open a text file and read it using binary...
12
5832
by: Adam J. Schaff | last post by:
I am writing a quick program to edit a binary file that contains file paths (amongst other things). If I look at the files in notepad, they look like: ...
68
5152
by: vim | last post by:
hello everybody Plz tell the differance between binary file and ascii file............... Thanks in advance vim
10
22578
by: rory | last post by:
I can't seem to append a string to the end of a binary file. I'm using the following code: fstream outFile("test.exe", ios::in | ios::out | ios::binary | ios::ate | ios::app)...
20
3525
by: Peter Olcott | last post by:
Is there any standard C++ way to determine the size of a file before it is read?
0
7009
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7178
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7223
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
7390
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5475
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4602
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3103
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1427
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
665
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.