473,406 Members | 2,336 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

ascii or binary

Hello NG,

I am making a small tool which reads files on harddisk and saves many
information about files in a db. Now, while reading information from the
file I want to figure out what type, ASCII or BINary, it is while reading. I
can look for ext but there are millions of them and not very reliable. Also,
there could be a embedded binary information in the file (e.g. pdf) that
might make that file a mixed type.

I want to know if there is a way, maybe some kind of signature, to find what
kind of file is it in C# (or any other language that can be used with cs).

--
Thx
po
Nov 16 '05 #1
3 3974
Hi Pohihihi,

As far as I know there is no good way to split text and binary files.
You could search every byte and if none are above 127 chances are it's
an ASCII text file, but even if there are bytes above 127 it could be
an extended ASCII text file.

The best bet would probably be to keep a list of known text file
extensions like .bat .inf .reg .txt, but even then you can't be sure
if some program decides to make a binary .txt

Maybe you could tap into the registry and get a descriptive line about
a registered file extension instead.
--
Happy Coding!
Morten Wennevik [C# MVP]
Nov 16 '05 #2
A fairly reliable method I have used in the past is to read it for binary
access, and if any of the bytes are zero, then it ISN'T a text file. If it
contains NO zeros in the whole file, it's probable that it's a text file.
Exceptions to this are *very* rare.

So just have a method IsTextFile, and start to read it using a FileStream in
a using block, and if any of the bytes are zero, return false immediately,
otherwise return true if it gets to the end of the loop.
"Pohihihi" wrote:
Hello NG,

I am making a small tool which reads files on harddisk and saves many
information about files in a db. Now, while reading information from the
file I want to figure out what type, ASCII or BINary, it is while reading. I
can look for ext but there are millions of them and not very reliable. Also,
there could be a embedded binary information in the file (e.g. pdf) that
might make that file a mixed type.

I want to know if there is a way, maybe some kind of signature, to find what
kind of file is it in C# (or any other language that can be used with cs).

--
Thx
po

Nov 16 '05 #3
Bonj wrote:
A fairly reliable method I have used in the past is to read it for
binary access, and if any of the bytes are zero, then it ISN'T a text
file. If it contains NO zeros in the whole file, it's probable that
it's a text file. Exceptions to this are *very* rare.

So just have a method IsTextFile, and start to read it using a
FileStream in a using block, and if any of the bytes are zero, return
false immediately, otherwise return true if it gets to the end of the
loop.

But this will identify only ASCII files as text, Unicode files will probably
be detected as binary! When characters are stored as two bytes, it is
very possibly that one of these is zero.

Hans Kesting


"Pohihihi" wrote:
Hello NG,

I am making a small tool which reads files on harddisk and saves many
information about files in a db. Now, while reading information from
the file I want to figure out what type, ASCII or BINary, it is
while reading. I can look for ext but there are millions of them and
not very reliable. Also, there could be a embedded binary
information in the file (e.g. pdf) that might make that file a mixed
type.

I want to know if there is a way, maybe some kind of signature, to
find what kind of file is it in C# (or any other language that can
be used with cs).

--
Thx
po

Nov 16 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Bix | last post by:
As this is my very first post, I'd like to give thanks to all who support this with their help. Hopefully, this question hasn't been answered (too many times) before... If anyone could explain...
8
by: Bernhard Hidding | last post by:
Hi, my program writes chars to an ascii file via ofstream. You can use "\n" for newline and "\t" for tab there, but is there any command that deletes the last char in the current ofstream? Thanks...
22
by: Sunner Sun | last post by:
Hi, all Since the OS look both ASCII and binary file as a sequence of bytes, is there any way to determine the file type except to judge the extension? Thank you!
5
by: sathyashrayan | last post by:
Group, I have some doubts in the following program. ------------------program--------------------- /* ** Make an ascii binary string into an integer. */ #include <string.h> unsigned int...
13
by: greg | last post by:
Hello, I'm searching to know if a local file is ascii or binary. I couldn't find it in the manual, is there a way to know that ? thanks, -- greg
6
by: SandyMan | last post by:
Hi, I am able to open a binary file for reading but can someone tell me as how to go about converting a Binary file to ASCII file using C. Thanks In Advance SandyMan
68
by: vim | last post by:
hello everybody Plz tell the differance between binary file and ascii file............... Thanks in advance vim
6
by: as400tips | last post by:
I have a Binary Data file (Packed Decimal and ASCII mixed) and would like to convert into ASCII (readable) file. How to do it in C#? Thanks.
7
by: azrael | last post by:
Hy folks, I googled, and searched, and can not bealive that I have not found a built in way to convert the easy and elegant python way a function to easily convert simple ascii data to binary...
5
by: Canned | last post by:
Hi, I'm trying to write a class that can convert ascii to binary and vice versa. I write my class based on this function I've found on internet That works perfectly, but when I try to implement...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.