473,503 Members | 1,641 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

? about file formats

Hi,

I have a dos program that creates data files which are used in another
program that is written by the same company. I am trying to figure out how I
can read the data from the data files.

When I open the files in notepad they look like crap so I'm assuming they
are written in binary?

Now, assuming they are written in binary are there any methods I can use to
try to determine the format?

I opened one in a hex editor and I see 4 rows of numbers (like 01 00 20 02
00 20 03 etc) on the left and a bunch of dots on the right side view. From
my limited knowledge I'm guessing that I'm looking at the file in
hexadecimal on the left and the right is non printable or non text
characters thus showing up as dots.

So I'm curious how people go about determining file formats. Is it mostly
guess work or is there a more strategic approach I can use?

Thanks alot!!

Btw please recommend a group I can ask this in if it doesn't apply here.

Jul 22 '05 #1
2 1226
James wrote:

Hi,

I have a dos program that creates data files which are used in another
program that is written by the same company. I am trying to figure out how I
can read the data from the data files.

When I open the files in notepad they look like crap so I'm assuming they
are written in binary?
Reasonable assumption

Now, assuming they are written in binary are there any methods I can use to
try to determine the format?

I opened one in a hex editor and I see 4 rows of numbers (like 01 00 20 02
00 20 03 etc) on the left and a bunch of dots on the right side view. From
my limited knowledge I'm guessing that I'm looking at the file in
hexadecimal on the left and the right is non printable or non text
characters thus showing up as dots.
Right. Most Hex Editors present the data in that way.

So I'm curious how people go about determining file formats. Is it mostly
guess work or is there a more strategic approach I can use?


Ask the company on a documentation for the file format.
If they don't give you that information, then it is .... guess work

Usually you start with:
let the program create a data file with minimal data (no user data
at all if possible). Name that file 'Empty'.
Now let the program create a data file with a little more user
data. Compare that file with 'Empty' and try to find the user data
(the things that change). If your user data contains some text you
most likely will find that text somewhere in the file. Other parts
of the file may have changed also. They could be some organizational
entries, such as: where in the file does the text section start, how
many entries are there (if a byte changes from 0 to 1, eg.). Things
like that. Try to make sense of that.
Try various other data files (but start with small ones. There is
no sense in analyzing a multi-MB data file. You will never figure out
how all those bytes are connected).

Good luck. It can take days or weeks to analyze a binary data format.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #2
"James" <j@j.net> wrote:
Hi,

I have a dos program that creates data files which are used in another
program that is written by the same company. I am trying to figure out how I
can read the data from the data files.

When I open the files in notepad they look like crap so I'm assuming they
are written in binary?
Something other than ASCII. Anything other than ASCII can be called
binary.
Now, assuming they are written in binary are there any methods I can use to
try to determine the format?
No. The only foolproof way is to examine the source code of the
program that wrote it. Or examine documentation written by somebody
who knew that code,
I opened one in a hex editor and I see 4 rows of numbers (like 01 00 20 02
00 20 03 etc) on the left and a bunch of dots on the right side view. From
my limited knowledge I'm guessing that I'm looking at the file in
hexadecimal on the left and the right is non printable or non text
characters thus showing up as dots.
That's right, that's how reasonable Hex editors work. A hex display
side-by-side with an ASCII display. Dots are usually displayed on the
ASCII side for unprintables.
So I'm curious how people go about determining file formats. Is it mostly
guess work or is there a more strategic approach I can use?


As stated above, guess work unless you can find source code or
documentation.

--
Tim Slattery
Sl********@bls.gov
Jul 22 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1788
by: Subodh | last post by:
Hi, Currently we get data from more then 200 different sources and all of our vendors provide data in different file formats. The problem is we have more then 100 DTS packages now and the...
3
9591
by: Tanuki | last post by:
Hi All: I encounter a programming problem recently. I need to read a binary file. I need to translate the binary data into useful information. I have the format at hand, like 1st byte = ID,...
12
7305
by: Danny Lu | last post by:
Can anyone tell me if all the .obj or .o files are compatible?
0
3910
by: Lokkju | last post by:
I am pretty much lost here - I am trying to create a managed c++ wrapper for this dll, so that I can use it from c#/vb.net, however, it does not conform to any standard style of coding I have seen....
7
331
by: Bart | last post by:
When I try to upload a file from whithin a form, it works locally. But when i deploy my asp.net application, i got the error 'uri formats are not supported'. I thought it has something to do...
4
1927
by: Eric | last post by:
Hi, I need to find a way to identify between a few different file formats WITHOUT looking at the file extension. Very often our customers will name file incorrectly. For example, they'll send us...
68
5156
by: vim | last post by:
hello everybody Plz tell the differance between binary file and ascii file............... Thanks in advance vim
1
2597
by: feltra | last post by:
Hi, The following is from my friend who has only restricted net access from his office and hence cannot post.... ...
0
1641
by: feltra | last post by:
Hi all, I am trying to export a GridView data to multiple file formats. The requirement is that when more than one file format is selected and the "Submit" button is clicked, the data from the...
2
2903
by: Peter Oliphant | last post by:
The Image class allows loading a bitmap from a graphic file. So far I've gotten it to work with JPG and BMP files. What other graphic file formats are supported in this way? Is this fixed based...
0
7194
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7316
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6976
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7449
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5566
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
4993
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3160
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1495
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
729
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.