Tanuki wrote:

Hi All:

I encounter a programming problem recently. I need to read a binary

file. I need to translate the binary data into useful information. I

have the format at hand, like 1st byte = ID, next 4 byte (int) =

serial number etc.

The first problem is Big Endian/ Little Endian problem. I can decipher

if the format is big or little endian. But got confuse as to how to

decipher the data.

Eg. if I know I am on little endian, and I have a integer whose binary

representation is

20 03 00 00, then what is the equivalent decimal?

Assuming that by "binary" you actually mean "hexadecimal,"

this value is decimal 800. How did I get there?

0x20 + 0x03*0x100 + 0x00*0x10000 + 0x00*0x1000000

or equivalently

((0x00 * 0x100 + 0x00) * 0x100 + 0x03) * 0x100 + 0x20

or equivalently

0x20 + (0x03<<8) + (0x00<<16) + (0x00<<24)

or equivalently

(((((0x00 << 8) + 0x00) << 8) + 0x03) << 8) + 0x20

The next problem is there are also floating point data. How can I

infer the floating point data from a binary representaiton, like what

r the numbers before the decimal point and those after the decimal

point?

Without more knowledge of the representation, you're stuck.

In the integer case you already knew a good deal about what the

representation looked like: you knew it consisted of four eight-

bit bytes arranged in Little-Endian order. (Some questions still

remain, of course: for example, what do negative integers look

like?) But in the floating-point case, all you've told us is

that you know the numbers are floating-point -- but if you know

nothing about the representation, there's no way to decode it.

The best thing to do is consult the documentation for the

system that wrote the file, and see whether it tells you how

floating-point numbers are stored.

Failing that, you could try inspecting the data in the file

and seeing whether it "looks like" a well-known floating-point

format. The commonest such formats are surely the IEEE single-

and double-precision binary floating point; try interpreting

the bits according to those formats and see whether the values

you get "make sense" for the application at hand. If it doesn't

look like IEEE, you could also try the various VAX floating-

point formats, or the S/360 base-16 formats.

Of course, if the numbers were written out on the same system

that's reading them back again, they'll be in some native format

supported by that system. If you can figure out which one, you

needn't sweat the details: just read the bits into an object of

the proper type, and you're done.

--

Er*********@sun.com