Tanuki wrote:
Hi All:
I encounter a programming problem recently. I need to read a binary
file. I need to translate the binary data into useful information. I
have the format at hand, like 1st byte = ID, next 4 byte (int) =
serial number etc.
The first problem is Big Endian/ Little Endian problem. I can decipher
if the format is big or little endian. But got confuse as to how to
decipher the data.
Eg. if I know I am on little endian, and I have a integer whose binary
representation is
20 03 00 00, then what is the equivalent decimal?
Assuming that by "binary" you actually mean "hexadecimal,"
this value is decimal 800. How did I get there?
0x20 + 0x03*0x100 + 0x00*0x10000 + 0x00*0x1000000
or equivalently
((0x00 * 0x100 + 0x00) * 0x100 + 0x03) * 0x100 + 0x20
or equivalently
0x20 + (0x03<<8) + (0x00<<16) + (0x00<<24)
or equivalently
(((((0x00 << 8) + 0x00) << 8) + 0x03) << 8) + 0x20
The next problem is there are also floating point data. How can I
infer the floating point data from a binary representaiton, like what
r the numbers before the decimal point and those after the decimal
point?
Without more knowledge of the representation, you're stuck.
In the integer case you already knew a good deal about what the
representation looked like: you knew it consisted of four eight-
bit bytes arranged in Little-Endian order. (Some questions still
remain, of course: for example, what do negative integers look
like?) But in the floating-point case, all you've told us is
that you know the numbers are floating-point -- but if you know
nothing about the representation, there's no way to decode it.
The best thing to do is consult the documentation for the
system that wrote the file, and see whether it tells you how
floating-point numbers are stored.
Failing that, you could try inspecting the data in the file
and seeing whether it "looks like" a well-known floating-point
format. The commonest such formats are surely the IEEE single-
and double-precision binary floating point; try interpreting
the bits according to those formats and see whether the values
you get "make sense" for the application at hand. If it doesn't
look like IEEE, you could also try the various VAX floating-
point formats, or the S/360 base-16 formats.
Of course, if the numbers were written out on the same system
that's reading them back again, they'll be in some native format
supported by that system. If you can figure out which one, you
needn't sweat the details: just read the bits into an object of
the proper type, and you're done.
--
Er*********@sun.com