Julián Albo wrote:
Quote:
The best way to read binary files is to use an unsigned char buffer and
convert from this buffer to the structure you use in the program for that
data. You make the conversion as complex as your goal of portability are,
considering endianess, type of sign enconding used...
A bit more code to write at first, but avoids the need to worry about
padding and many other issues.
To clarify, the converting code needs to worry about padding inserted in
the byte stream because the source wrote entire structs.
I suggest making it look like a stream filter reading chars from an
underlying stream so you won't ever deal with the buffer and boundary
conditions. Each function to read a particular type needs to a) skip
padding bytes that the source would have inserted to align that type;
b) read and assemble the bytes of the object; c) perhaps do something
really hard for floating-point data using a different representation,
or for bitfield data; d) pick up the value as the correct type and
return it. Sometimes you'll find shortcuts, as when 32 bit data only
needs 16 bit alignment so can be fetched by two calls to the 16 bit
fetcher.
I would add separate functions to mark the beginning and end of each
struct as there is additional padding there not related to the type of
the next member. This will require you to analyze the struct so you
can pass in the alignment the source machine will have assumed for the
struct as a whole. At least you won't have to make every single pad
explicit.
Once, when faced with too much foreign data, I wrote functions to take
a dense character string description of a struct like "ssslccl" and
convert to and from the foreign form, knowing the padding requirements
of both forms.
I consider this a defect in the language. I should be able to declare
the interface properties of the struct (padding, byte order, FP format)
in a standard way and let the compiler choose to implement it or reject
it or maybe half-implement it so special functions could be applied to
the members that can't be accessed normally. We do it anyway for device
drivers with memory-mapped I/O and for MMU structures, but fighting the
compiler every step of the way.