473,287 Members | 1,492 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,287 software developers and data experts.

reading binary text

hello,
i was trying to use the fread function on SunOS and ran into some
trouble.
i made a simple test as follows:

i'm trying to read in a binary file (generated from a fortran code)
that contains the following three floating-point numbers:

1.0 2.0 3.0

in my c code, i first declare an array:

float array[10];

and my fread line looks like this:

fread(array, sizeof(float), 3, input_file);

now when i print out the contents of "array" on screen:

i=3;

for (j=0; j<i; j++) {
printf("%f\n", array[j]);
}

i get

0.0
1.0
2.0

why is the first number 0.0 instead of 1.0?

any input is greatly appreciated!
thanks!
yeow
Nov 13 '05 #1
8 9474
"Yeow" <ys***@mtu.edu> wrote in message
news:64*************************@posting.google.co m...
Re: reading binary text
That's an oxymoron. Streams can be set to one
of two 'modes': text or binary. A text stream
will cause the OS to do any necessary translations
from the stored representation to the native representation
(e.g. CR/LF to/from '\n'). A binary stream reads and
stores bytes 'as is'.
hello,
i was trying to use the fread function on SunOS
If you're using a compliant implementation, the host
OS is irrelevant.
and ran into some
trouble.
i made a simple test as follows:

i'm trying to read in a binary file (generated from a fortran code)
that contains the following three floating-point numbers:

1.0 2.0 3.0
Stop right there. There's no requirement that the binary
representation of floating point values be the same for
C and FORTRAN. They might be, they might not.

Is the above just your way of expressing those values
in this message, or are those characters actually stored
in the file? If the latter, it's not a 'binary file',
in which case you need to read with formatted input
functions such as 'fscanf()'.

in my c code, i first declare an array:

float array[10];

and my fread line looks like this:

fread(array, sizeof(float), 3, input_file);

now when i print out the contents of "array" on screen:

i=3;

for (j=0; j<i; j++) {
printf("%f\n", array[j]);
}

i get

0.0
1.0
2.0

why is the first number 0.0 instead of 1.0?


Either that's really the value in the file (if the
floating point format is the same for your C and FORTRAN
implementations), or the representations are not the
same, or the file is really text, or your program has a bug
-- which we can't find without seeing the code.

Issues like this is why it's strongly advised to transport data
between applications and systems using text instead
of a native binary representation.

-Mike
Nov 13 '05 #2
"Mike Wahler" <mk******@mkwahler.net> wrote in message
news:eH*****************@newsread3.news.pas.earthl ink.net...
"Yeow" <ys***@mtu.edu> wrote in message

in my c code, i first declare an array:

float array[10];

and my fread line looks like this:

fread(array, sizeof(float), 3, input_file);


Are you checking the return value from 'fread()'
(as any good program should do)? This value could
possibly shed some light on the problem.

-Mike
Nov 13 '05 #3
ys***@mtu.edu (Yeow) wrote:
hello,
i was trying to use the fread function on SunOS and ran into some
trouble.
i made a simple test as follows:

i'm trying to read in a binary file (generated from a fortran code)
that contains the following three floating-point numbers:

1.0 2.0 3.0

in my c code, i first declare an array:

float array[10];

and my fread line looks like this:

fread(array, sizeof(float), 3, input_file);
Who or what guarantees that the representations of floating point
numbers in the binary file and in your C implementation match?

Data exchange via binary files is highly system and implementation
dependent an therefore almost unportable. Do you have the possibility
to switch to text file data exchange?

now when i print out the contents of "array" on screen:

i=3;

for (j=0; j<i; j++) {
printf("%f\n", array[j]);
}

i get

0.0
1.0
2.0

why is the first number 0.0 instead of 1.0?

<shrug> If you've had presented the code without any sample output,
I would've said that it could print almost anything.

Regards

Irrwahn
--
ERROR 103: Dead mouse in hard drive.
Nov 13 '05 #4
"Mike Wahler" <mk******@mkwahler.net> writes:
Issues like this is why it's strongly advised to transport data
between applications and systems using text instead
of a native binary representation.


(Or to use an agreed-upon, stable binary representation)

-Micah
Nov 13 '05 #5
binary text? is that anything like ASCII EBCDIC or defined undefined
behavior?
i'm trying to read in a binary file (generated from a fortran code)
that contains the following three floating-point numbers:

1.0 2.0 3.0


How many bytes are in the file? If it's other than 3 * sizeof(float),
you need to specify more details about the file format.

FORTRAN has a tendancy to deal in "records". It is not that uncommon
to have a record length field (probably an integer type, but not
necessarily a *C* integer type) at the beginning of a "record".
When FORTRAN reads what FORTRAN wrote, you don't see it. (And I
didn't guarantee what units that's in: it might not be in C bytes.)

Gordon L. Burditt
Nov 13 '05 #6
Thank you for your replies!

the file i'm trying to read is actually a huge, unformatted file from
a fortran code. this data file contains both strings and real numbers
(double precision). the write commands in the fortran code typically
look like this:

write(10) nmax, string1, string2
write(10) (a(n),b(n),c(n),n=1,nmax)

and so on. "10" is the file unit number. nmax is an integer, string1
and 2 are text strings, and a,b,c arrays contain double-precision
floating-point data.

there's no flexibility in the output format from this fortran code. it
has to be unformatted.

this means if i try to open this data file using vi or cat, it'll
print garbage.

i'm trying to write a c code to read in this data file, manipulate the
data, then write it out in a different format (ASCII, formatted).

so the test i did (and showed you) was a much simplified version of my
actual task.
i'm trying to read in a binary file (generated from a fortran code)
that contains the following three floating-point numbers:

1.0 2.0 3.0


Stop right there. There's no requirement that the binary
representation of floating point values be the same for
C and FORTRAN. They might be, they might not.


how can i check/make sure?

the "record header" output from fortran may have been the reason why
the first value printed out from my c code was 0.0 instead of 1.0 in
my little test.

thanks so much again!
yeow
Nov 13 '05 #7
> >i'm trying to read in a binary file (generated from a fortran code)
that contains the following three floating-point numbers:

1.0 2.0 3.0


How many bytes are in the file? If it's other than 3 * sizeof(float),
you need to specify more details about the file format.

FORTRAN has a tendancy to deal in "records". It is not that uncommon
to have a record length field (probably an integer type, but not
necessarily a *C* integer type) at the beginning of a "record".
When FORTRAN reads what FORTRAN wrote, you don't see it. (And I
didn't guarantee what units that's in: it might not be in C bytes.)

Gordon L. Burditt


Most likely that's the case. And sometimes more than one item
in the field (fortran supports backspace even of binary files).
Most systems will have a "C" like buffer out subroutine, or a mode on
the OPEN statement that will put you in "C" compatible mode.
System dependent, but compatible with "C" on the same system
Maybe you can even call the "C" fopen and fwrite directly from
your fortran.
Nov 13 '05 #8
In <64**************************@posting.google.com > ys***@mtu.edu (Yeow) writes:
the file i'm trying to read is actually a huge, unformatted file from
a fortran code. this data file contains both strings and real numbers
(double precision). the write commands in the fortran code typically
look like this:

write(10) nmax, string1, string2
write(10) (a(n),b(n),c(n),n=1,nmax)

and so on. "10" is the file unit number. nmax is an integer, string1
and 2 are text strings, and a,b,c arrays contain double-precision
floating-point data.

there's no flexibility in the output format from this fortran code. it
has to be unformatted.

this means if i try to open this data file using vi or cat, it'll
print garbage.

i'm trying to write a c code to read in this data file, manipulate the
data, then write it out in a different format (ASCII, formatted).

so the test i did (and showed you) was a much simplified version of my
actual task.
> i'm trying to read in a binary file (generated from a fortran code)
> that contains the following three floating-point numbers:
>
> 1.0 2.0 3.0
Stop right there. There's no requirement that the binary
representation of floating point values be the same for
C and FORTRAN. They might be, they might not.


how can i check/make sure?


Don't bother. Real life implementors do everything they can to allow
mixed language programming, therefore it's simply an issue of figuring
out which C type corresponds to which Fortran type. For floating point,
it's easy: float <-> REAL, double <-> DOUBLE PRECISION. Chances are
that int <-> INTEGER and that short <-> INTEGER*2 (which is not a standard
Fortran type, but is widely supported by implementations for byte-oriented
machines).
the "record header" output from fortran may have been the reason why
the first value printed out from my c code was 0.0 instead of 1.0 in
my little test.


Yes, that's one issue. Note that most implementations also use a
"record trailer" with the same contents, for the benefit of the BACKSPACE
statement.

If you have access to the Fortran code generating the file, it is
possible to "decode" its contents with a C program, but first you have
to make some experiments, with simple Fortran programs generating
binary output, so that you can figure out the *exact* structure of a
Fortran-generated binary file. Note that Fortran strings may be space
padded but not null-terminated, so you need to know their exact size,
in order to be able to properly decode them.

Example (on a little endian platform):

fangorn:~ 296> cat test.f
character*10 string
integer a, b, c

string = 'foo'
a = 1
b = 2
c = 3
write(10) string, a, b, c
end
fangorn:~ 297> ls -l fort.10
-rw-r--r-- 1 danpop sysprog 30 Sep 30 16:38 fort.10
fangorn:~ 298> od -b -c fort.10
0000000 026 000 000 000 146 157 157 040 040 040 040 040 040 040 001 000
026 \0 \0 \0 f o o 001 \0
0000020 000 000 002 000 000 000 003 000 000 000 026 000 000 000
\0 \0 002 \0 \0 \0 003 \0 \0 \0 026 \0 \0 \0

The first 4 bytes, which are identical to the last 4 bytes are obviously
the record length. We can easily confirm this with a bit of arithmetic:
026 is 22 and the length of the file is 30. If we subtract the length
of the "metadata", i.e. the record header and the record trailer (each
of them having 4 bytes) from the total size we obtain the value 22 for
the size of the actual data contained in the record. Since we know that
the record consists of a 10-byte string and 3 4-byte integers, this is the
expected record size.

Now, we have to look at the bytes of the actual record: the string takes
10 bytes, exactly the size declared in the Fortran code, with no null
byte terminator. Since the initialiser contained fewer the 10 characters,
the string was padded with spaces up to its declared length. The
integers are exactly the way we expected them to be on a 32-bit, little
endian platform.

Having this information, we can write the C code to decode this binary
file. Error checking deliberately omitted, for simplicity:

fangorn:~ 309> cat decode.c
#include <stdio.h>

int main()
{
char sdata[10], buff[4];
int idata[3];
FILE *fp = fopen("fort.10", "rb");

fread(buff, sizeof buff, 1, fp);
fread(sdata, sizeof sdata, 1, fp);
fread(idata, sizeof idata, 1, fp);
fread(buff, sizeof buff, 1, fp); /* not really needed */
fclose(fp);

printf("string = '%.10s'\n", sdata);
printf("a = %d, b = %d, c = %d\n", idata[0], idata[1], idata[2]);
return 0;
}
fangorn:~ 310> cc decode.c
fangorn:~ 311> ./a.out
string = 'foo '
a = 1, b = 2, c = 3

So, by having access to the source code of the Fortran program and
exploiting our recently acquired knowledge about how this Fortran compiler
generates binary records, it was possible to write a C program that
retrieves all the information stored in the Fortran output file.

Keep in mind the following issues:

1. The format of a binary record may be different on your Fortran
implementation. You have to discover it, using a Fortran program
similar to the one above and a byte dumping utility. Or you can
read the documentation of the Fortran compiler :-)

2. sdata in my C code does NOT contain a C string. If you want a genuine
C string, you have to allocate an extra byte and do something like
this:

char sdata[10 + 1] = { 0 };
...
fread(sdata, sizeof sdata - 1, 1, fp);

3. The C program must be used on the same platform the Fortran file was
generated. This will guarantee that integers and reals have the same
size and representation in the two programs. If not sure about what
C type to use for a given Fortran type, examine the output of a simple
Fortran program that writes a record consisting of a single value of
the given type.

4. On certain platforms, using record oriented file systems, the record
header and trailer may be invisible to the C program. For this reason,
it is better to use your own byte dumping utility, written in C, rather
than something provided by the OS.

5. NEVER extrapolate from one platform to another, repeat the
"discovery" process on each new platform where you have to perform
this kind of data conversion.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 13 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: john smith | last post by:
Hi, I have a file format that is going to contain some parts in ascii, and some parts with raw binary data. Should I open this file with ios::bin or no? For example: filename: a.bin number of...
6
by: KevinD | last post by:
assumption: I am new to C and old to COBOL I have been reading a lot (self teaching) but something is not sinking in with respect to reading a simple file - one record at a time. Using C, I am...
50
by: Michael Mair | last post by:
Cheerio, I would appreciate opinions on the following: Given the task to read a _complete_ text file into a string: What is the "best" way to do it? Handling the buffer is not the problem...
7
by: John Dann | last post by:
I'm trying to read some binary data from a file created by another program. I know the binary file format but can't change or control the format. The binary data is organised such that it should...
2
by: Mad Scientist Jr | last post by:
i'm trying to read a file byte by byte (and later alter the data and write it to a 2nd file byte by byte) and running into a problem where it seems to keep reading the same byte over and over again...
30
by: siliconwafer | last post by:
Hi All, I want to know tht how can one Stop reading a file in C (e.g a Hex file)with no 'EOF'?
11
by: Freddy Coal | last post by:
Hi, I'm trying to read a binary file of 2411 Bytes, I would like load all the file in a String. I make this function for make that: '-------------------------- Public Shared Function...
3
by: The Cool Giraffe | last post by:
Regarding the following code i have a problem. void read () { fstream file; ios::open_mode opMode = ios::in; file.open ("some.txt", opMode); char *ch = new char; vector <charv; while...
6
by: John Wright | last post by:
I am trying to read the data from a device on a serial port. I connect just fine and can receive data fine in text mode but not in binary mode. In text mode the data from the device comes in...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.