473,322 Members | 1,714 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

reading Java floats from C

Hi,

I am trying to read Java-floats (IEEE 754 encoding) stored in a binary
file from C (gcc on linux/i386, more specifically). Unfortunately, C
seems to expect floats to be stored somewhat differently than Java
does. I suspected an endianess problem and tried out ntohl/htonl but it
doesn't help.

Any clues?

Thanks,
Sören

Nov 14 '05 #1
6 5438
On Mon, 20 Jun 2005 12:52:31 -0700, sbalko wrote:
Hi,

I am trying to read Java-floats (IEEE 754 encoding) stored in a binary
file
The question there would be how Java stores floats in a file, which would
depend in the code used to store them. From the information given there's
no reason to assume that the Java code is storing them in the same binary
format used internally. I think you'll need to discuss this in a Java
related newsgroup.
from C (gcc on linux/i386, more specifically). Unfortunately, C
seems to expect floats to be stored somewhat differently than Java
does. I suspected an endianess problem and tried out ntohl/htonl but it
doesn't help.


C doesn't specify the representation used by floating point types,
although IEEE 754 is typical. If you give some information about the
file format the Java code is using, and the C code you are using to read
the values we should be able to help you.

Lawrence
Nov 14 '05 #2
Lawrence Kirby schrieb:
I am trying to read Java-floats (IEEE 754 encoding) stored in a binary
file


The question there would be how Java stores floats in a file, which would
depend in the code used to store them. From the information given there's
no reason to assume that the Java code is storing them in the same binary
format used internally. I think you'll need to discuss this in a Java
related newsgroup.

On the java side, I am using DataOutputStream's writeFloat method which
explicitly uses IEEE 754 to encode a float into 4 bytes.
from C (gcc on linux/i386, more specifically). Unfortunately, C
seems to expect floats to be stored somewhat differently than Java
does. I suspected an endianess problem and tried out ntohl/htonl but it
doesn't help.


C doesn't specify the representation used by floating point types,
although IEEE 754 is typical. If you give some information about the
file format the Java code is using, and the C code you are using to read
the values we should be able to help you.

Actually the java file is a plain format with intermixed ASCII and
subsequently stored floats . On the C side, things are a bit more
complex. I am using mmap to map the file to a main memory address
(casted to a char* pointer). Then I memcpy 4 bytes from certain offsets
in the buffer to a float variable. I also tried to apply ntohl on that
float but that doesn't solve my problem either.

Nov 14 '05 #3

<sb****@gmail.com> wrote
Actually the java file is a plain format with intermixed ASCII and
subsequently stored floats . On the C side, things are a bit more
complex. I am using mmap to map the file to a main memory address
(casted to a char* pointer). Then I memcpy 4 bytes from certain offsets
in the buffer to a float variable. I also tried to apply ntohl on that
float but that doesn't solve my problem either.

Do you know how floating point numbers are generally constructed?

By doing some experiments you ought to be able to work out what
representation your Java platform and C compiler uses, and to convert. Watch
out for special cases like nan, infinity, and very small numbers.
Nov 14 '05 #4
On Mon, 20 Jun 2005 15:34:53 -0700, sbalko wrote:

....
Actually the java file is a plain format with intermixed ASCII and
subsequently stored floats .
I suggest you log the representation of the data you've read in. You
access the representation of an object by treating it as an array of
unsigned char e.g.

TYPE var = value;
const unsigned char *ptr = (const unsigned char *)&var;

for (i = 0; i < sizeof var; i++)
printf(" %02x", ptr[i]);

Also do this for the same values set in the C environment. You should then
be able to see if

a) you've read the data in correctly

b) how the Java and C representations correspond.
On the C side, things are a bit more
complex. I am using mmap to map the file to a main memory address
(casted to a char* pointer). Then I memcpy 4 bytes from certain offsets
in the buffer to a float variable. I also tried to apply ntohl on that
float but that doesn't solve my problem either.


ntohl isn't a standard C library function. Given the common socket related
definition of it you can't apply it directly to a float, you would be
converting the value to a long, swapping bytes and converting back again
which will give a completely wrong result.

Instead of using memcpy() to copy into the float try code that copies the
bytes to the float object in reverse order. E.g.

void unmarshall_float(float *fl, const unsigned char *data)
{
unsigned char *flrep = (unsigned char *)fl;
int i;

for (i = 0; i < sizeof(float); i++)
flrep[i] = data[sizeof(float)-1-i];
}

Lawrence

Nov 14 '05 #5
Lawrence Kirby wrote:
On Mon, 20 Jun 2005 15:34:53 -0700, sbalko wrote:

...

Actually the java file is a plain format with intermixed ASCII and
subsequently stored floats .

I suggest you log the representation of the data you've read in. You
access the representation of an object by treating it as an array of
unsigned char e.g.

TYPE var = value;
const unsigned char *ptr = (const unsigned char *)&var;

for (i = 0; i < sizeof var; i++)
printf(" %02x", ptr[i]);

Also do this for the same values set in the C environment. You should then
be able to see if

a) you've read the data in correctly

b) how the Java and C representations correspond.

On the C side, things are a bit more
complex. I am using mmap to map the file to a main memory address
(casted to a char* pointer). Then I memcpy 4 bytes from certain offsets
in the buffer to a float variable. I also tried to apply ntohl on that
float but that doesn't solve my problem either.

ntohl isn't a standard C library function. Given the common socket related
definition of it you can't apply it directly to a float, you would be
converting the value to a long, swapping bytes and converting back again
which will give a completely wrong result.

Instead of using memcpy() to copy into the float try code that copies the
bytes to the float object in reverse order. E.g.

void unmarshall_float(float *fl, const unsigned char *data)
{
unsigned char *flrep = (unsigned char *)fl;
int i;

for (i = 0; i < sizeof(float); i++)
flrep[i] = data[sizeof(float)-1-i];
}

Lawrence


I would, if possible, coerce Java to write text like '1.23456789e2' for
floats. Convert them with strtod() on the C side.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Nov 14 '05 #6
(I've cross-posted this to comp.programming where it's more relevant.
Also I've blacked out the specific names of programming languages
because that's irrelevant to my general answer.)
From: sb****@gmail.com
I am trying to read ###-floats (IEEE ??? encoding) stored in a binary
file from %%% (??? on ???, more specifically). Unfortunately, %%% seems
to expect floats to be stored somewhat differently than ### does. I
suspected an endianess problem and tried out ntohl/htonl but it
doesn't help. Any clues?


If you can't find such an answer from online documents, why didn't you
just do some experiments? For example, try this to see how ### writes
floats in binary mode: Write a test program that writes out exactly
five values of exactly 0.0, then write out these values in sequence:
9.0 0.0 10.0 0.0 11.0 0.0 12.0 0.0 13.0 0.0 14.0 0.0 15.0 0.0, and then
examine the resultant file to see if you can find:
- The same exact pattern repeating exactly five times before it's
broken by other patterns not the same, to show you what the 0.0 looks
like in binary file format.
- Alternating original pattern and other patterns the same length, to
make sure you haven't accidently used different precision for the
non-zero values generated from the index variable in your loop and the
zero values generated by literals.
- Among those non-zero groups of bytes, see if you can find a bit
pattern that goes somewhat like this:
1001
1010
1011
1100
1101
1110
1111
The '1' might be missing if it's in a notation where the 1 is assumed
rather than explicit, but the other bits should follow that pattern.

At that point you have a good idea where the mantissa is located. Now
to find where the exponent is located, generate this sequence:
0.0 1.0 0.0 2.0 0.0 4.0 0.0 8.0 0.0 16.0 0.0 32.0 0.0 64.0 0.0
You should see a similar pattern in the bits.

Finally you need to know how negative numbers are expressed.
I leave that as an exercise for the reader.

Once you know all that for ###, do the same for %%%.
Write that test data from the program that will be doing reading.
(Unless it's totally broken, it should write data in the same layout
that it expects to read it in.)

Now compare what you learned about ### and %%%, whether sequence of
bytes is the only difference, or there's a more complicated difference
in representation.
Nov 15 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Darsant | last post by:
I'm currently reading 1-n number of binary files, each with 3 different arrays of floats containing about 10,000 values a piece for a total of about 30,000 values per file. I'm looking for a way...
7
by: laclac01 | last post by:
So I am converting some matlab code to C++. I am stuck at one part of the code. The matlab code uses fread() to read in to a vector a file. It's a binary file. The vector is made up of floats,...
289
by: napi | last post by:
I think you would agree with me that a C compiler that directly produces Java Byte Code to be run on any JVM is something that is missing to software programmers so far. With such a tool one could...
4
by: Matthew Crema | last post by:
Hello, Say I have 1000 text files and each is a list of 32768 integers. I have written a C program to read this data into a large matrix. I am using fopen in combination with fscanf to read...
2
by: Matt McGonigle | last post by:
Hi all, Please help me out with this. Perhaps it is a dumb question, but I can't seem to make it work. I am doing a file conversion using an unformatted binary file for input and outputting to...
8
by: stewart_bristol | last post by:
can someone direct me to some compilable example code which reads in floating points or integers from a tab delimited file (either to variables or arrays) please? this is driving me mad today ...
51
by: erikcw | last post by:
DiveIntoPython.org was the first book I read on python, and I really got a lot out of it. I need to start learning Java (to maintain a project I've inherited), and was wondering if anyone knew of...
5
blazedaces
by: blazedaces | last post by:
Ok, so you know my problem, java is running out of memory reading with SAX, the event-based xml parser intended more-so than DOM for extremely large files. I'll try to explain what I've been doing...
16
by: luca bertini | last post by:
Hi, i have strings which look like money values (ie 34.45) is there a way to convert them into float variables? everytime i try I get this error: "numb = float(my_line) ValueError: empty string...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.