By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,553 Members | 1,128 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,553 IT Pros & Developers. It's quick & easy.

Help with FileInputStream and DataInputStream - porting c++ fread function into Java

P: n/a
Hello all!
I am porting an application from C++ to Java and have run into a
problem using the DataInputStream reader object. The file I am trying
to read in is anywhere from 20 to 60 MB and has a short (25 lines or
so) ASCII text "header". The file structure is a double dimensioned
array of objects. The ASCII header defines how many "columns" (the
first array index) there will be in the file. After the ASCII header,
the first value is an integer that contains the number of objects in
the first column. You are intended to read this many objects in, and
then the next number will be an integer containing the number of
objects in the next column. And so on and so forth. Each "object"
has, basically, three doubles, a long integer, and a 4 character
array.
My problem comes when reading the first binary number, an integer
containing the number of objects in the first column. It reads
without throwing an exception, but if I print this number to the
console it ends up being 10 million something, when I know that it
should be no more than 1000. My code is basically as follows

File theFile = new File(filename);
if (theFile.canRead()) {
FileInputStream fis = new FileInputStream(theFile);
BufferedReader fileReader =
new BufferedReader(new InputStreamReader(fis));

//use BufferedReader fileReader object to read in ASCII header
(snipped)
//this part is working swimmingly

DataInputStream dataReader = new DataInputStream(fis);
//I assume that this dataReader is "pointing" to the same place
in the
//file that the BufferedRead ended on, not, say, at the
beginning of the
//file or something like that. If it is based on the same
stream,
//can't a stream just have one location? Maybe I am too used to
C++

//This is the first read, mentioned above. I can't figure out
why
//its reading in 10156179 when it should be getting around
900-1000!
try {
numPoints[i] = dataReader.readInt();
//Isn't this the same as:
//fread(&numPoints[i], sizeof(int), 1, fp);
//in C++, where we are reading 1 integer sized binary section
//of the file, and storing it into the integer array
numPoints?

System.out.println(numPoints[i]);
} catch (EOFException e) {
System.out.println("Fewer columns than expected (V2) ( + " +
i +
" < " + mParams.numColumns + ")");
mParams.numColumns = i;
break;
}

//Since I do this later:
data[i] = new Rtpi[numPoints[i]];
//and am trying to allocate 10 million of these objects, I
eventually run
//out of memory/Java VM heap space, and it throws a
//java.lang.OutOfMemory error. Not too surprsing I guess

//end of non-working code

So my main problem/misunderstanding is on how to use the
DataInputStream reader object. I have read through the Java API for
this class, but don't really get it too much. Any and all help would
be much appreciated. I desperately need this code to work for my
M.Sc. Dissertation.

TIA,

-Patrick

Please send any responses to me directly as well as to the newsgroup.
Jul 17 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a
[ invalid group comp.lang.java.developer removed ]

On 12 Jul 2004 17:00:53 -0700, Patrick wrote:
My problem comes when reading the first binary number, an integer
containing the number of objects in the first column. It reads
without throwing an exception, but if I print this number to the
console it ends up being 10 million something, when I know that it
should be no more than 1000.


When you read numbers in binary format, the reader and writer need to
agree on the endianness of the representation (i.e. which byte comes
first).

The Java standard streams assume network byte order (big endian), and
so should your C program. It should be using macros like htonl() and
htons() to write in network byte order.

If your C application writes values using a mechanism like this:

foo_t foo;
write(fd, &foo, sizeof(foo));

and you run it on a little-endian platform, then you will see exactly
the problem you've described. Note that such an application will fail
to read its own data if it runs on a platform with a different byte
order.

If the C program is beyond your control, there are third party Java
classes for reading in little endian, or you can roll your own by
reading one byte at a time, then shifting and adding to recreate the
original values.

/gordon

--
[ do not email me copies of your followups ]
g o r d o n + n e w s @ b a l d e r 1 3 . s e
Jul 17 '05 #2

P: n/a
On 13 Jul 2004 08:03:05 +0200, Gordon Beaton <no*@for.email> wrote or
quoted :
and you run it on a little-endian platform, then you will see exactly
the problem you've described. Note that such an application will fail
to read its own data if it runs on a platform with a different byte If the C program is beyond your control, there are third party Java
classes for reading in little endian, or you can roll your own by
reading one byte at a time, then shifting and adding to recreate the
original values.


see http://mindprod.com/jgloss/ledatastream.html
http://mindprod.com/jgloss/endian.html

you can read more than one byte at a time, but you do have to
rearrange a byte at a time.

--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
Jul 17 '05 #3

P: n/a
On Tue, 13 Jul 2004 07:09:07 +0000, Roedy Green wrote:
On 13 Jul 2004 08:03:05 +0200, Gordon Beaton <no*@for.email> wrote or
quoted :
and you run it on a little-endian platform, then you will see exactly
the problem you've described. Note that such an application will fail
to read its own data if it runs on a platform with a different byte

If the C program is beyond your control, there are third party Java
classes for reading in little endian, or you can roll your own by
reading one byte at a time, then shifting and adding to recreate the
original values.


see http://mindprod.com/jgloss/ledatastream.html
http://mindprod.com/jgloss/endian.html

you can read more than one byte at a time, but you do have to
rearrange a byte at a time.


You can use ByteBuffer. Load the data, set the byte order of
the buffer and then read from it. It has methods to read all the primitive
types.
--
Nigel Wade, System Administrator, Space Plasma Physics Group,
University of Leicester, Leicester, LE1 7RH, UK
E-mail : nm*@ion.le.ac.uk
Phone : +44 (0)116 2523548, Fax : +44 (0)116 2523555

Jul 17 '05 #4

P: n/a
Patrick wrote:
Hello all!
I am porting an application from C++ to Java and have run into a
problem using the DataInputStream reader object. The file I am trying
to read in is anywhere from 20 to 60 MB and has a short (25 lines or
so) ASCII text "header". The file structure is a double dimensioned
array of objects. The ASCII header defines how many "columns" (the
first array index) there will be in the file. After the ASCII header,
the first value is an integer that contains the number of objects in
the first column. You are intended to read this many objects in, and
then the next number will be an integer containing the number of
objects in the next column. And so on and so forth. Each "object"
has, basically, three doubles, a long integer, and a 4 character
array.


Patrick,

You received many excellent responses concerning the endian-ness of the
data. In addition to that, you should make sure that the length of the
data agrees as well. E.g., Java uses 32-bit ints on every platform,
does that agree with your C programs, etc.

HTH,
Ray

--
XML is the programmer's duct tape.
Jul 17 '05 #5

P: n/a
Hello again,
I think I figured my problem out! Although the Java API states
that DataInputStream "lets an application read primitive Java data
types from an underlying input stream in a machine-independent way",
it doesn't mean that it can actually do it. Rather, it messes with
your mind for awhile until you figure out that if you're reading in
files from a machine built on a little endian architecture (virtually
all PCs - intel and all compatible) that weren't written by a Java
DataOutputStream object, then you'll be pretty much screwed - there is
no convenient method by which to do this in the Java API. They could
definitely stand to clear this up in the API, and also include the
following classes: LEDataInputStream, LEDataOutputStream. They are
available here and are my life savers right now:

http://mindprod.com/jgloss/endian.html

Thanks to Roedy Green!
-Patrick
Jul 17 '05 #6

P: n/a
On 13 Jul 2004 08:31:43 -0700, mc**********@hotmail.com (Patrick)
wrote or quoted :
They are
available here and are my life savers right now:

http://mindprod.com/jgloss/endian.html


There are also now built into nio somewhere.

--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
Jul 17 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.