473,320 Members | 1,953 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

reading binary file

Hello,

I have a binary file (image file) and am reading 4-bytes at a time. The
File size is 63,480,320 bytes. My assumption is that if I loop through
this file reading 4 bytes at a time, I should loop 15,870,080 times.

The code is:

newprogram.cpp
=============
#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
int counter=0;
char * memblock;
memblock = new char [4];

ifstream file ("179060_mar_05_00_L7.024", ios::in|ios::binary);

file.seekg (1,ios::beg);
while (!file.eof())
{
file.read(memblock, 4);
counter++;
}
cout << "Number of loops: " << counter << "\n";
delete[] memblock;
file.close();

return 0;
}

$g++ newprogram.cpp -o deletelater
$./deletelater
Number of loops: 15870080

(a) Notice the file.seekg (1,ios::beg); statement. Is this correct?
(b) If I were to use file.seekg (0,ios::beg); statement, the number of
loops would end up 15870081. Is file.seekg(0,ios::beg) correct? If so,
could you please help me understand why the loop goes 15870081 times?

If I were to use a variable: int tempdata;
and in the loop right after file.read, were to insert: tempdata = (int)
(*memblock), I would get different results with (a)
file.seekg(1,ios::beg) and (b) file.seekg(0,ios::beg). Which one would
be correct?

Your suggestions will be very helpful. Thank you.

Use*n*x

Dec 8 '06 #1
9 9773

Use*n*x wrote:
#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
int counter=0;
char * memblock;
memblock = new char [4];

ifstream file ("179060_mar_05_00_L7.024", ios::in|ios::binary);

file.seekg (1,ios::beg);
while (!file.eof())
{
file.read(memblock, 4);
counter++;
}
cout << "Number of loops: " << counter << "\n";
delete[] memblock;
file.close();

return 0;
}

$g++ newprogram.cpp -o deletelater
$./deletelater
Number of loops: 15870080

(a) Notice the file.seekg (1,ios::beg); statement. Is this correct?
(b) If I were to use file.seekg (0,ios::beg); statement, the number of
loops would end up 15870081. Is file.seekg(0,ios::beg) correct? If so,
could you please help me understand why the loop goes 15870081 times?

If I were to use a variable: int tempdata;
and in the loop right after file.read, were to insert: tempdata = (int)
(*memblock), I would get different results with (a)
file.seekg(1,ios::beg) and (b) file.seekg(0,ios::beg). Which one would
be correct?
Your loop appears to be written under a common, but false,
misconception.

file.eof() does not return whether or not you are at the end of a file;
it returns whether or not you've attempted to read past the end of the
file. Additionally, an istream does not even know whether it's reached
the end-of-file until it tries to read past the end of the file.

Without a seek, or with a seek to 0, what happens is that you have
15870080 successful reads. After those, file.eof() still returns false;
not because it's not at the end of the file (it is), but because it
hasn't yet tried to read past the end of the file. The very next
(15870081st) read will fail (read zero bytes), and /then/ the eof bit
will be set; but counter will still be incremented.

The reason why seeking to 1 appears to give the right number of reads,
is that you are skipping the first byte (in position 0). Then follows
15870079 successful reads, followed by the 15870080th read that only
reads the final 3 bytes. It tries to read the fourth byte, and at that
point encounters the end-of-file, so it sets the bit, and the test
condition terminates the loop. But you have missed the first byte, and
the final byte you /think/ you read (at memblock[3]) actually is just a
duplicate from the read just before the last one.

Another problem with your loop is that if there were a read /failure/,
your loop would continue indefinitely, as the eof bit would never get
set, and the read calls would just keep failing undected.

The solution? Make the loop condition simply "while (file)" or "while
(file.good())", and check the return value from file.read() before
assuming that it filled your array completely (or at all). Only
increment the counter if the read was successful (I have no clue how
you might want to handle a partial read, but you should take note of
them if they occur).

Dec 8 '06 #2

Use*n*x wrote:
Hello,

I have a binary file (image file) and am reading 4-bytes at a time. The
File size is 63,480,320 bytes. My assumption is that if I loop through
this file reading 4 bytes at a time, I should loop 15,870,080 times.

The code is:

newprogram.cpp
=============
#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
int counter=0;
char * memblock;
memblock = new char [4];

ifstream file ("179060_mar_05_00_L7.024", ios::in|ios::binary);

file.seekg (1,ios::beg);
while (!file.eof())
{
file.read(memblock, 4);
counter++;
}
cout << "Number of loops: " << counter << "\n";
delete[] memblock;
file.close();

return 0;
}

$g++ newprogram.cpp -o deletelater
$./deletelater
Number of loops: 15870080

(a) Notice the file.seekg (1,ios::beg); statement. Is this correct?
(b) If I were to use file.seekg (0,ios::beg); statement, the number of
loops would end up 15870081. Is file.seekg(0,ios::beg) correct? If so,
could you please help me understand why the loop goes 15870081 times?

If I were to use a variable: int tempdata;
and in the loop right after file.read, were to insert: tempdata = (int)
(*memblock), I would get different results with (a)
file.seekg(1,ios::beg) and (b) file.seekg(0,ios::beg). Which one would
be correct?

Your suggestions will be very helpful. Thank you.

Use*n*x

I was testing a little more and found this method to be more reliable
than using file.eof(). Suggestions and comments are more than welcome.

#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
int counter=0;
char * memblock;
memblock = new char [4];

long begin,end,filesize,i;
//ifstream file ("179060_mar_05_00_L7.024",
ios::in|ios::binary);
ifstream file ("test", ios::in|ios::binary);
ofstream dump ("dump", ios::binary);

// find file size
begin = file.tellg();
file.seekg(0,ios::end);
end = file.tellg();
filesize = end - begin;

// reposition
file.seekg(0,ios::beg);

// loop
for (i=0; i<filesize; i=i+4)
{
file.read(memblock,4);
// not quite needed
// cout<< memblock << ".." << file.tellg() << endl;
counter++;
}

/*file.seekg (0,ios::beg);
while (!file.eof())
{
file.read(memblock, 4);
//dump << memblock;
cout << memblock << ".." << file.tellg() << endl;
counter++;
}*/
cout << "Number of loops: " << counter << "\n";
delete[] memblock;
file.close();
dump.close();

return 0;
}

Dec 8 '06 #3

Micah Cowan wrote:
Use*n*x wrote:
#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
int counter=0;
char * memblock;
memblock = new char [4];

ifstream file ("179060_mar_05_00_L7.024", ios::in|ios::binary);

file.seekg (1,ios::beg);
while (!file.eof())
{
file.read(memblock, 4);
counter++;
}
cout << "Number of loops: " << counter << "\n";
delete[] memblock;
file.close();

return 0;
}

$g++ newprogram.cpp -o deletelater
$./deletelater
Number of loops: 15870080

(a) Notice the file.seekg (1,ios::beg); statement. Is this correct?
(b) If I were to use file.seekg (0,ios::beg); statement, the number of
loops would end up 15870081. Is file.seekg(0,ios::beg) correct? If so,
could you please help me understand why the loop goes 15870081 times?

If I were to use a variable: int tempdata;
and in the loop right after file.read, were to insert: tempdata = (int)
(*memblock), I would get different results with (a)
file.seekg(1,ios::beg) and (b) file.seekg(0,ios::beg). Which one would
be correct?

Your loop appears to be written under a common, but false,
misconception.

file.eof() does not return whether or not you are at the end of a file;
it returns whether or not you've attempted to read past the end of the
file. Additionally, an istream does not even know whether it's reached
the end-of-file until it tries to read past the end of the file.

Without a seek, or with a seek to 0, what happens is that you have
15870080 successful reads. After those, file.eof() still returns false;
not because it's not at the end of the file (it is), but because it
hasn't yet tried to read past the end of the file. The very next
(15870081st) read will fail (read zero bytes), and /then/ the eof bit
will be set; but counter will still be incremented.

The reason why seeking to 1 appears to give the right number of reads,
is that you are skipping the first byte (in position 0). Then follows
15870079 successful reads, followed by the 15870080th read that only
reads the final 3 bytes. It tries to read the fourth byte, and at that
point encounters the end-of-file, so it sets the bit, and the test
condition terminates the loop. But you have missed the first byte, and
the final byte you /think/ you read (at memblock[3]) actually is just a
duplicate from the read just before the last one.

Another problem with your loop is that if there were a read /failure/,
your loop would continue indefinitely, as the eof bit would never get
set, and the read calls would just keep failing undected.

The solution? Make the loop condition simply "while (file)" or "while
(file.good())", and check the return value from file.read() before
assuming that it filled your array completely (or at all). Only
increment the counter if the read was successful (I have no clue how
you might want to handle a partial read, but you should take note of
them if they occur).
Your explanation makes good sense. Thank you.

Dec 8 '06 #4
Use*n*x wrote:
Hello,

I have a binary file (image file) and am reading 4-bytes at a time. The
File size is 63,480,320 bytes. My assumption is that if I loop through
this file reading 4 bytes at a time, I should loop 15,870,080 times.

The code is:

newprogram.cpp
=============
#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
int counter=0;
char * memblock;
memblock = new char [4];

ifstream file ("179060_mar_05_00_L7.024", ios::in|ios::binary);

file.seekg (1,ios::beg);
while (!file.eof())
{
file.read(memblock, 4);
counter++;
}
cout << "Number of loops: " << counter << "\n";
delete[] memblock;
file.close();

return 0;
}
This is awfully inefficient.

Try this:

#include <iostream>
#include <fstream>

using namespace std;

int main ()
{

ifstream file ("179060_mar_05_00_L7.024", ios::in|ios::binary);

streambuf * pbuf = file.rdbuf();
int l_blocks[1024];

streamsize i;

while (
i = pbuf->sgetn(
reinterpret_cast<char*>(l_buffer), sizeof(l_buffer) )
)
{
streamsize num_read = i / sizeof(int);

for ( streamsize x = 0; x < num_read; ++ x )
{
PROCESS_THIS_THING( l_blocks[x] );
}
}

return 0;
}

Come to think of it, I have not checked the performance of the C++
stream library lately so I could be wrong. However, I have found that
frequent calls can significantly slow down the application, especially
when you're reading large chunks of data.

If you don't care about peformance, then you can stick with what you
have. I would loose the new/delete tho. Just declare a small array on
the stack and make sure you never read more bytes than you have allocated.
>
$g++ newprogram.cpp -o deletelater
$./deletelater
Number of loops: 15870080

(a) Notice the file.seekg (1,ios::beg); statement. Is this correct?
(b) If I were to use file.seekg (0,ios::beg); statement, the number of
loops would end up 15870081. Is file.seekg(0,ios::beg) correct? If so,
could you please help me understand why the loop goes 15870081 times?

If I were to use a variable: int tempdata;
and in the loop right after file.read, were to insert: tempdata = (int)
(*memblock), I would get different results with (a)
file.seekg(1,ios::beg) and (b) file.seekg(0,ios::beg). Which one would
be correct?

Your suggestions will be very helpful. Thank you.

Use*n*x
Dec 8 '06 #5

Use*n*x wrote:
I was testing a little more and found this method to be more reliable
than using file.eof(). Suggestions and comments are more than welcome.

#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
int counter=0;
char * memblock;
memblock = new char [4];

long begin,end,filesize,i;
//ifstream file ("179060_mar_05_00_L7.024",
ios::in|ios::binary);
ifstream file ("test", ios::in|ios::binary);
ofstream dump ("dump", ios::binary);

// find file size
begin = file.tellg();
file.seekg(0,ios::end);
end = file.tellg();
filesize = end - begin;

// reposition
file.seekg(0,ios::beg);

// loop
for (i=0; i<filesize; i=i+4)
{
file.read(memblock,4);
// not quite needed
// cout<< memblock << ".." << file.tellg() << endl;
counter++;
}
<snipped the rest>

You're much safer using std::streampos from <iosfwdto store the
result of file.tellg(), as it could well have a greater width than a
long, and you might not detect a potential overflow.

Other than that, you're still a lot better of checking for eof(): if a
read failure occurs, your code above still won't catch it, and if some
outside program were to truncate the file before you were through
reading it, you don't detect that condition either. Also, it is
possible for the current location to not be able to fit into a
streampos, or to otherwise fail, in which case seekg() will return
streampos(streamoff(-1)), and your code won't work as you expect.

Also: I snipped a section where you use /* ... */ to comment out a
block of code. While that works in your specific case, it's a habit to
be avoided in general, as what if that block of code had a /* */
comment of its own? Those comments don't nest, and you'd have a syntax
error. It's easier in the long run just to get in the habit of using
#if 0 instead.

Dec 8 '06 #6

Gianni Mariani wrote:
Use*n*x wrote:
Hello,

I have a binary file (image file) and am reading 4-bytes at a time. The
File size is 63,480,320 bytes. My assumption is that if I loop through
this file reading 4 bytes at a time, I should loop 15,870,080 times.

The code is:

newprogram.cpp
=============
#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
int counter=0;
char * memblock;
memblock = new char [4];

ifstream file ("179060_mar_05_00_L7.024", ios::in|ios::binary);

file.seekg (1,ios::beg);
while (!file.eof())
{
file.read(memblock, 4);
counter++;
}
cout << "Number of loops: " << counter << "\n";
delete[] memblock;
file.close();

return 0;
}

This is awfully inefficient.

Try this:

#include <iostream>
#include <fstream>

using namespace std;

int main ()
{

ifstream file ("179060_mar_05_00_L7.024", ios::in|ios::binary);

streambuf * pbuf = file.rdbuf();
int l_blocks[1024];

streamsize i;

while (
i = pbuf->sgetn(
reinterpret_cast<char*>(l_buffer), sizeof(l_buffer) )
)
{
streamsize num_read = i / sizeof(int);

for ( streamsize x = 0; x < num_read; ++ x )
{
PROCESS_THIS_THING( l_blocks[x] );
}
}

return 0;
}

Come to think of it, I have not checked the performance of the C++
stream library lately so I could be wrong. However, I have found that
frequent calls can significantly slow down the application, especially
when you're reading large chunks of data.

If you don't care about peformance, then you can stick with what you
have. I would loose the new/delete tho. Just declare a small array on
the stack and make sure you never read more bytes than you have allocated.
Good to know your thoughts. It helped. Thank you.

>

$g++ newprogram.cpp -o deletelater
$./deletelater
Number of loops: 15870080

(a) Notice the file.seekg (1,ios::beg); statement. Is this correct?
(b) If I were to use file.seekg (0,ios::beg); statement, the number of
loops would end up 15870081. Is file.seekg(0,ios::beg) correct? If so,
could you please help me understand why the loop goes 15870081 times?

If I were to use a variable: int tempdata;
and in the loop right after file.read, were to insert: tempdata = (int)
(*memblock), I would get different results with (a)
file.seekg(1,ios::beg) and (b) file.seekg(0,ios::beg). Which one would
be correct?

Your suggestions will be very helpful. Thank you.

Use*n*x
Dec 8 '06 #7

Micah Cowan wrote:
Use*n*x wrote:
I was testing a little more and found this method to be more reliable
than using file.eof(). Suggestions and comments are more than welcome.

#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
int counter=0;
char * memblock;
memblock = new char [4];

long begin,end,filesize,i;
//ifstream file ("179060_mar_05_00_L7.024",
ios::in|ios::binary);
ifstream file ("test", ios::in|ios::binary);
ofstream dump ("dump", ios::binary);

// find file size
begin = file.tellg();
file.seekg(0,ios::end);
end = file.tellg();
filesize = end - begin;

// reposition
file.seekg(0,ios::beg);

// loop
for (i=0; i<filesize; i=i+4)
{
file.read(memblock,4);
// not quite needed
// cout<< memblock << ".." << file.tellg() << endl;
counter++;
}

<snipped the rest>

You're much safer using std::streampos from <iosfwdto store the
result of file.tellg(), as it could well have a greater width than a
long, and you might not detect a potential overflow.
I started off using streamsize/streampos, but was not quite confident
of what I was doing. So switched back to something that was similar to
a sample code I had on hand.
>
Other than that, you're still a lot better of checking for eof(): if a
read failure occurs, your code above still won't catch it, and if some
outside program were to truncate the file before you were through
reading it, you don't detect that condition either. Also, it is
possible for the current location to not be able to fit into a
streampos, or to otherwise fail, in which case seekg() will return
streampos(streamoff(-1)), and your code won't work as you expect.
Yes, that is what I should do - keep a tab on eof() to handle failure
in IO.
>
Also: I snipped a section where you use /* ... */ to comment out a
block of code. While that works in your specific case, it's a habit to
be avoided in general, as what if that block of code had a /* */
comment of its own? Those comments don't nest, and you'd have a syntax
error. It's easier in the long run just to get in the habit of using
#if 0 instead.
Oh yes, I didn't even realize that. Thank you for your valuable inputs.
I have a long way to go in C++.

Use*n*x

Dec 8 '06 #8

Use*n*x wrote:
Yes, that is what I should do - keep a tab on eof() to handle failure
in IO.
Actually, eof() won't report I/O failure(), you should check
bool(file), or file.good(), which handles all of eof, failure, and
bad-state.

It is sometimes useful to call file.exceptions(<some std::iostate
values>), to cause the stream to throw an exception upon failure or
corruption.

Dec 8 '06 #9

Use*n*x wrote in message ...
>
#include <iostream>
#include <fstream>
<snipped the rest>

Yes, that is what I should do - keep a tab on eof() to handle failure
in IO.
Use*n*x

Huh?

int main (){
// using namespace std;
int counter=0;
char *memblock( new char[4] );
std::ifstream file( "test", std::ios::in | std::ios::binary );

while( file.read( memblock, 4 ) ){
counter++;
}

std::cout << "Number of loops: " << counter << "\n";
delete[] memblock;

if( not file ){
std::cout<<" file error="<<file.flags()<<std::endl;
std::cout<<" ios::good="<<file.good()<<std::endl;
std::cout<<" ios::bad="<<file.bad()<<std::endl;
std::cout<<" ios::eof="<<file.eof()<<std::endl;
std::cout<<" ios::fail="<<file.fail()<<std::endl;
}

file.clear();
file.seekg( 0, std::ios::end );
long long end = file.tellg();
long long filesize = end / 4;
std::cout << "long long end = file.tellg(): "<<end<< "\n";
std::cout << "long long filesize = end / 4: "<<filesize<< "\n";

file.close();
return 0;
}

// - output -
// Number of loops: 3
// file error=4098
// ios::good=false
// ios::bad=false
// ios::eof=true
// ios::fail=true
// long long end = file.tellg(): 14
// long long filesize = end / 4: 3

--
Bob R
POVrookie
Dec 8 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: john smith | last post by:
Hi, I have a file format that is going to contain some parts in ascii, and some parts with raw binary data. Should I open this file with ios::bin or no? For example: filename: a.bin number of...
20
by: ishmael4 | last post by:
hello everyone! i have a problem with reading from binary file. i was googling and searching, but i just cant understand, why isnt this code working. i could use any help. here's the source code:...
6
by: KevinD | last post by:
assumption: I am new to C and old to COBOL I have been reading a lot (self teaching) but something is not sinking in with respect to reading a simple file - one record at a time. Using C, I am...
50
by: Michael Mair | last post by:
Cheerio, I would appreciate opinions on the following: Given the task to read a _complete_ text file into a string: What is the "best" way to do it? Handling the buffer is not the problem...
7
by: John Dann | last post by:
I'm trying to read some binary data from a file created by another program. I know the binary file format but can't change or control the format. The binary data is organised such that it should...
30
by: siliconwafer | last post by:
Hi All, I want to know tht how can one Stop reading a file in C (e.g a Hex file)with no 'EOF'?
6
by: arne.muller | last post by:
Hello, I've come across some problems reading strucutres from binary files. Basically I've some strutures typedef struct { int i; double x; int n; double *mz;
3
by: The Cool Giraffe | last post by:
Regarding the following code i have a problem. void read () { fstream file; ios::open_mode opMode = ios::in; file.open ("some.txt", opMode); char *ch = new char; vector <charv; while...
6
by: efrenba | last post by:
Hi, I came from delphi world and now I'm doing my first steps in C++. I'm using C++builder because its ide is like delphi although I'm trying to avoid the vcl. I need to insert new features...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.