473,407 Members | 2,326 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,407 software developers and data experts.

fstream, getline() and failbit

I am reading a binary file and I want to search it for a string. The
only problem is that failbit gets set after only a few calls to
getline() so it never reaches the end of the file where the string is
contained. From reading through posts to this list it seems that
failbit gets set if there is a format error whilst reading. Is it bad
form to reading binary data into a char[] array? Is this why my
function below doesn't work?

void ReadBinData()
{
int reads=0;
string data;
char str[1024];
fstream myFile ("test.exe", ios::in | ios::binary);
if ( (myFile.rdstate() & ifstream::failbit ) != 0 )
cout << "error";

while(myFile.getline(str, 1024 ))
{
data = str;
if(data.find("roryrory", 0)!=string::npos)
cout << "found it";
reads++;
}

cout << "\nno of times getline was called = " << reads << endl;

if ( (myFile.rdstate() & ifstream::failbit ) != 0 )
cout << "\nerror, failbit set....";

myFile.close();
}

Rory.
Jan 23 '08 #1
11 13439
rory wrote:
while(myFile.getline(str, 1024 ))
Try read (str, 1024) instead of getline (can't use "while (read (...))"
then though) - why would you want to read "lines" from a binary (.exe)
file anyways?

However that should not be the cause of your problem. What COULD happen
is that your search expression will get split over 2 different
"getlines" in your case, when 1023 bytes have been read without finding
a newline delimiter. str will then for example contain "roryr\n" in the
last 6 bytes, and on the next call will be filled with "ory" in the
first 3 bytes. You'd never find your search expression.
And the failbit might well be set after "only a few" calls to getline,
if the executable only contains a few newlines and isn't much bigger
than a couple of kilobytes.. So long story short: don't use getline!
Then see if your problem still occurs.

Best Regards,

Lars
Jan 23 '08 #2
On Jan 23, 1:43 pm, Lars Uffmann <a...@nurfuerspam.dewrote:
rory wrote:
while(myFile.getline(str, 1024 ))

Try read (str, 1024) instead of getline (can't use "while (read (...))"
then though) - why would you want to read "lines" from a binary (.exe)
file anyways?

However that should not be the cause of your problem. What COULD happen
is that your search expression will get split over 2 different
"getlines" in your case, when 1023 bytes have been read without finding
a newline delimiter. str will then for example contain "roryr\n" in the
last 6 bytes, and on the next call will be filled with "ory" in the
first 3 bytes. You'd never find your search expression.
And the failbit might well be set after "only a few" calls to getline,
if the executable only contains a few newlines and isn't much bigger
than a couple of kilobytes.. So long story short: don't use getline!
Then see if your problem still occurs.

Best Regards,

Lars
Thanks Lars, using read it seems to read the entire file, the filesize
is 3.830 mbs and read is called 3829 times. My next problem is one you
alluded to, reading blocks of data means the string could get chopped
up which is the last thing I want. The idea is that I append a unique
string identifier to a binary file, then I append some text after it.
I then want to search that file for the unique string identifier and
then retrieve the text that follows it. Before writing the unique
string I first write a newline char, that's why I thought I could just
use getline() as it runs until a new line. Valid point however that it
might not always get to a new line. Have you any suggestions for me on
how I might do this? Thanks for the reply,

Rory.
Jan 23 '08 #3
In article <ed81c061-791b-4d38-ac59-
54**********@v67g2000hse.googlegroups.com>, ro*******@gmail.com says...

[ ... ]
Thanks Lars, using read it seems to read the entire file, the filesize
is 3.830 mbs and read is called 3829 times. My next problem is one you
alluded to, reading blocks of data means the string could get chopped
up which is the last thing I want. The idea is that I append a unique
string identifier to a binary file, then I append some text after it.
I then want to search that file for the unique string identifier and
then retrieve the text that follows it. Before writing the unique
string I first write a newline char, that's why I thought I could just
use getline() as it runs until a new line. Valid point however that it
might not always get to a new line. Have you any suggestions for me on
how I might do this? Thanks for the reply,
My guess is that the failure is due to some value in the file being
interpreted as signaling the end of the file when it's treated as text.
Unix generally treats control-D this way; for Windows it's control-Z.
Regardless, you need to tell your stream not to interpret control
characters that way, by opening it as a binary stream:

std::ifstream file(your_file_name, std::ios::binary);

std::stringstream temp;

// copy the file into a string
temp << file.rdbuf();

// marker for the beginning of your data:
std::string sentinel("\nroryrory");

// find your data (std::string::npos if it doesn't exist)
int data_pos = temp.str().find(sentinel)+sentinel.length();

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jan 23 '08 #4
On 2008-01-23 10:25:24 -0500, Jerry Coffin <jc*****@taeus.comsaid:
>
My guess is that the failure is due to some value in the file being
interpreted as signaling the end of the file when it's treated as text.
Unix generally treats control-D this way; for Windows it's control-Z.
Regardless, you need to tell your stream not to interpret control
characters that way, by opening it as a binary stream:
Yes, that's the right way to read a binary file. But having done that,
the runtime library also won't translate the character sequence that
represents a newline into the character '\n'. It's binary data all the
way...
>
// marker for the beginning of your data:
std::string sentinel("\nroryrory");
That '\n' at the beginning may or may not match some sequence of bytes
that was written to the file. Search for "roryrory" instead.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Jan 23 '08 #5
I think this take care of the '\n' problem also

#include <iostream>
#include <fstream>
using namespace std;

int main()
{
ifstream::pos_type size;
char * memblock;

ifstream iFile;
iFile.open("asdf.txt",ios::in|ios::binary|ios::ate );
if(iFile.is_open())
{
size = iFile.tellg(); //get the size of file
memblock = new char[size]; //memblock needs size byte to hold all
data
iFile.seekg(0,ios::beg); //go back to beginning of file
iFile.read(memblock,size); //read whole file in memblock
iFile.close(); //close the file
}
char * pos;
pos = strchr(&memblock[0],'r'); //find the position of the first
'r'
while( strncmp(pos,"roryrory",8) ) //compare 8 char size
{
pos = strchr(pos+1,'a'); // keep looking to the next char
}
return 0;
}
Jan 23 '08 #6
That code causes my program to crash. It seems to die at the while
loop, if I place a cout in there it never gets printed and I get a
'program has encountered a problem and needs to close' notice. My
previous version, the one using stringstream was finding the correct
string and returning the right position but when I tried reading from
that position on I get a strange string. I don't know why? Thanks for
replying, I am getting closer to finding the problem but will have to
leave it till tomorrow, it's bedtime!

Rory.

Jan 24 '08 #7
On Jan 24, 4:55*am, rory <rorywa...@gmail.comwrote:
That code causes my program to crash. It seems to die at the while
loop, if I place a cout in there it never gets printed and I get a
'program has encountered a problem and needs to close' notice. My
previous version, the one using stringstream was finding the correct
string and returning the right position but when I tried reading from
that position on I get a strange string. I don't know why? Thanks for
replying, I am getting closer to finding the problem but will have to
leave it till tomorrow, it's bedtime!

Rory.
This code worked for me.

#include <fstream>
#include <iostream>
#include <sstream>
#include <string>

main()
{
std::ifstream ifstr("new",std::ios::binary);
std::stringstream temp;
temp << ifstr.rdbuf();
const std::string sentinel("roryrory");
const std::string::size_type data_pos(temp.str().find(sentinel,
0)+sentinel.length());
const std::string myText(temp.str().substr(data_pos,50));
std::cout << myText << '\n';
}
Thanks,
Balaji.
Jan 24 '08 #8
Your code work here too. The only difference I can spot is that you
used const std::string's. I'm delighted that it now works for me but
can someone explain *why* it didn't work using plain old std::strings?
Thanks to everyone who's replied, I can now move forward with my
project.

Rory.

Jan 24 '08 #9
On Jan 23, 4:25 pm, Jerry Coffin <jcof...@taeus.comwrote:
In article <ed81c061-791b-4d38-ac59-
54277a539...@v67g2000hse.googlegroups.com>, rorywa...@gmail.com says...
[ ... ]
Thanks Lars, using read it seems to read the entire file, the filesize
is 3.830 mbs and read is called 3829 times. My next problem is one you
alluded to, reading blocks of data means the string could get chopped
up which is the last thing I want. The idea is that I append a unique
string identifier to a binary file, then I append some text after it.
I then want to search that file for the unique string identifier and
then retrieve the text that follows it. Before writing the unique
string I first write a newline char, that's why I thought I could just
use getline() as it runs until a new line. Valid point however that it
might not always get to a new line. Have you any suggestions for me on
how I might do this? Thanks for the reply,
My guess is that the failure is due to some value in the file being
interpreted as signaling the end of the file when it's treated as text.
That's one possibility. Another is simply that his buffer isn't
big enough to hold the longest "line". getline() will set the
failbit if it encounters the end of the buffer before it sees a
'\n' character.
Unix generally treats control-D this way; for Windows it's control-Z.
Unix never treats control-D this way in a file. Under Unix,
there is absolutely no difference between binary files and text
files.
Regardless, you need to tell your stream not to interpret
control characters that way, by opening it as a binary stream:
std::ifstream file(your_file_name, std::ios::binary);
And of course, he'll also have to write the file in binary mode;
otherwise, some of the output data might be modified.
std::stringstream temp;
// copy the file into a string
temp << file.rdbuf();
// marker for the beginning of your data:
std::string sentinel("\nroryrory");
// find your data (std::string::npos if it doesn't exist)
int data_pos = temp.str().find(sentinel)+sentinel.length();
Depending on the implementation, that might not be such a good
idea; some implementations of stringstream grow the string very
inefficiently (and it will be 3.8 MB). If he can determine the
size of the file before hand, resizing an std::vector<charand
reading the entire file into it, then using std::search might be
a good option. (If portability isn't a concern, mmap'ing the
file is likely to be the fastest solution.) Otherwise, a KMP
search is pretty straightforward, and since it never requires
backing up, it avoids the problem of the sequence being split
across two successive buffers. Or if he needs an even faster
algorithm (BM, for example), he can save a block the size of the
sentinel at the start of the buffer, copy the end of the
preceding buffer into it before each read, and start his search
from there.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jan 24 '08 #10
On Jan 23, 5:12 pm, Pete Becker <p...@versatilecoding.comwrote:
On 2008-01-23 10:25:24 -0500, Jerry Coffin <jcof...@taeus.comsaid:
Yes, that's the right way to read a binary file. But having
done that, the runtime library also won't translate the
character sequence that represents a newline into the
character '\n'. It's binary data all the way...
// marker for the beginning of your data:
std::string sentinel("\nroryrory");
That '\n' at the beginning may or may not match some sequence
of bytes that was written to the file. Search for "roryrory"
instead.
If it's binary data, he'd better have used binary mode when he
wrote it as well. In which case, reading it in binary mode
will return exactly the same bytes he wrote.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jan 24 '08 #11
rory <ro*******@gmail.comwrote:
I am reading a binary file and I want to search it for a string. The
only problem is that failbit gets set after only a few calls to
getline() so it never reaches the end of the file where the string is
contained. From reading through posts to this list it seems that
failbit gets set if there is a format error whilst reading. Is it bad
form to reading binary data into a char[] array? Is this why my
function below doesn't work?

void ReadBinData()
{
int reads=0;
string data;
char str[1024];
fstream myFile ("test.exe", ios::in | ios::binary);
if ( (myFile.rdstate() & ifstream::failbit ) != 0 )
cout << "error";

while(myFile.getline(str, 1024 ))
{
data = str;
if(data.find("roryrory", 0)!=string::npos)
cout << "found it";
reads++;
}

cout << "\nno of times getline was called = " << reads << endl;

if ( (myFile.rdstate() & ifstream::failbit ) != 0 )
cout << "\nerror, failbit set....";

myFile.close();
}
Is there a particular reason why you can't use a standard algorithm?

void ReadBinData()
{
fstream myFile("test.exe", ios::in | ios::binary);
const char* rory = "roryrory";
search( istream_iterator<char>( myFile ), istream_iterator<char>(),
rory, rory + strlen( rory ) );
if ( myFile )
{
cout << "found it\n";
}
else if ( myFile.eof() )
{
cout << "not found\n";
}
else
cout << "error\n";
myFile.close();
}
Jan 24 '08 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Newsgroup - Ann | last post by:
The fstream is supposed to be a replacement of the stdio.h and the exception is supposed to be a replacement of the return error code. So how do I replace my old codes of the style: if (output =...
3
by: Mike Austin | last post by:
It's the most annoying thing, and causes hours of frustration. Why can't it be resolved? Regards, Mike Austin Example: #include <iostream> #include <string>
10
by: Alex Vinokur | last post by:
What is wrong with small_buffer in program below? I/O getline doesn't read data from file into small (relative to file line size) buffer. ====== foo.cpp ====== #include <cassert> #include...
18
by: Amadeus W. M. | last post by:
I'm trying to read a whole file as a single string, using the getline() function, as in the example below. I can't tell what I'm doing wrong. Tried g++ 3.2, 3.4 and 4.0. Thanks! #include...
2
by: Assertor | last post by:
Hi, All. (VC++6.0) I found some strange thins when using getline() and seekg() of std::ifstream. After the file position of an open file was shift to the end of the file, seekg() did not...
3
by: shyam | last post by:
Hi All I want to know if there is any problem with using fstream.getLine(char*, int ) function. My problem is that when I read a file using it, the program aborts when it reads the...
2
by: manwanirg | last post by:
the function getline is a public member of istream and cin.getline can be used. Since ifstream is publicily derived from istream, getline shall be available in ifstream as well. However,on solaris...
4
by: IanWright | last post by:
I'm having trouble with getline function, in the sense that I keep getting an error, and I'm wondering if anyone can help? The code I'm using is below. What it does is to read a line from one...
8
by: khalid302 | last post by:
I need to read a specific number of characters into an std::string from a file regardless of the characters read. * fstream operator>stops when it runs into a white space character. * The global...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.