473,378 Members | 1,436 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

I/O getline and small buffer


What is wrong with small_buffer in program below?
I/O getline doesn't read data from file into small (relative to file line size) buffer.

====== foo.cpp ======
#include <cassert>
#include <iostream>
#include <fstream>
using namespace std;

void read_using_io_getline ()
{
char small_buffer[4];
char big_buffer[512];

ifstream infile ("foo.in");
assert (infile.is_open());

cout << infile.rdbuf() << endl;

// --------------------------
infile.clear ();
infile.seekg (0, ios::beg);

cout << endl;
cout << "[small_buffer] Start: sizeof(small_buffer) = " << sizeof(small_buffer) << endl;
cout << "[small_buffer] Start: rdstate = " << infile.rdstate() << endl;
while (infile.getline (small_buffer, sizeof(small_buffer)))
{
cout << "[small_buffer] Read: " << small_buffer << '\n';
}
cout << "[small_buffer] Finish: rdstate = " << infile.rdstate() << endl;
// --------------------------
infile.clear ();
infile.seekg (0, ios::beg);

cout << endl;
cout << "[big_buffer] Start: sizeof(big_buffer) = " << sizeof(big_buffer) << endl;
cout << "[big_buffer] Start: rdstate = " << infile.rdstate() << endl;
while (infile.getline (big_buffer, sizeof(big_buffer)))
{
cout << "[big_buffer] Read: " << big_buffer << '\n';
}
cout << "[big_buffer] Finish: rdstate = " << infile.rdstate() << endl;
}
int main()
{
read_using_io_getline ();
return 0;

}
=====================

====== Compilation & Run ======

// gpp.exe (GCC) 3.4.1

$ gpp foo.cpp
// No errors/warnings

$ ./a

1234567890
ABCDEGHIJKL
XYZUWT
[small_buffer] Start: sizeof(small_buffer) = 4
[small_buffer] Start: rdstate = 0
[small_buffer] Finish: rdstate = 4

[big_buffer] Start: sizeof(big_buffer) = 512
[big_buffer] Start: rdstate = 0
[big_buffer] Read: 1234567890
[big_buffer] Read: ABCDEGHIJKL
[big_buffer] Read: XYZUWT
[big_buffer] Finish: rdstate = 6

================================

--
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Jul 23 '05 #1
10 3776
Alex Vinokur wrote:
What is wrong with small_buffer in program below?
I/O getline doesn't read data from file into small (relative to file line size) buffer.


It does. If you RTFM carefully on 'getline', you'll probably see that it
sets the error condition in the stream if the buffer is too small to read
the entire line. That's the only indication it has for you to know that
there is still some stuff in the same line to be read. I guess you need
to improve your code to work on that error condition. Essentially, if the
stream is not good after 'getline' it doesn't necessarily mean error in
reading *in general*. It could be just a flag that tells you "hey, you
asked for a line, your buffer isn't big enough for the whole line, just to
let you know".

V
Jul 23 '05 #2

"Victor Bazarov" <v.********@comAcast.net> wrote in message news:wg*******************@newsread1.mlpsca01.us.t o.verio.net...
Alex Vinokur wrote:
What is wrong with small_buffer in program below?
I/O getline doesn't read data from file into small (relative to file line size) buffer.


It does. If you RTFM carefully on 'getline', you'll probably see that it
sets the error condition in the stream if the buffer is too small to read
the entire line. That's the only indication it has for you to know that
there is still some stuff in the same line to be read. I guess you need
to improve your code to work on that error condition. Essentially, if the
stream is not good after 'getline' it doesn't necessarily mean error in
reading *in general*. It could be just a flag that tells you "hey, you
asked for a line, your buffer isn't big enough for the whole line, just to
let you know".

V


OK.

So, must we use enough big buffer while using I/O getline?

What to do if we don't know what is the size of largest line on input file?
--
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Jul 23 '05 #3
Alex Vinokur wrote:
"Victor Bazarov" <v.********@comAcast.net> wrote in message news:wg*******************@newsread1.mlpsca01.us.t o.verio.net...
Alex Vinokur wrote:
What is wrong with small_buffer in program below?
I/O getline doesn't read data from file into small (relative to file line size) buffer.


It does. If you RTFM carefully on 'getline', you'll probably see that it
sets the error condition in the stream if the buffer is too small to read
the entire line. That's the only indication it has for you to know that
there is still some stuff in the same line to be read. I guess you need
to improve your code to work on that error condition. Essentially, if the
stream is not good after 'getline' it doesn't necessarily mean error in
reading *in general*. It could be just a flag that tells you "hey, you
asked for a line, your buffer isn't big enough for the whole line, just to
let you know".

V

OK.

So, must we use enough big buffer while using I/O getline?

What to do if we don't know what is the size of largest line on input file?


No, we don't have to use large enough buffer. Just be a bit more specific
when checking the status of the stream after 'getline'. If 'failbit' is
set, clear it and try reading again. You can probably find out how many
chars were actually read by interrogating the stream buffer about its
current position before and after the read operation.

V
Jul 23 '05 #4
Victor Bazarov wrote:
No, we don't have to use large enough buffer. Just be a bit more specific when checking the status of the stream after 'getline'. If 'failbit' is set, clear it and try reading again. You can probably find out how many chars were actually read by interrogating the stream buffer about its
current position before and after the read operation.


You can use 'std::istream's member 'gcount()' to find out how many
'char's where read by the last unformatted input operation (and
the fixes in TC1 make it pretty specific what unformatted input
is). However, it is much simpler to use an 'std::string' for
reading lines:

/**/ for (std::string line; std::getline(in, line); )
/**/ ...;
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.contendix.com> - Software Development & Consulting

Jul 23 '05 #5

"Alex Vinokur" <al****@big-foot.com> wrote in message
news:36*************@individual.net...
So, must we use enough big buffer while using I/O getline?

What to do if we don't know what is the size of largest line on input

file?

Use std::string.

Something like (probably need more error checking but...):

std::ifstream infile("spoo.txt");
if(!infile.is_open()) { std::cerr << "open failed" << std::endl; return
ERROR;}
std::string buffer;
while(std::getline(infile,buffer,'\n') {
if(infile.bad()) { std::cerr << "read failed" << std::endl; return
ERROR;}
// use the buffer...
}

Jul 23 '05 #6

"Dietmar Kuehl" <di***********@yahoo.com> wrote in message news:11**********************@z14g2000cwz.googlegr oups.com...
Victor Bazarov wrote:
No, we don't have to use large enough buffer. Just be a bit more specific
when checking the status of the stream after 'getline'. If 'failbit'

is
set, clear it and try reading again. You can probably find out how

many
chars were actually read by interrogating the stream buffer about its
current position before and after the read operation.


You can use 'std::istream's member 'gcount()' to find out how many
'char's where read by the last unformatted input operation (and
the fixes in TC1 make it pretty specific what unformatted input is).


Somethig like:

====== foo.cpp ======
#include <cassert>
#include <iostream>
#include <fstream>
using namespace std;

void read_using_io_getline (ifstream& infile_io, int buf_size)
{
char buffer[buf_size];

// --------------------------
infile_io.clear ();
infile_io.seekg (0, ios::beg);

cout << endl;
cout << "Start: sizeof(buffer) = " << sizeof(buffer) << endl;
cout << "Start: rdstate = " << infile_io.rdstate() << endl;

while (infile_io.getline (buffer, sizeof(buffer)).gcount())
{
assert (!infile_io.bad());
cout << buffer;

if (infile_io.fail())
{
infile_io.clear (~(ios_base::failbit | ~infile_io.rdstate ()));
}
else
{
cout << '\n';
}

}
cout << "Finish: rdstate = " << infile_io.rdstate() << endl;

}
int main()
{
ifstream infile ("foo.in");
assert (infile.is_open());

cout << infile.rdbuf() << endl;

read_using_io_getline (infile, 4);
read_using_io_getline (infile, 512);

infile.close();
assert (!infile.is_open());

return 0;

}
=====================
====== Run ======

9 8 7 6 5 4
1234567890
ABCDEGHIJ
XYZUWTPR
MNKSQVU
abcdeg
xyzuw
mnks
opr
st
k

goodbye
Start: sizeof(buffer) = 4
Start: rdstate = 0
9 8 7 6 5 4
1234567890
ABCDEGHIJ
XYZUWTPR
MNKSQVU
abcdeg
xyzuw
mnks
opr
st
k

goodbye
Finish: rdstate = 6

Start: sizeof(buffer) = 512
Start: rdstate = 0
9 8 7 6 5 4
1234567890
ABCDEGHIJ
XYZUWTPR
MNKSQVU
abcdeg
xyzuw
mnks
opr
st
k

goodbye
Finish: rdstate = 6

=================

However, it is much simpler to use an 'std::string' for
reading lines:

/**/ for (std::string line; std::getline(in, line); )
/**/ ...;


Of course.
But I am comparing various methods of copying files.
So, I need both the getline() function and the I/O getline method.
--
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Jul 23 '05 #7
Alex Vinokur wrote:
[..]
Somethig like:

====== foo.cpp ======
#include <cassert>
#include <iostream>
#include <fstream>
using namespace std;

void read_using_io_getline (ifstream& infile_io, int buf_size)
{
char buffer[buf_size]; ^^^^^^^^^^^^^^^^^^^^^^
This is not C++.
[...]


V
Jul 23 '05 #8

"Victor Bazarov" <v.********@comAcast.net> wrote in message news:Af*******************@newsread1.mlpsca01.us.t o.verio.net...
Alex Vinokur wrote:
[..]
Somethig like:

====== foo.cpp ======
#include <cassert>
#include <iostream>
#include <fstream>
using namespace std;

void read_using_io_getline (ifstream& infile_io, int buf_size)
{
char buffer[buf_size]; ^^^^^^^^^^^^^^^^^^^^^^
This is not C++.


You are right.
I compiled that with g++.
[...]


V

--
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Jul 23 '05 #9

"Alex Vinokur" <al****@big-foot.com> wrote in message news:36*************@individual.net...
[snip]
"Dietmar Kuehl" <di***********@yahoo.com> wrote in message news:11**********************@z14g2000cwz.googlegr oups.com...
You can use 'std::istream's member 'gcount()' to find out how many
'char's where read by the last unformatted input operation (and
the fixes in TC1 make it pretty specific what unformatted input is).


Somethig like:

[snip]

Here is update version which treats last '\n' or non-'\n'.

------ read_using_io_getline ------
#include <cassert>
#include <iostream>
#include <sstream>
#include <fstream>
using namespace std;

string read_using_io_getline (ifstream& infile_io, int buf_size)
{
char* buffer = new (nothrow) char [buf_size];
assert (!(buffer == NULL));

const ios::iostate prev_state (infile_io.rdstate());
const ios::pos_type prev_pos (infile_io.tellg());

// --------------------------
infile_io.clear ();
infile_io.seekg (0, ios::beg);

ostringstream oss;
while (infile_io.getline (buffer, buf_size).gcount())
{
assert (!infile_io.bad());
oss << buffer;

if (infile_io.fail()) infile_io.clear (~(ios_base::failbit | ~infile_io.rdstate ()));
else oss << '\n';
}

string ret_str(oss.str());
if (ret_str.size() > 1)
{
infile_io.rdbuf()->sungetc ();
if (infile_io.rdbuf()->sgetc() != '\n') ret_str.erase(ret_str.size() - 1);
}

// ---------------------------
infile_io.clear(prev_state);
infile_io.seekg(prev_pos, ios::beg);

assert (prev_state == infile_io.rdstate());
assert (prev_pos == infile_io.tellg());
// ---------------------------

return ret_str;

}
int main(int argc, char** argv)
{
cout << "YOUR COMMAND LINE: ";
for (int i = 0; i < argc; i++)
{
cout << argv[i] << " ";
}
cout << endl;
cout << endl;

for (int i = 1; i < argc; i++)
{
cout << endl;
cout << "--- File-" << i << " : " << argv[i] << " ---" << endl;
cout << endl;
ifstream infile (argv[i]);
assert (infile.is_open());

cout << "Source data file: " << endl;
cout << "<" << infile.rdbuf() << ">" << endl;
cout << endl;

int cur_buf_size;

cur_buf_size = 4;
cout << endl;
cout << "Start : buf_size = " << cur_buf_size << endl;
cout << "<" << read_using_io_getline (infile, cur_buf_size) << ">" << endl;
cout << "Finish: buf_size = " << cur_buf_size << endl << endl;

cur_buf_size = 256;
cout << endl;
cout << "Start : buf_size = " << cur_buf_size << endl;
cout << "<" << read_using_io_getline (infile, cur_buf_size) << ">" << endl;
cout << "Finish: buf_size = " << cur_buf_size << endl << endl;

infile.close();
assert (!infile.is_open());

cout << endl;
}

return 0;

}
-----------------------------------
--
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Jul 23 '05 #10

"Alex Vinokur" <al****@big-foot.com> wrote in message news:36*************@individual.net...
[snip]
Here is update version which treats last '\n' or non-'\n'.

------ read_using_io_getline ------ [snip] string ret_str(oss.str()); ------------------------------- if (ret_str.size() > 1) // Should be
if (!ret_str.empty())
------------------------------- {
infile_io.rdbuf()->sungetc ();
if (infile_io.rdbuf()->sgetc() != '\n') ret_str.erase(ret_str.size() - 1);
}
[snip]
delete[] buffer; return ret_str;


[snip]
--
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Jul 23 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: John | last post by:
Hello all, I am trying to read in lines into a buffer from a file. Normally I would do this very low-level, but I have come to the conclusion I must stop doing everything the hard way. So, I...
4
by: news.hku.hk | last post by:
Excuse me, could you point out why the command below will generate error? And how to overcome this problem by still using string?? i know char array is ok but i don't want to use it. Thanks a lot ...
18
by: Amadeus W. M. | last post by:
I'm trying to read a whole file as a single string, using the getline() function, as in the example below. I can't tell what I'm doing wrong. Tried g++ 3.2, 3.4 and 4.0. Thanks! #include...
1
by: tinks | last post by:
I am getting a linking error when I do something like this: ifstream dataFile; dataFile.open(dataFileName_, ios::in); while(dataFile) { dataFile.getline(buffer, MAX_DATA_FILE_LINE_LEN); //...
6
by: Dave | last post by:
In .Net 2003 if a line, read from a text file is larger than a size parameter, the ifstream getline(buff, sze) put the file pointer to the EOF, so next peek() returns EOF. I saw this problem...
4
by: wqyuwss | last post by:
Hi, We have several core dumps in our product. These core dump can be reproduced in the same place. That is system function call std::basic_istream<char,std::char_traits<char>>::getline. The...
2
by: Bit Byter | last post by:
I have a method in a class implemented like this: int foo (const char* filename, bool flg) { FILE *fp; char *buffer, *tokstr, *tmp; size_t size, toksize; unsigned int i; if ((fp =...
8
by: toton | last post by:
Hi, I am reading some large text files and parsing it. typical file size I am using is 3 MB. It takes around 20 sec just to use std::getline (I need to treat newlines properly ) for whole file in...
11
by: rory | last post by:
I am reading a binary file and I want to search it for a string. The only problem is that failbit gets set after only a few calls to getline() so it never reaches the end of the file where the...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.