By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,665 Members | 1,430 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,665 IT Pros & Developers. It's quick & easy.

What's wrong with std::ifstream::read()?

P: n/a
I know of a least one person who believes std::ifstream::read() and
std::ofstream::write() are "mistakes". They seem to do the job I want
done. What's wrong with them. This is the code I currently have as a test
for using std::ifstream::read(). Is there anything wrong with the way I'm
getting the file?

#include <vector>
#include <iomanip>
#include <fstream>
#include <iostream>

template<typename Iterator>
std::ostream& printHexLine(Iterator start, Iterator stop, std::ostream& out)
{
while(start<stop) out
<<std::setw(2)
<<(static_cast<unsigned int>(static_cast<unsigned char>(*start++)))<<"
";
return out;
}

template<typename Container>
std::ostream& print(const Container& data, std::ostream& out) {
typedef typename Container::const_iterator c_itr;

std::ostream hexout(out.rdbuf());
hexout.setf(std::ios::hex, std::ios::basefield);
hexout.fill('0');

c_itr from (data.begin());
c_itr dataEnd (from + data.size());
c_itr end (dataEnd - (data.size()%16));

for(c_itr start = from; start < end; start += 16) printHexLine(start,
start + 16, hexout)<<"\n";

printHexLine(end, dataEnd, hexout)<<"\n";
return out;
}

int main (int argc, char* argv[]) {
std::string filename("fileio");
std::ifstream file(filename.c_str(), std::ios::in|std::ios::binary
std::ios::ate);
std::vector<char>vbuf(file.tellg());
file.seekg(0, std::ios::beg);
file.read(&vbuf[0], vbuf.size());
print(vbuf, std::cout);
return 0;
}

--
If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true.-Bertrand Russell
Aug 4 '05 #1
Share this Question
Share on Google+
12 Replies


P: n/a

Steven T. Hatton wrote:
I know of a least one person who believes std::ifstream::read() and
std::ofstream::write() are "mistakes". They seem to do the job I want
done. What's wrong with them. This is the code I currently have as a test
for using std::ifstream::read(). Is there anything wrong with the way I'm
getting the file?
[]
std::vector<char>vbuf(file.tellg());
file.seekg(0, std::ios::beg);
file.read(&vbuf[0], vbuf.size());


You don't need read here. A simple:

std::vector<char>vbuf((istreambuf_iterator<char>(f ile)),
(istreambuf_iterator<char>()));

Would suffice. Although you might argue that your code does not involve
vector reallocations.

Aug 4 '05 #2

P: n/a
Maxim Yegorushkin wrote:

Steven T. Hatton wrote:
I know of a least one person who believes std::ifstream::read() and
std::ofstream::write() are "mistakes". They seem to do the job I want
done. What's wrong with them. This is the code I currently have as a
test
for using std::ifstream::read(). Is there anything wrong with the way
I'm getting the file?


[]
std::vector<char>vbuf(file.tellg());
file.seekg(0, std::ios::beg);
file.read(&vbuf[0], vbuf.size());


You don't need read here. A simple:

std::vector<char>vbuf((istreambuf_iterator<char>(f ile)),
(istreambuf_iterator<char>()));

Would suffice. Although you might argue that your code does not involve
vector reallocations.


I can reserve space in the vector, and still use the iterator. I don't know
what the exact implications of opening with std::ios_base::ate are. Does
that force the OS to try loading the entire file into memory? I know the
language doesn't specify, and it may well be OS dependent. It seemed to me
the iterator is probably doing a lot of work that really didn't need to be
done. What I really want to do is steal the buffer from the ifstream
rather than copy it.

I don't know of any performance evaluations comparing the different
techniques for reading files, but I do know on one job I did, we had the
largest personnel record system in the world, all in the form of scanned
images. Wasted copying was not a great idea in that context.

--
If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true.-Bertrand Russell
Aug 4 '05 #3

P: n/a

Steven T. Hatton wrote:

[]
I can reserve space in the vector, and still use the iterator. I don't know
what the exact implications of opening with std::ios_base::ate are. Does
that force the OS to try loading the entire file into memory? I know the
language doesn't specify, and it may well be OS dependent. It seemed to me
the iterator is probably doing a lot of work that really didn't need to be
done. What I really want to do is steal the buffer from the ifstream
rather than copy it.


I use memory mapped files for that. A POSIX/win32 implementation is
trivial.

Aug 4 '05 #4

P: n/a

"Steven T. Hatton" <ch********@germania.sup> wrote in message news:6O********************@speakeasy.net...
[snip]
I don't know of any performance evaluations comparing the different
techniques for reading files

[snip]

Look at

"Comparative Performance Measurement: Reading file into string"
http://groups.google.com/group/perfo...0fae8e5e065030
"Comparative Performance Measurement: Copying files"
http://groups.google.com/group/perfo...74465da4c4e9bb
--
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Aug 4 '05 #5

P: n/a
Alex Vinokur wrote:

"Steven T. Hatton" <ch********@germania.sup> wrote in message
news:6O********************@speakeasy.net...
[snip]
I don't know of any performance evaluations comparing the different
techniques for reading files

[snip]

Look at

"Comparative Performance Measurement: Reading file into string"
http://groups.google.com/group/perfo...0fae8e5e065030
"Comparative Performance Measurement: Copying files"
http://groups.google.com/group/perfo...74465da4c4e9bb

I don't see where you ran this one:

### CPP-23: std::vector and istream::read()
------------------------------------------------
vector<char> v (no_of_file_bytes);
ifs.read(&v[0], no_of_file_bytes);
ret_str = (v.empty() ? string() : string (v.begin(), v.end()));
------------------------------------------------

but it's probably safe to assume it would very closesly match:

### CPP-24: std::string and istream::read()
------------------------------------------------
string tmp (no_of_file_bytes, '0');
ifs.read(&tmp[0], no_of_file_bytes);
ret_str = tmp;
------------------------------------------------

Which outperformed streambuf iterators by between 10 and 30 times, and was
between 400 and 500 times faster than using stream iterators. That seems
to confirm what I suspected in both cases.

--
If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true.-Bertrand Russell
Aug 4 '05 #6

P: n/a

"Steven T. Hatton" <ch********@germania.sup> wrote in message
news:ce********************@speakeasy.net...
Alex Vinokur wrote:

"Steven T. Hatton" <ch********@germania.sup> wrote in message
news:6O********************@speakeasy.net...
[snip]
I don't know of any performance evaluations comparing the different
techniques for reading files

[snip]

Look at

"Comparative Performance Measurement: Reading file into string"
http://groups.google.com/group/perfo...0fae8e5e065030
"Comparative Performance Measurement: Copying files"
http://groups.google.com/group/perfo...74465da4c4e9bb

I don't see where you ran this one:

### CPP-23: std::vector and istream::read()
------------------------------------------------
vector<char> v (no_of_file_bytes);
ifs.read(&v[0], no_of_file_bytes);
ret_str = (v.empty() ? string() : string (v.begin(), v.end()));
------------------------------------------------

but it's probably safe to assume it would very closesly match:

### CPP-24: std::string and istream::read()
------------------------------------------------
string tmp (no_of_file_bytes, '0');
ifs.read(&tmp[0], no_of_file_bytes);
ret_str = tmp;
------------------------------------------------

[snip]

File file2str-1-0.cpp from
http://groups-beta.google.com/group/...4798865afae595

------------- file2str-1-0.cpp : Fragment -------------
Line#

2143
2144 MEASURE_WITH_NO_ARG (CPP_23_txt__vector__cpp_read);
2145 CHECK_TXT_RETURNED_STRING;
2146 MEASURE_WITH_NO_ARG (CPP_23_bin__vector__cpp_read);
2147 CHECK_BIN_RETURNED_STRING;
2148
2149 MEASURE_WITH_NO_ARG (CPP_24_txt__string__cpp_read);
2150 CHECK_TXT_RETURNED_STRING;
2151 MEASURE_WITH_NO_ARG (CPP_24_bin__string__cpp_read);
2152 CHECK_BIN_RETURNED_STRING;
2153

-------------------------------------------------------
--
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Aug 5 '05 #7

P: n/a
Alex Vinokur wrote:

[...]
File file2str-1-0.cpp from
http://groups-beta.google.com/group/...4798865afae595

------------- file2str-1-0.cpp : Fragment -------------
Line#

2143
2144 MEASURE_WITH_NO_ARG (CPP_23_txt__vector__cpp_read);
2145 CHECK_TXT_RETURNED_STRING;
2146 MEASURE_WITH_NO_ARG (CPP_23_bin__vector__cpp_read);
2147 CHECK_BIN_RETURNED_STRING;
2148
2149 MEASURE_WITH_NO_ARG (CPP_24_txt__string__cpp_read);
2150 CHECK_TXT_RETURNED_STRING;
2151 MEASURE_WITH_NO_ARG (CPP_24_bin__string__cpp_read);
2152 CHECK_BIN_RETURNED_STRING;
2153

-------------------------------------------------------


I'm getting this on SuSE 9.3:

$ g++ -otext file2str-1-0.cpp
file2str-1-0.cpp: In function `size_t get_filesize_via_lseek(const char*,
bool)
':
file2str-1-0.cpp:294: error: `O_BINARY' undeclared (first use this function)
file2str-1-0.cpp:294: error: (Each undeclared identifier is reported only
once for each function it appears in.)

I tried hacking around it by replacing some of the code, but it looked like
the hole was getting deeper. I'm not sure if, or where the macro is
currently defined. From googling around, it looks like it used to come
from <fcntl.h>.
--
If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true.-Bertrand Russell
Aug 5 '05 #8

P: n/a

Steven T. Hatton wrote:
Alex Vinokur wrote:

[...]
File file2str-1-0.cpp from
http://groups-beta.google.com/group/...4798865afae595

I'm getting this on SuSE 9.3:
[snip]

$ g++ -otext file2str-1-0.cpp
file2str-1-0.cpp: In function `size_t get_filesize_via_lseek(const char*,
bool)
':
file2str-1-0.cpp:294: error: `O_BINARY' undeclared (first use this function)
file2str-1-0.cpp:294: error: (Each undeclared identifier is reported only
once for each function it appears in.)

I tried hacking around it by replacing some of the code, but it looked like
the hole was getting deeper. I'm not sure if, or where the macro is
currently defined. From googling around, it looks like it used to come
from <fcntl.h>.

[snip]

O_BINARY is in fcntl.h (on UNIX).

Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Aug 6 '05 #9

P: n/a
Alex Vinokur wrote:

Steven T. Hatton wrote:
Alex Vinokur wrote:

[...]
> File file2str-1-0.cpp from
> http://groups-beta.google.com/group/...4798865afae595


I'm getting this on SuSE 9.3:


[snip]

$ g++ -otext file2str-1-0.cpp
file2str-1-0.cpp: In function `size_t get_filesize_via_lseek(const char*,
bool)
':
file2str-1-0.cpp:294: error: `O_BINARY' undeclared (first use this
function) file2str-1-0.cpp:294: error: (Each undeclared identifier is
reported only once for each function it appears in.)

I tried hacking around it by replacing some of the code, but it looked
like
the hole was getting deeper. I'm not sure if, or where the macro is
currently defined. From googling around, it looks like it used to come
from <fcntl.h>.

[snip]

O_BINARY is in fcntl.h (on UNIX).


I'm not sure that is a current requirement for UNIX. SuSE's pretty good at
getting the standards right. It's not there on my box, but there is a
definition of O_BINARY in <kpathsea/c-fopen.h>. When I add that file to
your program, it compiles, but when I run it, I get the following error:

================================================
Simple C/C++ Perfometer : Reading file to string
Version F2S-1.0
================================================
-------------
GNU gcc 3.3.5
-------------

YOUR COMMAND LINE : test 1024 1 1

### File size : 1024
### Number of runs : 1
### Number of tests : 1
### Number of repetitions : 1
### CLOCKS_PER_SEC : 1000000

Run-1 of 1 : Started
User defined file size = 1024
Txt input file size = 1030
Via fseek&ftell file size = 1024
test: file2str-1-0.cpp:1966: void measure(long unsigned int): Assertion
`infile_size2_txt == get_filesize_via_fseek_ftell ("z-txt.in", true)'
failed.
aborted
--
If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true.-Bertrand Russell
Aug 6 '05 #10

P: n/a
Steven T. Hatton wrote:
Alex Vinokur wrote:

Steven T. Hatton wrote:
Alex Vinokur wrote:

[...]
> File file2str-1-0.cpp from
> http://groups-beta.google.com/group/...4798865afae595


I'm getting this on SuSE 9.3:

[snip]

Hi Steven,

I think our discussion is going to be out of topic in comp.lang.c++
and it is worth continuing it in comp.lang.c++.perfometer.
So, my reply has been sent comp.lang.c++.perfometer and can be seen at
http://groups.google.com/group/perfo...8aa965fe5be816

-----
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Aug 7 '05 #11

P: n/a
Alex Vinokur wrote:
Steven T. Hatton wrote:
Alex Vinokur wrote:
>
> Steven T. Hatton wrote:
>> Alex Vinokur wrote:
>>
>> [...]
>> > File file2str-1-0.cpp from
>> > http://groups-beta.google.com/group/...4798865afae595
>
>>
>> I'm getting this on SuSE 9.3:

[snip]

Hi Steven,

I think our discussion is going to be out of topic in comp.lang.c++
and it is worth continuing it in comp.lang.c++.perfometer.
So, my reply has been sent comp.lang.c++.perfometer and can be seen at
http://groups.google.com/group/perfo...8aa965fe5be816

-----
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn


For some reason that newsgroup is not on my server.

FWIW:

--- get_filesize_via_fseek_ftell
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_lseek
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_fstat
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_stat
Created in TXT mode : 1
Created in BIN mode : 1

--- get_filesize_via_seekg_tellg
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_distance
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_rdbuf_pubseekoff
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1
--
If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true.-Bertrand Russell
Aug 7 '05 #12

P: n/a

Steven T. Hatton wrote:
Alex Vinokur wrote:
Steven T. Hatton wrote:
Alex Vinokur wrote:

>
> Steven T. Hatton wrote:
>> Alex Vinokur wrote:
>>
>> [...]
>> > File file2str-1-0.cpp from
>> > http://groups-beta.google.com/group/...4798865afae595
>
>>
>> I'm getting this on SuSE 9.3: [snip]

Hi Steven,

I think our discussion is going to be out of topic in comp.lang.c++
and it is worth continuing it in comp.lang.c++.perfometer.
So, my reply has been sent comp.lang.c++.perfometer and can be seen at
http://groups.google.com/group/perfo...8aa965fe5be816

[snip] For some reason that newsgroup is not on my server.
comp.lang.c++.perfometer is not on NNTP server.
One worrks with this via WEB-interface:
http://groups-beta.google.com/group/perfo

FWIW:

--- get_filesize_via_fseek_ftell
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_lseek
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_fstat
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_stat
Created in TXT mode : 1
Created in BIN mode : 1

--- get_filesize_via_seekg_tellg
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_distance
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_rdbuf_pubseekoff
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 1
Created in BIN mode, read in TXT mode: 1


Here is output of the same program ("Getting file size" from
http://groups.google.com/group/alt.s...464ce8b75f8417 )
produced with g++ 3.3.3 on Cygwin & Windows2000
--- get_filesize_via_fseek_ftell
Created in TXT mode, read in TXT mode: 2
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 2
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_lseek
Created in TXT mode, read in TXT mode: 2
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 2
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_fstat
Created in TXT mode, read in TXT mode: 2
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 2
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_stat
Created in TXT mode : 2
Created in BIN mode : 1

--- get_filesize_via_seekg_tellg
Created in TXT mode, read in TXT mode: 2
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 2
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_distance
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 2
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_rdbuf_pubseekoff
Created in TXT mode, read in TXT mode: 2
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 2
Created in BIN mode, read in TXT mode: 1
So, we can see that different operating systems/hadware produce
different file size for text mode.

I have updated "Simple C/C++ Perfometer: Reading file to string
(Versions 1.x)".
Latest version (F2S-1.0.6) is at
http://groups-beta.google.com/group/...a9b6f91239c909

Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Aug 8 '05 #13

This discussion thread is closed

Replies have been disabled for this discussion.