473,396 Members | 1,996 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

q: fast file IO

Hi. I am trying to read in a large number of floats from an ASCII
file.

I am currently using i/o streams as such:

vector <float> values;
float val;
ifstream in(fn);
in>>height;
in>>width;

for(int i=0;i<height;i++) {
for(int j=0;j<width;j++) {

in>>val;
values.push_back(val);

}
}

where the data is:

Width Height
value value value value value value...
However, this is turning out to be really slow when I read about a
million points. I was wondering if anyone knows how i could read this
data in much faster? I know that it is possible because several
applications do it already Maybe using C FILEs is faster than streams?
Im not sure.

Thanks!

Oliver

Jul 23 '05 #1
8 3805
laniik wrote:
Hi. I am trying to read in a large number of floats from an ASCII
file.

I am currently using i/o streams as such:

vector <float> values;
float val;
ifstream in(fn);
in>>height;
in>>width;

for(int i=0;i<height;i++) {
for(int j=0;j<width;j++) {

in>>val;
values.push_back(val);

}
}

where the data is:

Width Height
value value value value value value...
However, this is turning out to be really slow when I read about a
million points. I was wondering if anyone knows how i could read this
data in much faster? I know that it is possible because several
applications do it already Maybe using C FILEs is faster than streams?
Im not sure.

Thanks!

Oliver


You could try this:

#include <vector>
#include <fstream>
#include <iterator>

using namespace std;

int main() {
ifstream file( filename );
int width, height;

file >> width >> height;
vector<float> values( width * height );

copy( istream_iterator<float>( file ), istream_iterator<float>(),
back_inserter( values ) );

return 0;
}

This was created with a little help from
http://www.sgi.com/tech/stl/istream_iterator.html.

Mike
Jul 23 '05 #2

"Mike Austin" <no@spam.com> wrote in message
news:zi********************@bgtnsc04-news.ops.worldnet.att.net...
laniik wrote:
Hi. I am trying to read in a large number of floats from an ASCII
file.

I am currently using i/o streams as such:

vector <float> values;
float val;
ifstream in(fn);
in>>height;
in>>width;

for(int i=0;i<height;i++) {
for(int j=0;j<width;j++) {

in>>val;
values.push_back(val);

}
}

where the data is:

Width Height
value value value value value value...
However, this is turning out to be really slow when I read about a
million points. I was wondering if anyone knows how i could read this
data in much faster? I know that it is possible because several
applications do it already Maybe using C FILEs is faster than streams?
Im not sure.

Thanks!

Oliver


You could try this:

#include <vector>
#include <fstream>
#include <iterator>

using namespace std;

int main() {
ifstream file( filename );
int width, height;

file >> width >> height;
vector<float> values( width * height );

copy( istream_iterator<float>( file ), istream_iterator<float>(),
back_inserter( values ) );

return 0;


This is unlikely to improve performance, since 'istream_iterator'
uses operator>>(), which 'lanik' was already using. 'lanik's idea
of using a FILE* instead of an ifstream might or might not be
faster, the only way to know is to measure. Another possibility
is to use some nonstandard platform-specific method which might
exploit a more intimate connection with the OS/file system.

-Mike
Jul 23 '05 #3
It is likely that the slowness is mainly caused by the repeatedly
reallocations of your vector object, rather than file operations. You
may try reserve() to pre-allocate the amount of memory expected.

Jul 23 '05 #4
Zenith wrote:
It is likely that the slowness is mainly caused by the repeatedly
reallocations of your vector object, rather than file operations. You
may try reserve() to pre-allocate the amount of memory expected.


You can also try a std::list. It wont double in size as a vector tends to
do.

Alvin

--
Why must I click 'Start' in order to turn off my computer?
Jul 23 '05 #5
Mike Wahler wrote:
This is unlikely to improve performance, since 'istream_iterator'
uses operator>>(), which 'lanik' was already using. 'lanik's idea
of using a FILE* instead of an ifstream might or might not be
faster, the only way to know is to measure. Another possibility
is to use some nonstandard platform-specific method which might
exploit a more intimate connection with the OS/file system.

-Mike


Let's try it another way then. If each float string is a specific
length, we can take advantage of that:

#include <iostream>
#include <vector>
#include <iterator>
#include <fstream>

using namespace std;

const int readLength = 9;

int main( int argc, char* argv[] ) {
ifstream file( "float_data" );
int width, height, size;
char buffer[256];

file >> width >> height;
size = width * height
vector<float> values( size );

for( int i = 0; i < size && file.good(); ++i ) {
file.get( buffer, readLength );
values[i] = atof( buffer );
}

vector<float>::iterator i;
for( i = values.begin(); i != values.end(); ++i ) {
cout << *i << " " << flush;
}

return 0;
}

Mike
Jul 23 '05 #6
i dont know this for sure, but I have heard that the atof function is
fairly slow. and that i could speed up reading time by writing my own
atof function. so maybe using .get() with my own atof will be faster
than <<.

also, does anyone know how << turns what it reads in into a float?
wonder how fast that part is.

ill try the memory allocation idea, ill try that, that makes sense.

the list idea might speed things up, but I need to have random access
abilities with this array.

thanks everyone!

Jul 23 '05 #7
laniik wrote:
i dont know this for sure, but I have heard that the atof function is
fairly slow. and that i could speed up reading time by writing my own
atof function. so maybe using .get() with my own atof will be faster
than <<.

also, does anyone know how << turns what it reads in into a float?
wonder how fast that part is.

ill try the memory allocation idea, ill try that, that makes sense.

the list idea might speed things up, but I need to have random access
abilities with this array.

thanks everyone!


I've heard things like this often. The next question would be, why
provide library functions if they are useless and slow? One factor is
of course that the library routines are very general. Here's an
implementation I found with google:

http://www.jbox.dk/sanos/source/lib/strtod.c.html

The scaling loop could be eliminated in trade for accuracy:

int scale = 1;

[...]

if (*p == '.') {
p++;
while (isdigit(*p)) {
number = number * 10. + (*p - '0');
p++;
scale *= 10;
}
}

[...]

return number / scale;
Regards,
Mike
Jul 23 '05 #8
If you have the option of changing the input file format, try storing
binary data. Use something like this:

union{float num;char data[sizeof(num)];};

That way the big killer (converting floats to and from strings) gets
killed!

Samee

Jul 23 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

20
by: GS | last post by:
The stdint.h header definition mentions five integer categories, 1) exact width, eg., int32_t 2) at least as wide as, eg., int_least32_t 3) as fast as possible but at least as wide as, eg.,...
6
by: Cable | last post by:
Hello, I am hoping that someone can answer a question or two regarding file access. I have created an app that reads an image from a file then displays it (using OpenGL). It works well using...
6
by: G.Esmeijer | last post by:
Friends, I would like to read a text file (fixed length formaated) really fast and store the data into an Access database (2003). Using the streamreader and reading line by line, separating the...
4
by: Volker Jobst | last post by:
Hi, Is there a really fast way to read a text file which contains lines of variable length? I'm using the StreamReader to read the file, but this is not as fast as I need it. thanks a lot...
2
by: thomson | last post by:
Hi all, can any one tell me which is fast traversing a XML file or a hash file is fast, i got few few field names and values in XML which i will use to retrieve. I can use Hash File also to do the...
4
by: Alexis Gallagher | last post by:
(I tried to post this yesterday but I think my ISP ate it. Apologies if this is a double-post.) Is it possible to do very fast string processing in python? My bioinformatics application needs to...
10
by: javuchi | last post by:
I just want to share some code with you, and have some comments and improvements if you want. This header file allocates and add and delete items of any kind of data from a very fast array: ...
3
by: Michael Bacarella | last post by:
The id2name.txt file is an index of primary keys to strings. They look like this: 11293102971459182412:Descriptive unique name for this record\n 950918240981208142:Another name for another...
0
by: Vinod Sadanandan | last post by:
Fast-Start Failover An Overview In Dataguard Environment ============================================================================= This article describes the automatic fast start failover...
9
by: Salad | last post by:
I have access, for testing at my client's site, a Win2000 computer running A2003 retail. He recently upgraded all of his other machines to DualCore Pentiums with 2 gig ram and run A2003 runtime. ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.