473,803 Members | 3,725 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

q: fast file IO

Hi. I am trying to read in a large number of floats from an ASCII
file.

I am currently using i/o streams as such:

vector <float> values;
float val;
ifstream in(fn);
in>>height;
in>>width;

for(int i=0;i<height;i+ +) {
for(int j=0;j<width;j++ ) {

in>>val;
values.push_bac k(val);

}
}

where the data is:

Width Height
value value value value value value...
However, this is turning out to be really slow when I read about a
million points. I was wondering if anyone knows how i could read this
data in much faster? I know that it is possible because several
applications do it already Maybe using C FILEs is faster than streams?
Im not sure.

Thanks!

Oliver

Jul 23 '05 #1
8 3826
laniik wrote:
Hi. I am trying to read in a large number of floats from an ASCII
file.

I am currently using i/o streams as such:

vector <float> values;
float val;
ifstream in(fn);
in>>height;
in>>width;

for(int i=0;i<height;i+ +) {
for(int j=0;j<width;j++ ) {

in>>val;
values.push_bac k(val);

}
}

where the data is:

Width Height
value value value value value value...
However, this is turning out to be really slow when I read about a
million points. I was wondering if anyone knows how i could read this
data in much faster? I know that it is possible because several
applications do it already Maybe using C FILEs is faster than streams?
Im not sure.

Thanks!

Oliver


You could try this:

#include <vector>
#include <fstream>
#include <iterator>

using namespace std;

int main() {
ifstream file( filename );
int width, height;

file >> width >> height;
vector<float> values( width * height );

copy( istream_iterato r<float>( file ), istream_iterato r<float>(),
back_inserter( values ) );

return 0;
}

This was created with a little help from
http://www.sgi.com/tech/stl/istream_iterator.html.

Mike
Jul 23 '05 #2

"Mike Austin" <no@spam.com> wrote in message
news:zi******** ************@bg tnsc04-news.ops.worldn et.att.net...
laniik wrote:
Hi. I am trying to read in a large number of floats from an ASCII
file.

I am currently using i/o streams as such:

vector <float> values;
float val;
ifstream in(fn);
in>>height;
in>>width;

for(int i=0;i<height;i+ +) {
for(int j=0;j<width;j++ ) {

in>>val;
values.push_bac k(val);

}
}

where the data is:

Width Height
value value value value value value...
However, this is turning out to be really slow when I read about a
million points. I was wondering if anyone knows how i could read this
data in much faster? I know that it is possible because several
applications do it already Maybe using C FILEs is faster than streams?
Im not sure.

Thanks!

Oliver


You could try this:

#include <vector>
#include <fstream>
#include <iterator>

using namespace std;

int main() {
ifstream file( filename );
int width, height;

file >> width >> height;
vector<float> values( width * height );

copy( istream_iterato r<float>( file ), istream_iterato r<float>(),
back_inserter( values ) );

return 0;


This is unlikely to improve performance, since 'istream_iterat or'
uses operator>>(), which 'lanik' was already using. 'lanik's idea
of using a FILE* instead of an ifstream might or might not be
faster, the only way to know is to measure. Another possibility
is to use some nonstandard platform-specific method which might
exploit a more intimate connection with the OS/file system.

-Mike
Jul 23 '05 #3
It is likely that the slowness is mainly caused by the repeatedly
reallocations of your vector object, rather than file operations. You
may try reserve() to pre-allocate the amount of memory expected.

Jul 23 '05 #4
Zenith wrote:
It is likely that the slowness is mainly caused by the repeatedly
reallocations of your vector object, rather than file operations. You
may try reserve() to pre-allocate the amount of memory expected.


You can also try a std::list. It wont double in size as a vector tends to
do.

Alvin

--
Why must I click 'Start' in order to turn off my computer?
Jul 23 '05 #5
Mike Wahler wrote:
This is unlikely to improve performance, since 'istream_iterat or'
uses operator>>(), which 'lanik' was already using. 'lanik's idea
of using a FILE* instead of an ifstream might or might not be
faster, the only way to know is to measure. Another possibility
is to use some nonstandard platform-specific method which might
exploit a more intimate connection with the OS/file system.

-Mike


Let's try it another way then. If each float string is a specific
length, we can take advantage of that:

#include <iostream>
#include <vector>
#include <iterator>
#include <fstream>

using namespace std;

const int readLength = 9;

int main( int argc, char* argv[] ) {
ifstream file( "float_data " );
int width, height, size;
char buffer[256];

file >> width >> height;
size = width * height
vector<float> values( size );

for( int i = 0; i < size && file.good(); ++i ) {
file.get( buffer, readLength );
values[i] = atof( buffer );
}

vector<float>:: iterator i;
for( i = values.begin(); i != values.end(); ++i ) {
cout << *i << " " << flush;
}

return 0;
}

Mike
Jul 23 '05 #6
i dont know this for sure, but I have heard that the atof function is
fairly slow. and that i could speed up reading time by writing my own
atof function. so maybe using .get() with my own atof will be faster
than <<.

also, does anyone know how << turns what it reads in into a float?
wonder how fast that part is.

ill try the memory allocation idea, ill try that, that makes sense.

the list idea might speed things up, but I need to have random access
abilities with this array.

thanks everyone!

Jul 23 '05 #7
laniik wrote:
i dont know this for sure, but I have heard that the atof function is
fairly slow. and that i could speed up reading time by writing my own
atof function. so maybe using .get() with my own atof will be faster
than <<.

also, does anyone know how << turns what it reads in into a float?
wonder how fast that part is.

ill try the memory allocation idea, ill try that, that makes sense.

the list idea might speed things up, but I need to have random access
abilities with this array.

thanks everyone!


I've heard things like this often. The next question would be, why
provide library functions if they are useless and slow? One factor is
of course that the library routines are very general. Here's an
implementation I found with google:

http://www.jbox.dk/sanos/source/lib/strtod.c.html

The scaling loop could be eliminated in trade for accuracy:

int scale = 1;

[...]

if (*p == '.') {
p++;
while (isdigit(*p)) {
number = number * 10. + (*p - '0');
p++;
scale *= 10;
}
}

[...]

return number / scale;
Regards,
Mike
Jul 23 '05 #8
If you have the option of changing the input file format, try storing
binary data. Use something like this:

union{float num;char data[sizeof(num)];};

That way the big killer (converting floats to and from strings) gets
killed!

Samee

Jul 23 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

20
9182
by: GS | last post by:
The stdint.h header definition mentions five integer categories, 1) exact width, eg., int32_t 2) at least as wide as, eg., int_least32_t 3) as fast as possible but at least as wide as, eg., int_fast32_t 4) integer capable of holding a pointer, intptr_t 5) widest integer in the implementation, intmax_t Is there a valid motivation for having both int_least and int_fast?
6
2816
by: Cable | last post by:
Hello, I am hoping that someone can answer a question or two regarding file access. I have created an app that reads an image from a file then displays it (using OpenGL). It works well using fopen() with fgetc() to access each byte. I have decided to move further with this app and allow the user to select the first file of an image sequence and it will play the sequence back at at 24 frames per second. I have almost everything...
6
23777
by: G.Esmeijer | last post by:
Friends, I would like to read a text file (fixed length formaated) really fast and store the data into an Access database (2003). Using the streamreader and reading line by line, separating the line into string just takes to long. When I Import the file with access manually goes fast. But how to use this fromout a C# programme who has done this before and who can give met some answers
4
7899
by: Volker Jobst | last post by:
Hi, Is there a really fast way to read a text file which contains lines of variable length? I'm using the StreamReader to read the file, but this is not as fast as I need it. thanks a lot volker jobst
2
2176
by: thomson | last post by:
Hi all, can any one tell me which is fast traversing a XML file or a hash file is fast, i got few few field names and values in XML which i will use to retrieve. I can use Hash File also to do the same operation, Can any one tell me which is faster. Thanks in Advance
4
3621
by: Alexis Gallagher | last post by:
(I tried to post this yesterday but I think my ISP ate it. Apologies if this is a double-post.) Is it possible to do very fast string processing in python? My bioinformatics application needs to scan very large ASCII files (80GB+), compare adjacent lines, and conditionally do some further processing. I believe the disk i/o is the main bottleneck so for now that's what I'm optimizing. What I have now is roughly as follows (on python...
10
3016
by: javuchi | last post by:
I just want to share some code with you, and have some comments and improvements if you want. This header file allocates and add and delete items of any kind of data from a very fast array: #include <stdlib.h> #ifndef __LIST_H__ #define __LIST_H__
3
1874
by: Michael Bacarella | last post by:
The id2name.txt file is an index of primary keys to strings. They look like this: 11293102971459182412:Descriptive unique name for this record\n 950918240981208142:Another name for another record\n The file's properties are: # wc -l id2name.txt 8191180 id2name.txt
0
8013
by: Vinod Sadanandan | last post by:
Fast-Start Failover An Overview In Dataguard Environment ============================================================================= This article describes the automatic fast start failover configuration and the conditions for trigerring a fast start failover in dataguard environment . In Faststart failover dataguard configuration if the primary database becomes unavailable, the...
9
2149
by: Salad | last post by:
I have access, for testing at my client's site, a Win2000 computer running A2003 retail. He recently upgraded all of his other machines to DualCore Pentiums with 2 gig ram and run A2003 runtime. I believe all current SPs for Windows and Office are installed on the fast machines. I have 1 process/subroutine that has worked for a couple of years without a problem. It works fine on the testing (slow) machine. The process checks a folder...
0
9703
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9566
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10317
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10300
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10069
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7607
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6844
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5503
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5636
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.