Connecting Tech Pros Worldwide Forums | Help | Site Map

Reading a very large textfile into an array

Markus Hofmann
Guest
 
Posts: n/a
#1: Jul 19 '05
Hi @ all

hope someone can help me as my PC is ready to go through the
window...here it is:

I have a text file with approximately 150,000 lines of the following
format:

0007391027,000049-0458-1556-09141999,0023924296,
0007391028,001217-0671-1610-09141999,0023924302,
0007391029,004581-0671-1630-09141999,0023924313,
0007391030,001110-0433-1636-09141999,0023924317,
0007391031,007651-0665-1648-09141999,0023924320,


How can I read the file into the memory/one array.

The aim is that I can compare various strings throughout the entire
file. I have another version working but that has to re-read each line
of the text file after comparing it with the various strings of other
lines. This takes for ages and as I have approximately 700 of these
files it is not an option....

Ivan Vecerina
Guest
 
Posts: n/a
#2: Jul 19 '05

re: Reading a very large textfile into an array


"Markus Hofmann" <hofmannmarkus@hotmail.com> wrote in message
news:a2909dd5.0308110306.29ded349@posting.google.c om...
| I have a text file with approximately 150,000 lines of the following
| format:
|
| 0007391027,000049-0458-1556-09141999,0023924296,
| 0007391028,001217-0671-1610-09141999,0023924302,
| 0007391029,004581-0671-1630-09141999,0023924313,
| 0007391030,001110-0433-1636-09141999,0023924317,
| 0007391031,007651-0665-1648-09141999,0023924320,
|
| How can I read the file into the memory/one array.

The easy way to do so in C++ is:
#include <vector>
#include <string>
#include <fstream>
using namespace std;

....
string str;
vector<string> buf;
// eventually call reserve() or use an std::deque
while( getline( srcFile,str ) )
buf.push_back(str);

| The aim is that I can compare various strings throughout the entire
| file. I have another version working but that has to re-read each line
| of the text file after comparing it with the various strings of other
| lines. This takes for ages and as I have approximately 700 of these
| files it is not an option....

To do things efficiently, you'll probably want to decode each line
as you read it (for example into a struct with integer fields).
A lot can be done to improve performance, but the best approach
depends on the type of processing needed...

hth,
--
Ivan Vecerina <> http://www.post1.com/~ivec
Brainbench MVP for C++ <> http://www.brainbench.com


Thomas Matthews
Guest
 
Posts: n/a
#3: Jul 19 '05

re: Reading a very large textfile into an array


Markus Hofmann wrote:
[color=blue]
> Hi @ all
>
> hope someone can help me as my PC is ready to go through the
> window...here it is:
>
> I have a text file with approximately 150,000 lines of the following
> format:
>
> 0007391027,000049-0458-1556-09141999,0023924296,
> 0007391028,001217-0671-1610-09141999,0023924302,
> 0007391029,004581-0671-1630-09141999,0023924313,
> 0007391030,001110-0433-1636-09141999,0023924317,
> 0007391031,007651-0665-1648-09141999,0023924320,
>
>
> How can I read the file into the memory/one array.
>
> The aim is that I can compare various strings throughout the entire
> file. I have another version working but that has to re-read each line
> of the text file after comparing it with the various strings of other
> lines. This takes for ages and as I have approximately 700 of these
> files it is not an option....[/color]

Here is a suggestion:
Split the line int fields.
1. Open the file in binary mode.
2. Record the current file position (i.e. the beginning of the line).
3. Read in the line and extract the field you want.
4. Store the file position into a map, using the key as the index:
std::map[/* key */] = file_position;
5. Repeat for the entire file.

The above builds an index table. Use the index table when
searching. It will return the file position and you can
position the file, then read the information line.

Other than indices, you would have to reorganize your data
to get a faster search.

Search (or even post) to a generic database newsgroup.
They may have some solutions. The folks in news:comp.programming
may also have some suggestions.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jerry Coffin
Guest
 
Posts: n/a
#4: Jul 19 '05

re: Reading a very large textfile into an array


In article <a2909dd5.0308110306.29ded349@posting.google.com >,
hofmannmarkus@hotmail.com says...[color=blue]
> Hi @ all
>
> hope someone can help me as my PC is ready to go through the
> window...here it is:
>
> I have a text file with approximately 150,000 lines of the following
> format:
>
> 0007391027,000049-0458-1556-09141999,0023924296,
> 0007391028,001217-0671-1610-09141999,0023924302,
> 0007391029,004581-0671-1630-09141999,0023924313,
> 0007391030,001110-0433-1636-09141999,0023924317,
> 0007391031,007651-0665-1648-09141999,0023924320,
>
>
> How can I read the file into the memory/one array.[/color]

A memory mapped file is what you probably want, but that's not portable.

In portable code, if you want the entire file as a single string, you
can use something like:

std::ifstream infile("whatever name");
std::stringstream temp;

temp << in.rdbuf();
std::string &contents = temp.str();

Now 'contents' is the entire contents of the file. This method is
typically faster than many of the obvious alternatives like reading the
file one line at a time into a vector of strings.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Closed Thread


Similar C / C++ bytes