By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,804 Members | 1,627 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,804 IT Pros & Developers. It's quick & easy.

Reading a very large textfile into an array

P: n/a
Hi @ all

hope someone can help me as my PC is ready to go through the
window...here it is:

I have a text file with approximately 150,000 lines of the following
format:

0007391027,000049-0458-1556-09141999,0023924296,
0007391028,001217-0671-1610-09141999,0023924302,
0007391029,004581-0671-1630-09141999,0023924313,
0007391030,001110-0433-1636-09141999,0023924317,
0007391031,007651-0665-1648-09141999,0023924320,
How can I read the file into the memory/one array.

The aim is that I can compare various strings throughout the entire
file. I have another version working but that has to re-read each line
of the text file after comparing it with the various strings of other
lines. This takes for ages and as I have approximately 700 of these
files it is not an option....
Jul 19 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
"Markus Hofmann" <ho***********@hotmail.com> wrote in message
news:a2**************************@posting.google.c om...
| I have a text file with approximately 150,000 lines of the following
| format:
|
| 0007391027,000049-0458-1556-09141999,0023924296,
| 0007391028,001217-0671-1610-09141999,0023924302,
| 0007391029,004581-0671-1630-09141999,0023924313,
| 0007391030,001110-0433-1636-09141999,0023924317,
| 0007391031,007651-0665-1648-09141999,0023924320,
|
| How can I read the file into the memory/one array.

The easy way to do so in C++ is:
#include <vector>
#include <string>
#include <fstream>
using namespace std;

....
string str;
vector<string> buf;
// eventually call reserve() or use an std::deque
while( getline( srcFile,str ) )
buf.push_back(str);

| The aim is that I can compare various strings throughout the entire
| file. I have another version working but that has to re-read each line
| of the text file after comparing it with the various strings of other
| lines. This takes for ages and as I have approximately 700 of these
| files it is not an option....

To do things efficiently, you'll probably want to decode each line
as you read it (for example into a struct with integer fields).
A lot can be done to improve performance, but the best approach
depends on the type of processing needed...

hth,
--
Ivan Vecerina <> http://www.post1.com/~ivec
Brainbench MVP for C++ <> http://www.brainbench.com
Jul 19 '05 #2

P: n/a
Markus Hofmann wrote:
Hi @ all

hope someone can help me as my PC is ready to go through the
window...here it is:

I have a text file with approximately 150,000 lines of the following
format:

0007391027,000049-0458-1556-09141999,0023924296,
0007391028,001217-0671-1610-09141999,0023924302,
0007391029,004581-0671-1630-09141999,0023924313,
0007391030,001110-0433-1636-09141999,0023924317,
0007391031,007651-0665-1648-09141999,0023924320,
How can I read the file into the memory/one array.

The aim is that I can compare various strings throughout the entire
file. I have another version working but that has to re-read each line
of the text file after comparing it with the various strings of other
lines. This takes for ages and as I have approximately 700 of these
files it is not an option....


Here is a suggestion:
Split the line int fields.
1. Open the file in binary mode.
2. Record the current file position (i.e. the beginning of the line).
3. Read in the line and extract the field you want.
4. Store the file position into a map, using the key as the index:
std::map[/* key */] = file_position;
5. Repeat for the entire file.

The above builds an index table. Use the index table when
searching. It will return the file position and you can
position the file, then read the information line.

Other than indices, you would have to reorganize your data
to get a faster search.

Search (or even post) to a generic database newsgroup.
They may have some solutions. The folks in news:comp.programming
may also have some suggestions.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jul 19 '05 #3

P: n/a
In article <a2**************************@posting.google.com >,
ho***********@hotmail.com says...
Hi @ all

hope someone can help me as my PC is ready to go through the
window...here it is:

I have a text file with approximately 150,000 lines of the following
format:

0007391027,000049-0458-1556-09141999,0023924296,
0007391028,001217-0671-1610-09141999,0023924302,
0007391029,004581-0671-1630-09141999,0023924313,
0007391030,001110-0433-1636-09141999,0023924317,
0007391031,007651-0665-1648-09141999,0023924320,
How can I read the file into the memory/one array.


A memory mapped file is what you probably want, but that's not portable.

In portable code, if you want the entire file as a single string, you
can use something like:

std::ifstream infile("whatever name");
std::stringstream temp;

temp << in.rdbuf();
std::string &contents = temp.str();

Now 'contents' is the entire contents of the file. This method is
typically faster than many of the obvious alternatives like reading the
file one line at a time into a vector of strings.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jul 19 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.