473,385 Members | 1,673 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Reading a very large textfile into an array

Hi @ all

hope someone can help me as my PC is ready to go through the
window...here it is:

I have a text file with approximately 150,000 lines of the following
format:

0007391027,000049-0458-1556-09141999,0023924296,
0007391028,001217-0671-1610-09141999,0023924302,
0007391029,004581-0671-1630-09141999,0023924313,
0007391030,001110-0433-1636-09141999,0023924317,
0007391031,007651-0665-1648-09141999,0023924320,
How can I read the file into the memory/one array.

The aim is that I can compare various strings throughout the entire
file. I have another version working but that has to re-read each line
of the text file after comparing it with the various strings of other
lines. This takes for ages and as I have approximately 700 of these
files it is not an option....
Jul 19 '05 #1
3 7636
"Markus Hofmann" <ho***********@hotmail.com> wrote in message
news:a2**************************@posting.google.c om...
| I have a text file with approximately 150,000 lines of the following
| format:
|
| 0007391027,000049-0458-1556-09141999,0023924296,
| 0007391028,001217-0671-1610-09141999,0023924302,
| 0007391029,004581-0671-1630-09141999,0023924313,
| 0007391030,001110-0433-1636-09141999,0023924317,
| 0007391031,007651-0665-1648-09141999,0023924320,
|
| How can I read the file into the memory/one array.

The easy way to do so in C++ is:
#include <vector>
#include <string>
#include <fstream>
using namespace std;

....
string str;
vector<string> buf;
// eventually call reserve() or use an std::deque
while( getline( srcFile,str ) )
buf.push_back(str);

| The aim is that I can compare various strings throughout the entire
| file. I have another version working but that has to re-read each line
| of the text file after comparing it with the various strings of other
| lines. This takes for ages and as I have approximately 700 of these
| files it is not an option....

To do things efficiently, you'll probably want to decode each line
as you read it (for example into a struct with integer fields).
A lot can be done to improve performance, but the best approach
depends on the type of processing needed...

hth,
--
Ivan Vecerina <> http://www.post1.com/~ivec
Brainbench MVP for C++ <> http://www.brainbench.com
Jul 19 '05 #2
Markus Hofmann wrote:
Hi @ all

hope someone can help me as my PC is ready to go through the
window...here it is:

I have a text file with approximately 150,000 lines of the following
format:

0007391027,000049-0458-1556-09141999,0023924296,
0007391028,001217-0671-1610-09141999,0023924302,
0007391029,004581-0671-1630-09141999,0023924313,
0007391030,001110-0433-1636-09141999,0023924317,
0007391031,007651-0665-1648-09141999,0023924320,
How can I read the file into the memory/one array.

The aim is that I can compare various strings throughout the entire
file. I have another version working but that has to re-read each line
of the text file after comparing it with the various strings of other
lines. This takes for ages and as I have approximately 700 of these
files it is not an option....


Here is a suggestion:
Split the line int fields.
1. Open the file in binary mode.
2. Record the current file position (i.e. the beginning of the line).
3. Read in the line and extract the field you want.
4. Store the file position into a map, using the key as the index:
std::map[/* key */] = file_position;
5. Repeat for the entire file.

The above builds an index table. Use the index table when
searching. It will return the file position and you can
position the file, then read the information line.

Other than indices, you would have to reorganize your data
to get a faster search.

Search (or even post) to a generic database newsgroup.
They may have some solutions. The folks in news:comp.programming
may also have some suggestions.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jul 19 '05 #3
In article <a2**************************@posting.google.com >,
ho***********@hotmail.com says...
Hi @ all

hope someone can help me as my PC is ready to go through the
window...here it is:

I have a text file with approximately 150,000 lines of the following
format:

0007391027,000049-0458-1556-09141999,0023924296,
0007391028,001217-0671-1610-09141999,0023924302,
0007391029,004581-0671-1630-09141999,0023924313,
0007391030,001110-0433-1636-09141999,0023924317,
0007391031,007651-0665-1648-09141999,0023924320,
How can I read the file into the memory/one array.


A memory mapped file is what you probably want, but that's not portable.

In portable code, if you want the entire file as a single string, you
can use something like:

std::ifstream infile("whatever name");
std::stringstream temp;

temp << in.rdbuf();
std::string &contents = temp.str();

Now 'contents' is the entire contents of the file. This method is
typically faster than many of the obvious alternatives like reading the
file one line at a time into a vector of strings.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jul 19 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Anders Eriksson | last post by:
Hello! I'm a beginner at PHP and I wonder how I would go about to read a textfile into an array(hash). The textfile looks like this: A/S=age/sex? A/S/L=age/sex/location? AA=alcoholics...
6
by: guillaume | last post by:
I have to read and process a large ASCII file containing a mesh : a list of points and triangles. The file is 100 MBytes. I first tried to do it in memory but I think I am running out of memory...
4
by: Matthew Crema | last post by:
Hello, Say I have 1000 text files and each is a list of 32768 integers. I have written a C program to read this data into a large matrix. I am using fopen in combination with fscanf to read...
2
by: chris | last post by:
Hi there, I am reading in a textfile which looks like this (there is no new line after the last number) 03 98661881 0407 566453 The code to load the textfile looks like this:
2
by: novacreatura | last post by:
Hi, I have a project that's supposed to create a program for a "Dating Service". The first part of the program is to read a textfile of profiles which include names, age, etc...into a string...
10
by: nuke1872 | last post by:
Hello guys, I have a file names network.txt which contains a matrix. I want to read this matrix as store it as an array. I am new to stuff like these...can anybody help me out !! Thanks nuke
5
by: sajenia | last post by:
i need to write a program in c++ that is going to read English sentences from a textfile, a line at a time and store them in an array. the file has 100 sentences each occupying a single line.
1
by: Justin Fancy | last post by:
Hi everyone, I have a textfile which I need to read and compare dates. The text file summarizes every time I do an update to an internet site. Sample output is as follows: Copying...
1
by: stoogots2 | last post by:
I have written a Windows App in C# that needs to read a text file over the network, starting from the end of the file and reading backwards toward the beginning (looking for the last occurrence of a...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.