473,770 Members | 5,976 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Reading a very large textfile into an array

Hi @ all

hope someone can help me as my PC is ready to go through the
window...here it is:

I have a text file with approximately 150,000 lines of the following
format:

0007391027,0000 49-0458-1556-09141999,002392 4296,
0007391028,0012 17-0671-1610-09141999,002392 4302,
0007391029,0045 81-0671-1630-09141999,002392 4313,
0007391030,0011 10-0433-1636-09141999,002392 4317,
0007391031,0076 51-0665-1648-09141999,002392 4320,
How can I read the file into the memory/one array.

The aim is that I can compare various strings throughout the entire
file. I have another version working but that has to re-read each line
of the text file after comparing it with the various strings of other
lines. This takes for ages and as I have approximately 700 of these
files it is not an option....
Jul 19 '05 #1
3 7665
"Markus Hofmann" <ho***********@ hotmail.com> wrote in message
news:a2******** *************** ***@posting.goo gle.com...
| I have a text file with approximately 150,000 lines of the following
| format:
|
| 0007391027,0000 49-0458-1556-09141999,002392 4296,
| 0007391028,0012 17-0671-1610-09141999,002392 4302,
| 0007391029,0045 81-0671-1630-09141999,002392 4313,
| 0007391030,0011 10-0433-1636-09141999,002392 4317,
| 0007391031,0076 51-0665-1648-09141999,002392 4320,
|
| How can I read the file into the memory/one array.

The easy way to do so in C++ is:
#include <vector>
#include <string>
#include <fstream>
using namespace std;

....
string str;
vector<string> buf;
// eventually call reserve() or use an std::deque
while( getline( srcFile,str ) )
buf.push_back(s tr);

| The aim is that I can compare various strings throughout the entire
| file. I have another version working but that has to re-read each line
| of the text file after comparing it with the various strings of other
| lines. This takes for ages and as I have approximately 700 of these
| files it is not an option....

To do things efficiently, you'll probably want to decode each line
as you read it (for example into a struct with integer fields).
A lot can be done to improve performance, but the best approach
depends on the type of processing needed...

hth,
--
Ivan Vecerina <> http://www.post1.com/~ivec
Brainbench MVP for C++ <> http://www.brainbench.com
Jul 19 '05 #2
Markus Hofmann wrote:
Hi @ all

hope someone can help me as my PC is ready to go through the
window...here it is:

I have a text file with approximately 150,000 lines of the following
format:

0007391027,0000 49-0458-1556-09141999,002392 4296,
0007391028,0012 17-0671-1610-09141999,002392 4302,
0007391029,0045 81-0671-1630-09141999,002392 4313,
0007391030,0011 10-0433-1636-09141999,002392 4317,
0007391031,0076 51-0665-1648-09141999,002392 4320,
How can I read the file into the memory/one array.

The aim is that I can compare various strings throughout the entire
file. I have another version working but that has to re-read each line
of the text file after comparing it with the various strings of other
lines. This takes for ages and as I have approximately 700 of these
files it is not an option....


Here is a suggestion:
Split the line int fields.
1. Open the file in binary mode.
2. Record the current file position (i.e. the beginning of the line).
3. Read in the line and extract the field you want.
4. Store the file position into a map, using the key as the index:
std::map[/* key */] = file_position;
5. Repeat for the entire file.

The above builds an index table. Use the index table when
searching. It will return the file position and you can
position the file, then read the information line.

Other than indices, you would have to reorganize your data
to get a faster search.

Search (or even post) to a generic database newsgroup.
They may have some solutions. The folks in news:comp.progr amming
may also have some suggestions.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.l earn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jul 19 '05 #3
In article <a2************ **************@ posting.google. com>,
ho***********@h otmail.com says...
Hi @ all

hope someone can help me as my PC is ready to go through the
window...here it is:

I have a text file with approximately 150,000 lines of the following
format:

0007391027,0000 49-0458-1556-09141999,002392 4296,
0007391028,0012 17-0671-1610-09141999,002392 4302,
0007391029,0045 81-0671-1630-09141999,002392 4313,
0007391030,0011 10-0433-1636-09141999,002392 4317,
0007391031,0076 51-0665-1648-09141999,002392 4320,
How can I read the file into the memory/one array.


A memory mapped file is what you probably want, but that's not portable.

In portable code, if you want the entire file as a single string, you
can use something like:

std::ifstream infile("whateve r name");
std::stringstre am temp;

temp << in.rdbuf();
std::string &contents = temp.str();

Now 'contents' is the entire contents of the file. This method is
typically faster than many of the obvious alternatives like reading the
file one line at a time into a vector of strings.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jul 19 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
2162
by: Anders Eriksson | last post by:
Hello! I'm a beginner at PHP and I wonder how I would go about to read a textfile into an array(hash). The textfile looks like this: A/S=age/sex? A/S/L=age/sex/location? AA=alcoholics anonymous
6
6564
by: guillaume | last post by:
I have to read and process a large ASCII file containing a mesh : a list of points and triangles. The file is 100 MBytes. I first tried to do it in memory but I think I am running out of memory therefore I decide to use the shelve module to store my points and elements on disks. Despite the fact it is slow ... Any hint ? I think I have the same memory problem but I don't understand why since my aPoint should be removed by the gc.
4
5984
by: Matthew Crema | last post by:
Hello, Say I have 1000 text files and each is a list of 32768 integers. I have written a C program to read this data into a large matrix. I am using fopen in combination with fscanf to read the data in. However, it takes about 20 seconds to complete and I wonder if there is a faster way. For example, I found that I could use 'fread' to read the data into a string that looks like this:
2
2606
by: chris | last post by:
Hi there, I am reading in a textfile which looks like this (there is no new line after the last number) 03 98661881 0407 566453 The code to load the textfile looks like this:
2
3030
by: novacreatura | last post by:
Hi, I have a project that's supposed to create a program for a "Dating Service". The first part of the program is to read a textfile of profiles which include names, age, etc...into a string array, and be able to add,edit,remove to the textfile of profiles during runtime. What would be the most efficient way to do this to make it easiest as possible to make changes to the textfile during time and access elements of the array?
10
2808
by: nuke1872 | last post by:
Hello guys, I have a file names network.txt which contains a matrix. I want to read this matrix as store it as an array. I am new to stuff like these...can anybody help me out !! Thanks nuke
5
2722
by: sajenia | last post by:
i need to write a program in c++ that is going to read English sentences from a textfile, a line at a time and store them in an array. the file has 100 sentences each occupying a single line.
1
1437
by: Justin Fancy | last post by:
Hi everyone, I have a textfile which I need to read and compare dates. The text file summarizes every time I do an update to an internet site. Sample output is as follows: Copying humanresources\compensation files from tcinfotest to tcinfo -----------------------
1
4739
by: stoogots2 | last post by:
I have written a Windows App in C# that needs to read a text file over the network, starting from the end of the file and reading backwards toward the beginning (looking for the last occurrence of a couple of strings in one line of text). I do not want to read the entire file, as it is very large, on a highly utilized server, and is updated with hundreds of lines of text every second. So since I am reading backwards, I do a seek, then read,...
0
9592
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9425
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10230
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10004
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8886
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7416
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6678
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5313
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3576
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.