I have a comma delimited file to parse up. So I figured I will go line by line and parse them up. Now efficiency is the top priority here. So I figure I can use the old-school C approach or use std::string. The way I see it I could (assuming infile is a valid std::ifstream) - char buffer [4096];
-
while (!infile.eof () && !infile.bad ())
-
{
-
// Get the next line in the file
-
infile.getline (buffer, 4095);
-
-
// If fail bit is set it means we did not read the entire
-
// line in. So we need to skip it, but the rest of the line is
-
// still on the stream. So clear the error and contnue reading
-
// until the end of the line, skipping it all.
-
if (infile.fail ())
-
{
-
while (infile.fail ())
-
{
-
infile.clear ();
-
infile.getline (buffer, 4095);
-
}
-
continue;
-
}
-
-
// If the string is empty or all whitepace skip it.
-
char *str = buffer;
-
while (*str != '\0' && isspace (*str))
-
str++;
-
-
if (strlen (str) == 0)
-
continue;
-
-
char *tokstr = strtok (str, ",");
-
while (tokstr)
-
{
-
// Do something with the data
-
tokstr = strtok (NULL, ",");
-
}
-
}
-
or - string buffer;
-
while (!infile.eof () && !infile.bad ())
-
{
-
getline (infile, buffer);
-
-
// Don't worry about fail bit, string is dynamic and can read all lines
-
-
string::iterator stritr = buffer.begin ();
-
while (isspace (*stritr))
-
stritr++;
-
-
if (stritr == buffer.end ())
-
continue;
-
if (stritr != buffer.begin ())
-
buffer.erase (buffer.begin (), stritr)
-
-
int startpos = 0;
-
int endpos = buffer.find_first_of (",");
-
while (startpos != npos)
-
{
-
valuestr = buffer.substr (startpos, endpos);
-
// Do something with the data
-
-
startpos = endpos + 1;
-
endpos = buffer.find_first_of (",", startpos);
-
}
-
}
The string version is a little nicer on memory since there is no real cap on the line size whereas the first approach caps it at 4095 and requires a bit more logic to deal with longer lines (not to mention not importing them). However, I have a strong feeling the first one is much faster. Any thoughts on which is faster? Any optimizations you see?
3 2088
The STL templates are faster than the C functions.
P J Plauger, who wrote the STL templates says they were optimized for speed and that If you think you can do this faster, then think three times.
Operations on STL containers with millions of strings are much faster than C code.
Cool. My concern of speed was in the line: - valuestr = buffer.substr (startpos, endpos);
Simply because its allocatin/dealloc temporary memory. But the other speed enhancements probably outweigh it.
Perhaps you can avoid the copy - it depends on what "do something with the data" involves.
Let's suppose you want to covert it to an integer
// replace comma with null
buffer[ endpos ] = '\0';
// convert
int i = atoi( (const char*) &buffer[ startpos ] );
All very tricky, of course, and of dubious value. Are you SURE this is the bottleneck? Have you run a profile?
James
Sign in to post your reply or Sign up for a free account.
Similar topics |
by: Gerrit Holl |
last post by:
Posted with permission from the author.
I have some comments on this PEP, see the (coming) followup to this message.
PEP: 321
Title: Date/Time Parsing and Formatting
Version: $Revision: 1.3 $
Last-Modified: $Date: 2003/10/28 19:48:44 $
Author: A.M. Kuchling <amk@amk.ca>
Status: Draft
Type: Standards Track
|
by: Cigdem |
last post by:
Hello,
I am trying to parse the XML files that the user selects(XML files are
on anoher OS400 system called "wkdis3"). But i am permenantly getting
that error:
Directory0: \\wkdis3\ROOT\home
Canonicalpath-Directory4: \\wkdis3\ROOT\home\bwe\
You selected the file named AAA.XML
getXmlAlgorithmDocument(): IOException Not logged in
|
by: Oliver Corona |
last post by:
I am wondering if anyone has any insights on the performance benefit (or
detriment) of declaring local variables instead of referencing members.
Is allocating memory for a new variable more efficient than repeatedly
referencing the member in a loop?
Maybe using a string isn't the best example, but hopefully you get the
idea!
* example (referencing member):
|
by: PD |
last post by:
I am currently making a dating website.
I want to have some information on how to structure the database and
the php files so that I can achieve speed and efficiency.
Can some one please give me suggestions and point me to references
where I can get this information.
|
by: vamshi |
last post by:
Hi all,
This is a question about the efficiency of the code.
a :-
int i;
for( i = 0; i < 20; i++ )
printf("%d",i);
b:-
int i = 10;
| |
by: Benny the Guard |
last post by:
I know this sounds like a homework problem, but it's honestly not one. I am writing some code which given an IP address in string format will calculate the integer value of it. Now the trick is I need it to be quick. I see two paths and not sure which would be faster.
1) Use strtok like functionality (actually custom code to make it reentrant and in my case I can change the string) to parse the components of the address. I would also keep...
|
by: OldBirdman |
last post by:
Efficiency
I've never stumbled on any discussion of efficiency of various methods of coding, although I have found posts on various forums where individuals were concerned with efficiency. I'm not concerned when dealing with user typing, but I am if a procedure is called by a query.
Does the VBA compiler generate "in-line" code for some apparent function calls? For example, y = Abs(x) might be compiled as y = x & mask. The string...
|
by: gw7rib |
last post by:
I have a program that needs to do a small amount of relatively simple
parsing. The routines I've written work fine, but the code using them
is a bit long-winded.
I therefore had the idea of creating a class to do parsing. It could
be used as follows:
int a, n, x, y;
Parser par;
par << string;
|
by: eyeore |
last post by:
Hello everyone my String reverse code works but my professor wants me to use pop top push or Stack code and parsing code could you please teach me how to make this code work with pop top push or Stack code and parsing code my professor i does not like me using buffer reader on my code and my professor did even give me an example code for parsing as well as pop push top or Stack code and i don't know how to do this code into parsing and pop push...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |