473,793 Members | 2,922 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Efficiency of parsing a string in C vs C++

92 New Member
I have a comma delimited file to parse up. So I figured I will go line by line and parse them up. Now efficiency is the top priority here. So I figure I can use the old-school C approach or use std::string. The way I see it I could (assuming infile is a valid std::ifstream)

Expand|Select|Wrap|Line Numbers
  1. char buffer [4096];
  2. while (!infile.eof () && !infile.bad ())
  3.     {
  4.     // Get the next line in the file    
  5.     infile.getline (buffer, 4095);
  6.  
  7.     // If fail bit is set it means we did not read the entire
  8.     // line in. So we need to skip it, but the rest of the line is
  9.     // still on the stream. So clear the error and contnue reading
  10.     // until the end of the line, skipping it all.
  11.     if (infile.fail ())
  12.         {
  13.         while (infile.fail ())
  14.             {
  15.             infile.clear ();
  16.             infile.getline (buffer, 4095);
  17.             }
  18.         continue;
  19.         }
  20.  
  21.     // If the string is empty or all whitepace skip it.
  22.     char *str = buffer;
  23.     while (*str != '\0' && isspace (*str))
  24.         str++;
  25.  
  26.     if (strlen (str) == 0)
  27.         continue;
  28.  
  29.     char *tokstr = strtok (str, ",");
  30.     while (tokstr)
  31.         {
  32.         // Do something with the data
  33.         tokstr = strtok (NULL, ",");
  34.         }
  35.     }
  36.  
or

Expand|Select|Wrap|Line Numbers
  1. string buffer;
  2. while (!infile.eof () && !infile.bad ())
  3.     {
  4.     getline (infile, buffer);
  5.  
  6.     // Don't worry about fail bit, string is dynamic and can read all lines
  7.  
  8.     string::iterator stritr = buffer.begin ();
  9.     while (isspace (*stritr))
  10.         stritr++;
  11.  
  12.     if (stritr == buffer.end ())
  13.         continue;
  14.     if (stritr != buffer.begin ())
  15.         buffer.erase (buffer.begin (), stritr)
  16.  
  17.     int startpos = 0;
  18.     int endpos = buffer.find_first_of (",");
  19.     while (startpos != npos)
  20.         {
  21.         valuestr = buffer.substr (startpos, endpos);
  22.         // Do something with the data
  23.  
  24.         startpos = endpos + 1;
  25.         endpos = buffer.find_first_of (",", startpos);
  26.         }
  27.     }

The string version is a little nicer on memory since there is no real cap on the line size whereas the first approach caps it at 4095 and requires a bit more logic to deal with longer lines (not to mention not importing them). However, I have a strong feeling the first one is much faster. Any thoughts on which is faster? Any optimizations you see?
Jul 18 '07 #1
3 2088
weaknessforcats
9,208 Recognized Expert Moderator Expert
The STL templates are faster than the C functions.

P J Plauger, who wrote the STL templates says they were optimized for speed and that If you think you can do this faster, then think three times.

Operations on STL containers with millions of strings are much faster than C code.
Jul 18 '07 #2
Benny the Guard
92 New Member
Cool. My concern of speed was in the line:

Expand|Select|Wrap|Line Numbers
  1. valuestr = buffer.substr (startpos, endpos);
Simply because its allocatin/dealloc temporary memory. But the other speed enhancements probably outweigh it.
Jul 18 '07 #3
ravenspoint
111 New Member
Perhaps you can avoid the copy - it depends on what "do something with the data" involves.

Let's suppose you want to covert it to an integer

// replace comma with null

buffer[ endpos ] = '\0';

// convert

int i = atoi( (const char*) &buffer[ startpos ] );

All very tricky, of course, and of dubious value. Are you SURE this is the bottleneck? Have you run a profile?

James
Jul 19 '07 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

8
9448
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $ Last-Modified: $Date: 2003/10/28 19:48:44 $ Author: A.M. Kuchling <amk@amk.ca> Status: Draft Type: Standards Track
2
3959
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home Canonicalpath-Directory4: \\wkdis3\ROOT\home\bwe\ You selected the file named AAA.XML getXmlAlgorithmDocument(): IOException Not logged in
2
2141
by: Oliver Corona | last post by:
I am wondering if anyone has any insights on the performance benefit (or detriment) of declaring local variables instead of referencing members. Is allocating memory for a new variable more efficient than repeatedly referencing the member in a loop? Maybe using a string isn't the best example, but hopefully you get the idea! * example (referencing member):
12
2317
by: PD | last post by:
I am currently making a dating website. I want to have some information on how to structure the database and the php files so that I can achieve speed and efficiency. Can some one please give me suggestions and point me to references where I can get this information.
19
2934
by: vamshi | last post by:
Hi all, This is a question about the efficiency of the code. a :- int i; for( i = 0; i < 20; i++ ) printf("%d",i); b:- int i = 10;
2
1565
by: Benny the Guard | last post by:
I know this sounds like a homework problem, but it's honestly not one. I am writing some code which given an IP address in string format will calculate the integer value of it. Now the trick is I need it to be quick. I see two paths and not sure which would be faster. 1) Use strtok like functionality (actually custom code to make it reentrant and in my case I can change the string) to parse the components of the address. I would also keep...
9
3321
by: OldBirdman | last post by:
Efficiency I've never stumbled on any discussion of efficiency of various methods of coding, although I have found posts on various forums where individuals were concerned with efficiency. I'm not concerned when dealing with user typing, but I am if a procedure is called by a query. Does the VBA compiler generate "in-line" code for some apparent function calls? For example, y = Abs(x) might be compiled as y = x & mask. The string...
6
1927
by: gw7rib | last post by:
I have a program that needs to do a small amount of relatively simple parsing. The routines I've written work fine, but the code using them is a bit long-winded. I therefore had the idea of creating a class to do parsing. It could be used as follows: int a, n, x, y; Parser par; par << string;
1
4409
by: eyeore | last post by:
Hello everyone my String reverse code works but my professor wants me to use pop top push or Stack code and parsing code could you please teach me how to make this code work with pop top push or Stack code and parsing code my professor i does not like me using buffer reader on my code and my professor did even give me an example code for parsing as well as pop push top or Stack code and i don't know how to do this code into parsing and pop push...
0
9518
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10430
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10211
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9033
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7538
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6776
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5436
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5560
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4111
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.