473,657 Members | 2,419 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

String tokens/parsing

(if this is a FAQ, I apologize for not finding it)

I have a C-style string that I'd like to cleanly separate into tokens
(based on the '.' character) and then convert those tokens to unsigned
integers. What is the best standard(!) C++ way to accomplish this?

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cybers pace.org | don't, I need to know. Flames welcome.
Jul 22 '05 #1
10 2624
Christopher Benson-Manica wrote:
(if this is a FAQ, I apologize for not finding it)

I have a C-style string that I'd like to cleanly separate into tokens
(based on the '.' character) and then convert those tokens to unsigned
integers. What is the best standard(!) C++ way to accomplish this?


The strtok function will find tokens in the string and
modify your string.

Perhaps strchr to find the '.'.

Another function is sscanf. I've heard that you can set
the format descriptor string so that it parses correctly.
{which may be a difficult task). I'm sure if you post
to news:comp.lang. c, Dan Pop will show the way.

As for C++, you may want to convert to a std::string
and use the "find" methods and maybe a stringstream
for converting to an int.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.l earn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jul 22 '05 #2
Thomas Matthews <Th************ *************** *@sbcglobal.net > spoke thus:
The strtok function will find tokens in the string and
modify your string. Perhaps strchr to find the '.' Another function is sscanf. I've heard that you can set
the format descriptor string so that it parses correctly.
{which may be a difficult task). I'm sure if you post
to news:comp.lang. c, Dan Pop will show the way.


Believe me, I'm perfectly capable of doing this with C, and am no
stranger to comp.lang.c. I posted here specifically because I'm
interested in improving on the C methods, if that is in fact possible.
The original C code (we're stuck in a "C-style-C++" paradigm,
unfortunately) strikes me as being distinctively hack-y.

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cybers pace.org | don't, I need to know. Flames welcome.
Jul 22 '05 #3
Christopher Benson-Manica wrote:
Believe me, I'm perfectly capable of doing this with C, and am no
stranger to comp.lang.c. I posted here specifically because I'm
interested in improving on the C methods, if that is in fact possible.
The original C code (we're stuck in a "C-style-C++" paradigm,
unfortunately) strikes me as being distinctively hack-y.

Then you need to be more specific about what you already have, and what
your requirements are. Do you object to converting the C-style strings
to std::strings or std::stringstre ams? What limitations does your
"C-style-C++" paradigm impose? To what extent can you deviate from the C
standard library?

You have a very vague question.


Brian Rodenborn
Jul 22 '05 #4
Default User <fi********@boe ing.com.invalid > spoke thus:
Then you need to be more specific about what you already have,
The original code looked like the following unholy mess (which I did
not write):

// unsigned int typedef'ed as uint
// assume appropriate #includes

char sTemp[64];
uint uDeptTime, uArrvTime;
if( argc>=2 ) {
if( sameas(argv[1], "") ) { // sameas ~ strcmp() with flavor
// error
}
strncpy( sTemp, argv[1], sizeof(sTemp) );
cp=strchr(sTemp , '.');
if( cp == NULL ) {
// error
}
uint const lene=strlen(cp) ;
uint const lenb=strlen(sTe mp);
uint const lenr=lenb-lene;
sTemp[lenr]='\0';
uDeptTime=(uint )atoi(sTemp);
if( cp+1 ) {
uArrvTime=(uint )atoi(cp+1);
}
}

I wrote the following as a first approximation to a decent solution:

char *cp;
vector<uint> v;
char sTemp[64];
uint uDeptTime, uArrvTime, uMaxGT; // new variable
for( cp=argv[1] ; (cp=strchr(cp,' .')) != NULL ; ) {
v.push_back( atoi(cp++) ); // atoi() wraps the "standard" atoi, but
// I don't know the details
}
if( v.size() < 3 ) {
// error
}
uDeptTime=v[0];
uArrvTime=v[1];
uMaxGT=v[2];
and what your requirements are.
Straightline code only. (no additional class declarations)
Do you object to converting the C-style strings to std::strings or
std::stringstre ams?
I'd love to use std::strings and/or std::stringstre ams if they offer a
cleaner (not necessarily more "efficient" ) solution.
What limitations does your "C-style-C++" paradigm impose?
The STL is never used in our code, and vectors in particular seem to
be frowned upon. std::stringstre ams might be pushing the envelope.
You have a very vague question.


Better?

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cybers pace.org | don't, I need to know. Flames welcome.
Jul 22 '05 #5
Chris,

Christopher Benson-Manica wrote:
Default User <fi********@boe ing.com.invalid > spoke thus:

Then you need to be more specific about what you already have,


[snip]
<sigh>

Sometimes those responding to messages in this group can be a little...
well... pedantic. If you are looking for a decent C++ implementation of
a tokenizer, have a look at the boost.org site where they have a decent
tokenizer.

The Boost web site contains many useful template libraries that
complement the STL. As it turns out, many of the STL authors contribute
to this site. The way they put it, many of their submissions didn't make
it into the standard, but are none-the-less useful and worthy of use.

Jul 22 '05 #6
Evan Carew <te*******@pobo x.com> spoke thus:
Sometimes those responding to messages in this group can be a little...
well... pedantic.
And I wouldn't want it any other way :)
If you are looking for a decent C++ implementation of
a tokenizer, have a look at the boost.org site where they have a decent
tokenizer.


Unfortunately, boost is out of the question here. I'm working at a
company where any code not written in-house (i.e., by my boss) is
considered suspect, so in effect I'm trying to sneak some "real" C++
in the code here and there below the radar. There are times where
std::strings can really make life simple, so I toss them in
occasionally, but for the most part C-style strings rule the day.

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cybers pace.org | don't, I need to know. Flames welcome.
Jul 22 '05 #7
Christopher Benson-Manica <at***@nospam.c yberspace.org> spoke thus:
I wrote the following as a first approximation to a decent solution:
And then (now) realized that it doesn't work...
for( cp=argv[1] ; (cp=strchr(cp,' .')) != NULL ; ) {
v.push_back( atoi(cp++) ); // atoi() wraps the "standard" atoi, but
// I don't know the details
}


*sigh*

for( cp=argv[1] ; cp && cp++ ; cp=strchr(cp,'. ') ) {
v.push_back( atoi(cp) );
}

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cybers pace.org | don't, I need to know. Flames welcome.
Jul 22 '05 #8
Christopher Benson-Manica wrote:

Default User <fi********@boe ing.com.invalid > spoke thus:
Then you need to be more specific about what you already have,


The original code looked like the following unholy mess (which I did
not write):

// unsigned int typedef'ed as uint
// assume appropriate #includes

char sTemp[64];
uint uDeptTime, uArrvTime;
if( argc>=2 ) {
if( sameas(argv[1], "") ) { // sameas ~ strcmp() with flavor
// error
}
strncpy( sTemp, argv[1], sizeof(sTemp) );
cp=strchr(sTemp , '.');
if( cp == NULL ) {
// error
}
uint const lene=strlen(cp) ;
uint const lenb=strlen(sTe mp);
uint const lenr=lenb-lene;
sTemp[lenr]='\0';
uDeptTime=(uint )atoi(sTemp);
if( cp+1 ) {
uArrvTime=(uint )atoi(cp+1);
}
}

I wrote the following as a first approximation to a decent solution:

char *cp;
vector<uint> v;
char sTemp[64];
uint uDeptTime, uArrvTime, uMaxGT; // new variable
for( cp=argv[1] ; (cp=strchr(cp,' .')) != NULL ; ) {
v.push_back( atoi(cp++) ); // atoi() wraps the "standard" atoi, but
// I don't know the details
}
if( v.size() < 3 ) {
// error
}
uDeptTime=v[0];
uArrvTime=v[1];
uMaxGT=v[2];
and what your requirements are.


Straightline code only. (no additional class declarations)
Do you object to converting the C-style strings to std::strings or
std::stringstre ams?


I'd love to use std::strings and/or std::stringstre ams if they offer a
cleaner (not necessarily more "efficient" ) solution.


Ok, Donovan Rebbechi previously posted this:

The simplest way would be to use the getline() function and set the
optionl
field separator argument to "."

std::istringstr eam in(mystring);

while (std::getline(i n, mystring, '.'))
{
stringlist.push _back(mystring) ;
};
What limitations does your "C-style-C++" paradigm impose?


The STL is never used in our code, and vectors in particular seem to
be frowned upon. std::stringstre ams might be pushing the envelope.


My usual tool for this sort of thing is the Explode function I wrote for
string parsing. Unfortunately, it returns a vector of strings. I'll
present it anyway, you may be able to get some value from it. Or not.

#include <vector>
#include <string>

// breaks apart a string into substrings separated by a character string
// does not use a strtok() style list of separator characters
// returns a vector of std::strings

std::vector<std ::string> Explode (const std::string &inString,
const std::string &separator)
{
std::vector<std ::string> returnVector;
std::string::si ze_type start = 0;
std::string::si ze_type end = 0;

while ((end=inString. find (separator, start)) != std::string::np os)
{
returnVector.pu sh_back (inString.subst r (start, end-start));
start = end+separator.s ize();
}

returnVector.pu sh_back (inString.subst r (start));

return returnVector;
}
You have a very vague question.


Better?


Much.

Brian Rodenborn
Jul 22 '05 #9
Evan Carew wrote:
Sometimes those responding to messages in this group can be a little...
well... pedantic.
I'm not sure if my questions were pedantic. Had he presented the problem
cleanly, then one could try to answer. As he had some not well-define
limits, I thought it prudent to ask before presenting solutions that may
not suit him. For instance, in light of his followup, something like my
Explode() function I trot out now and them wouldn't do, because it
returns a vector or strings.
If you are looking for a decent C++ implementation of
a tokenizer, have a look at the boost.org site where they have a decent
tokenizer.


Considering that he specified a "C-style C++ paradigm" I doubt Boost
will be in his solution set. Which is exactly why I asked.

Brian Rodenborn
Jul 22 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

28
8069
by: David Rubin | last post by:
I looked on google for an answer, but I didn't find anything short of using boost which sufficiently answers my question: what is a good way of doing string tokenization (note: I cannot use boost). For example, I have tried this: #include <algorithm> #include <cctype> #include <climits> #include <deque> #include <iostream>
7
332
by: Daniel Lidström | last post by:
Hi, I'm currently using this method to extract doubles from a string: System::String* sp = S" "; System::String* tokens = s->Trim()->Split(sp->ToCharArray()); m_Northing = System::Double::Parse(tokens, nfi); m_Easting = System::Double::Parse(tokens, nfi); m_Elevation = System::Double::Parse(tokens, nfi);
26
808
by: Kai Jaensch | last post by:
Hello, i am an newbie and i have to to solve this problem as fast as i can. But at this time i don´t have a lot of success. Can anybody help me (and understand my english :-))? I have a .txt-file in which the data is structured in that way: Project-Nr. ID name lastname 33 9 Lars Lundel 33 12 Emil Korla
15
3617
by: John Smith | last post by:
I would like to parse a string into an array. I found on the net the following codes which parse a string and print it. The result is exactly what I want: char * pch; pch = strtok (buffer," "); while (pch != NULL) { printf ("%s\n",pch); pch = strtok (NULL, " ,.");
3
3626
by: Dave | last post by:
I'm calling string.Split() producing output string. I need direct access to its enumerator, but would greatly prefer an enumerator strings and not object types (as my parsing is unsafe casting from object to string frequently). Basically generics and not its non- generic counterpart. string str1 = "abc: value1 def: value2 ghi: value3"; char delimiterChars = { '\t' }; string tokens = str1.Split(delimiterChars);
7
2372
by: Donn Ingle | last post by:
Hi, I really hope someone can help me -- I'm stuck. I have written three versions of code over a week and still can't get past this problem, it's blocking my path to getting other code written. This might be a little hairy, but I'll try to keep it short. Situation: I want to pass a string to a function which will parse it and generate objects in a list.
3
2100
by: WP | last post by:
Hello! I need some help with my program...it's supposed to read infix expressions line by line from stdin and each expression should be divided into operands and operators and added to a vector of strings. So if we read one line that holds "1+2" the vector should afterwards hold the strings "1", "+" and "2". Valid operators are +, -, * and / meaning they are of length 1. Valid operands are ints >= 0 meaning they can stretch over several...
6
3507
by: James Arnold | last post by:
Hello, I am new to C and I am trying to write a few small applications to get some hands-on practise! I am trying to write a random string generator, based on a masked input. For example, given the string: "AAANN" it would return a string containing 3 alphanumeric characters followed by 3 digits. This part I have managed:) I would now like to add some complexity to this, such as repetitions and grouping. For example, I'd like to have...
6
4732
by: (2b|!2b)==? | last post by:
I am expecting a string of this format: "id1:param1,param2;id2:param1,param2,param3;id" The tokens are seperated by semicolon ";" However each token is really a struct of the following format: struct mst_ {
0
8826
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8732
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8605
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7330
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6166
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5632
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4306
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
1955
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1615
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.