The function in question follows:
vector<string>& tokenize(
const string& s,
vector<string>& v,
char delimiter=',' )
{
int delim_idx, begin_idx=0, len=s.length();
for( delim_idx=s.find_first_of(delimiter,begin_idx) ;
delim_idx >=0 && begin_idx < len ;
delim_idx=s.find_first_of(delimiter,begin_idx) ) {
v.push_back( s.substr(begin_idx,delim_idx-begin_idx) );
begin_idx=delim_idx+1;
}
if( begin_idx < len ) {
v.push_back( s.substr(begin_idx,len-begin_idx) );
}
return( v );
}
It seems to work well, but I'd appreciate suggestions regarding style,
technique, or subtle bugs I've missed. Thanks.
--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome. 5 1780
Christopher Benson-Manica <at***@nospam.cyberspace.org> wrote in
news:c7**********@chessie.cirr.com: The function in question follows:
vector<string>& tokenize( const string& s, vector<string>& v, char delimiter=',' ) { int delim_idx, begin_idx=0, len=s.length();
These are not of a strictly "proper" type... they should be of type
std::string::size_type. for( delim_idx=s.find_first_of(delimiter,begin_idx) ; delim_idx >=0 && begin_idx < len ; delim_idx=s.find_first_of(delimiter,begin_idx) ) {
Why aren't you checking against std::string::npos, which is the return
value of find_first_of if the thing you're looking for isn't found?
v.push_back( s.substr(begin_idx,delim_idx-begin_idx) ); begin_idx=delim_idx+1; } if( begin_idx < len ) { v.push_back( s.substr(begin_idx,len-begin_idx) ); } return( v ); }
It seems to work well, but I'd appreciate suggestions regarding style, technique, or subtle bugs I've missed. Thanks.
I'd probably use a while loop instead of a for loop. IMHO: it's a better
documentation style. To me a for loop implies that you're going to do
something a predermined number of times, where the while loop is just "do
something until some condition is met". So I'd probably do something
like (psudocode):
begin_idx = 0;
delim_idx = find;
while (delim_idx != npos)
{
push string fragment onto vector
advance the begin_idx
delim_idx = find;
}
push last fragment onto vector
return
Hi,
This is my own version:
#include <string>
#include <vector>
#include "tokenize.h"
void Tokenize(const string& buffer,
vector<string>& tokens,
const char delimiter)
{
int pos = 0, pos_ant = 0;
pos = buffer.find(delimiter, pos_ant);
while (pos != string::npos)
{
string token = buffer.substr(pos_ant, pos-pos_ant);
tokens.push_back(token);
pos_ant = pos+1;
pos = buffer.find(delimiter, pos_ant);
}
if (!buffer.empty())
{
tokens.push_back(buffer.substr(pos_ant, buffer.size()-1));
}
}
Regards,
Jose Luis
Christopher Benson-Manica <at***@nospam.cyberspace.org> wrote in message news:<c7**********@chessie.cirr.com>... The function in question follows:
vector<string>& tokenize( const string& s, vector<string>& v, char delimiter=',' ) { int delim_idx, begin_idx=0, len=s.length();
for( delim_idx=s.find_first_of(delimiter,begin_idx) ; delim_idx >=0 && begin_idx < len ; delim_idx=s.find_first_of(delimiter,begin_idx) ) {
v.push_back( s.substr(begin_idx,delim_idx-begin_idx) ); begin_idx=delim_idx+1; } if( begin_idx < len ) { v.push_back( s.substr(begin_idx,len-begin_idx) ); } return( v ); }
It seems to work well, but I'd appreciate suggestions regarding style, technique, or subtle bugs I've missed. Thanks.
Christopher Benson-Manica <at***@nospam.cyberspace.org> wrote in message news:<c7**********@chessie.cirr.com>... The function in question follows:
vector<string>& tokenize( const string& s, vector<string>& v, char delimiter=',' ) { int delim_idx, begin_idx=0, len=s.length();
for( delim_idx=s.find_first_of(delimiter,begin_idx) ; delim_idx >=0 && begin_idx < len ; delim_idx=s.find_first_of(delimiter,begin_idx) ) {
v.push_back( s.substr(begin_idx,delim_idx-begin_idx) ); begin_idx=delim_idx+1; } if( begin_idx < len ) { v.push_back( s.substr(begin_idx,len-begin_idx) ); } return( v ); }
It seems to work well, but I'd appreciate suggestions regarding style, technique, or subtle bugs I've missed. Thanks.
In addition to what Andre Kostur wrote, I'd suggest two extra
parameters. One is a template parameter; the function would
work just as well on any basic_string<CH>. The second is a
boolean parameter, whether empty strings are included in
the output. When the separator is ',', you probably want
to split "a,,b" in three strings. When ' ', and splitting
"a b", you probably want just two strings.
Regards,
Michiel Salters
jose luis fernandez diaz wrote: Hi,
This is my own version:
#include <string> #include <vector>
#include "tokenize.h"
void Tokenize(const string& buffer, vector<string>& tokens, const char delimiter) { int pos = 0, pos_ant = 0;
pos = buffer.find(delimiter, pos_ant); while (pos != string::npos) { string token = buffer.substr(pos_ant, pos-pos_ant); tokens.push_back(token); pos_ant = pos+1; pos = buffer.find(delimiter, pos_ant); }
if (!buffer.empty()) { tokens.push_back(buffer.substr(pos_ant, buffer.size()-1)); } }
One better?
#include <string>
template <typename InsertIter>
void
tokenize(const std::string& buf,
const std::string& delim,
InsertIter ii)
{
std::string::size_type sp = 0; /* start position */
std::string::size_type ep = -1; /* end position */
do{
sp = buf.find_first_not_of(delim, ep+1);
ep = buf.find_first_of(delim, sp);
if(sp != ep){
if(ep == buf.npos)
ep = buf.length();
*ii++ = buf.substr(sp, ep-sp);
}
}while(sp != buf.npos);
}
/david
--
"As a scientist, Throckmorton knew that if he were ever to break wind in
the echo chamber, he would never hear the end of it."
Michiel Salters <Mi*************@logicacmg.com> spoke thus: In addition to what Andre Kostur wrote, I'd suggest two extra parameters. One is a template parameter; the function would work just as well on any basic_string<CH>. The second is a boolean parameter, whether empty strings are included in the output. When the separator is ',', you probably want to split "a,,b" in three strings. When ' ', and splitting "a b", you probably want just two strings.
The empty strings argument is a great idea - thanks!
--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: blrmaani |
last post by:
Here is what I want:
string s1 = "This is a list of string";
list<string> s2 = s1.some_method();
Now, I should be able to traverse list s2 and get each
member ( which is of type 'string' ).
...
|
by: Java Guy |
last post by:
This must be a classical topic -- C++ stgring tokenizer.
I just switched from C to C++ ( in Unix ). It turns out that there is no
existing C++ string tokenizer. Searching on the Web, I found...
|
by: AMT2K5 |
last post by:
Hello, how would I go about breaking up a string that is returned by a
function. After I do that, I will strcpy that data to a class data
member .
I have the following functions
void...
|
by: Andre |
last post by:
Hi,
Does anyone know whether the ECMA, or an other standard document,
specifies a maximum for the value that can be pass to the setTimeOut()
function in Javascript?
Andre
|
by: Felix85 |
last post by:
I am trying to make a command interpreter for a mud that i am working
on the problem i am having right now is that i cannot convert the
string into a char array.
This is the error I am getting...
|
by: Simon |
last post by:
Well, the title's pretty descriptive; how would I be able to take a
line of input like this:
getline(cin,mostrecentline);
And split into an (flexible) array of strings. For example: "do this...
|
by: Bilal |
last post by:
Hello,
I'm trying to perform some string manipulations in my stylesheet and
have gotten stuck on the issue below so hopefully can elicit some useful
hints.
Namely, the problem is that I need to...
|
by: nnguyec |
last post by:
Hi, I'm trying to write a small code for an assignment which the void String Tokenizer will get a line input, and take out those delimiters from the original string. Then pass each string without the...
|
by: xetulul |
last post by:
my problem is that im unable to match a user input word to a word in a string. the strings are in a file and then placed in nodes. i have to go through each token in each node to find the 'word'....
|
by: CloudSolutions |
last post by:
Introduction:
For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
|
by: ryjfgjl |
last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
| |