446,234 Members | 1,889 Online
Need help? Post your question and get tips & solutions from a community of 446,234 IT Pros & Developers. It's quick & easy.

 P: n/a The function in question follows: vector& tokenize( const string& s, vector& v, char delimiter=',' ) { int delim_idx, begin_idx=0, len=s.length(); for( delim_idx=s.find_first_of(delimiter,begin_idx) ; delim_idx >=0 && begin_idx < len ; delim_idx=s.find_first_of(delimiter,begin_idx) ) { v.push_back( s.substr(begin_idx,delim_idx-begin_idx) ); begin_idx=delim_idx+1; } if( begin_idx < len ) { v.push_back( s.substr(begin_idx,len-begin_idx) ); } return( v ); } It seems to work well, but I'd appreciate suggestions regarding style, technique, or subtle bugs I've missed. Thanks. -- Christopher Benson-Manica | I *should* know what I'm talking about - if I ataru(at)cyberspace.org | don't, I need to know. Flames welcome. Jul 22 '05 #1
5 Replies

 P: n/a Christopher Benson-Manica wrote in news:c7**********@chessie.cirr.com: The function in question follows: vector& tokenize( const string& s, vector& v, char delimiter=',' ) { int delim_idx, begin_idx=0, len=s.length(); These are not of a strictly "proper" type... they should be of type std::string::size_type. for( delim_idx=s.find_first_of(delimiter,begin_idx) ; delim_idx >=0 && begin_idx < len ; delim_idx=s.find_first_of(delimiter,begin_idx) ) { Why aren't you checking against std::string::npos, which is the return value of find_first_of if the thing you're looking for isn't found? v.push_back( s.substr(begin_idx,delim_idx-begin_idx) ); begin_idx=delim_idx+1; } if( begin_idx < len ) { v.push_back( s.substr(begin_idx,len-begin_idx) ); } return( v ); } It seems to work well, but I'd appreciate suggestions regarding style, technique, or subtle bugs I've missed. Thanks. I'd probably use a while loop instead of a for loop. IMHO: it's a better documentation style. To me a for loop implies that you're going to do something a predermined number of times, where the while loop is just "do something until some condition is met". So I'd probably do something like (psudocode): begin_idx = 0; delim_idx = find; while (delim_idx != npos) { push string fragment onto vector advance the begin_idx delim_idx = find; } push last fragment onto vector return Jul 22 '05 #2

 P: n/a Hi, This is my own version: #include #include #include "tokenize.h" void Tokenize(const string& buffer, vector& tokens, const char delimiter) { int pos = 0, pos_ant = 0; pos = buffer.find(delimiter, pos_ant); while (pos != string::npos) { string token = buffer.substr(pos_ant, pos-pos_ant); tokens.push_back(token); pos_ant = pos+1; pos = buffer.find(delimiter, pos_ant); } if (!buffer.empty()) { tokens.push_back(buffer.substr(pos_ant, buffer.size()-1)); } } Regards, Jose Luis Christopher Benson-Manica wrote in message news:... The function in question follows: vector& tokenize( const string& s, vector& v, char delimiter=',' ) { int delim_idx, begin_idx=0, len=s.length(); for( delim_idx=s.find_first_of(delimiter,begin_idx) ; delim_idx >=0 && begin_idx < len ; delim_idx=s.find_first_of(delimiter,begin_idx) ) { v.push_back( s.substr(begin_idx,delim_idx-begin_idx) ); begin_idx=delim_idx+1; } if( begin_idx < len ) { v.push_back( s.substr(begin_idx,len-begin_idx) ); } return( v ); } It seems to work well, but I'd appreciate suggestions regarding style, technique, or subtle bugs I've missed. Thanks. Jul 22 '05 #3

 P: n/a Christopher Benson-Manica wrote in message news:... The function in question follows: vector& tokenize( const string& s, vector& v, char delimiter=',' ) { int delim_idx, begin_idx=0, len=s.length(); for( delim_idx=s.find_first_of(delimiter,begin_idx) ; delim_idx >=0 && begin_idx < len ; delim_idx=s.find_first_of(delimiter,begin_idx) ) { v.push_back( s.substr(begin_idx,delim_idx-begin_idx) ); begin_idx=delim_idx+1; } if( begin_idx < len ) { v.push_back( s.substr(begin_idx,len-begin_idx) ); } return( v ); } It seems to work well, but I'd appreciate suggestions regarding style, technique, or subtle bugs I've missed. Thanks. In addition to what Andre Kostur wrote, I'd suggest two extra parameters. One is a template parameter; the function would work just as well on any basic_string. The second is a boolean parameter, whether empty strings are included in the output. When the separator is ',', you probably want to split "a,,b" in three strings. When ' ', and splitting "a b", you probably want just two strings. Regards, Michiel Salters Jul 22 '05 #4

 P: n/a jose luis fernandez diaz wrote: Hi, This is my own version: #include #include #include "tokenize.h" void Tokenize(const string& buffer, vector& tokens, const char delimiter) { int pos = 0, pos_ant = 0; pos = buffer.find(delimiter, pos_ant); while (pos != string::npos) { string token = buffer.substr(pos_ant, pos-pos_ant); tokens.push_back(token); pos_ant = pos+1; pos = buffer.find(delimiter, pos_ant); } if (!buffer.empty()) { tokens.push_back(buffer.substr(pos_ant, buffer.size()-1)); } } One better? #include template void tokenize(const std::string& buf, const std::string& delim, InsertIter ii) { std::string::size_type sp = 0; /* start position */ std::string::size_type ep = -1; /* end position */ do{ sp = buf.find_first_not_of(delim, ep+1); ep = buf.find_first_of(delim, sp); if(sp != ep){ if(ep == buf.npos) ep = buf.length(); *ii++ = buf.substr(sp, ep-sp); } }while(sp != buf.npos); } /david -- "As a scientist, Throckmorton knew that if he were ever to break wind in the echo chamber, he would never hear the end of it." Jul 22 '05 #5

 P: n/a Michiel Salters spoke thus: In addition to what Andre Kostur wrote, I'd suggest two extra parameters. One is a template parameter; the function would work just as well on any basic_string. The second is a boolean parameter, whether empty strings are included in the output. When the separator is ',', you probably want to split "a,,b" in three strings. When ' ', and splitting "a b", you probably want just two strings. The empty strings argument is a great idea - thanks! -- Christopher Benson-Manica | I *should* know what I'm talking about - if I ataru(at)cyberspace.org | don't, I need to know. Flames welcome. Jul 22 '05 #6

### This discussion thread is closed

Replies have been disabled for this discussion.