467,907 Members | 1,440 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 467,907 developers. It's quick & easy.

C++ Tokenize vectors strings, no values between delimiters

m6s
1. After hours of researching, I used these snippets :
Expand|Select|Wrap|Line Numbers
  1. void Object::TokenizeLines(const string& str, vector<string>& tokens, const string& delimiters)
  2. // Skip delimiters at beginning.
  3.     string::size_type lastPos = str.find_first_not_of(delimiters, 0);
  4.     // Find first "non-delimiter".
  5.     string::size_type pos     = str.find_first_of(delimiters, lastPos);
  6.  
  7.     while (string::npos != pos || string::npos != lastPos) {
  8.         // Found a token, add it to the vector.
  9.         tokens.push_back( CheckWord( str, lastPos, pos - lastPos));
  10.         // Skip delimiters.  Note the "not_of"
  11.         lastPos = str.find_first_not_of(delimiters, pos);
  12.         // Find next "non-delimiter"
  13.         pos = str.find_first_of(delimiters, lastPos);
  14.     }
  15. }
  16.  
for tokenizing large lines.
with a function call like this : string.Tokenize(str, tokens, "\n");
and because in each line I had things like : abc,de,,,f,g,,h
the previous lines were faulty. So I found this one :
Expand|Select|Wrap|Line Numbers
  1. void TokenizeWithComma(const string& str, vector<string>& tokens){
  2.     const char* first = str.c_str();
  3.     const char* last = str.c_str() + strlen(str.c_str());
  4.     while (first != last) {
  5.         const char* next = find(first, last, ',');
  6.         tokens.push_back(string(first, next - first));
  7.         first = min(next + 1, last);
  8.     }
  9. }
  10.  
and I use it after the TokenizeLines to tokenise the words by passing a string and a vector. Both worked.

After spenting too many hours today for that, I have my homework done, but I am not sure what I did here...
Can someone (not novice likeme) give me more detailed view?

Also :
2. Why, the first function don't want to work with strings like a,bc,,d ?
3. I tryed also this for each word ( which are in vector ) :
Expand|Select|Wrap|Line Numbers
  1. for (w_iter = token_lines.begin(); w_iter != token_lines.end(); w_iter++) {
  2. string ff = (*w_iter);
  3. string::size_type loc = ff.find( "abc", 0 );
  4. if( loc != string::npos ) { cout << "Found Omega at " << loc << endl;}
  5. else {cout << "Didn't find Omega" << endl;}
  6.  
But the code while is a working with a normal string ff("abc,ccc,cc"), seems not to work with the declaration I had. Is the iterator's fault?
This drove me nuts, and made me in order to find patterns, devide a string with substr and have also cases for not making illegal measurements in the substr function ( which is of string again ).

I know it might be borring topic for most, I appreciate your help...
Thank you
Feb 9 '08 #1
  • viewed: 3790
Share:
6 Replies
weaknessforcats
Expert Mod 8TB
But the code while is a working with a normal string ff("abc,ccc,cc"),
string ff("abc,ccc,cc") creates a strring object.

Your function has a string& argument so you can use that string object as a argument.

This ("abc,ccc,cc") is a C-string. A C-string cannot be used a string& because it's not a string object.

Finally, I like your last solution. You should never be use the string::c_str() method unless your function absolutely requires a C-string. Considerinf that the C string library is deprecated in C++, there should little call for this.
Feb 9 '08 #2
m6s
m6s
string ff("abc,ccc,cc") creates a strring object.

Your function has a string& argument so you can use that string object as a argument.

This ("abc,ccc,cc") is a C-string. A C-string cannot be used a string& because it's not a string object.

Finally, I like your last solution. You should never be use the string::c_str() method unless your function absolutely requires a C-string. Considerinf that the C string library is deprecated in C++, there should little call for this.
Thank you for your answer, and the good word :-)
I had tried even string temp = *iter, but that didn't work also. I assume by this way it should load a string object and tokenize it, right?
because temp was then passed to the function.But didn't work either.
Finally, how could I make the TokenizeCommas without a c_str().
Build the whole string as char arrays? Or is any other choice in order to stick as much as it can close to C++?
Feb 11 '08 #3
weaknessforcats
Expert Mod 8TB
Why can't you use find() to locate the next comma??
Feb 11 '08 #4
m6s
m6s
:-) I Don't understand me either!!!
I have a 64bit, can that be the problem?
Just go through yourself if you like it and find it interesting...
I know that didn't result.

In other words, can you make an iterator from words ( if from a file even better) which will be like a,b,c,,,d,f,g ....so on... (did you notice the three commas-->2 spaces?

And then use with just C++ code not C, I mean the first Tokenize function.
What's your result? Oh, assign each iterator to a string and then tokenize it.
For me it was disaster...
Feb 11 '08 #5
weaknessforcats
Expert Mod 8TB
In other words, can you make an iterator from words ( if from a file even better) which will be like a,b,c,,,d,f,g ....so on... (did you notice the three commas-->2 spaces?
It's not the string that's your problem. Its your parsing logic.

There is an article in the C/C++ HowTos on the State Design Pattern and inside that article is how tro constrcuct a tokenizer to break a string into individual words. Complete with code. You might read that article.
Feb 12 '08 #6
m6s
m6s
Ok, thank you for your close support on this, I am going to check this article too...
Feb 13 '08 #7

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

9 posts views Thread by Lans | last post: by
6 posts views Thread by Ram Laxman | last post: by
2 posts views Thread by James | last post: by
5 posts views Thread by kurt sune | last post: by
20 posts views Thread by bubunia2000 | last post: by
5 posts views Thread by Dennis | last post: by
1 post views Thread by Tim | last post: by
4 posts views Thread by Christian Christmann | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.