473,320 Members | 1,600 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

C++ Tokenize vectors strings, no values between delimiters

m6s
55
1. After hours of researching, I used these snippets :
Expand|Select|Wrap|Line Numbers
  1. void Object::TokenizeLines(const string& str, vector<string>& tokens, const string& delimiters)
  2. // Skip delimiters at beginning.
  3.     string::size_type lastPos = str.find_first_not_of(delimiters, 0);
  4.     // Find first "non-delimiter".
  5.     string::size_type pos     = str.find_first_of(delimiters, lastPos);
  6.  
  7.     while (string::npos != pos || string::npos != lastPos) {
  8.         // Found a token, add it to the vector.
  9.         tokens.push_back( CheckWord( str, lastPos, pos - lastPos));
  10.         // Skip delimiters.  Note the "not_of"
  11.         lastPos = str.find_first_not_of(delimiters, pos);
  12.         // Find next "non-delimiter"
  13.         pos = str.find_first_of(delimiters, lastPos);
  14.     }
  15. }
  16.  
for tokenizing large lines.
with a function call like this : string.Tokenize(str, tokens, "\n");
and because in each line I had things like : abc,de,,,f,g,,h
the previous lines were faulty. So I found this one :
Expand|Select|Wrap|Line Numbers
  1. void TokenizeWithComma(const string& str, vector<string>& tokens){
  2.     const char* first = str.c_str();
  3.     const char* last = str.c_str() + strlen(str.c_str());
  4.     while (first != last) {
  5.         const char* next = find(first, last, ',');
  6.         tokens.push_back(string(first, next - first));
  7.         first = min(next + 1, last);
  8.     }
  9. }
  10.  
and I use it after the TokenizeLines to tokenise the words by passing a string and a vector. Both worked.

After spenting too many hours today for that, I have my homework done, but I am not sure what I did here...
Can someone (not novice likeme) give me more detailed view?

Also :
2. Why, the first function don't want to work with strings like a,bc,,d ?
3. I tryed also this for each word ( which are in vector ) :
Expand|Select|Wrap|Line Numbers
  1. for (w_iter = token_lines.begin(); w_iter != token_lines.end(); w_iter++) {
  2. string ff = (*w_iter);
  3. string::size_type loc = ff.find( "abc", 0 );
  4. if( loc != string::npos ) { cout << "Found Omega at " << loc << endl;}
  5. else {cout << "Didn't find Omega" << endl;}
  6.  
But the code while is a working with a normal string ff("abc,ccc,cc"), seems not to work with the declaration I had. Is the iterator's fault?
This drove me nuts, and made me in order to find patterns, devide a string with substr and have also cases for not making illegal measurements in the substr function ( which is of string again ).

I know it might be borring topic for most, I appreciate your help...
Thank you
Feb 9 '08 #1
6 4206
weaknessforcats
9,208 Expert Mod 8TB
But the code while is a working with a normal string ff("abc,ccc,cc"),
string ff("abc,ccc,cc") creates a strring object.

Your function has a string& argument so you can use that string object as a argument.

This ("abc,ccc,cc") is a C-string. A C-string cannot be used a string& because it's not a string object.

Finally, I like your last solution. You should never be use the string::c_str() method unless your function absolutely requires a C-string. Considerinf that the C string library is deprecated in C++, there should little call for this.
Feb 9 '08 #2
m6s
55
string ff("abc,ccc,cc") creates a strring object.

Your function has a string& argument so you can use that string object as a argument.

This ("abc,ccc,cc") is a C-string. A C-string cannot be used a string& because it's not a string object.

Finally, I like your last solution. You should never be use the string::c_str() method unless your function absolutely requires a C-string. Considerinf that the C string library is deprecated in C++, there should little call for this.
Thank you for your answer, and the good word :-)
I had tried even string temp = *iter, but that didn't work also. I assume by this way it should load a string object and tokenize it, right?
because temp was then passed to the function.But didn't work either.
Finally, how could I make the TokenizeCommas without a c_str().
Build the whole string as char arrays? Or is any other choice in order to stick as much as it can close to C++?
Feb 11 '08 #3
weaknessforcats
9,208 Expert Mod 8TB
Why can't you use find() to locate the next comma??
Feb 11 '08 #4
m6s
55
:-) I Don't understand me either!!!
I have a 64bit, can that be the problem?
Just go through yourself if you like it and find it interesting...
I know that didn't result.

In other words, can you make an iterator from words ( if from a file even better) which will be like a,b,c,,,d,f,g ....so on... (did you notice the three commas-->2 spaces?

And then use with just C++ code not C, I mean the first Tokenize function.
What's your result? Oh, assign each iterator to a string and then tokenize it.
For me it was disaster...
Feb 11 '08 #5
weaknessforcats
9,208 Expert Mod 8TB
In other words, can you make an iterator from words ( if from a file even better) which will be like a,b,c,,,d,f,g ....so on... (did you notice the three commas-->2 spaces?
It's not the string that's your problem. Its your parsing logic.

There is an article in the C/C++ HowTos on the State Design Pattern and inside that article is how tro constrcuct a tokenizer to break a string into individual words. Complete with code. You might read that article.
Feb 12 '08 #6
m6s
55
Ok, thank you for your close support on this, I am going to check this article too...
Feb 13 '08 #7

Sign in to post your reply or Sign up for a free account.

Similar topics

9
by: Lans | last post by:
I have a string that I need to tokenize but I need to use a string token see example i am trying the following but strtok only uses characters as delimiters and I need to seperate bu a certain...
6
by: Ram Laxman | last post by:
Hi all, How can I tokenize the integers using strtok. For example: If I have some thing like: "ram":"laxman":"deepak" then I can safely use strtok.But if I have something like below:...
2
by: James | last post by:
Hi, I am looking for a stringtokenizer class/method in C#, but can't find one. The similar classes in Java and C++ are StringTokenizer and CStringT::tokenize respectively. I need to keep a...
5
by: kurt sune | last post by:
The code: Dim aLine As String = "cat" & vbNewLine & "dog" & vbNewLine & "fox" & vbNewLine Dim csvColumns1 As String() = aLine.Split(vbNewLine, vbCr, vbLf) Dim csvColumns2 As String() =...
20
by: bubunia2000 | last post by:
Hi all, I heard that strtok is not thread safe. So I want to write a sample program which will tokenize string without using strtok. Can I get a sample source code for the same. For exp:...
5
by: Dennis | last post by:
I know this is probably a very overworked issue but thought I'd share the code below to convert words in a text string to capitalize the first letter of the word using an array of word delimiters. ...
1
by: Tim | last post by:
I ran into a problem with a script i was playing with to check code indents and need some direction. It seems to depend on if tabsize is set to 4 in editor and spaces and tabs indents are mixed on...
4
by: Christian Christmann | last post by:
Hi, what is the best approach to tokenize a "const char*"? The strings look like "sometext 12345". I need to read the number. The first sequence of characters (here "sometext") is not...
4
by: Caudata | last post by:
I am by no means an experienced c++ programmer, but I am trying to use a vector of vectors because it is convenient to store some strings while parsing a text file. I am having trouble with the...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.