473,322 Members | 1,287 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

Tokenize strings

Hi,

what is the best approach to tokenize a "const char*"?

The strings look like "sometext 12345".
I need to read the number. The first sequence of characters
(here "sometext") is not important. Also the number of
white spaces is not known.

And hints?

Regards,
Chris
Dec 1 '06 #1
4 2055
On Dec 1, 4:27 pm, Christian Christmann <plfr...@yahoo.dewrote:
Hi,

what is the best approach to tokenize a "const char*"?

The strings look like "sometext 12345".
I need to read the number. The first sequence of characters
(here "sometext") is not important. Also the number of
white spaces is not known.
Iterate through the string and check the value of each char, if the
value is between 48 and 57 (inclusive) then it's a number. Next use the
atoi-function (from <cstdlib>) and pass the pointer to the first number
and out you get an int.

--
Erik Wikström

Dec 1 '06 #2

Christian Christmann wrote:
Hi,

what is the best approach to tokenize a "const char*"?

The strings look like "sometext 12345".
I need to read the number. The first sequence of characters
(here "sometext") is not important. Also the number of
white spaces is not known.

And hints?

Regards,
Chris
#include <iostream>
#include <string>
#include <sstream>

int main( )
{
const char* p = "sometext 12345";
std::istringstream iss(p);

std::string text;
iss >text;
std::cout << text << std::endl;

int number;
iss >number;
std::cout << number << std::endl;
}

/*
sometext
12345 <- is no longer a string
*/

Note: if you plan to read bigger numbers, consider switching to
something other than an integer.
If you need to see the upper range for an integer:

#include <limits>
std::numeric_limits< int >::max();

Dec 1 '06 #3

er****@student.chalmers.se wrote:
On Dec 1, 4:27 pm, Christian Christmann <plfr...@yahoo.dewrote:
Hi,

what is the best approach to tokenize a "const char*"?

The strings look like "sometext 12345".
I need to read the number. The first sequence of characters
(here "sometext") is not important. Also the number of
white spaces is not known.

Iterate through the string and check the value of each char, if the
value is between 48 and 57 (inclusive) then it's a number. Next use the
atoi-function (from <cstdlib>) and pass the pointer to the first number
and out you get an int.
There is no guarantee that the characters '0' to '9' have integer
values 48 to 57. What is guaranteed though is that whatever integer
values those characters do have are contiguous. So if the a char is >=
'0' and <= '9' then that char represents a decimal digit. The standard
function isdigit in <ctype.h(or std::isdigit in <cctype>) will also
tell you the same thing.

Once you have a char c that you know contains a decimal digit character
then c - '0' is guaranteed to give you the integer in the range 0 to 9
that you need.

Gavin Deane

Dec 1 '06 #4
On 1 Dec 2006 07:43:45 -0800, "er****@student.chalmers.se"
<er****@student.chalmers.sewrote in comp.lang.c++:
On Dec 1, 4:27 pm, Christian Christmann <plfr...@yahoo.dewrote:
Hi,

what is the best approach to tokenize a "const char*"?

The strings look like "sometext 12345".
I need to read the number. The first sequence of characters
(here "sometext") is not important. Also the number of
white spaces is not known.

Iterate through the string and check the value of each char, if the
value is between 48 and 57 (inclusive) then it's a number. Next use the
atoi-function (from <cstdlib>) and pass the pointer to the first number
and out you get an int.
Don't ever recommend the use of atoi(), atol(), or atod() to anyone.
These are extremely unsafe functions, in that they produce undefined
behavior if the text string represents a value outside the range of
the type.

That's exactly why the C standard added strtol(), strtoul(), and
strtod() way back in 1989. They have defined behavior with any input.

And of course Gavin already mentioned the folly of using decimal
literals to specify digit characters.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Dec 2 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Andr? Roberge | last post by:
According to the Python documentation: 18.5 tokenize -- Tokenizer for Python source .... The primary entry point is a generator: generate_tokens(readline) .... An older entry point is...
16
by: qwweeeit | last post by:
In analysing a very big application (pysol) made of almost 100 sources, I had the need to remove comments. Removing the comments which take all the line is straightforward... Instead for the...
9
by: Lans | last post by:
I have a string that I need to tokenize but I need to use a string token see example i am trying the following but strtok only uses characters as delimiters and I need to seperate bu a certain...
6
by: Ram Laxman | last post by:
Hi all, How can I tokenize the integers using strtok. For example: If I have some thing like: "ram":"laxman":"deepak" then I can safely use strtok.But if I have something like below:...
5
by: Lam | last post by:
Hi I try to read in a line from text file, and how can I tokenize the line? Thanks
20
by: bubunia2000 | last post by:
Hi all, I heard that strtok is not thread safe. So I want to write a sample program which will tokenize string without using strtok. Can I get a sample source code for the same. For exp:...
1
by: Tim | last post by:
I ran into a problem with a script i was playing with to check code indents and need some direction. It seems to depend on if tabsize is set to 4 in editor and spaces and tabs indents are mixed on...
3
by: WP | last post by:
Hello! I need some help with my program...it's supposed to read infix expressions line by line from stdin and each expression should be divided into operands and operators and added to a vector of...
6
m6s
by: m6s | last post by:
1. After hours of researching, I used these snippets : void Object::TokenizeLines(const string& str, vector<string>& tokens, const string& delimiters) // Skip delimiters at beginning....
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.