By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,923 Members | 1,279 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,923 IT Pros & Developers. It's quick & easy.

get number at end of string

P: n/a
Here is a function I have to get a number at the end of a string. I'm
posting this in case it proves helpful to someone. Comments are
welcome.

int getnum(char *str)
{
char buffer[BUFSIZE];
char *buf = buffer;
int i;

for (i = 0; i < BUFSIZE-1; i++) {
if (str[i] == '\0') break;
if (!isdigit(str[i])) continue;
*(buf++) = str[i];
}
*buf = '\0';

if (buffer[0] != '\0') {
return atoi(buffer);
} else {
return -1;
}
}

Nov 14 '05 #1
Share this Question
Share on Google+
17 Replies


P: n/a
jake1138 wrote:
Here is a function I have to get a number at the end of a string. I'm
posting this in case it proves helpful to someone. Comments are
welcome.

int getnum(char *str)
{
char buffer[BUFSIZE];
char *buf = buffer;
int i;

for (i = 0; i < BUFSIZE-1; i++) {
if (str[i] == '\0') break;
if (!isdigit(str[i])) continue;
*(buf++) = str[i];
}
*buf = '\0';

if (buffer[0] != '\0') {
return atoi(buffer);
} else {
return -1;
}
}


Firstly, I don't see what it has to do with "getting a number at the end
of a string". This function is not targeted specifically at the "end of
a string" in any way.

Secondly, this function inspects at most 'BUFSIZE' first character of
the input sequence, which is more than strange. What it the input string
(it is supposed to be a string, right?) is longer than 'BUFSIZE'?

--
Best regards,
Andrey Tarasevich
Nov 14 '05 #2

P: n/a
jake1138 wrote:
Here is a function I have to get a number at the end of a string. I'm
posting this in case it proves helpful to someone. Comments are
welcome.

int getnum(char *str)
{
char buffer[BUFSIZE];
char *buf = buffer;
int i;

for (i = 0; i < BUFSIZE-1; i++) {
if (str[i] == '\0') break;
if (!isdigit(str[i])) continue;
*(buf++) = str[i];
}
*buf = '\0';

if (buffer[0] != '\0') {
return atoi(buffer);
} else {
return -1;
}
}


Comments (you asked for 'em):
- Undocumented. This really needs its semantics documented, in particular
the valid types of inputvalues and the behaviour on parsing failure.
- It doesn't modify 'str', use const.
- I'd put the 'buf++' from the loop's body into the header, just as the
check 'if (str[i] == '\0')'. This is my personal preference though.
- Prone to failure due to size of 'buffer'. You check that you don't
overflow it, but if the number doesn't fit, you return faulty results.
- Too complicated. You (try to) copy the number at the end of 'str' to a
temporary buffer and then feed that buffer to atoi(). Instead, search the
beginning of that number and feed that pointer to atoi() directly.
- Your copying is faulty, it concats all digits and then parses the result
as number.

You need to do testcases according to how you want the function to work
(i.e. according to the yet undocumented semantics):

assert( getnum(NULL)==-1);
assert( getnum("")==-1);
assert( getnum("aeu123htn456)==456);
assert( getnum("x:-1")==-1);
....

Uli

Nov 14 '05 #3

P: n/a
jake1138 <co******@gmail.com> wrote:
Here is a function I have to get a number at the end of a string. I'm
posting this in case it proves helpful to someone. Comments are
welcome.
A bit of nitpicking:
int getnum(char *str)
Sine you don't modify what 'str' is pointing to you might make that

int getnum( const char *str )
{
char buffer[BUFSIZE];
char *buf = buffer;
int i;
Some people think that it is a good habit to check the arguments of
a function as far as possible, and you could test is 'str' is a NULL
pointer and report an error in this case.
for (i = 0; i < BUFSIZE-1; i++) {
if (str[i] == '\0') break;
if (!isdigit(str[i])) continue;
What's wrong with negative numbers? Your function will return a positive
number for them... And, contrary to the description you gave, your
function will return the first number it finds in the string, which
is not necessarily the one at the end of the string, think of e.g.
"blabla12blabla42blabla".
*(buf++) = str[i];
You can leave off the parentheses around the 'buf++'. But why do you
think you need to copy the string at all? You can easily use 'str'
itself as a pointer to the current position in the string and later
pass that to atoi() or whatever you're going to use.
}
*buf = '\0'; if (buffer[0] != '\0') {
return atoi(buffer);
While you can safely assume that what's in 'buffer' starts with a digit,
you still don't know if the number there isn't too large to be stored
in an integer - and atoi() gives you no chance to figure that out. For
that reason it is usually more prudent to use strtol().
} else {
return -1;
If you modify the function to deal with negative numbers also you
will have to use a different way to report errors...
}
}


Here's a version of your function that takes care of a few of the
issues mentioned (but which still does not handle negative values
correctly!):

#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <errno.h>
#include <limits.h>

int getnum( const char *s )
{
long val;

if ( s == NULL )
return -1;

while ( *s && ! isdigit( *s ) )
s++;
if ( ! *s )
return -1 ;

val = strtol( s, NULL, 10 );
if ( errno == ERANGE || val > INT_MAX )
return -1;
return val;
}
Regards, Jens
--
\ Jens Thoms Toerring ___ Je***********@physik.fu-berlin.de
\__________________________ http://www.toerring.de
Nov 14 '05 #4

P: n/a


jake1138 wrote:
Here is a function I have to get a number at the end of a string. I'm
posting this in case it proves helpful to someone. Comments are
welcome.

int getnum(char *str)
`const char *str' would be a little better, as it
indicates your intention not to modify the string.
{
char buffer[BUFSIZE];
char *buf = buffer;
I can't see any reason to make an extra copy of the
digits; you could just as well convert them directly
from the `str' string (but see below). Why bother with
the copy, especially since it places an artificial limit
on the length of the string you can handle?
int i;

for (i = 0; i < BUFSIZE-1; i++) {
if (str[i] == '\0') break;
if (!isdigit(str[i])) continue;
Make that `isdigit((unsigned char)str[i])' to defend
against negative-valued character codes in the string.
*(buf++) = str[i];
Harmless but unnecessary and funny-looking parentheses.
}
*buf = '\0';
This loop doesn't quite match your "number at the end
of a string" description. For example, consider what it
does with a string like "abc123xyz456". If you actually
intend to extract "123456" from such an input, then the
copy into buffer[] may be justified -- but if you intend
to convert just the "456" the copy is unnecessary and you
need to change the way this loop works.
if (buffer[0] != '\0') {
return atoi(buffer);
Despite its name, atoi() is not a good choice of a
conversion function. The biggest drawback is that it has
no way to report errors in the input (in fact, erroneous
input provokes undefined behavior). And yes: a string
consisting entirely of digits can be erroneous; consider
the string consisting of one million '9' digits ... Try
strtol() instead, so you can check for errors after the
conversion. Alternatively, do the conversion yourself,
one digit at a time -- this may actually be easier.
} else {
return -1;
I was also going to remark on the fact that you've
ignored the possibility of a minus sign before the digit
string, but this `return' makes me think you probably don't
intend to process a sign: "abc-123" is thought of as "abc-"
plus "123", not as "abc" plus "-123". That's fine, but you
ought to say so explicitly in your description of what the
function is supposed to do.
}
}


Putting all this together leads to something like
(untested):

#include <stdlib.h>
#include <ctype.h>
#include <limits.h>
#include <errno.h>

int getnum(const char *str) {
const char *ptr;
char *end;
int dseen;
long value;

/* Advance `str' to start of all-digits suffix.
* (Alternating strspn() and strcspn() calls might
* be slicker, but this loop has the virtue of
* simplicity.)
*/
dseen = 0;
for (ptr = str; *ptr != '\0'; ++ptr) {
if (isdigit((unsigned char)*ptr)) {
if (dseen == 0) {
/* Digit string starts here */
str = ptr;
dseen = 1;
}
}
else {
/* Non-digit: suffix not found yet */
dseen = 0;
}
}

/* Can't proceed if no such suffix found */
if (dseen == 0)
return -1;

/* Try to convert the suffix */
errno = 0;
value = strtol(str, &end, 10);

/* Fail if unable to convert or if result is
* too large for an `int'
*/
if (errno == ERANGE || end == str || value > INT_MAX)
return -1;

/* Huzzah! */
return value;
}

--
Er*********@sun.com

Nov 14 '05 #5

P: n/a
jake1138 wrote:
Here is a function I have to get a number at the end of a string.

Your function attempts to take all the numeric digits in the string
from left to right, not just the number the end of the string.
"number" as you say, implies both negative and positive variations.

I'm
posting this in case it proves helpful to someone. Comments are
welcome.

int getnum(char *str)

getnum's signature implies that it returns a number (positive or
negative) and takes a pointer to a modifiable stirng (although leaving
out 'const' is not a big deal).

{
char buffer[BUFSIZE];

You are hiding some peculiar "limits" to your function inside the
function. Where did/do you define BUFSIZE? What if the string is too
long, do you signal an error?

char *buf = buffer;
int i;

for (i = 0; i < BUFSIZE-1; i++) {
^^^^^
You are testing i, which is used to index into str not buffer, to
compare against BUFSIZE. This means, buf could still be pointing at
the beginning of your buffer (your buffer is empty) yet the limits of
the for loop can still be breached. You are limitting your function
greatly without need. Although you don't and shouldn't really use an
automatic char array in this situation, if for some reason you had to,
use some other index variable, like j, to point into buffer, and use
that to test against BUFSIZE (that way you know when exactly buffer is
filling up).

if (str[i] == '\0') break;
if (!isdigit(str[i])) continue;
*(buf++) = str[i];
}
*buf = '\0';

There is no support for negative numbers or signs in your fuction. It
grabs all the numeric digits from left to right and plops it into a
limitted size array, which is actually not even limitted by it's own
use (see above) but rather by str's use.


if (buffer[0] != '\0') {
return atoi(buffer);

Where is atoi declared?

atoi has it's limitations, there are better functions for conversion
out there (strtol). Why don't you scan your string from right to left
with a pointer, wait until a non numeric value is reached, check the
next left char, if it's a - or +, move the pointer another unit left,
otherwise don't. Then you can pass this pointer into strtol and let it
do the conversion. Once that done, you should check for all types of
failure (associated with strtol/errno.h).

} else {
return -1;

If and when you do add support for negative numbers, this return value
will add confusion. Better let the caller supply the address of some
variable, and use it to signal errors (or some other technique).

}
}


Take care

Nov 14 '05 #6

P: n/a
Je***********@physik.fu-berlin.de wrote:
...
Here's a version of your function that takes care of a few of the
issues mentioned (but which still does not handle negative values
correctly!):
But this is not the same. The OP's function was collecting (possibly
non-adjacent) digits from the input string. You function looks for (and
converts) the first continuous sequence of digits in the input string.

Of course, I can only guess what was the OP's real intent.
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <errno.h>
#include <limits.h>

int getnum( const char *s )
{
long val;

if ( s == NULL )
return -1;

while ( *s && ! isdigit( *s ) )
s++;
if ( ! *s )
return -1 ;

val = strtol( s, NULL, 10 );
if ( errno == ERANGE || val > INT_MAX )
return -1;
return val;
}
Regards, Jens

--
Best regards,
Andrey Tarasevich
Nov 14 '05 #7

P: n/a
Andrey Tarasevich <an**************@hotmail.com> wrote:
Je***********@physik.fu-berlin.de wrote:
...
Here's a version of your function that takes care of a few of the
issues mentioned (but which still does not handle negative values
correctly!):
But this is not the same. The OP's function was collecting (possibly
non-adjacent) digits from the input string. You function looks for (and
converts) the first continuous sequence of digits in the input string.
Nicely spotted, I did overlook that completely! Perhaps that explains
the copying of the string to a new buffer.
Of course, I can only guess what was the OP's real intent.


I have a hard time coming up with an example where it would make sense
to do that (except perhaps in cases where one wants to skip separators
between digits like in "total population: 1,812,587", but then I would
expect a test for the correct separator character)...

Regards, Jens
--
\ Jens Thoms Toerring ___ Je***********@physik.fu-berlin.de
\__________________________ http://www.toerring.de
Nov 14 '05 #8

P: n/a
Je***********@physik.fu-berlin.de wrote:
...


BTW, unless I'm missing something, the entire body of your version of
'getnum' function can be replaced with a single 'sscanf' call that will
do essentially the same thing :)

sscanf(s, "%*[^0123456789]%ld", &val);
--
Best regards,
Andrey Tarasevich
Nov 14 '05 #9

P: n/a
In article <11*************@news.supernews.com>
Andrey Tarasevich <an**************@hotmail.com> wrote:
BTW, unless I'm missing something, the entire body of your version of
'getnum' function can be replaced with a single 'sscanf' call that will
do essentially the same thing :)

sscanf(s, "%*[^0123456789]%ld", &val);


No, this will fail when the string consists entirely of digits
(because %[ directives must match at least one character). You
could use:

if (sscanf(s, "%*[^0123456789]%ld", &val) != 1 &&
sscanf(s, "%ld", &val) != 1) ... do something about no-digits-at-all

Note that when scanning something like "-123", the first sscanf()
call does succeed and return 1, so the second call never occurs.
(The negated, assignment-suppressed %[ directive matches the '-'
character and therefore succeeds.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #10

P: n/a
Andrey Tarasevich <an**************@hotmail.com> wrote:
Je***********@physik.fu-berlin.de wrote:
...
BTW, unless I'm missing something, the entire body of your version of
'getnum' function can be replaced with a single 'sscanf' call that will
do essentially the same thing :) sscanf(s, "%*[^0123456789]%ld", &val);


Beside the problem Chris Torek pointed out I can't see a way to check
for cases where the number in the string is too large to be stored in
an int. And for an all-purpose routine that seems to me to be a rather
important aspect.
Regards, Jens
--
\ Jens Thoms Toerring ___ Je***********@physik.fu-berlin.de
\__________________________ http://www.toerring.de
Nov 14 '05 #11

P: n/a
Je***********@physik.fu-berlin.de wrote:
Andrey Tarasevich <an**************@hotmail.com> wrote:
...

BTW, unless I'm missing something, the entire body of your version
of 'getnum' function can be replaced with a single 'sscanf' call
that will do essentially the same thing :)

sscanf(s, "%*[^0123456789]%ld", &val);


Beside the problem Chris Torek pointed out I can't see a way to
check for cases where the number in the string is too large to be
stored in an int. And for an all-purpose routine that seems to me
to be a rather important aspect.


Starting at the rear of the string, skip blanks. While char is a
digit, accumulate an integer, checking for overflow. If terminal
char is a sign act accordingly and back up one more. If terminal
char is '.' keep count of digits to form fractional part, and
continue digit scan into integer part. Also watch out for 'e' or
'E'. This should develop a suitable syntax for a backward scan
that yields a number of some form. It may be convenient to place
some form of sentinel at the start of the string.

I don't see any real use for it, however.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #12

P: n/a
CBFalconer <cb********@yahoo.com> wrote:
Je***********@physik.fu-berlin.de wrote:
Andrey Tarasevich <an**************@hotmail.com> wrote:
...
BTW, unless I'm missing something, the entire body of your version
of 'getnum' function can be replaced with a single 'sscanf' call
that will do essentially the same thing :)

sscanf(s, "%*[^0123456789]%ld", &val);


Beside the problem Chris Torek pointed out I can't see a way to
check for cases where the number in the string is too large to be
stored in an int. And for an all-purpose routine that seems to me
to be a rather important aspect.

Starting at the rear of the string, skip blanks. While char is a
digit, accumulate an integer, checking for overflow. If terminal
char is a sign act accordingly and back up one more. If terminal
char is '.' keep count of digits to form fractional part, and
continue digit scan into integer part. Also watch out for 'e' or
'E'. This should develop a suitable syntax for a backward scan
that yields a number of some form. It may be convenient to place
some form of sentinel at the start of the string.


Sorry, I meant checking for overflow while using sscanf() with some
construct like the one Andrey proposed.

Regards, Jens
--
\ Jens Thoms Toerring ___ Je***********@physik.fu-berlin.de
\__________________________ http://www.toerring.de
Nov 14 '05 #13

P: n/a
Je***********@physik.fu-berlin.de wrote:
Andrey Tarasevich <an**************@hotmail.com> wrote:
Je***********@physik.fu-berlin.de wrote:
...

BTW, unless I'm missing something, the entire body of your version of
'getnum' function can be replaced with a single 'sscanf' call that will
do essentially the same thing :)

sscanf(s, "%*[^0123456789]%ld", &val);


Beside the problem Chris Torek pointed out I can't see a way to check
for cases where the number in the string is too large to be stored in
an int. And for an all-purpose routine that seems to me to be a rather
important aspect.


Yes, my suggestion indeed suffers miserably from these two problems.
Trying to salvage at least some of it :) I can suggest reducing it to a
mere replacement of the 'while (!is_digit())' cycle. I.e. instead of doing

while ( *s && ! isdigit( *s ) )
s++;

one can use

int d = 0;
sscanf(s, "%*[^0123456789]%n", &d);
s += d;

although I myself would probably stick with the 'while' version (at
least for such a compact test as 'is_digit').

--
Best regards,
Andrey Tarasevich
Nov 14 '05 #14

P: n/a
> Of course, I can only guess what was the OP's real intent.

Thanks for all the responses. I should have posted how my function was
being used. You all made some very good points.

The current purpose of this function is to get a positive integer at
the end of a string, where the expected string would be something like
"WAN1", "WAN2", "WAN3" ... "WANn" (where n <= 20).

I wasn't sure how to "search backwards" from the end of the string to
do this, so I came up with the function I posted. What would be nice,
for full validation, would be to search forward, specifically looking
for 'W', 'A', 'N', then taking the rest as a number, but I guess I
hadn't thought too much on how to do that.

I could use strstr() to see if "WAN" is in the string, but it doesn't
tell me it's at the beginning of the string and even if it does find
it, it only gives me a pointer to the beginning (but then I could count
up 3 from there). Hmmm, I guess if the return pointer of strstr() is
equal to the string pointer, then it was found at the beginning. I'll
have to think about this a little more.

Nov 14 '05 #15

P: n/a
jake1138 wrote:
I should have posted how my function was being used. ...
The current purpose of this function is to get a positive
integer at the end of a string, where the expected string
would be something like "WAN1", "WAN2", "WAN3" ... "WANn"
(where n <= 20).


char ignore;
int n, r = sscanf(blah, "WAN%2d%c", &n, &ignore);
if (r == 1 && 1 <= n && n <= 20)
... WAN n ...
else
...bad input...

--
Peter

Nov 14 '05 #16

P: n/a
jake1138 wrote:
Of course, I can only guess what was the OP's real intent.

Thanks for all the responses. I should have posted how my function was
being used. You all made some very good points.

The current purpose of this function is to get a positive integer at
the end of a string, where the expected string would be something like
"WAN1", "WAN2", "WAN3" ... "WANn" (where n <= 20).

I wasn't sure how to "search backwards" from the end of the string to
do this, so I came up with the function I posted. What would be nice,
for full validation, would be to search forward, specifically looking
for 'W', 'A', 'N', then taking the rest as a number, but I guess I
hadn't thought too much on how to do that.

I could use strstr() to see if "WAN" is in the string, but it doesn't
tell me it's at the beginning of the string and even if it does find
it, it only gives me a pointer to the beginning (but then I could count
up 3 from there). Hmmm, I guess if the return pointer of strstr() is
equal to the string pointer, then it was found at the beginning. I'll
have to think about this a little more.


If you'd revealed what you were trying to do in the
first place, this thread would have been a lot shorter.
One word: sscanf().

--
Eric Sosman
es*****@acm-dot-org.invalid

Nov 14 '05 #17

P: n/a
Peter Nilsson wrote:
jake1138 wrote:
I should have posted how my function was being used. ...
The current purpose of this function is to get a positive
integer at the end of a string, where the expected string
would be something like "WAN1", "WAN2", "WAN3" ... "WANn"
(where n <= 20).


char ignore;
int n, r = sscanf(blah, "WAN%2d%c", &n, &ignore);
if (r == 1 && 1 <= n && n <= 20)
... WAN n ...
else
...bad input...

--
Peter


Well, that's quite simple. I guess it helps to have a good
understanding of the C standard library. I hadn't used sscanf before.

Nov 14 '05 #18

This discussion thread is closed

Replies have been disabled for this discussion.