473,811 Members | 2,597 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

String parsing program

Hi I've a string input and I have to parse it in such a way that that
there can be only white space till a digit is reached and once a digit
is reached, there can be only digits or white space till the string
ends. Am I doing this correctly ? :

Code:

#include <stdio.h>
#include <string.h>

int main(void)
{
char s[50];
int i = 0;

gets(s);

while (isspace(s[i]))
i++;
while (isdigit(s[i]))
i++;
while (isspace(s[i]))
i++;
if (s[i] != '\0')
printf("\nIncor rect string\n");

return (0);
}

I want to actually convert a string to unsigned long. So this kind of
algorithm should be carried out prior to strtoul function to ensure
that some of the weakness from which the strtoul function suffers like
convertin 123aaaaa to 123 for eg or -123 to some unsigned value is
removed. This will also ensure that when you have a string like :

1234 78

1234 is not returned but an error message will be printed. Because a
string should only contain 1 number in my program.
Jul 3 '08
28 2204
On Jul 4, 2:14 am, Ben Bacarisse <ben.use...@bsb .me.ukwrote:
<snip>
The simplest way to scan for a number whilst reporting bad input is to
use the signed strtol function. You check that errno has not been set
to ERANGE and that the end-pointer is not the string you passed in.
If you like, you can now check that nothing but white space is left in
the string. Finally, you confirm the input is the range your program
expects. The signed version lets you detect input like -123. The down
side is that you loose half the range of possible inputs. If that
matters, you can (probably) go up to strtoll.
<snip>
My input is of following format :

45 5666 16000

^^ All of that is just a single string. I need to read the three
numbers into 3 different size_t variables. There can be white space
amongst them but no alphabets or any other characters. It is possible
check if *endp character is nothing but white space (This must be done
when errno != ERANGE and s != endp i.e. when one usually expects
correct output), but any character other than that means data is
erroneous. With unsigned long, you can check if the first non white
space character is a '-' or not. This should solve the problem of
negative numbers as well and prevent their conversion to some unsigned
value before hand.
Jul 3 '08 #11
santosh wrote:
Your code exhibits undefined behaviour because you have failed to
include ctype.h where the declarations for the is* functions are.
Not really. The default declarations will do for those. It's not good
practice, of course.


Brian


Jul 3 '08 #12
pereges <Br*****@gmail. comwrites:
On Jul 4, 2:14 am, Ben Bacarisse <ben.use...@bsb .me.ukwrote:
><snip>
The simplest way to scan for a number whilst reporting bad input is to
use the signed strtol function. You check that errno has not been set
to ERANGE and that the end-pointer is not the string you passed in.
If you like, you can now check that nothing but white space is left in
the string. Finally, you confirm the input is the range your program
expects. The signed version lets you detect input like -123. The down
side is that you loose half the range of possible inputs. If that
matters, you can (probably) go up to strtoll.
<snip>

My input is of following format :

45 5666 16000

^^ All of that is just a single string. I need to read the three
numbers into 3 different size_t variables. There can be white space
amongst them but no alphabets or any other characters. It is possible
check if *endp character is nothing but white space (This must be done
when errno != ERANGE and s != endp i.e. when one usually expects
correct output), but any character other than that means data is
erroneous. With unsigned long, you can check if the first non white
space character is a '-' or not. This should solve the problem of
negative numbers as well and prevent their conversion to some unsigned
value before hand.
Here is one way based on using the widest signed type for input. It
is not ideal, but then without very details specs, what could be? If
you need to accept input right up to SIZE_MAX and you have an
implementation where intmax_t can't hold that value, then you will
need to use strtoumax and check for the - manually, so to speak. That
would not be a big change to parse_size.

#include <stdio.h>
#include <stdbool.h>
#include <inttypes.h>
#include <stdint.h>
#include <errno.h>
#include <ctype.h>

size_t parse_size(cons t char *num, const char **endp, bool *error)
{
char *ep;
errno = 0;
intmax_t imax = strtoimax(num, &ep, 10);
if (errno == ERANGE || imax < 0 || imax SIZE_MAX) {
while (isspace(*num))
num++;
fprintf(stderr, "Input \"%.*s\" out of range.\n",
ep - num, num);
if (error)
*error = true;
}
else if (ep == num) {
/* Skip to the next space-delimited portion of the string. */
while (isspace(*ep))
ep++;
num = ep;
while (*ep != '\0' && !isspace(*ep))
ep++;
fprintf(stderr, "Input \"%.*s\" could not be converted.\n",
ep - num, num);
if (error)
*error = true;
}
if (endp)
*endp = ep;
return imax;
}

void parse_three_siz es(const char *input)
{
bool errors = false;
const char *ep;
size_t s1 = parse_size(inpu t, &ep, &errors);
size_t s2 = parse_size(ep, &ep, &errors);
size_t s3 = parse_size(ep, &ep, &errors);

if (!errors) {
/* Check that everything parsed. */
const char *save_ep = ep;
while (isspace(*ep))
ep++;
if (*ep != '\0')
fprintf(stderr, "Superfluou s input found: \"%s\"\n", save_ep);
printf("Got: %zu %zu %zu\n", s1, s2, s3);
}
}

int main(int argc, char **argv)
{
if (argc 1)
parse_three_siz es(argv[1]);
return 0;
}

--
Ben.
Jul 3 '08 #13
"Default User" <de***********@ yahoo.comwrites :
santosh wrote:
>Your code exhibits undefined behaviour because you have failed to
include ctype.h where the declarations for the is* functions are.

Not really. The default declarations will do for those. It's not good
practice, of course.
That works only in C90. Even though there are few full C99 compilers,
it doesn't hurt to write good C90 code that's also valid C99 code.

And the is* and to* functions are very likely to be implemented as
macros, with better performance than the function calls. By not
including the header, you miss out on the macro definitions.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Jul 4 '08 #14
Keith Thompson wrote:
"Default User" <de***********@ yahoo.comwrites :
santosh wrote:
Your code exhibits undefined behaviour because you have failed to
include ctype.h where the declarations for the is* functions are.
Not really. The default declarations will do for those. It's not
good practice, of course.

That works only in C90. Even though there are few full C99 compilers,
it doesn't hurt to write good C90 code that's also valid C99 code.
So? In that case there's a required diagnostic for the missing
declaration.
And the is* and to* functions are very likely to be implemented as
macros, with better performance than the function calls. By not
including the header, you miss out on the macro definitions.
But not undefined behavior.


Brian
Jul 4 '08 #15
pereges wrote:
Hi I've a string input and I have to parse it in such a way that that
there can be only white space till a digit is reached and once a digit
is reached, there can be only digits or white space till the string
ends. Am I doing this correctly ? :

Code:

#include <stdio.h>
#include <string.h>

int main(void)
{
char s[50];
int i = 0;

gets(s);
Real bad example.
>
while (isspace(s[i]))
while ((unsigned char) s[i])
i++;
while (isdigit(s[i]))
i++;
This does not _require_ a digit to be present.
while (isspace(s[i]))
i++;
if (s[i] != '\0')
printf("\nIncor rect string\n");
return (0);
}

I want to actually convert a string to unsigned long. So this
kind of algorithm should be carried out prior to strtoul
function to ensure that some of the weakness from which
the strtoul function suffers like convertin 123aaaaa to 123
That is not a weakness but a strength.

x = strtoul(str, &endp, 0);

if (endp != str)
while (isspace((unsig ned char) endp))
endp++;

if (endp != str && *endp == 0)
/* all good */;
for eg or -123 to some unsigned value is removed.
if (endp != str && *endp == 0 && strchr(str,'-') == 0)
/* all good */;

--
Peter
Jul 4 '08 #16
Default User wrote:
Keith Thompson wrote:
>"Default User" <de***********@ yahoo.comwrites :
>>santosh wrote:
Your code exhibits undefined behaviour because you have failed to
include ctype.h where the declarations for the is* functions are.
Not really. The default declarations will do for those. It's not
good practice, of course.
That works only in C90. Even though there are few full C99 compilers,
it doesn't hurt to write good C90 code that's also valid C99 code.

So? In that case there's a required diagnostic for the missing
declaration.
>And the is* and to* functions are very likely to be implemented as
macros, with better performance than the function calls. By not
including the header, you miss out on the macro definitions.

But not undefined behavior.
The common subset of C90 and C99,
is like C90 with less allowable sloppy style.

It's a better way to write C90 code,
even if you don't intend it to be compiled as C99 code,

--
pete
Jul 4 '08 #17
"Default User" <de***********@ yahoo.comwrites :
Keith Thompson wrote:
>"Default User" <de***********@ yahoo.comwrites :
santosh wrote:
Your code exhibits undefined behaviour because you have failed to
include ctype.h where the declarations for the is* functions are.

Not really. The default declarations will do for those. It's not
good practice, of course.

That works only in C90. Even though there are few full C99 compilers,
it doesn't hurt to write good C90 code that's also valid C99 code.

So? In that case there's a required diagnostic for the missing
declaration.
>And the is* and to* functions are very likely to be implemented as
macros, with better performance than the function calls. By not
including the header, you miss out on the macro definitions.

But not undefined behavior.
You're right, there's no undefined behavior in either C90 or C99.

(Well, there's a constraint violation in C99; if the implementation
accepts the program in spite of that, after issuing the required
diagnostic, then the behavior is undefined. But that's stretching the
point.)

Failing to include <ctype.hwhen using the is* or to* functions is
still a bad idea, of course.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Jul 4 '08 #18
pereges said:
Hi I've a string input and I have to parse it in such a way that that
there can be only white space till a digit is reached and once a digit
is reached, there can be only digits or white space till the string
ends. Am I doing this correctly ? :
No.

gets(s);
Until you have learned why you must never call this function, there is
little point trying to learn anything else about C. This is your next
step. Trying to jump over it is most unwise.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Jul 4 '08 #19
pete wrote:
Default User wrote:
But not undefined behavior.
It's a better way to write C90 code,
even if you don't intend it to be compiled as C99 code,

What does this have to do with what I said?

Brian
Jul 4 '08 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

19
20592
by: linzhenhua1205 | last post by:
I want to parse a string like C program parse the command line into argc & argv. I hope don't use the array the allocate a fix memory first, and don't use the memory allocate function like malloc. who can give me some ideas? The following is my program, but it has some problem. I hope someone would correct it. //////////////////////////// //Test_ConvertArg.c ////////////////////////////
19
78850
by: Paul | last post by:
hi, there, for example, char *mystr="##this is##a examp#le"; I want to replace all the "##" in mystr with "****". How can I do this? I checked all the string functions in C, but did not find one.
29
4276
by: zoltan | last post by:
Hi, The scenario is like this : struct ns_rr { const u_char* rdata; }; The rdata field contains some fields such as :
15
3086
by: Fariba | last post by:
Hello , I am trying to call a mthod with the following signature: AddRole(string Group_Nam, string Description, int permissionmask); Accroding to msdn ,you can mask the permissions using pipe symbol .for example you can use something like this AddRole("My Group", "Test", 0x10000000|0x00000002);
9
1937
by: Michael D. Ober | last post by:
OK, I can't figure out a way to optimize the following VB 2005 code using StringBuilders: Public Const RecSize as Integer = 105 Private buffer As String Public Sub New() init End Sub Public Sub New(ByVal value As String)
1
3066
by: kellysgirl | last post by:
Now what you are going to see posted here is both the set of instructions I was given..and the code I have written. The instructions I was given are as follows In this case, you will create a Visual Basic 2005 solution that manipulates strings. It will parse a string containing a list of items within a text box and put the individual items into the list box. It will build the textbox string by putting the list box items together into a...
3
2108
by: WP | last post by:
Hello! I need some help with my program...it's supposed to read infix expressions line by line from stdin and each expression should be divided into operands and operators and added to a vector of strings. So if we read one line that holds "1+2" the vector should afterwards hold the strings "1", "+" and "2". Valid operators are +, -, * and / meaning they are of length 1. Valid operands are ints >= 0 meaning they can stretch over several...
6
3520
by: James Arnold | last post by:
Hello, I am new to C and I am trying to write a few small applications to get some hands-on practise! I am trying to write a random string generator, based on a masked input. For example, given the string: "AAANN" it would return a string containing 3 alphanumeric characters followed by 3 digits. This part I have managed:) I would now like to add some complexity to this, such as repetitions and grouping. For example, I'd like to have...
1
4411
by: eyeore | last post by:
Hello everyone my String reverse code works but my professor wants me to use pop top push or Stack code and parsing code could you please teach me how to make this code work with pop top push or Stack code and parsing code my professor i does not like me using buffer reader on my code and my professor did even give me an example code for parsing as well as pop push top or Stack code and i don't know how to do this code into parsing and pop push...
0
10648
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10389
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10402
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10135
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9205
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6890
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5554
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5692
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
3018
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.