so**********@gmail.com wrote:
so**********@gmail.com wrote:
Hi.
I'm trying to find some tutorial-type information on using scanf's
regular-expression subset for parsing - to no avail so far.
I've looked through quite a few posts to clc, but none that have helped
me understand what's really allowed, also, the c-faq - when searched
for scanf regular (using google site-search) didn't help.
Does anyone know of some source - preferably web-based with examples -
that could help me understand what's allowed (e.g., subset of normal
regular-expression syntax is valid)?
Ok, so I've tried the following:
char s1[10] = { '\0' }, s2[10] = { '\0' };
scanf("%9[^abc]%9s", s1, s2);
printf("s1 = %s s2 = %s\n", s1, s2);
My understanding of this is that the scanf line works like this -
%9[^abc] tells scanf to put characters NOT matching a or b or c into s1
- 9 specifies the max number of chars.
The following %9s uses s2 to 'catch' other characters.
If I input the following:
defabc
the output is s1 = def s2 = abc
However, if I input
abcdef
the output is s1 = s2 =
I thought/expected s1 to contain def and s2 to contain nothing.
Help appreciated!
The first thing you need to understand is that the scanf functions do
you support/use regular expressions. Although the bracket syntax in
the format specifier operates in a similiar fashion to the bracketed
character class of a regular expression they not the same thing. The
format string "%9[^abc]%9s" means:
match one or more contigious (up to 9) characters that are not a, b, or
c and store the matched characters as a string into the area provided,
then skip any whitespace (the %s format specifier skips leading
whitespace) and match up to 9 contigious non-whitespace characters and
store them as a string into the second area provided.
Note that:
1. "[" format specifier must match at least one character for it to be
successful
2. The scanf functions process one character at a time, if at any time
the current character does not match the format specifier the match
fails and the function returns.
Failure can occur due to a matching failure or read error. The scanf
functions return the number of successful conversions performed so you
can check this and act accordingly. In your example, when you provide
the input "abcdef", scanf reads the first character, 'a', from the
stream, sees that it does not match the first format specifier, pushes
the character back onto the stream and returns. Remember that the "["
format specifier must match at least one character to be successful and
that once any conversion fails the function returns, it doesn't try to
match the input against the following format specifiers.
Robert Gamble