ikassaki@hotmail.com (IKAS) wrote in
news:cec61ad7.0307210326.6bbde12f@posting.google.c om:
[color=blue]
> I am having some difficulty in using Regular expression in .NET
> with C#.
>
> How do I write a regular expression so that it will match both
> the following strings and also get me the relevant named groups?
>
> 1. select xcol, ycol from sampletable where zcol='val'
>
> 2. select xcol, ycol from sampletable
>
> If I try the following regular expression
>
> ^select (?<SelectClause>.*) from (?<FromClause>.*) (where
> (?<WhereClause>.*))*$
>
> and execute it with first string the "WhereClause" group is not
> matched.
> Even worse, the where clause part shows up in the "FromClause"
> group.
>
> Maybe there is a better way to achieve what I am trying to do ?[/color]
The <FromClause> uses a greedy * quantifer. In this mode, * will
match as much text as it possibly can. Since the <WhereClause> is
optional, the <FromClause> * will "eat up" all of the text to the end
of the string.
To fix this, just put a ? after the * in <FromClause>:
^select (?<SelectClause>.*) from (?<FromClause>.*?)( where
(?<WhereClause>.*))*$
That turns the * quantifier into a non-greedy quantifier, and it will
only match the minimum amount of text necessary to fulfill its part
of the pattern.
Also note the change I made by putting the space before the "where"
keyword, because the space is really part of the optional
<WhereClause>.
Another change you might want to make is to use \s+ instead of hard-
coding the spaces to make the regex more flexible:
^select\s+(?<SelectClause>.*)\s+from\s+(?<FromClau se>.*?)(\s+where\s+
(?<WhereClause>.*))*$
Hope this helps.
Chris.
-------------
C.R. Timmons Consulting, Inc.
http://www.crtimmonsinc.com/