ik******@hotmail.com (IKAS) wrote in
news:ce**************************@posting.google.c om:
I am having some difficulty in using Regular expression in .NET
with C#.
How do I write a regular expression so that it will match both
the following strings and also get me the relevant named groups?
1. select xcol, ycol from sampletable where zcol='val'
2. select xcol, ycol from sampletable
If I try the following regular expression
^select (?<SelectClause>.*) from (?<FromClause>.*) (where
(?<WhereClause>.*))*$
and execute it with first string the "WhereClause" group is not
matched.
Even worse, the where clause part shows up in the "FromClause"
group.
Maybe there is a better way to achieve what I am trying to do ?
The <FromClause> uses a greedy * quantifer. In this mode, * will
match as much text as it possibly can. Since the <WhereClause> is
optional, the <FromClause> * will "eat up" all of the text to the end
of the string.
To fix this, just put a ? after the * in <FromClause>:
^select (?<SelectClause>.*) from (?<FromClause>.*?)( where
(?<WhereClause>.*))*$
That turns the * quantifier into a non-greedy quantifier, and it will
only match the minimum amount of text necessary to fulfill its part
of the pattern.
Also note the change I made by putting the space before the "where"
keyword, because the space is really part of the optional
<WhereClause>.
Another change you might want to make is to use \s+ instead of hard-
coding the spaces to make the regex more flexible:
^select\s+(?<SelectClause>.*)\s+from\s+(?<FromClau se>.*?)(\s+where\s+
(?<WhereClause>.*))*$
Hope this helps.
Chris.
-------------
C.R. Timmons Consulting, Inc.
http://www.crtimmonsinc.com/