In article <11**********************@z14g2000cwz.googlegroups .com>,
<aj******@gmail.com> wrote:
: I need some simple help with my regular expressions.
:
: I want to search my input text for all the boolean variables which do
: not start with bln. i.e I want to match "bool followed by 1 or more
: spaces followed by text other than bln" in my input text.
: Eg:
: 1. private bool dispose; should be highlighted
: 2. private bool blndispose; should be not be highlighted
:
: I have written the following regular expression text for this:
: bool\s+[^bln] but it highlights both 1 and 2. Where could I be going
: wrong??
Remember that [] creates a character class, so [^bln] means any
character that's not b, or l, or n. It doesn't mean, as you seem
to intend, any sequence other than "bln".
As another poster noted, you're forgetting that the matcher will
backtrack to try to find a match.
As we know, quantifiers (such as +) are greedy, so the first stab
at your case 2 is -- assuming you're using a monospaced font:
private bool blndispose;
^--- look for [^bln]
The spot where it's looking (indicated by the caret) is b, which
fails to match against [^bln].
Being the demagogue that it is, the matcher accuses the plus
quantifier of being too greedy and calls in jackbooted thugs to
forcibly take away one of its hard-won spaces:
private bool blndispose;
^--- look for [^bln]
A space is in fact neither b nor l nor n, so the match commissar is
happy on two counts: he scratched his collectivist itch (he did allow
the plus to match *some* spaces but made sure everyone else had enough)
and gets to report a match. Too bad it was a match we didn't want.
Below is one way to do it:
static void Main(string[] args)
{
string[] inputs = new string[]
{
"private bool dispose;",
"private bool blndispose;",
};
Regex unhungarian = new Regex(@"bool(?!\s+bln)");
foreach (string input in inputs)
{
string verdict =
unhungarian.IsMatch(input)
? "highlight" : "don't";
Console.WriteLine("'" + input + "': " + verdict);
}
}
This pattern says find me a bool that's not followed by a run of
spaces followed by bln.
Hope this helps,
Greg