*
om****@wp.pl wrote, On 23-5-2007 15:27:
Hi
I wonder if regular expressions are in general sower than using
classes like String and
Char when used for validating/parsing text data?
I've done some simple test (using IsMatch()) method and the result was
that Regex
is either as fast or two times slower than method which used methods
from classes
String and Char.
(I tried static and object method, I also tried Compiled option)
Maybe Regex are slower only when matching is relatively simple?
There's a discussion a couple of threads back. The outcome was quite
logical. You need to balance, maintainability, performance, suitability
between the follwoing three methods
Simple patterns (startswith, endswith, length, contains fixed text...)
- Use string based functions. For shorter strings direct manipulation is
faster, for longer strings stringbuilder is better.
Complexer patterns (contains variable strings, matches a complex
pattern, extensive search and replace option, large optional parts etc)
- Use a Regular expression. If the expression is going to be reused
multiple time, initialize the expression to a Regex object and assing
that to a static field. Use the RegexOptions.Compiled to convert the
expression to IL. For adhoc or variable expression (maybe built from
user input) don't use RegexOptions.Compiled, but use the static methods
from the Regex class instead.
Make use of the option to include whitespace and comments in your regex.
Set the RegexOptions.IgnorePatternWhitespace option. Then format your
expressions as follows:
@"
(
A (?# Documentation for alternative A)
| B (?# Documentation for alternative B)
| C (?# Documentation for alternative C)
)+
";
If you need to match a space character you'll have to put it in a
character class like this: "[ ]".
Quite complex problems (resulting in unreadable regexes, hard to explain
problems etc)
Split the problem in sub-problems. Apply the above rules to use the
appropriate method to solve these sub problems. The same as you would do
with too complex methods.
Very complex problems (anything which results in a hard to maintain code
with lots of subroutines and regular expressions and string manipulation).
Use a scanner/parser generator. There's one you can get with the Visual
Studio 2005 SDK which generates C# code based on your template.
Performance wise, always test your assertions. If performance isn't that
much of a problem, err to the more maintainable version.
Jesse Houwing