Mike,
Rather then literally using Chr(0), Chr(31), Chr(33), ..., I would recommend
the RegEx Character Escape sequences.
http://msdn.microsoft.com/library/de...terescapes.asp
Something like:
' With ASCII character escapes
Dim _dropContactGarbage As New Regex( _
"([\x00-\x1F]+)|" & _
"([\x21-\x26]+)|" & _
"([\x28-\x2C]+)|" & _
...
Of course you may have problems with Chr(128) & above, as Chr(128) is an
ANSI char code, while Regex expects ASCII and/or Unicode. As you know ASCII
is 7 bit (0 to 127) & Unicode in RegEx needs 4 digits (\u0000).
' with Unicode character escapes
Dim _dropContactGarbage As New Regex( _
"([\u0000-\u001F]+)|" & _
"([\u0021-\u0026]+)|" & _
"([\u0028-\u002C]+)|" & _
It might be "easier" if you used a the predefined character classes (\s \w
\W \s ...) instead:
http://msdn.microsoft.com/library/de...terclasses.asp
Something like:
Dim _dropContactGarbage As New Regex("\W")
Which says match any nonword character...
Expresso & RegEx Workbench both have wizards of varying degrees to help you
build your expression, plus they allow you to test your expressions, also
the analyzer/interpreter in each is rather handy.
Expresso:
http://www.ultrapico.com/Expresso.htm
RegEx Workbench:
http://www.gotdotnet.com/Community/U...-4ee2729d7322A
tutorial & reference on using regular expressions:
http://www.regular-expressions.info/
The MSDN's documentation on regular expressions:
http://msdn.microsoft.com/library/de...geElements.asp
Hope this helps
Jay
"Mike Labosh" <ml*****@hotmail.com> wrote in message
news:en**************@TK2MSFTNGP10.phx.gbl...
|I have the following System.Text.RegularExpressions.Regex that is supposed
| to remove this predefined list of garbage characters from contact names
that
| come in on import files :
|
| Dim _dropContactGarbage As New Regex( _
| "([" & Chr(0) & "-" & Chr(31) & "]+)|" & _
| "([" & Chr(33) & "-" & Chr(38) & "]+)|" & _
| "([" & Chr(40) & "-" & Chr(44) & "]+)|" & _
| "([" & Chr(47) & "-" & Chr(47) & "]+)|" & _
| "([" & Chr(58) & "-" & Chr(64) & "]+)|" & _
| "([" & Chr(91) & "-" & Chr(96) & "]+)|" & _
| "([" & Chr(123) & "-" & Chr(127) & "]+)|" & _
| "([" & Chr(152) & "]+)|" & _
| "([" & Chr(155) & "-" & Chr(159) & "]+)|" & _
| "([" & Chr(166) & "-" & Chr(224) & "]+)|" & _
| "([" & Chr(226) & "-" & Chr(255) & "]+)")
|
| We use it like this:
|
| value = _dropContactGarbage.Replace(value, "")
|
| But the Regex constructor is throwing an ArgumentException whose Message
| property says only "Parse ([". There is no inner exception. Normally, if
I
| have a string expression that's wrong, I would Console.WriteLine() it.
But
| in this case, it doesn't WriteLine correctly, because some of the
characters
| in the expression are control characters, so what it displays is not
| visually correct.
|
| I have slaved over this issue for hours and hours and I can only guess
that
| one of the items must be escaped with a "\" or something, but I cannot
| figure it out. I have already been all over the MSDN help topics for the
| Regex Class.
|
| Help?
|
| --
| Peace & happy computing,
|
| Mike Labosh, MCSD
| "After very careful consideration, I have come
| to the conclusion that this new system SUCKS"
| -- General Barringer, from WARGAMES
|
|