This is what you are looking for :
^[^a-zA-Z]*|[^a-zA-Z]*$|[^a-zA-Z-. ]*([-. ])?[^a-zA-Z]*
myStringCleaned=Regex.Replace(myStringToClean,@"^[^a-zA-Z]*|[^a-zA-Z]*$|[^a-
zA-Z-. ]*([-. ])?[^a-zA-Z]*","$1");
Here some explanations :
^[a-zA-Z]* means that all line starting with anything else than a letter
must be replaced by $+ wich is empy (there is no capuring group).
| means "else"
[^a-zA-Z]*$ means that all line ending with anything else than a letter
must be replaced by $+ wich is empty too.
| means "else"
[^a-zA-Z-. ]* means : every sequence that does not match a letter or dot,
hyphen and space
([-. ])? means that you want to match one of these character
(dot, hyphen and space) the first time they appears and all the sequence
matched will be replaced by $+ wich is this characted you have just matched
[^a-zA-Z]* means anything else than a letter
$+ means you replace all sequences by the first match. For
the first and the second part, it is empty. For the last part, it could be
empty if there are no dot,hypen and space) or contains the last captured
group if there was a match.
So,
___..&* los - .an#$geles. ^&...____ .
will be replaced
by los angeles
and
___..&* los- .an#$geles. ^&...____ .
will be replaced
by los-angeles
To understand, let take the example ___..&* los- .an#$geles. ^&...____ .
___..&* will be matched in the first part of the regex and will be replaced
by nothing ($+=empty).
- . will be matched in the third part and will be replaced by -
($+=hyphen)
#$ will be matched in the third part and will be replaced by
nothing ($+=empty)
.. ^&...____ . will be matched in the second part of the regex and will be
replaced by nothing ($+=empty).
So you have 'empty' and 'los' and 'hyphen' and 'an' and 'geles' and 'empty'
that is 'los-angeles'.
Hope it helps,
Ludovic SOEUR
<al*****@hotmail.com> a écrit dans le message de
news:11**********************@o13g2000cwo.googlegr oups.com...
I am trying to clean up a city name. Inside the letters only one of 3
characters (dot, space and hyphen) is allowed (1 max). For example:
Los-Angeles,Los Angeles and N.Westminster are ok.
Outside the letters nothing allowed.
So I need to do a replace and get this:
los angeles
from this:
___..&* los - .an#$geles. ^&...____ .
or this:
los-angeles
from this:
___..&* los- .an#$geles. ^&...____ .
Please help.