By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,007 Members | 979 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,007 IT Pros & Developers. It's quick & easy.

regex -- substitute chars outside quoted strings

P: n/a
What I want to do sounds simple, but it's defeating me. I want to
substitute all occurences of a colon : character in a string with an @
character -- unless the : occurs within a single or double-quoted
substring. Surely this can be done with regular expressions? Any regex
gurus know how to do it?
Jul 21 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
ga**@mccull.org (Gary McCullough) writes:
What I want to do sounds simple, but it's defeating me. I want to
substitute all occurences of a colon : character in a string with an @
character -- unless the : occurs within a single or double-quoted
substring. Surely this can be done with regular expressions? Any regex
gurus know how to do it?


Preprocess the string: split it into the parts with quotes and without.

Lexer would work great in this case.
Jul 21 '05 #2

P: n/a
As a human being, this seems like a very simple problem, but trying to get a
computer to understand what you want is another story. Let's look at an
example of why this is more complex of a problem, using the following string:

Meeting today : 10AM
This is an "example: 1"
Meeting tomorrow : 11AM
This is another "example: 2"

If I understand the requirements, the desired output should be:

Meeting today @ 10AM
This is an "example: 1"
Meeting tomorrow @ 11AM
This is another "example: 2"

Unfortunately, if we wrote a regular expression to replace any colon (:) not
inside quotes, the colon before 11AM would not be changed, because there is a
preceding and following quote.

I've had a similar problem before as well, and the best solution I could
think of was to extract all of the quoted strings and replace them with an
escape sequence, then do the replacement, then re-inflate the escape
sequences with the extracted values.

An example would look something like:

string s = @"
Meeting today : 10AM
This is an ""example: 1""
Meeting tomorrow : 11AM
This is another ""example: 1""
";

// Extract the quoted strings
MatchCollection matches = Regex.Matches(s, @"""[^\""]+?""");
for(int x=matches.Count-1; x>-1; x--)
{
Match match = matches[x];
s = s.Remove(match.Index, match.Length);
s = s.Insert(match.Index, "{" + x + "}");
}

// Replace the remaining : with @
s = s.Replace(':', '@');

// Reinflate the escaped strings
for(int x=0; x<matches.Count; x++)
{
Match match = matches[x];
s = s.Remove(match.Index, x.ToString().Length + 2);
s = s.Insert(match.Index, match.Value);
}
If anyone else has a better solution, I'd love to hear it.

Hope this helps.

--
Jason Whitted
Jul 21 '05 #3

P: n/a
Jason,

I was afraid you'd say that.

Your analysis of the problem is dead on. In reality I'm converting
parameterized sql statements from SqlServer format to Oracle format and
vice versa (thus substituting :'s and @'s), but your example works just
as well. I'm surprised this is such a hard problem.

Since I can't figure out how to do it with a regex I'm just using
regex's to extract the literals and doing replaces on the other bits.

----------------------
Gary McCullough
http://mccull.org/gary

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 21 '05 #4

P: n/a
Jason,

I was afraid you'd say that.

Your analysis of the problem is dead on. In reality I'm converting
parameterized sql statements from SqlServer format to Oracle format and
vice versa (thus substituting :'s and @'s), but your example works just
as well. I'm surprised this is such a hard problem.

Since I can't figure out how to do it with a regex I'm just using
regex's to extract the literals and doing replaces on the other bits.

----------------------
Gary McCullough
http://mccull.org/gary

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 21 '05 #5

P: n/a
On 9 Nov 2004 08:47:48 -0800, Gary McCullough wrote:
What I want to do sounds simple, but it's defeating me. I want to
substitute all occurences of a colon : character in a string with an @
character -- unless the : occurs within a single or double-quoted
substring. Surely this can be done with regular expressions? Any regex
gurus know how to do it?


This pattern will find all occurrences of a particular charcter except
where it occurs between quotation marks.

This example finds the occurrences of the colon character except where it
appears within quotation marks. If you want single quotes, then change the
\x22 to \x27

Dim sPattern As String = ":(?=([^\x22]*\x22[^\x22]*\x22)*(?![^\x22]*\x22))"
Dim sInput As String = "Meeting tomorrow : 11AM ""Example: 1"""

Dim mc As New MatchCollection =
Regex.Matches(sInput,sPattern,RegExOption.Explicit Capture)

The match collection here should contain only one item the location of the
colon after the word tomorrow.

If you use the replace function:

Dim sResult As string = Regex.Replace(sInput, sPattern, "@")
'sResult = "Meeting tomorrow @ 11AM "Example: 1""

Perhaps this will help you.

--
Chris

dunawayc[AT]sbcglobal_lunchmeat_[DOT]net

To send me an E-mail, remove the "[", "]", underscores ,lunchmeat, and
replace certain words in my E-Mail address.
Jul 21 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.