New problem;
If I insert a new parameter before my commandText, it corrupts again!
new parse string is ...
database=Application;commandText=insert into [Trace] (Text) values(@category
+ N': ' + @message);commandType=StoredProcedure; message=@message;
category=@category
All a single line again, but in this instance the first "name" returned is
"database=Application;commandText" with a "value" of "insert into [Trace]
(Text) values(@category + N': ' + @message);" - After this initial hickup,
it catches up and reports the remaining pairs correctly.
Can I be pushy and ask for more help?
Cheers.
Martin.
"Greg Bacon" <gbacon@hiwaay.net> wrote in message
news:11b5saha7r6b516@corp.supernews.com...[color=blue]
> In article <O4Q9dJzcFHA.3912@TK2MSFTNGP15.phx.gbl>,
> Martin Robins <sorry@noaddress.spam> wrote:
>
> : The string I am trying to parse is as follows:
> : commandText=insert into [Trace] (Text) values (@message + N': ' +
> : @category);commandType=StoredProcedure; message=@message;
> : category=@category
> : [...]
> : The regular expression code is as follows:
> : Regex regex = new
> : Regex(@"(?<name>[^=]*)=(?<value>[^(?:;|$)]*)(?:;|$)",
> : RegexOptions.ExplicitCapture);
>
> Part of your problem is that most metacharacters lose their special
> meanings inside character classes. I doubt that you meant to say
> that a value is zero or more characters that aren't parentheses,
> question mark, colon, semicolon, pipe, and dollar sign.
>
> The trickier part was figuring out why it matched the first *name*
> as "insert into...commandType". At first, I thought it might have
> been a longest-leftmost issue[*], but then I realized it was due to
> a combination of the character class misunderstanding and your trailing
> "anchor."
>
>[*] A POSIX thing -- see pg. 116 of Friedl's *Mastering Regular
> Expressions* or
http://shurl.org/friedl-longest-leftmost
>
> When the matching engine tries the real first value ("insert...
> @category)"), it sees the left parenthesis before @message and
> says, 'Wait, a value can't have any parentheses because of the
> given character class."
>
> It then tries to backtrack, but it can't match the trailing anchor,
> i.e., there's no semicolon or end-of-line to the left of that
> paren before @message.
>
> 'Okay,' it thinks, 'I must've matched a bad substring for name,'
> but a name is zero or more characters that aren't equals signs.
> The next place that can start is "insert into...", and the greedy
> star quantifier sucks up everything up to "commandType". The
> rest of the pattern can match from there, and that explains the
> faulty match.
>
> Consider the following snippet:
>
> static void Main(string[] args)
> {
> string str =
> @"commandText=insert into [Trace] (Text) values (@message + N': ' +
> @category);commandType=StoredProcedure ; message=@message;
> category=@category";
>
> Regex nameval = new Regex(
> @"(?<name>\S+)\s*=\s*(?<val>[^;]+?)\s*(;|$)",
> RegexOptions.Singleline);
>
> foreach (Match m in nameval.Matches(str))
> {
> Console.WriteLine(
> "name=[{0}], val=[{1}]",
> m.Groups["name"].ToString(),
> m.Groups["val"].ToString());
> }
> }
>
> Its output is
>
> name=[commandText], val=[insert into [Trace] (Text) values (@message +
> N': ' +
> @category)]
> name=[commandType], val=[StoredProcedure]
> name=[message], val=[@message]
> name=[category], val=[@category]
>
> Here we define a name as a run of non-whitespace characters (\S+). By
> matching optional whitespace (\s*) and excluding it from the capturing
> parentheses, we save the trim steps from your code.
>
> The val subpattern is similar: a val is a run of non-semicolon
> characters. One place to pay attention is the +? quantifier. Remember
> that * (zero or more of..) and + (one or more of..) are greedy: they
> grab as much text as they can. The ? versions (think of them as
> cautious or timid) are very anxious to turn control over to the next
> part of the expression.
>
> If the val subpattern had been [^;]+ instead of [^;]+?, any trailing
> whitespace would be consumed as part of val, but \s* would still happily
> matched the empty string. (Remember that starred expressions *always*
> succeed, although perhaps by matching nothing.)
>
> This is mostly a polish issue. Using the non-greedy plus gives \s*
> a chance to throw away whitespace. Again, this saves the extra trim
> steps.
>
> One more important note: because the final name-val pair may be
> terminated by end-of-string instead of a semicolon, use of
> RegexOptions.Singleline is important because it changes $ to mean
> only end-of-string. (I wasn't sure if the newlines in your example
> were an artifact of posting to Usenet or whether they might actually
> be there, so I took the conservative route.)
>
> I hope this helps.
>
> Greg[/color]