Connecting Tech Pros Worldwide Forums | Help | Site Map

Need help with Regex

Danny Ni
Guest
 
Posts: n/a
#1: Jul 31 '08
Hi,

The following code snippet is causing CPU to max out on my local machine and
production servers. It looks fine on Expresso though.

Regex rgxVideo = new
Regex(@"<embed(\s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s+src=\s*(""|')?http://www.g4tv.com/i?sv3?/(?<videokey>\d+)(""|')?(\s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s*(/\s*>|>\s*</embed>)",
RegexOptions.IgnoreCase);
string strBody = "<embed name=\"VideoPlayer\"
src=\"http://localhost/lv3/26757\" width=\"480\" height=\"418\"
scale=\"ShowAll\" loop=\"loop\" menu=\"menu\" wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\" src=\"http://localhost/lv3/19251\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\" menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\" src=\"http://localhost/lv3/20202\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\" menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\" src=\"http://localhost/lv3/16549\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\" menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>";
foreach (Match objMatch in rgxVideo.Matches(strBody)) // loop
indefinitely here
{


}

TIA






The Colorado Kid
Guest
 
Posts: n/a
#2: Jul 31 '08

re: Need help with Regex


Hello Danny,

I've found Expresso doesn't work well enough for .Net regex. Use the regex
designer at http://www.radsoftware.com.au/ it's free. Check it with that.
Quote:
Hi,
>
The following code snippet is causing CPU to max out on my local
machine and production servers. It looks fine on Expresso though.
>
Regex rgxVideo = new
Regex(@"<embed(\s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s+src=\s*
(""|')?http://www.g4tv.com/i?sv3?/(?<videokey>\d+)(""|')?(\s+[a-z]+\s*
=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s*(/\s*>|>\s*</embed>)",
RegexOptions.IgnoreCase);
string strBody = "<embed name=\"VideoPlayer\"
src=\"http://localhost/lv3/26757\" width=\"480\" height=\"418\"
scale=\"ShowAll\" loop=\"loop\" menu=\"menu\" wmode=\"Window\"
quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\"
src=\"http://localhost/lv3/19251\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\"
menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\"
src=\"http://localhost/lv3/20202\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\"
menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\"
src=\"http://localhost/lv3/16549\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\"
menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>";
foreach (Match objMatch in rgxVideo.Matches(strBody)) //
loop
indefinitely here
{
}
>
TIA
>

The Colorado Kid
Guest
 
Posts: n/a
#3: Jul 31 '08

re: Need help with Regex


Hello Danny,

I've found Expresso doesn't work well enough for .Net regex. Use the regex
designer at http://www.radsoftware.com.au/ it's free. Check it with that.
Quote:
Hi,
>
The following code snippet is causing CPU to max out on my local
machine and production servers. It looks fine on Expresso though.
>
Regex rgxVideo = new
Regex(@"<embed(\s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s+src=\s*
(""|')?http://www.g4tv.com/i?sv3?/(?<videokey>\d+)(""|')?(\s+[a-z]+\s*
=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s*(/\s*>|>\s*</embed>)",
RegexOptions.IgnoreCase);
string strBody = "<embed name=\"VideoPlayer\"
src=\"http://localhost/lv3/26757\" width=\"480\" height=\"418\"
scale=\"ShowAll\" loop=\"loop\" menu=\"menu\" wmode=\"Window\"
quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\"
src=\"http://localhost/lv3/19251\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\"
menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\"
src=\"http://localhost/lv3/20202\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\"
menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\"
src=\"http://localhost/lv3/16549\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\"
menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>";
foreach (Match objMatch in rgxVideo.Matches(strBody)) //
loop
indefinitely here
{
}
>
TIA
>

=?Utf-8?B?S290dGVrb2U=?=
Guest
 
Posts: n/a
#4: Aug 1 '08

re: Need help with Regex


Danny,

I tried this in Expresso and it predicts the same behavior you should see in
code, namely that the execution time of your regex grows exponentially with
the size of the input string. I'm guessing that when you tested it in
Expresso, you used a shorter input string or one that easily found a match,
thereofe it terminated quickly. The example in your code does not have a
match (for example, "g4tv" will never match). The regex engine has to try
every possible permutation of your regex hunting for a match. The number of
permutations grows exponentially with the size of the string, so your
application hangs, while it continues to try new permutations. There are
dangerous things in your regex design that cause this. Be very careful with
nested quantifiers, especially when applied to wildcards, like (.*)*. Things
like this can cause the execution time to double every time a single
character is added to the input text. It may work fine for 100 characters,
but add 10 more and the execution time goes up by a factor of 1000, or add 20
characters (a 20% increase in length) and the times goes up by one million
times.

JWT

P.S. I don't know what the Colorado Kid is talking about. Expresso is
specifically designed to work with .NET regex.

"Danny Ni" wrote:
Quote:
Hi,
>
The following code snippet is causing CPU to max out on my local machine and
production servers. It looks fine on Expresso though.
>
Regex rgxVideo = new
Regex(@"<embed(\s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s+src=\s*(""|')?http://www.g4tv.com/i?sv3?/(?<videokey>\d+)(""|')?(\s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s*(/\s*>|>\s*</embed>)",
RegexOptions.IgnoreCase);
string strBody = "<embed name=\"VideoPlayer\"
src=\"http://localhost/lv3/26757\" width=\"480\" height=\"418\"
scale=\"ShowAll\" loop=\"loop\" menu=\"menu\" wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\" src=\"http://localhost/lv3/19251\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\" menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\" src=\"http://localhost/lv3/20202\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\" menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>" +
"<embed name=\"VideoPlayer\" src=\"http://localhost/lv3/16549\"
width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\" menu=\"menu\"
wmode=\"Window\" quality=\"1\"
type=\"application/x-shockwave-flash\"></embed>";
foreach (Match objMatch in rgxVideo.Matches(strBody)) // loop
indefinitely here
{
>
>
}
>
TIA
>
>
>
>
>
>
The Colorado Kid
Guest
 
Posts: n/a
#5: Aug 1 '08

re: Need help with Regex


Kottekoe,

I used to use Expresso for regex testing in my .Net programs, but one day,
something worked in Expresso, but didn't in actual .Net so I ditched it for
the Rad Regex Designer, which is a great tool. I liked Expresso, but find
Rad's better.
Quote:
Danny,
>
I tried this in Expresso and it predicts the same behavior you should
see in code, namely that the execution time of your regex grows
exponentially with the size of the input string. I'm guessing that
when you tested it in Expresso, you used a shorter input string or one
that easily found a match, thereofe it terminated quickly. The example
in your code does not have a match (for example, "g4tv" will never
match). The regex engine has to try every possible permutation of your
regex hunting for a match. The number of permutations grows
exponentially with the size of the string, so your application hangs,
while it continues to try new permutations. There are dangerous things
in your regex design that cause this. Be very careful with nested
quantifiers, especially when applied to wildcards, like (.*)*. Things
like this can cause the execution time to double every time a single
character is added to the input text. It may work fine for 100
characters, but add 10 more and the execution time goes up by a factor
of 1000, or add 20 characters (a 20% increase in length) and the times
goes up by one million times.
>
JWT
>
P.S. I don't know what the Colorado Kid is talking about. Expresso is
specifically designed to work with .NET regex.
>
"Danny Ni" wrote:
>
Quote:
>Hi,
>>
>The following code snippet is causing CPU to max out on my local
>machine and production servers. It looks fine on Expresso though.
>>
>Regex rgxVideo = new
>>
>Regex(@"<embed(\s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s+src=\s
>*(""|')?http://www.g4tv.com/i?sv3?/(?<videokey>\d+)(""|')?(\s+[a-z]+\
>s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s*(/\s*>|>\s*</embed>)",
>>
>RegexOptions.IgnoreCase);
>>
>string strBody = "<embed name=\"VideoPlayer\"
>>
>src=\"http://localhost/lv3/26757\" width=\"480\" height=\"418\"
>>
>scale=\"ShowAll\" loop=\"loop\" menu=\"menu\" wmode=\"Window\"
>quality=\"1\"
>>
>type=\"application/x-shockwave-flash\"></embed>" +
>>
>"<embed name=\"VideoPlayer\" src=\"http://localhost/lv3/19251\"
>>
>width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\"
>menu=\"menu\"
>>
>wmode=\"Window\" quality=\"1\"
>>
>type=\"application/x-shockwave-flash\"></embed>" +
>>
>"<embed name=\"VideoPlayer\" src=\"http://localhost/lv3/20202\"
>>
>width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\"
>menu=\"menu\"
>>
>wmode=\"Window\" quality=\"1\"
>>
>type=\"application/x-shockwave-flash\"></embed>" +
>>
>"<embed name=\"VideoPlayer\" src=\"http://localhost/lv3/16549\"
>>
>width=\"480\" height=\"418\" scale=\"ShowAll\" loop=\"loop\"
>menu=\"menu\"
>>
>wmode=\"Window\" quality=\"1\"
>>
>type=\"application/x-shockwave-flash\"></embed>";
>>
>foreach (Match objMatch in rgxVideo.Matches(strBody)) // loop
>>
>indefinitely here
>>
>{
>>
>}
>>
>TIA
>>

Closed Thread