473,506 Members | 9,749 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regular expressions performance problem

I wanted to use regular expressions but unfortunetely it is too
slow.....Should they be so slow or am i doing something wrong. I am
reading in bytes from a file then converting them to char then making a
string out of each of the individual bytes. I check if its in the
correct format...and take out the various paretres i need. It looked
nice and neat so i am not happy that i may have to use another
method.....any alternative solutions?.

*** Sent via Devdex http://www.devdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 16 '05 #1
5 5416
Regex is the fastest method to search in all languages, the only problem
with it is complicated syntax. I think that your performance issue is only
result of wrong regex. Please post, if you want, your regex and text you are
looking in and we try to help you withit.
--
Tamir Khason
You want dot.NET? Just ask:
"Please, www.dotnet.us "
"James Dean" <m_*******@yahoo.com> wrote in message
news:ue**************@TK2MSFTNGP12.phx.gbl...
I wanted to use regular expressions but unfortunetely it is too
slow.....Should they be so slow or am i doing something wrong. I am
reading in bytes from a file then converting them to char then making a
string out of each of the individual bytes. I check if its in the
correct format...and take out the various paretres i need. It looked
nice and neat so i am not happy that i may have to use another
method.....any alternative solutions?.

*** Sent via Devdex http://www.devdex.com ***
Don't just participate in USENET...get rewarded for it!

Nov 16 '05 #2
"James Dean" <m_*******@yahoo.com> wrote in
news:ue**************@TK2MSFTNGP12.phx.gbl...
I wanted to use regular expressions but unfortunetely it is too
slow.....Should they be so slow or am i doing something wrong.
Usually they're pretty fast. In my experience, they are faster than most
other available searching methods (like String.IndexOf).
I am
reading in bytes from a file then converting them to char then making a
string out of each of the individual bytes.
Do you use a StringBuilder to create the string?
Why don't you use a StreamReader class?
I check if its in the
correct format... and take out the various paretres i need. It looked
nice and neat so i am not happy that i may have to use another
method.....any alternative solutions?.


As I said, regexes are usually quite fast, but that does of course depend on
the pattern you match for. So you could do the following:
1. Find out if the regex is really the bottleneck.
2. If it is, post your regex here, maybe with some sample data.

Niki
Nov 16 '05 #3

byte myValue = fileMemBufferIn.GetCurrentByte

char convertValue = Convert.ToChar(myValue);
//Select is a string type
Select += Convert.ToString(convertValue);
i use this to check the string to see if its in the correct
format
if(Regex.Match(Select,@"((&|;)\d+(&|;))+[A-Za-z]{1}").Success)

then i get each of the parameters from command:
public bool GetCommand(string headerValue)
{
paramCount = 0;
Regex SeperateParams = new Regex(@"(&|;)\d+((&|;) | (&[a-zA-Z]))*");
foreach(Match myMatches in SeperateParams.Matches(headerValue))
{
string values = myMatches.Value.ToString();
values = Regex.Replace(values,@"&","");
values = Regex.Replace(values,@";","");
values = Regex.Replace(values,@"[A-Za-z]","");
HeaderParameters[paramCount] = int.Parse(values);
paramCount++;

}
this.GetParameter1 = HeaderParameters[0];this.GetParameter2 =
HeaderParameters[1];this.GetParameter3 =
HeaderParameters[2];this.GetParameter4 = HeaderParameters[3];
for(int t = 0;t < 4;t++)
{
HeaderParameters[t] = -1;
}
return true;

return false;

}

The commands i read in are in the format
"&Param1;Param2;Param3.....&a(Some character from a-z)
*** Sent via Devdex http://www.devdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 16 '05 #4
The performance problem is most probably due to your string usage, not due
to the regex.

"James Dean" <m_*******@yahoo.com> wrote in
news:ud**************@TK2MSFTNGP09.phx.gbl...

byte myValue = fileMemBufferIn.GetCurrentByte

char convertValue = Convert.ToChar(myValue);
//Select is a string type
Select += Convert.ToString(convertValue);
everytime this line is hit, it will create a new string, copy the old one
into it, discard the old one, and continue with the new one. You actually
have a O(n^2) string reading algorithm... Use the StreamReader class!
i use this to check the string to see if its in the correct
format
if(Regex.Match(Select,@"((&|;)\d+(&|;))+[A-Za-z]{1}").Success)
You didn't supply and sample data. It does look ok, though.
then i get each of the parameters from command:
public bool GetCommand(string headerValue)
{
paramCount = 0;
Regex SeperateParams = new Regex(@"(&|;)\d+((&|;) | (&[a-zA-Z]))*");
foreach(Match myMatches in SeperateParams.Matches(headerValue))
{
string values = myMatches.Value.ToString();
values = Regex.Replace(values,@"&","");
values = Regex.Replace(values,@";","");
values = Regex.Replace(values,@"[A-Za-z]","");
Ugh. You know that each of those "replace" operations again has to create a
new string (see above)? Use capturing paranthesis to get data out of the
regex.
HeaderParameters[paramCount] = int.Parse(values);
Does HeaderParameters adjust it's size automatically? Or do you set it to
the correct size at some point?
paramCount++;

}
this.GetParameter1 = HeaderParameters[0];this.GetParameter2 =
HeaderParameters[1];this.GetParameter3 =
HeaderParameters[2];this.GetParameter4 = HeaderParameters[3];
Whatever this might be, it pretty sure looks ugly. It looks as it should
better be a loop.
for(int t = 0;t < 4;t++)
{
HeaderParameters[t] = -1;
}
Using something like "HeaderParameters.Length" is probably better here, as
it would allow you to change the size of HeaderParameters some day without
breaking code like that.
return true;

return false;

}

The commands i read in are in the format
"&Param1;Param2;Param3.....&a(Some character from a-z)


Hardly. The regex you use to check your input for correctness won't eat
this.

Niki
Nov 16 '05 #5
What do you want to parse?
strings such as
&anything;anything;anything&anything;anything;anyt hing;anything;anything;any
thing;anything;anything ?
--
Tamir Khason
You want dot.NET? Just ask:
"Please, www.dotnet.us "
"James Dean" <m_*******@yahoo.com> wrote in message
news:ud**************@TK2MSFTNGP09.phx.gbl...

byte myValue = fileMemBufferIn.GetCurrentByte

char convertValue = Convert.ToChar(myValue);
//Select is a string type
Select += Convert.ToString(convertValue);
i use this to check the string to see if its in the correct
format
if(Regex.Match(Select,@"((&|;)\d+(&|;))+[A-Za-z]{1}").Success)

then i get each of the parameters from command:
public bool GetCommand(string headerValue)
{
paramCount = 0;
Regex SeperateParams = new Regex(@"(&|;)\d+((&|;) | (&[a-zA-Z]))*");
foreach(Match myMatches in SeperateParams.Matches(headerValue))
{
string values = myMatches.Value.ToString();
values = Regex.Replace(values,@"&","");
values = Regex.Replace(values,@";","");
values = Regex.Replace(values,@"[A-Za-z]","");
HeaderParameters[paramCount] = int.Parse(values);
paramCount++;

}
this.GetParameter1 = HeaderParameters[0];this.GetParameter2 =
HeaderParameters[1];this.GetParameter3 =
HeaderParameters[2];this.GetParameter4 = HeaderParameters[3];
for(int t = 0;t < 4;t++)
{
HeaderParameters[t] = -1;
}
return true;

return false;

}

The commands i read in are in the format
"&Param1;Param2;Param3.....&a(Some character from a-z)
*** Sent via Devdex http://www.devdex.com ***
Don't just participate in USENET...get rewarded for it!

Nov 16 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2000
by: Tom | last post by:
I have struggled with the issue of whether or not to use Regular Expressions for a long time now, and after implementing many text manipulating solutions both ways, I've found that writing...
18
3014
by: Q. John Chen | last post by:
I have Vidation Controls First One: Simple exluce certain special characters: say no a or b or c in the string: * Second One: I required date be entered in "MM/DD/YYYY" format: //+4 How...
4
5141
by: Együd Csaba | last post by:
Hi All, I'd like to "compress" the following two filter expressions into one - assuming that it makes sense regarding query execution performance. .... where (adate LIKE "2004.01.10 __:30" or...
4
395
by: rufus | last post by:
I need to parse some HTML and add links to some keywords (up to 1000) defined in a DB table. What I need to do is search for these keywords and if they are not already a link, and they are not...
7
3794
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
25
5129
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
12
363
by: =?Utf-8?B?SlA=?= | last post by:
I am a newbie to regular expressions and want to extract a number from the end of a string. The string would have these formats: image/4567 image/45678 image/456789 I would also want to...
1
4362
by: Allan Ebdrup | last post by:
I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find...
47
3380
by: Henning_Thornblad | last post by:
What can be the cause of the large difference between re.search and grep? This script takes about 5 min to run on my computer: #!/usr/bin/env python import re row="" for a in range(156000):...
0
7220
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7105
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7308
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7479
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5617
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4702
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3188
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1534
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
410
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.