473,765 Members | 1,958 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regex Novice needs help

I'm writing an app which is going to rely extremely heavily on the
usage of regular expressions. I'm reading the docs but having trouble
wrapping my head around some of this since it's all fairly new to me.
I have two questions, I'm hoping I can get answers to at least one :)
Any help is better than no help:

1) I have many cases I am checking if a particular string matches
against a particular regular expression. However, if the match happens
"inside" the string I don't consider it a match. I need the entire
string to constitute as a match. How can I force this check on the
RegEx engine?

2) Performance is going to be a big factor for this particular app. I
have about 300 pre-determined hardcoded regular expressions, and in
peak scenarios I will be matching incoming strings at a rate of about
10-15 per second. Is there a list of "guidelines " somewhere for
writing performance-aware regular expressions?

Thanks
Zach

Apr 12 '06 #1
3 1247
Zach <di***********@ gmail.com> wrote:
I'm writing an app which is going to rely extremely heavily on the
usage of regular expressions. I'm reading the docs but having trouble
wrapping my head around some of this since it's all fairly new to me.
I have two questions, I'm hoping I can get answers to at least one :)
Any help is better than no help:

1) I have many cases I am checking if a particular string matches
against a particular regular expression. However, if the match happens
"inside" the string I don't consider it a match. I need the entire
string to constitute as a match. How can I force this check on the
RegEx engine?
Use ^ and $ to specify the start and end of the string.
2) Performance is going to be a big factor for this particular app. I
have about 300 pre-determined hardcoded regular expressions, and in
peak scenarios I will be matching incoming strings at a rate of about
10-15 per second. Is there a list of "guidelines " somewhere for
writing performance-aware regular expressions?


Do you mean you'd be running 300 regular expressions on each of 10-15
seconds per second? I wouldn't like to say for *sure* without testing
it (with examples of the actual regular expressions and sample data)
but I wouldn't have thought that would be a problem.

One important thing is to make sure you build the regular expressions
ahead of time and re-use them rather than creating new ones each time.
Also, use RegexOptions.Co mpiled. I'm sure others will be able to help
further - but the best thing to do to start with is to work out your
regular expressions and create a good sample data set. Then measure,
measure, measure - whenever you change something, run the test data set
through again and record the change to performance. Make sure you keep
that record - don't just do it on a scrap of paper. If possible, keep
the test results in the same source control system as the source, so
you can work out *exactly* which set of test results came from which
version of the code.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Apr 12 '06 #2
What I meant regarding the 300 and the 10-15 numbers is that my entire
set of regular expressions consists of about 300ish. Sometimes I will
have around 10-15 input strings per second to check against these
regular expressions. However, each input string will never be checked
against more than 3-4 regular expressions out of those 300. So a true
worst case is like (10-15)*(3-4) = 30-60 -> 45ish matches per second or
so.

Apr 12 '06 #3
Zach <di***********@ gmail.com> wrote:
What I meant regarding the 300 and the 10-15 numbers is that my entire
set of regular expressions consists of about 300ish. Sometimes I will
have around 10-15 input strings per second to check against these
regular expressions. However, each input string will never be checked
against more than 3-4 regular expressions out of those 300. So a true
worst case is like (10-15)*(3-4) = 30-60 -> 45ish matches per second or
so.


Right - that shouldn't be a problem at all. As ever though, it's worth
measuring. Of course, if the regexes are incredibly complicated, it
could take a long time.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Apr 12 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
794
by: chris | last post by:
i can see the power of regular expressions but am having a bit of a battle getting my head around them. can anyone recommend some BASIC - tutorials for using regex something like th idots guide :) or even total idiots guide :) somewhere that has some simple examples would be good toooo
1
761
by: rdimayuga | last post by:
I need a regex pattern that will match a string starting with zero or one dot's. For example, ".string" and "string" should both match, but something like "estring" should not match. So far, I've tried the following: \.*string {0,1}string {0,1}string {0,1}string
8
2988
by: Johnny | last post by:
I need to determine whether a text box contains a value that does not convert to a decimal. If the value does not convert to a decimal, I want to throw a MessageBox to have the user correct the value in the text box. I have the following code but when the user enters a decimal value the Regex.IsMatch catches it (ex. 250.50 should be allowed, but 250.50.0 should not). My code is as follows: if( ! Regex.IsMatch( tboxQtyCounted.Text,...
2
1650
by: Mortimer Schnurd | last post by:
Hi All, I am a VB 6 programmer who is now trying to learn C#. In doing so, I am trying to convert some of my VB modules to C#. I routinely user Reg Expressions in VB and am having some trouble trying to use Regex in C#. Basically, I have a fixed format text file which I need to validate prior to using in a program. The validation insures the data format matches what the program is expecting to find in the file. The pattern I am trying to...
2
4284
by: John Grandy | last post by:
Is it advisable to compile a Regex for a massively scalable ASP.NET web-application ? How exactly does this work ? Do you create a separate class library and expose the Regex.Replace() as a method ..... ..... or can you just use the syntax :
17
3979
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher http://forta.com/books/0672325667/
7
2232
by: Mike Labosh | last post by:
I have the following System.Text.RegularExpressions.Regex that is supposed to remove this predefined list of garbage characters from contact names that come in on import files : Dim _dropContactGarbage As New Regex( _ "(+)|" & _ "(+)|" & _ "(+)|" & _ "(+)|" & _ "(+)|" & _
0
1253
by: Sebosac | last post by:
hi, novice on regex, i'm searching for THE master Regex will retieve php variable name like "$varname" in my script. $tagparse = fil_gzet_contents('myscript.php'); preg_match_all("(.\$)", $tagparse, $out ); thank for your help
11
4943
by: coflo | last post by:
Hello I would like to replace an a href link that is provided in the RSS below with my own link. The link that I am looking to replace is defined in the <description> tag within the RSS. Im guessing I need to use some sort of function in combination with regex. I am able to create a regex to find and replace the a href link very easily; however, using regex in combination with a find and replace in XSL i am extremly novice. I am wanting...
0
9568
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9398
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10160
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10007
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9832
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8831
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6649
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5275
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3531
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.