473,396 Members | 1,918 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Regex at Index == 0

Hello:

I have an input string that is supposed to follow a particular syntax.
I am making a simple parser. In doing so, I need to write a lexical
analyzer. I have decided to implement it in terms of regular
expressions.

I want to take a list of regular expressions and iterate over them,
finding the first one that matches. Of course, it has to match AND the
match must start at index == 0.

The problem I have is that the regex has to look through the whole
string. I want to limit it to being the first position. Would
appending ^ to every regex work? I am just looking for a way to short-
circuit.

Thanks,
Travis
Sep 23 '08 #1
5 1353
<je**********@gmail.comwrote in message
news:f1**********************************@f36g2000 hsa.googlegroups.com...
Hello:

I have an input string that is supposed to follow a particular syntax.
I am making a simple parser. In doing so, I need to write a lexical
analyzer. I have decided to implement it in terms of regular
expressions.

I want to take a list of regular expressions and iterate over them,
finding the first one that matches. Of course, it has to match AND the
match must start at index == 0.

The problem I have is that the regex has to look through the whole
string. I want to limit it to being the first position. Would
appending ^ to every regex work? I am just looking for a way to short-
circuit.
Yes ^ would require that the pattern match only if the first character in
the search string matches the first character in the expression.

How many RegExps are you using, if just a few had you considered using a
single RegExp:-

^(?:(exp1)|(expr2)|...|(exprN))

Also are you tokenising? E.g., a form of command and value list syntax?
--
Anthony Jones - MVP ASP/ASP.NET

Sep 23 '08 #2
je**********@gmail.com wrote:
I want to take a list of regular expressions and iterate over them,
finding the first one that matches. Of course, it has to match AND the
match must start at index == 0.

The problem I have is that the regex has to look through the whole
string. I want to limit it to being the first position. Would
appending ^ to every regex work? I am just looking for a way to short-
circuit.
If you want to ensure that your regular expression matches only from the
beginning of the string then you indeed need to prefix your regular
expression with ^.
--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Sep 23 '08 #3
On Sep 23, 8:02*am, "Anthony Jones" <AnthonyWJo...@yadayadayada.com>
wrote:
<jehugalea...@gmail.comwrote in message

news:f1**********************************@f36g2000 hsa.googlegroups.com...


Hello:
I have an input string that is supposed to follow a particular syntax.
I am making a simple parser. In doing so, I need to write a lexical
analyzer. I have decided to implement it in terms of regular
expressions.
I want to take a list of regular expressions and iterate over them,
finding the first one that matches. Of course, it has to match AND the
match must start at index == 0.
The problem I have is that the regex has to look through the whole
string. I want to limit it to being the first position. Would
appending ^ to every regex work? I am just looking for a way to short-
circuit.

Yes ^ would require that the pattern match only if the first character in
the search string matches the first character in the expression.

How many RegExps are you using, if just a few had you considered using a
single RegExp:-

^(?:(exp1)|(expr2)|...|(exprN))

Also are you tokenising? E.g., a form of command and value list syntax?

--
Anthony Jones - MVP ASP/ASP.NET- Hide quoted text -

- Show quoted text -
Yes, I am tokenizing. It is a syntax similar to a SQL where clause. It
seems simple enough. I have to know what type of token I have found. I
have that covered already with my token class. It tells me the token
type and the actual value. Hopefully this will work out smoothly.
Sep 23 '08 #4
Yes, I am tokenizing. It is a syntax similar to a SQL where clause. It
seems simple enough. I have to know what type of token I have found. I
have that covered already with my token class. It tells me the token
type and the actual value. Hopefully this will work out smoothly.- Hide quoted text -

- Show quoted text -
Awesome! I got my parser working. The intent of the code was to allow
my database library to generate the WHERE clause of a SQL statement by
interpretting conditions on properties of my business objects.

So, say I have a Customer class with properties that map to a database
table. My library has features for mapping between database tables and
classes via XML or Attributes (similar to NHibernate or LINQ). I
wanted to allow the users of my library to get away from SQL
completely. So, they can now say something like this:

foreach (Customer customer in registry.CreateQuery<Customer>(
"(Name = {0} AND State = {1}) OR Type = {2}",
"Microsoft", "WA", "Preferred"))
{
}

My code will generate database-specific SQL with the actual table name
and column names used instead of the class and property names.

The cool thing now is that I don't need to write SQL anymore and so my
code becomes a little more database independent. I generate my classes
and attributes/mapping files using a tool I wrote too, so I don't have
to worry about writing that either.

DB development just became a whole lot easier for my users. Yay!

Thanks for everyone's input.

~Travis
Sep 23 '08 #5


<je**********@gmail.comwrote in message
news:ed**********************************@i76g2000 hsf.googlegroups.com...
>Yes, I am tokenizing. It is a syntax similar to a SQL where clause. It
seems simple enough. I have to know what type of token I have found. I
have that covered already with my token class. It tells me the token
type and the actual value. Hopefully this will work out smoothly.- Hide
quoted text -

- Show quoted text -

Awesome! I got my parser working. The intent of the code was to allow
my database library to generate the WHERE clause of a SQL statement by
interpretting conditions on properties of my business objects.

So, say I have a Customer class with properties that map to a database
table. My library has features for mapping between database tables and
classes via XML or Attributes (similar to NHibernate or LINQ). I
wanted to allow the users of my library to get away from SQL
completely. So, they can now say something like this:

foreach (Customer customer in registry.CreateQuery<Customer>(
"(Name = {0} AND State = {1}) OR Type = {2}",
"Microsoft", "WA", "Preferred"))
{
}

My code will generate database-specific SQL with the actual table name
and column names used instead of the class and property names.

The cool thing now is that I don't need to write SQL anymore and so my
code becomes a little more database independent. I generate my classes
and attributes/mapping files using a tool I wrote too, so I don't have
to worry about writing that either.

DB development just became a whole lot easier for my users. Yay!

Thanks for everyone's input.

~Travis
Hmm....make sure you are protecting against injection attacks....

Mythran
Sep 24 '08 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Tap | last post by:
I would like to parse the following string with as little code as possible. stringValue = "Email Message ID:TAPASVI to...
5
by: JackRazz | last post by:
Anyone know the regular expression to match a blank line where the byte sequence is "0D 0A 0D 0A" ive tried "\r\n\r\n+", "^$+" "\n\r" with no success. Any Ideas? Thanks - JackRazz This is...
8
by: Just Me | last post by:
I want to use regular expressions to search a string, give the user the option of replacing, and then maybe replacing the data - using reg expressions for the search and the replace strings. ...
7
by: lgbjr | last post by:
Hi All, I'm trying to split a string on every character. The string happens to be a representation of a hex number. So, my regex expression is (). Seems simple, but for some reason, I'm not...
3
by: jg | last post by:
I made a mistake somewhere in my vb code and I look, check and read against the articles and help on regex, I still can't find the mistake I made. I know my test string and the test patterns...
5
by: Kofi | last post by:
Any takers? Got a string of DNA as an input sequence GGATGGATG, apply the simple regex "GGATG" as in Regex r = new Regex("GGATG", (RegexOptions.Compiled)); MatchCollection matches =...
11
by: Steve | last post by:
Hi All, I'm having a tough time converting the following regex.compile patterns into the new re.compile format. There is also a differences in the regsub.sub() vs. re.sub() Could anyone lend...
6
by: Gary Bond | last post by:
Hi All, Being a bit of a newbie with regex, I am confused when using word boundaries. For instance, I want to replace all the stand alone '.5k' that occur in an input string, with 500. In other...
1
by: jonnyboy6969 | last post by:
Hi All Really hoping someone can help me out here with my deficient regex skills :) I have a function which takes a string of HTML and replaces a term (word or phrase) with a link. The pupose...
1
by: al.moorthi | last post by:
the below program is working in Suse and not working on Cent 5: can any body have the solution ? #include <regex.h> #include <stdlib.h> #include <stdio.h> int main(){ char cool =...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.