473,725 Members | 2,118 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regular Expression help

Hello,

Please forgive if this is not the most appropriate newsgroup for this
question. Unfortunately I didn't find a newsgroup specific to regular
expressions.

I have the following regular expression.

^(.+?) uses (?!a spoon)\.$

I know something is wrong with this, but essentially what I want to do
is match all strings of the form

^(.+?) uses (.+?)\.$

but only if the text inside that last group is not the string "a
spoon".

When this is tested against the string

Jim uses a fork.

the test fails. It says "Jim uses a fork." does not match the regular
expression "^(.+?) uses (?!a spoon)\.$"

Am I missing something obvious here?

Thanks for any help.

May 27 '06 #1
3 2564
Zach wrote:
When this is tested against the string

Jim uses a fork.

the test fails. It says "Jim uses a fork." does not match the regular
expression "^(.+?) uses (?!a spoon)\.$"

Am I missing something obvious here?


In your pattern you kind of say to "not match 'a spoon'", but you didn't
tell it to match "anything else instead of it", thats why "Jim uses a
fork." fails.

I think this should work:

string pattern = @"^(.+?) uses (?!a spoon).*?\.$";

That says "dont match 'uses a spoon', but match 'uses anything else'".

hth,
Max
May 28 '06 #2
Ahh, thanks. After reading your response I played around with it a
little more. I guess now I know why they're called "zero-width"
assertions. I guess I had thought that saying (?!a spoon)\. would
consume everything from the current location to the next period
character, but only if that was not equal to "a spoon". But it appears
to only verify that the string does not follow, without moving the
current position forward (hence the zero-width). The expression you
suggested works, thanks.

May 28 '06 #3
Okay, you're playing a little fast and loose with your rules, but based upon
my best guess as to what your rules are, let me start with my best guess as
to what the rules are:

First, you state that you want to match "all strings of the form..." This
does not make it clear whether these "strings" have multiple lines in them.
But based upon your syntax, which employs the '^' (Start of string) and '$'
(end of string) characters, I'm going to assume that perhaps you will be
working with multi-lined text (otherwise, at least one of the start of
string and end of string characters would be irrelevant). Based upon that
assumption, my derived rule begins thus:

A match begins at the start of a line, and ends at the end of a line (not
only at the start and end of a string)

Next, your regular expression indicates that any non-line-break character is
a match, *as long as* it is followed by [the string "uses " *not* followed
by the string "a spoon" *but is* followed by some other characters], and
ending with a period.

This breaks down into the following rule set:

1. A match begins at the beginning of a line.
2. It begins with at least one non-line-break character.
3. That sequence of character or characters *must* be followed by the
literal "uses ".
4. The literal "uses " *must not* be followed by the literal "a spoon".
5. The literal "uses " *must* be followed by at least one non-line-break
character.
6. The total sequence *must* end with a period.
7. The period *must* be at the end of a line.

Note that if any of these conditions fail, the entire match fails.

The evaluates to the following regular expression:

(?m)^.+(?=uses (?!a spoon)).+\.$

The first part ("(?m)") indicates that the characters '^' and '$' match at
line breaks, rather then at the beginning and end of a string. You may want
to change this if you're only evaluating a single string, rather than a
series of lines in a single string.

This is followed by '^' (must begin at the beginning of a line or string).

This is followed by '.' with the quantifier '+' (one or more of any
non-line-break character)

This is followed by a positive look-ahead, which states that this must be
followed by the literal "uses ". In addition, the literal "uses " is
followed by a negative look-ahead which prohibits the match if "uses " is
followed by the literal "a spoon". Both conditions must be true in order to
match the first part of the regular expression (must be followed by "uses "
*not* followed by "a spoon").

This is followed by '.' with the quantifier '+' (one or more of any
non-line-break character), meaning that the word "uses " must be followed by
one or more non-line-break characters.

This is followed by "\." (a period, one time), followed by a line break or
end of string.

I tested it with the following (the ones with numbers matched):

blah hoiuyy uses ouhsd7)) u. [1]
d;fj uses a fork uses spoon. [2]
lkjhlkjh uses a spoon.
usese a soon or a fork and is the glliiig.
spen duh a spoon
he uses a spoon.
he uses a fork. [3]
popiu hig spoon uses something. [4]
She uses.
uses a fork.
use a spoon.
He uses a spoon, but also uses a fork. [5]
She uses a spoon, not a fork.
He uses forks, knives, and a spoon. [6]
He uses a fork
or a spoon.

These matched because "uses " is not the first word in the line, is followed
by a sequence of characters that is not "a spoon", and the line ends with a
period.

Some other notable results:

"She uses." The word "uses" is not followed by any characters before the
period.
"uses a fork." The word "uses" is the first word on the line (not preceded
by any characters).
Number 5 succeeds because of the phrase "uses a fork". The phrase "uses a
spoon" is considered as part of the first sequence of "any non-line-break
character". Since this is followed by "uses a fork", that phrase causes the
whole to match.
The last 2 lines do not match because of the line break.

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Numbskull

This is, by definition, not that.

"Zach" <di***********@ gmail.com> wrote in message
news:11******** **************@ 38g2000cwa.goog legroups.com...
Hello,

Please forgive if this is not the most appropriate newsgroup for this
question. Unfortunately I didn't find a newsgroup specific to regular
expressions.

I have the following regular expression.

^(.+?) uses (?!a spoon)\.$

I know something is wrong with this, but essentially what I want to do
is match all strings of the form

^(.+?) uses (.+?)\.$

but only if the text inside that last group is not the string "a
spoon".

When this is tested against the string

Jim uses a fork.

the test fails. It says "Jim uses a fork." does not match the regular
expression "^(.+?) uses (?!a spoon)\.$"

Am I missing something obvious here?

Thanks for any help.

May 28 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
2530
by: Bradley Plett | last post by:
I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL, and then reconstruct another URL based on it. For example, I need to scan a web page looking for something like <a href="some_dir/list_20050815100225.csv">. I don't know in advance what the date/time in the file name will be. I need to take the...
4
3225
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go over each document, find out if it contains a header and/or a footer and extract only the main content part. The headers and the footers have no specific format and I have to detect and remove them using a list of strings that may appear as...
10
3027
by: Lee Kuhn | last post by:
I am trying the create a regular expression that will essentially match characters in the middle of a fixed-length string. The string may be any characters, but will always be the same length. In other words, as the regular expression (....)($) matches the "4567" in the string "1234567", how would I create a similar regular expression that only matches the "45" in the same string. The same regular expression would match "32" in the string...
3
3216
by: James D. Marshall | last post by:
The issue at hand, I believe is my comprehension of using regular expression, specially to assist in replacing the expression with other text. using regular expression (\s*) my understanding is that this will one or more occurrences to replace all the white space between with a comma. This search ElseIf InStr(1, indivline, "$") Then insert a replace statement that uses the regular expression to find and replace all the white space...
7
3828
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I want to avoid that. My question here is if there is a way to pass either a memory stream or array of "find", "replace" expressions or any other way to avoid multiple copies of a string. Any help will be highly appreciated
9
3357
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use an app call The Regulator, which makes it pretty easy to build and test regular expressions. As a warning, I'm real weak with regular expressions. Let's say my regular expression is:
25
5161
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART (CONDUCTION DEFECT) 37.33/2 HEART (CONDUCTION DEFECT) WITH CATHETER 37.34/2 " the expression is "HEART (CONDUCTION DEFECT)". How do I gain access to the expression (not the matches) at runtime? Thanks, Mike
3
1838
by: Mr.Steskal | last post by:
Posted: Wed Jul 11, 2007 7:01 am Post subject: Regular Expression Help -------------------------------------------------------------------------------- I need help writing a regular expression that only returns part of a string. For Example I have a multi-line text fragment like below:
18
622
by: Lit | last post by:
Hi, I am looking for a Regular expression for a password for my RegExp ValidationControl Requirements are, At least 8 characters long. At least one digit At least one upper case character
0
8888
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8752
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9401
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9257
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
6011
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4784
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3221
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2635
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2157
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.