RegExp question

Michael McGarry

Hi,

I would like to form a regular expression to find a few different
tokens (and, or, xor) followed by some variable number of whitespace
(i.e., tabs and spaces) followed by a hash mark (i.e., #). What would
be the regular expression for this?

Thanks for any help,

Michael

Apr 11 '06 #1

Subscribe Reply

1595

Tim Chase

> I would like to form a regular expression to find a few

different tokens (and, or, xor) followed by some variable
number of whitespace (i.e., tabs and spaces) followed by
a hash mark (i.e., #). What would be the regular
expression for this?

(and|or|xor)\s*#

Unless "varible number of whitespace" means "at least *some*
whitespace", in which case you'd want to use

(and|or|xor)\s+#

Both are beautiful and precise.

-tim

Apr 11 '06 #2

Michael McGarry

Tim,

for some reason that does not seem to do the trick.

I am testing it with grep. (i.e., grep -e '(and|or|xor)\s*#' myfile)

Michael

Apr 11 '06 #3

Paul McGuire

"Michael McGarry" <mi*************@gmail.com> wrote in message
news:11**********************@t31g2000cwb.googlegr oups.com...

Hi,

I would like to form a regular expression to find a few different
tokens (and, or, xor) followed by some variable number of whitespace
(i.e., tabs and spaces) followed by a hash mark (i.e., #). What would
be the regular expression for this?

Thanks for any help,

Michael

Using pyparsing, whitespace is implicitly ignored. Your expression would
look like:

oneOf("and or xor") + Literal("#")
Here's a complete example:
from pyparsing import *

pattern = oneOf("and or xor") + Literal("#")

testString = """
z = (a and b) and #XVAL;
q = z xor #YVAL;
"""
# use scanString to locate matches
for tokens,start,end in pattern.scanString(testString):
print tokens[0], tokens.asList()
print line(start,testString)
print (" "*(col(start,testString)-1)) + "^"
print
print
# use transformString to locate matches and substitute values
subs = {
'XVAL': 0,
'YVAL': True,
}
def replaceSubs(st,loc,toks):
try:
return toks[0] + " " + str(subs[toks[2]])
except KeyError:
pass

pattern2 = (pattern + Word(alphanums)).setParseAction(replaceSubs)
print pattern2.transformString(testString)

-----------------
Prints:
and ['and', '#']
z = (a and b) and #XVAL;
^

xor ['xor', '#']
q = z xor #YVAL;
^
z = (a and b) and 0;
q = z xor True;
Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul

Apr 11 '06 #4

Heiko Wundram

Am Dienstag 11 April 2006 21:16 schrieb Michael McGarry:

I am testing it with grep. (i.e., grep -e '(and|or|xor)\s*#' myfile)

Test it with Python's re-module, then. \s for matching Whitespace is specific
to Python (AFAIK). And as you've asked in a Python Newsgroup, you'll get
Python-answers here.

--- Heiko.

Apr 11 '06 #5

RunLevelZero

In my opinion you would be best to use a tool like Kiki.
http://project5.freezope.org/kiki/index.html/#

This will allow you to paste in the actual text you want to search and
then play with different RE's and set flags with a simple mouse click
so you can find just what you want. Rember what re.DOTALL does. It
will treat white spaces special and if there are line breaks it will
follow them, otherwise it will not. It's a good idea to have a grasp
of regular expressions or when you come back to your code months /
weeks later, you will be just as lost, and always comment them very
well :).

Just my 2¢

Apr 11 '06 #6

Ben C

On 2006-04-11, Michael McGarry <mi*************@gmail.com> wrote:

Hi,

I would like to form a regular expression to find a few different
tokens (and, or, xor) followed by some variable number of whitespace
(i.e., tabs and spaces) followed by a hash mark (i.e., #). What would
be the regular expression for this?

re.compile(r'(?:and|or|xor)\s*#')

Apr 11 '06 #7

Tim Chase

> I am testing it with grep. (i.e., grep -e '(and|or|xor)\s*#' myfile)

Well, you asked for the python regexp...different
environments use different regexp parsing engines. Your
response is akin to saying "the example snippet of python
code you gave me doesn't work in my Pascal program".

For grep:

grep '$and\|or\|xor$[[:space:]]*#' myfile

For Vim:

:g/$and\|or\|xor$\s*#/

The one I gave originally is a python regexp, and thus
should be tested within python, not grep or vim or emacs or
sed or whatever.

It's always best to test in the real
environment...otherwise, you'll get flakey results.

-tkc

Apr 11 '06 #8

John Machin

(-:
Sorry about Tim. He's not very imaginative. He presumed that because
you asked on comp.lang.python that you would be testing it with Python.
You should have either (a) asked your question on
comp.toolswithfunnynames.grep or (b) not presumed that grep's re syntax
is the same as Python's.
:-)

My grep appears to need something fugly like this:

grep -e "$and\|or\|xor$[ \t]*#" grepre.txt

but my grep is a Windows port which identifies itself as "grep (GNU
grep) 2.5.1" so it's definitely not The One True Grep ...

Now that you're here, why don't you try Python? It's not hard, e.g.

#>>> import re
#>>> rs = re.compile(r"(and|or|xor)\s*#").search
#>>> rs("if foo and #continued")
#<_sre.SRE_Match object at 0x00AE66E0>
#>>> rs("if foo and#continued")
#<_sre.SRE_Match object at 0x00AE6620>
#>>> rs("if foo and bar #continued")
#>>> rs("if foo xor # continued")
#<_sre.SRE_Match object at 0x00AE66E0>
#>>>

HTH,
John

Apr 11 '06 #9

Ben C

On 2006-04-11, Michael McGarry <mi*************@gmail.com> wrote:

Tim,

for some reason that does not seem to do the trick.

I am testing it with grep. (i.e., grep -e '(and|or|xor)\s*#' myfile)

Try with grep -P, which means use perl-compatible regexes as opposed to
POSIX ones. I only know for sure that -P exists for GNU grep.

I assumed it was a Python question! Unless you're testing your Python
regex with grep, not realizing they're different.

Perl and Python regexes are (mostly?) the same.

I usually grep -P because I know Python regexes better than any other
ones.

Apr 11 '06 #10

John Machin

Precise? The OP asked for "tokens".

#>>> re.search(r"(and|or|xor)\s*#", "a = the_operand # gotcha!")
#<_sre.SRE_Match object at 0x00AE6620>

Try this:

#>>> re.search(r"\b(and|or|xor)\s*#", "a = the_operand # should fail")
#>>> re.search(r"\b(and|or|xor)\s*#", "and # OK")
#<_sre.SRE_Match object at 0x00AE6E60>
#>>> re.search(r"\b(and|or|xor)\s*#", "blah blah and # OK")
#<_sre.SRE_Match object at 0x00AE66E0>

Apr 11 '06 #11

Similar topics

regexp question

by: python_charmer2000 | last post by:

I want to match several regexps against a large body of text. What I have so far is similar to this: re1 = <some regexp> re2 = <some regexp> re3 = <some regexp> big_re = re.compile(re1 +...

Python

Regexp optimization question

by: Magnus Lie Hetland | last post by:

I'm working on a project (Atox) where I need to match quite a few regular expressions (several hundred) in reasonably large text files. I've found that this can easily get rather slow. (There are...

Python

Saving search results in a dictionary

by: Lukas Holcik | last post by:

Hi everyone! How can I simply search text for regexps (lets say <a href="(.*?)">(.*?)</a>) and save all URLs(1) and link contents(2) in a dictionary { name : URL}? In a single pass if it could....

Python

RegExp to strip accents while ignoring case

by: Jon Maz | last post by:

Hi All, I want to strip the accents off characters in a string so that, for example, the (Spanish) word "práctico" comes out as "practico" - but ignoring case, so that "PRÁCTICO" comes out as...

C# / C Sharp

Regexp Question: Two Nots Makes a Right to Left?

by: Sped Erstad | last post by:

There must be a simple regexp reason for this little question but it's driving me nuts. Below is a simple regexp to determine if a string contains only numbers. I'm running these two strings...

C# / C Sharp

regexp replace question

by: Bill McCormick | last post by:

Hello, I'm new to VB.NET but have used regexp in Perl and VI. I'd like to read a regular expression from a file and apply it to a string read from another file. The regexp is simple word...

Visual Basic .NET

JS Enabled But No RegExp Support?

by: Matt Kruse | last post by:

Are there any current browsers that have Javascript support, but not RegExp support? For example, cell phone browsers, blackberrys, or other "minimal" browsers? I know that someone using Netscape...

Javascript

Finding position of a RegExp subexpression

by: Csaba Gabor | last post by:

I need to come up with a function function regExpPos (text, re, parenNum) { ... } that will return the position within text of RegExp.$parenNum if there is a match, and -1 otherwise. For...

Javascript

regexp test function behavior

by: HopfZ | last post by:

I coudn't understand some behavior of RegExp.test function. Example html code: ---------------- <html><head></head><body><script type="text/javascript"> var r = /^https?:\/\//g;...

Javascript

(RegExp in Access):VBA (RegExp in SQLServer):?

by: Darryl Kerkeslager | last post by:

Currently I am using the RegExp object to parse a large dataset in an Access table - but this table was exported from SQL Server, and the very correct question was asked - why not just do it in SQL...

Microsoft Access / VBA

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET