Regular Expressions in Python

fossil_blue

Dear Gurus,

I am trying to find out how to write an effective regular expression
in python for the following scenario:

"any number of leading spaces at the beginning of a line" "follow
by a string" "there maybe a string that starts with *"

for example:

END *This is a comment

but I don't want to match this:

END e * This is a line with an error (e)

thanks,
Noel

Jul 18 '05 #1

Subscribe Post Reply

1511

Jeff Epler

opt_spaces = " *"
identifier = "[A-Za-z_][A-Za-z0-9_]+"
comment = "\*.*"
opt_comment = "(%s)?" % comment

pat = re.compile(opt_spaces + identifier + opt_spaces + opt_comment + "$")

for test in (
" END *This is a comment",
" END e * This is a line with an error (e)"):
print test, pat.match(test)

Jeff

Jul 18 '05 #2

Paul McGuire

"Jeff Epler" <je****@unpythonic.net> wrote in message
news:ma*************************************@pytho n.org...

opt_spaces = " *"
identifier = "[A-Za-z_][A-Za-z0-9_]+"
comment = "\*.*"
opt_comment = "(%s)?" % comment

pat = re.compile(opt_spaces + identifier + opt_spaces + opt_comment + "$")

for test in (
" END *This is a comment",
" END e * This is a line with an error (e)"):
print test, pat.match(test)

Jeff

Assuming you're more interested in the identifier than in the comment,
change identifier to "([A-Za-z_][A-Za-z0-9_]+)" so that the keyword gets
saved in the pat.match.groups() list.

-- Paul

Jul 18 '05 #3

Paul McGuire

"fossil_blue" <no********@excite.com> wrote in message
news:c7**************************@posting.google.c om...

Dear Gurus,

I am trying to find out how to write an effective regular expression
in python for the following scenario:

"any number of leading spaces at the beginning of a line" "follow
by a string" "there maybe a string that starts with *"

for example:

END *This is a comment

but I don't want to match this:

END e * This is a line with an error (e)

thanks,
Noel

Here's an example with sample code using both re's and pyparsing. Note that
the single .ignore() call takes care of ignoring comments on all contained
grammar constructs, and non-significant whitespace is implicitly ignored (so
no need to litter your matching expressions with lots of opt_spaces-type
content).

-- Paul
========================
from pyparsing import Word, alphas, alphanums, restOfLine, LineEnd,
ParseException

testdata = """
END *This is a comment
END*This is a comment (but the next line has no comment)
END
END e * This is a line with an error (e)"""
enquote = lambda st : ( '"%s"' % st )

print "test with pyparsing"
grammar = Word( alphas, alphanums ).setName("keyword") + LineEnd()
comment = "*" + restOfLine
grammar.ignore( comment )

for test in testdata.split("\n"):
try:
print enquote(test),"\n->",
print grammar.parseString( test )
except ParseException, pe:
print pe

print

import re
print "test with re"
opt_spaces = " *"
#identifier = "[A-Za-z_][A-Za-z0-9_]+" - I'm guessing this regexp should
have ()'s for accessing content as a group
identifier = "([A-Za-z_][A-Za-z0-9_]+)"
comment = "\*.*"
opt_comment = "(%s)?" % comment

pat = re.compile(opt_spaces + identifier + opt_spaces + opt_comment + "$")

for test in testdata.split("\n"):
print enquote(test),"\n->",
if pat.match(test):
print pat.match(test).groups()
else:
print "Bad text"

========================
Gives this output:

test with pyparsing
""
-> Expected keyword (0), (1,1)
"END *This is a comment"
-> ['END']
" END*This is a comment (but the next line has no comment)"
-> ['END']
" END"
-> ['END']
" END e * This is a line with an error (e)"
-> Expected end of line (8), (1,9)

test with re
""
-> Bad text
"END *This is a comment"
-> ('END', '*This is a comment')
" END*This is a comment (but the next line has no comment)"
-> ('END', '*This is a comment (but the next line has no comment)')
" END"
-> ('END', None)
" END e * This is a line with an error (e)"
-> Bad text

Jul 18 '05 #4

by: Tony C | last post by:

I'm writing a python program which uses regular expressions, but I'm totally new to regexps. I've got Kuchling's "Regexp HOWTO", "Mastering Regular Expresions" by Oreilly, and have access to...

Python

Regular Expression AND mach

by: Fuzzyman | last post by:

I'm writing a song lyric database (effectively to drive a projector - so the database contains the full song lyrics). I'm using a nice simple Python database called KirbyBase which uses regular...

Python

Request for Feedback; a module making it easier to use regular expressions.

by: Kenneth McDonald | last post by:

I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...

Python

Python's regular expression?

by: Davy | last post by:

Hi all, I am a C/C++/Perl user and want to switch to Python (I found Python is more similar to C). Does Python support robust regular expression like Perl? And Python and Perl's File...

Python

builtin regular expressions?

by: Antoine De Groote | last post by:

Hello, Can anybody tell me the reason(s) why regular expressions are not built into Python like it is the case with Ruby and I believe Perl? Like for example in the following Ruby code line =...

Python

Regular Expressions

by: Geoff Hill | last post by:

What's the way to go about learning Python's regular expressions? I feel like such an idiot - being so strong in a programming language but knowing nothing about RE.

Python

UNICODE mode for regular expressions - time to change the default?

by: John Nagle | last post by:

Regular expressions are compiled in ASCII mode unless Unicode mode is specified to "rc.compile". The difference is that regular expressions in ASCII mode don't recognize things like Unicode...

Python

Python regular expressions just ain't PCRE

by: Wiseman | last post by:

I'm kind of disappointed with the re regular expressions module. In particular, the lack of support for recursion ( (?R) or (?n) ) is a major drawback to me. There are so many great things that can...

Python

Unicode Regular Expressions

by: bryan rasmussen | last post by:

Hi, I'm writing a program that requires specifically Unicode regular expressions http://unicode.org/reports/tr18/ to be loaded in from an external file and then interpreted against the data. if...

Python

re.search much slower then grep on some regular expressions

by: Henning_Thornblad | last post by:

What can be the cause of the large difference between re.search and grep? This script takes about 5 min to run on my computer: #!/usr/bin/env python import re row="" for a in range(156000):...

Python

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Regular Expressions in Python

Similar topics