Hi,

I would like to form a regular expression to find a few different

tokens (and, or, xor) followed by some variable number of whitespace

(i.e., tabs and spaces) followed by a hash mark (i.e., #). What would

be the regular expression for this?

Thanks for any help,

Michael

Using pyparsing, whitespace is implicitly ignored. Your expression would

look like:

oneOf("and or xor") + Literal("#")

Here's a complete example:

from pyparsing import *

pattern = oneOf("and or xor") + Literal("#")

testString = """

z = (a and b) and #XVAL;

q = z xor #YVAL;

"""

# use scanString to locate matches

for tokens,start,end in pattern.scanString(testString):

print tokens[0], tokens.asList()

print line(start,testString)

print (" "*(col(start,testString)-1)) + "^"

print

print

# use transformString to locate matches and substitute values

subs = {

'XVAL': 0,

'YVAL': True,

}

def replaceSubs(st,loc,toks):

try:

return toks[0] + " " + str(subs[toks[2]])

except KeyError:

pass

pattern2 = (pattern + Word(alphanums)).setParseAction(replaceSubs)

print pattern2.transformString(testString)

-----------------

Prints:

and ['and', '#']

z = (a and b) and #XVAL;

^

xor ['xor', '#']

q = z xor #YVAL;

^

z = (a and b) and 0;

q = z xor True;

Download pyparsing at

http://pyparsing.sourceforge.net.

-- Paul