473,320 Members | 1,988 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Error checking using regex ?

I have the code below which parses an expression string and creates tokens.

Can anyone suggest the best of error checking for things like:

Valid variable only obj.attribute -whitespace allowed

test( "ff*2/dd.r..ss r") #additional ..ss -invalid variable.
test( "ff*$24..55/ddr") #double .. and $ -invalid number
test( "ff*2/dd.r.ss r") #variable with double . -invalid variable

I can't see an efficient way of doing this so any suggestions appreciated.

TIA,

Guy

code:

import re
import time

re_par = '[\(\)]'
re_num = '[0-9]*\.?[0-9]+\E?[0-9]*'
re_opr = '[\*\/\+\-\^]'
re_cns = 'PI'
re_trg = 'SIN|COS|TAN|ASIN|ACOS|ATAN|SGN'
re_var = '[a-z_0-9\s]*\.?[a-z_0-9\s]*'

recom = re.compile( '(?P<token>%s|%s|%s|%s|%s|%s)'
%(re_par,re_num,re_opr,re_cns,re_trg,re_var) ,re.VERBOSE|re.IGNORECASE)

def test(str):
output = []
try:
r = recom.split(str)
for rr in r:
rr = rr.strip()
#test for blank string
if rr =='':
pass
else:
output.append(rr)
print output

except:
print 'error of some kind'

class stopwatch:

def __init__(self):

pass
def start(self):

self.t = time.time()
return 'starting timer'

def stop(self):

rstr = 'stopped at %f seconds' %(time.time() -self.t)
self.t = 0
return rstr

e = stopwatch()
print e.start()
test( "9" )
test( "9 + 3 + 6" )
test( "9 + 3 / 11" )
test( "( 9 + 3)" )
test( "(9+3) / 11" )
test( "9 - 12 - 6" )
test( "-9 - (12 - 6)" )
test( "2*3.14159" )
test( "3.1415926535*3.1415926535 / 10" )
test( "PI * PI / 10" )
test( "PI*PI/10" )
test( "PI^2" )
test( "6.02E23 * 8.048" )
test( "sin(PI/2)" )
test( "2^3^2" )
test( "2^9" )
test( "sgn(-2)" )
test( "sgn(0)" )
test( "sgn(0.1)" )
test( "ff*2" )
test( "ff*g g/2" )
test( "ff*2/dd.r r")
test( "5*4+300/(5-2)*(6+4)+4" )
test( "((5*4+300)/(5-2))*(6+4)+4" )
test( "(320/3)*10+4" )

#now test error expressions

test( "ff*2/dd.r..ss r") #additional ..ss and whitespace -invalid
variable
test( "ff*$24..55/ddr") #double .. -invalid number
test( "ff*2/dd.r.ss r") #variable with double . -invalid variable
#test( "ff*((w.w+3)-2") #no closing parentheses-to be tested when
evaluating expression

print e.stop()
Jul 18 '05 #1
3 1860
Am Dienstag, 8. Juni 2004 13:26 schrieb Guy Robinson:
I have the code below which parses an expression string and creates tokens.


You cannot parse expressions using regular expressions, and neither check them
for error, as the language specified by regular expressions is not
"intelligent" enough to match braces (read any book on complexity theory
primers, you need a machine with state, such as a deterministic stack
machine, to check for matching braces).

Your best bet to be able to check an expression, and also to be able to parse
it, is to write a context free grammar for your syntax, try to parse the
string you're evaluating, and in case parsing fails, to complain that the
expression is invalid. If you're parsing Python expressions, your best bet is
to call functions from the compile module (which create a code object from a
Python expression which is callable using exec).

HTH!

Heiko.

Jul 18 '05 #2
"Guy Robinson" <gu*@NOSPAM.r-e-d.co.nz> wrote in message
news:ca**********@lust.ihug.co.nz...
I have the code below which parses an expression string and creates tokens.
Can anyone suggest the best of error checking for things like:

Valid variable only obj.attribute -whitespace allowed

test( "ff*2/dd.r..ss r") #additional ..ss -invalid variable.
test( "ff*$24..55/ddr") #double .. and $ -invalid number
test( "ff*2/dd.r.ss r") #variable with double . -invalid variable

I can't see an efficient way of doing this so any suggestions appreciated.

TIA,

Guy

<snip>

Guy -

Well, I recognize the test cases from an example that I include with
pyparsing. Are you trying to add support for variables to that example? If
so, here is the example, modified to support assignments to variables.

-- Paul

============================
# minimath.py (formerly fourfn.py)
#
# Demonstration of the parsing module, implementing a simple 4-function
expression parser,
# with support for scientific notation, and symbols for e and pi.
# Extended to add exponentiation and simple built-in functions.
# Extended to add variable assignment, storage, and evaluation, and
Python-like comments.
#
# Copyright 2003,2004 by Paul McGuire
#
from pyparsing import
Literal,CaselessLiteral,Word,Combine,Group,Optiona l,ZeroOrMore,OneOrMore,For
ward,nums,alphas,restOfLine,delimitedList
import math

variables = {}
exprStack = []

def pushFirst( str, loc, toks ):
global exprStack
if toks:
exprStack.append( toks[0] )
return toks

def assignVar( str, loc, toks ):
global exprStack
global variables
variables[ toks[0] ] = evaluateStack( exprStack )
pushFirst(str,loc,toks)
bnf = None
def BNF():
global bnf
if not bnf:
point = Literal( "." )
e = CaselessLiteral( "E" )
fnumber = Combine( Word( "+-"+nums, nums ) +
Optional( point + Optional( Word( nums ) ) ) +
Optional( e + Word( "+-"+nums, nums ) ) )
ident = Word(alphas, alphas+nums+"_$")
varident = delimitedList(ident,".",combine=True)

plus = Literal( "+" )
minus = Literal( "-" )
mult = Literal( "*" )
div = Literal( "/" )
lpar = Literal( "(" ).suppress()
rpar = Literal( ")" ).suppress()
addop = plus | minus
multop = mult | div
expop = Literal( "^" )
pi = CaselessLiteral( "PI" )

expr = Forward()
atom = ( pi | e | fnumber | ident + lpar + expr + rpar |
varident ).setParseAction( pushFirst ) | ( lpar + expr.suppress() + rpar )
factor = atom + ZeroOrMore( ( expop + expr ).setParseAction(
pushFirst ) )
term = factor + ZeroOrMore( ( multop + factor ).setParseAction(
pushFirst ) )
expr << term + ZeroOrMore( ( addop + term ).setParseAction(
pushFirst ) )
assignment = (varident + "=" + expr).setParseAction( assignVar )

bnf = Optional( assignment | expr )

comment = "#" + restOfLine
bnf.ignore(comment)

return bnf

# map operator symbols to corresponding arithmetic operations
opn = { "+" : ( lambda a,b: a + b ),
"-" : ( lambda a,b: a - b ),
"*" : ( lambda a,b: a * b ),
"/" : ( lambda a,b: a / b ),
"^" : ( lambda a,b: a ** b ) }
fn = { "sin" : math.sin,
"cos" : math.cos,
"tan" : math.tan,
"abs" : abs,
"trunc" : ( lambda a: int(a) ),
"round" : ( lambda a: int(a+0.5) ),
"sgn" : ( lambda a: ( (a<0 and -1) or (a>0 and 1) or 0 ) ) }
def evaluateStack( s ):
global variables
if not s: return 0.0
op = s.pop()
if op in "+-*/^":
op2 = evaluateStack( s )
op1 = evaluateStack( s )
return opn[op]( op1, op2 )
elif op == "PI":
return 3.1415926535
elif op == "E":
return 2.718281828
elif op[0].isalpha():
if op in variables:
return variables[op]
fnarg = evaluateStack( s )
return (fn[op])( fnarg )
else:
return float( op )

if __name__ == "__main__":

def test( str ):
global exprStack
exprStack = []
results = BNF().parseString( str )
print str, "->", results, "=>", exprStack, "=", evaluateStack(
exprStack )

test( "9" )
test( "9 + 3 + 6" )
test( "9 + 3 / 11" )
test( "(9 + 3)" )
test( "(9+3) / 11" )
test( "9 - 12 - 6" )
test( "9 - (12 - 6)" )
test( "2*3.14159" )
test( "3.1415926535*3.1415926535 / 10" )
test( "PI * PI / 10" )
test( "PI*PI/10" )
test( "PI^2" )
test( "6.02E23 * 8.048" )
test( "e / 3" )
test( "sin(PI/2)" )
test( "trunc(E)" )
test( "E^PI" )
test( "2^3^2" )
test( "2^9" )
test( "sgn(-2)" )
test( "sgn(0)" )
test( "sgn(0.1)" )
test( "5*4+300/(5-2)*(6+4)+4" )
test( "((5*4+301)/(5-2))*(6+4)+4" )
test( "(321/3)*10+4" )
test( "# nothing but comments" )
test( "a = 2^10" )
test( "a^0.1 # same as 10th root of 1024" )
test( "c = a" )
test( "b=a" )
test( "b-c" )
Jul 18 '05 #3
Hi Paul,

Yep your examples :-) I'm using this as a learning experience and have
looked at your code but I have specific requirements for integration
into another application.

I'm using the regex to create a list of tokens to be processed into a
postfix processing string. This is then offloaded to another class that
processes the string for each database row.

The speed to generate the postffix string isn't important. But the speed
to process for each database row is.

Guy
"Guy Robinson" <gu*@NOSPAM.r-e-d.co.nz> wrote in message
news:ca**********@lust.ihug.co.nz...
I have the code below which parses an expression string and creates


tokens.
Can anyone suggest the best of error checking for things like:

Valid variable only obj.attribute -whitespace allowed

test( "ff*2/dd.r..ss r") #additional ..ss -invalid variable.
test( "ff*$24..55/ddr") #double .. and $ -invalid number
test( "ff*2/dd.r.ss r") #variable with double . -invalid variable

I can't see an efficient way of doing this so any suggestions appreciated.

TIA,

Guy


<snip>

Guy -

Well, I recognize the test cases from an example that I include with
pyparsing. Are you trying to add support for variables to that example? If
so, here is the example, modified to support assignments to variables.

-- Paul

============================
# minimath.py (formerly fourfn.py)
#
# Demonstration of the parsing module, implementing a simple 4-function
expression parser,
# with support for scientific notation, and symbols for e and pi.
# Extended to add exponentiation and simple built-in functions.
# Extended to add variable assignment, storage, and evaluation, and
Python-like comments.
#
# Copyright 2003,2004 by Paul McGuire
#
from pyparsing import
Literal,CaselessLiteral,Word,Combine,Group,Optiona l,ZeroOrMore,OneOrMore,For
ward,nums,alphas,restOfLine,delimitedList
import math

variables = {}
exprStack = []

def pushFirst( str, loc, toks ):
global exprStack
if toks:
exprStack.append( toks[0] )
return toks

def assignVar( str, loc, toks ):
global exprStack
global variables
variables[ toks[0] ] = evaluateStack( exprStack )
pushFirst(str,loc,toks)
bnf = None
def BNF():
global bnf
if not bnf:
point = Literal( "." )
e = CaselessLiteral( "E" )
fnumber = Combine( Word( "+-"+nums, nums ) +
Optional( point + Optional( Word( nums ) ) ) +
Optional( e + Word( "+-"+nums, nums ) ) )
ident = Word(alphas, alphas+nums+"_$")
varident = delimitedList(ident,".",combine=True)

plus = Literal( "+" )
minus = Literal( "-" )
mult = Literal( "*" )
div = Literal( "/" )
lpar = Literal( "(" ).suppress()
rpar = Literal( ")" ).suppress()
addop = plus | minus
multop = mult | div
expop = Literal( "^" )
pi = CaselessLiteral( "PI" )

expr = Forward()
atom = ( pi | e | fnumber | ident + lpar + expr + rpar |
varident ).setParseAction( pushFirst ) | ( lpar + expr.suppress() + rpar )
factor = atom + ZeroOrMore( ( expop + expr ).setParseAction(
pushFirst ) )
term = factor + ZeroOrMore( ( multop + factor ).setParseAction(
pushFirst ) )
expr << term + ZeroOrMore( ( addop + term ).setParseAction(
pushFirst ) )
assignment = (varident + "=" + expr).setParseAction( assignVar )

bnf = Optional( assignment | expr )

comment = "#" + restOfLine
bnf.ignore(comment)

return bnf

# map operator symbols to corresponding arithmetic operations
opn = { "+" : ( lambda a,b: a + b ),
"-" : ( lambda a,b: a - b ),
"*" : ( lambda a,b: a * b ),
"/" : ( lambda a,b: a / b ),
"^" : ( lambda a,b: a ** b ) }
fn = { "sin" : math.sin,
"cos" : math.cos,
"tan" : math.tan,
"abs" : abs,
"trunc" : ( lambda a: int(a) ),
"round" : ( lambda a: int(a+0.5) ),
"sgn" : ( lambda a: ( (a<0 and -1) or (a>0 and 1) or 0 ) ) }
def evaluateStack( s ):
global variables
if not s: return 0.0
op = s.pop()
if op in "+-*/^":
op2 = evaluateStack( s )
op1 = evaluateStack( s )
return opn[op]( op1, op2 )
elif op == "PI":
return 3.1415926535
elif op == "E":
return 2.718281828
elif op[0].isalpha():
if op in variables:
return variables[op]
fnarg = evaluateStack( s )
return (fn[op])( fnarg )
else:
return float( op )

if __name__ == "__main__":

def test( str ):
global exprStack
exprStack = []
results = BNF().parseString( str )
print str, "->", results, "=>", exprStack, "=", evaluateStack(
exprStack )

test( "9" )
test( "9 + 3 + 6" )
test( "9 + 3 / 11" )
test( "(9 + 3)" )
test( "(9+3) / 11" )
test( "9 - 12 - 6" )
test( "9 - (12 - 6)" )
test( "2*3.14159" )
test( "3.1415926535*3.1415926535 / 10" )
test( "PI * PI / 10" )
test( "PI*PI/10" )
test( "PI^2" )
test( "6.02E23 * 8.048" )
test( "e / 3" )
test( "sin(PI/2)" )
test( "trunc(E)" )
test( "E^PI" )
test( "2^3^2" )
test( "2^9" )
test( "sgn(-2)" )
test( "sgn(0)" )
test( "sgn(0.1)" )
test( "5*4+300/(5-2)*(6+4)+4" )
test( "((5*4+301)/(5-2))*(6+4)+4" )
test( "(321/3)*10+4" )
test( "# nothing but comments" )
test( "a = 2^10" )
test( "a^0.1 # same as 10th root of 1024" )
test( "c = a" )
test( "b=a" )
test( "b-c" )

Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: unndunn | last post by:
I'm trying to design a regular expression that matches (using preg_match()) when a string is a well-formed Email address. So far I have this: /^+@+\.{2,4}$/i I got that from...
6
by: Maurice LING | last post by:
Hi, I have the following codes: from __future__ import nested_scopes import re from UserDict import UserDict class Replacer(UserDict):
6
by: Dan Roberts | last post by:
I am running some off-the-shelf software that is written in ASP, which uses JScript to generate dynamic content within HTML forms. There are several ASP pages which are partially rendering to IE,...
4
by: Aaron Walker | last post by:
Greetings, I'm attempting to write my first *real* template function that also deals with a map of strings to member function pointers that is making the syntax a little tricky to get right. ...
16
by: TD | last post by:
This is the code under a command button - Dim ctl As Control For Each ctl In Me.Controls If ctl.BackColor <> RGB(255, 255, 255) Then ctl.BackColor = RGB(255, 255, 255) End If Next ctl
6
by: Mike P | last post by:
I'm using this code to check if a user input is of a particular data type : try { int intUserInput = Convert.ToInt32 (txtUserInput.Text); } catch { //invalid user input
0
by: Buddy Home | last post by:
Hello, I'm trying to upload a file programatically and occasionally I get the following error message. Unable to write data to the transport connection: An established connection was aborted...
2
by: akhilesh.noida | last post by:
I am trying to compile glibc-2.5 for ARM based board. But I am getting errors while configuring it. Please check and give your inputs for resolving this. configure command : $...
3
by: GazK | last post by:
I have been using an xml parsing script to parse a number of rss feeds and return relevant results to a database. The script has worked well for a couple of years, despite having very crude...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.