By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,813 Members | 1,236 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,813 IT Pros & Developers. It's quick & easy.

simpleparse parsing problem

P: n/a
Anyone out there use simpleparse? If so, I have a problem that I can't
seem to solve...I need to be able to parse this line:

"""Cen2 = Cen(OUT, "Cep", "ies", wh, 544, (wh/ht));"""

with this grammar:

grammar = r'''
declaration := ws, line, (ws, line)*, ws
line := (statement / assignment), ';', ws
assignment := identifier, ws, '=', ws, statement
statement := identifier, '(', arglist?, ')', chars?
identifier := ([a-zA-Z0-9_.:])+
arglist := arg, (',', ws, arg)*
arg := expr/ statement / identifier / num / str /
curve / spline / union / conditional / definition
definition := typedef?, ws, identifier, ws, '=', ws, arg
typedef := ([a-zA-Z0-9_])+
expr := termlist, ( operator, termlist )+
termlist := ( '(', expr, ')' ) / term
term := call / identifier / num
call := identifier, '(', arglist?, ')'
union := '{{', ws, (arg, ws, ';', ws)*, arg, ws, '}}'
operator := ( '+' / '-' / '/' / '*' /
'==' / '>=' / '<=' / '>' / '<' )
conditional := termlist, ws, '?', ws, termlist, ws, ':', ws, termlist
curve := (list / num), '@', num
spline := (cv, ',')*, cv
cv := identifier, '@', num
list := '[', arg, (',', ws, arg)*, ']'
str := '"', ([;] / chars)*, '"'
num := ( scinot / float / int )
<chars := ('-' / '/' / '?' / [a-zA-Z0-9_.!@#$%^&\*\+=<:])+
<int := ([-+]?, [0-9]+)
<float := ([-+]?, [0-9\.]+)
<scinot := (float, 'e', int)
<ws := [ \t\n]*
'''

But it fails. The problem is with how arglist/arg/expr are defined,
which makes it unable to handle the parenthesized expression at the end
of the line:

(wh/ht)

But everything I've tried to correct that problem fails. In the end, it
needs to be able to parse that line with those parentheses around wh/ht,
or without them.
Recursive parsing of expressions just seems hard to do in simpleparse,
and is beyond my parsing knowledge.

Here's the code to get the parser going:

from simpleparse.parser import Parser
p = Parser(grammar, 'line')
import pprint
bad_line = """Cen2 = Cen(OUT, "Cep", "ies", wh, 544, (wh/ht));"""

pprint.pprint(p.parse(bad_line))
Any help greatly appreciated, thanks,
-Dave
--
Presenting:
mediocre nebula.

Sep 2 '06 #1
Share this Question
Share on Google+
1 Reply


P: n/a
"David Hirschfield" <da****@ilm.comwrote in message
news:ma****************************************@py thon.org...
Anyone out there use simpleparse? If so, I have a problem that I can't
seem to solve...I need to be able to parse this line:

"""Cen2 = Cen(OUT, "Cep", "ies", wh, 544, (wh/ht));"""

with this grammar:

grammar = r'''
declaration := ws, line, (ws, line)*, ws
line := (statement / assignment), ';', ws
assignment := identifier, ws, '=', ws, statement
statement := identifier, '(', arglist?, ')', chars?
identifier := ([a-zA-Z0-9_.:])+
arglist := arg, (',', ws, arg)*
arg := expr/ statement / identifier / num / str /
curve / spline / union / conditional / definition
definition := typedef?, ws, identifier, ws, '=', ws, arg
typedef := ([a-zA-Z0-9_])+
expr := termlist, ( operator, termlist )+
termlist := ( '(', expr, ')' ) / term
term := call / identifier / num
call := identifier, '(', arglist?, ')'
union := '{{', ws, (arg, ws, ';', ws)*, arg, ws, '}}'
operator := ( '+' / '-' / '/' / '*' /
'==' / '>=' / '<=' / '>' / '<' )
conditional := termlist, ws, '?', ws, termlist, ws, ':', ws, termlist
curve := (list / num), '@', num
spline := (cv, ',')*, cv
cv := identifier, '@', num
list := '[', arg, (',', ws, arg)*, ']'
str := '"', ([;] / chars)*, '"'
num := ( scinot / float / int )
<chars := ('-' / '/' / '?' / [a-zA-Z0-9_.!@#$%^&\*\+=<:])+
<int := ([-+]?, [0-9]+)
<float := ([-+]?, [0-9\.]+)
<scinot := (float, 'e', int)
<ws := [ \t\n]*
'''
<snip>
>
David -

I converted your simpleparse grammar to pyparsing, which I could then
troubleshoot. Here is a working pyparsing grammar, perhaps you can convert
it back to simpleparse form and see if you make any better progress.

-- Paul
test = """Cen2 = Cen(OUT, "Cep", "ies", wh, 544, (wh/ht));"""
from pyparsing import *

# recursive items need forward decl - assign contents later using '<<'
operator
arg = Forward()
expr = Forward()
statement = Forward()

float_ = Regex (r"[-+]?[0-9]+\.[0-9]*")
int_ = Regex (r"[-+]?[0-9]+")
scinot = Combine(float_ + oneOf(list("eE")) + int_)
num = scinot | float_ | int_
str_ = dblQuotedString
list_ = "[" + delimitedList(arg) + "]"
identifier = Word(alphas, srange("[a-zA-Z0-9_.:]"))
cv = identifier + "@" + num
spline = delimitedList(cv)
curve = (list_ | num) + "@" + num
conditional = expr + "?" + expr + ":" + expr
operator = oneOf( ('+', '-', '/', '*', '==', '>=', '<=', '>', '<') )
union = "{{" + delimitedList( arg, delim=";" ) + "}}"
call = identifier + "(" + delimitedList(arg) + ")"
term = call | identifier | num | Group( "(" + expr + ")" )
expr << (term + ZeroOrMore( operator+term ) )
typedef = Word( alphas, alphanums )
definition = ( (typedef + identifier) | identifier ) + "=" + arg
arg << (expr | statement | identifier | num | str_ | "." |
curve | spline | union | conditional | definition)
assignment = identifier + "=" + statement
statement << ( call | assignment )
line_ = statement + ';'
declaration = OneOrMore(line_)

print declaration.parseString(test)

Prints:
['Cen2', '=', 'Cen', '(', 'OUT', '"Cep"', '"ies"', 'wh', '544', ['(', 'wh',
'/', 'ht', ')'], ')', ';']

Sep 2 '06 #2

This discussion thread is closed

Replies have been disabled for this discussion.