473,399 Members | 3,038 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

simpleparse parsing problem

Anyone out there use simpleparse? If so, I have a problem that I can't
seem to solve...I need to be able to parse this line:

"""Cen2 = Cen(OUT, "Cep", "ies", wh, 544, (wh/ht));"""

with this grammar:

grammar = r'''
declaration := ws, line, (ws, line)*, ws
line := (statement / assignment), ';', ws
assignment := identifier, ws, '=', ws, statement
statement := identifier, '(', arglist?, ')', chars?
identifier := ([a-zA-Z0-9_.:])+
arglist := arg, (',', ws, arg)*
arg := expr/ statement / identifier / num / str /
curve / spline / union / conditional / definition
definition := typedef?, ws, identifier, ws, '=', ws, arg
typedef := ([a-zA-Z0-9_])+
expr := termlist, ( operator, termlist )+
termlist := ( '(', expr, ')' ) / term
term := call / identifier / num
call := identifier, '(', arglist?, ')'
union := '{{', ws, (arg, ws, ';', ws)*, arg, ws, '}}'
operator := ( '+' / '-' / '/' / '*' /
'==' / '>=' / '<=' / '>' / '<' )
conditional := termlist, ws, '?', ws, termlist, ws, ':', ws, termlist
curve := (list / num), '@', num
spline := (cv, ',')*, cv
cv := identifier, '@', num
list := '[', arg, (',', ws, arg)*, ']'
str := '"', ([;] / chars)*, '"'
num := ( scinot / float / int )
<chars := ('-' / '/' / '?' / [a-zA-Z0-9_.!@#$%^&\*\+=<:])+
<int := ([-+]?, [0-9]+)
<float := ([-+]?, [0-9\.]+)
<scinot := (float, 'e', int)
<ws := [ \t\n]*
'''

But it fails. The problem is with how arglist/arg/expr are defined,
which makes it unable to handle the parenthesized expression at the end
of the line:

(wh/ht)

But everything I've tried to correct that problem fails. In the end, it
needs to be able to parse that line with those parentheses around wh/ht,
or without them.
Recursive parsing of expressions just seems hard to do in simpleparse,
and is beyond my parsing knowledge.

Here's the code to get the parser going:

from simpleparse.parser import Parser
p = Parser(grammar, 'line')
import pprint
bad_line = """Cen2 = Cen(OUT, "Cep", "ies", wh, 544, (wh/ht));"""

pprint.pprint(p.parse(bad_line))
Any help greatly appreciated, thanks,
-Dave
--
Presenting:
mediocre nebula.

Sep 2 '06 #1
1 1851
"David Hirschfield" <da****@ilm.comwrote in message
news:ma****************************************@py thon.org...
Anyone out there use simpleparse? If so, I have a problem that I can't
seem to solve...I need to be able to parse this line:

"""Cen2 = Cen(OUT, "Cep", "ies", wh, 544, (wh/ht));"""

with this grammar:

grammar = r'''
declaration := ws, line, (ws, line)*, ws
line := (statement / assignment), ';', ws
assignment := identifier, ws, '=', ws, statement
statement := identifier, '(', arglist?, ')', chars?
identifier := ([a-zA-Z0-9_.:])+
arglist := arg, (',', ws, arg)*
arg := expr/ statement / identifier / num / str /
curve / spline / union / conditional / definition
definition := typedef?, ws, identifier, ws, '=', ws, arg
typedef := ([a-zA-Z0-9_])+
expr := termlist, ( operator, termlist )+
termlist := ( '(', expr, ')' ) / term
term := call / identifier / num
call := identifier, '(', arglist?, ')'
union := '{{', ws, (arg, ws, ';', ws)*, arg, ws, '}}'
operator := ( '+' / '-' / '/' / '*' /
'==' / '>=' / '<=' / '>' / '<' )
conditional := termlist, ws, '?', ws, termlist, ws, ':', ws, termlist
curve := (list / num), '@', num
spline := (cv, ',')*, cv
cv := identifier, '@', num
list := '[', arg, (',', ws, arg)*, ']'
str := '"', ([;] / chars)*, '"'
num := ( scinot / float / int )
<chars := ('-' / '/' / '?' / [a-zA-Z0-9_.!@#$%^&\*\+=<:])+
<int := ([-+]?, [0-9]+)
<float := ([-+]?, [0-9\.]+)
<scinot := (float, 'e', int)
<ws := [ \t\n]*
'''
<snip>
>
David -

I converted your simpleparse grammar to pyparsing, which I could then
troubleshoot. Here is a working pyparsing grammar, perhaps you can convert
it back to simpleparse form and see if you make any better progress.

-- Paul
test = """Cen2 = Cen(OUT, "Cep", "ies", wh, 544, (wh/ht));"""
from pyparsing import *

# recursive items need forward decl - assign contents later using '<<'
operator
arg = Forward()
expr = Forward()
statement = Forward()

float_ = Regex (r"[-+]?[0-9]+\.[0-9]*")
int_ = Regex (r"[-+]?[0-9]+")
scinot = Combine(float_ + oneOf(list("eE")) + int_)
num = scinot | float_ | int_
str_ = dblQuotedString
list_ = "[" + delimitedList(arg) + "]"
identifier = Word(alphas, srange("[a-zA-Z0-9_.:]"))
cv = identifier + "@" + num
spline = delimitedList(cv)
curve = (list_ | num) + "@" + num
conditional = expr + "?" + expr + ":" + expr
operator = oneOf( ('+', '-', '/', '*', '==', '>=', '<=', '>', '<') )
union = "{{" + delimitedList( arg, delim=";" ) + "}}"
call = identifier + "(" + delimitedList(arg) + ")"
term = call | identifier | num | Group( "(" + expr + ")" )
expr << (term + ZeroOrMore( operator+term ) )
typedef = Word( alphas, alphanums )
definition = ( (typedef + identifier) | identifier ) + "=" + arg
arg << (expr | statement | identifier | num | str_ | "." |
curve | spline | union | conditional | definition)
assignment = identifier + "=" + statement
statement << ( call | assignment )
line_ = statement + ';'
declaration = OneOrMore(line_)

print declaration.parseString(test)

Prints:
['Cen2', '=', 'Cen', '(', 'OUT', '"Cep"', '"ies"', 'wh', '544', ['(', 'wh',
'/', 'ht', ')'], ')', ';']

Sep 2 '06 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Fortepianissimo | last post by:
A while ago I decided to use simpleparse to write a parser for a kind of formula representation that I created. This representation is typed, so there're invalid parses to reject based on sematnics...
303
by: mike420 | last post by:
In the context of LATEX, some Pythonista asked what the big successes of Lisp were. I think there were at least three *big* successes. a. orbitz.com web site uses Lisp for algorithms, etc. b....
16
by: Terry | last post by:
Hi, This is a newbie's question. I want to preload 4 images and only when all 4 images has been loaded into browser's cache, I want to start a slideshow() function. If images are not completed...
5
by: gamehack | last post by:
Hi all, I was thinking about parsing equations but I can't think of any generic approach. Basically I have a struct called math_term which is something like: struct math_term { char sign; int...
9
by: ankitdesai | last post by:
I would like to parse a couple of tables within an individual player's SHTML page. For example, I would like to get the "Actual Pitching Statistics" and the "Translated Pitching Statistics"...
3
by: toton | last post by:
Hi, I have some ascii files, which are having some formatted text. I want to read some section only from the total file. For that what I am doing is indexing the sections (denoted by .START in...
3
by: Anup Daware | last post by:
Hi Group, I am facing a strange problem here: I am trying to read xml response from a servlet using XmlTextWriter. I am able to read the read half of the xml and suddenly an exception:...
13
by: Chris Carlen | last post by:
Hi: Having completed enough serial driver code for a TMS320F2812 microcontroller to talk to a terminal, I am now trying different approaches to command interpretation. I have a very simple...
0
by: Laszlo Nagy | last post by:
The program below gives me "segmentation fault (core dumped)". Environment: Linux gandalf-desktop 2.6.20-16-generic #2 SMP Tue Feb 12 05:41:34 UTC 2008 i686 GNU/Linux Python 2.5.1 What is...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.