By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,245 Members | 884 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,245 IT Pros & Developers. It's quick & easy.

Parsing library for Python?

P: n/a
Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor
Jul 18 '05 #1
Share this Question
Share on Google+
14 Replies


P: n/a
Viktor Rosenfeld a écrit :
Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor


YAPPS : http://theory.stanford.edu/~amitp/Yapps/

and all those cite on http://www.python.org/sigs/parser-sig/

Jul 18 '05 #2

P: n/a
On Fri, 20 Feb 2004 18:17:48 +0100, Viktor Rosenfeld wrote:
Hi,

I need to create a parser for a Python project, and I'd like to use
process kinda like lex/yacc. I've looked at various parsing packages
online, but didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't
even compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give
a BNF grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor


Have you seen this page :
http://www.python.org/sigs/parser-sig/

Christophe.
Jul 18 '05 #3

P: n/a
"Viktor Rosenfeld" <ro******@informatik.hu-berlin.de> wrote in message
news:c1*************@ID-212335.news.uni-berlin.de...
[snip]
Is there a good parsing module for python that I missed?


You could look into SPARK:

http://pages.cpsc.ucalgary.ca/~aycock/spark/

Jul 18 '05 #4

P: n/a
>
You could look into SPARK:

http://pages.cpsc.ucalgary.ca/~aycock/spark/


Yes, I'd recommend that, too - its an early-parser implementation, which is
very powerful and allows e.g. left-recursive rules. however, you can't feed
it a ebnf directly, instead you do things like this (*->* is ebnf, ::= is
spark) :

rule -> term?

becomes

rule ::=
rule ::= term

rule -> (term)*

becomes

rule ::= rule term
rule ::=
--
Regards,

Diez B. Roggisch
Jul 18 '05 #5

P: n/a
Viktor Rosenfeld <ro******@informatik.hu-berlin.de> writes:
Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor

http://www.python.org/sigs/parser-sig/

I used Ply for a project a while ago. It felt comfortable.
http://systems.cs.uchicago.edu/ply/
--
ha************@boeing.com
6-6M21 BCA CompArch Design Engineering
Phone: (425) 342-0007
Jul 18 '05 #6

P: n/a
Viktor Rosenfeld wrote:
I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?


When looking for a parser generator, I think it is important that full
grammars be provided for at least C and Python and preferably for C++,
Java, and FORTRAN.
Jul 18 '05 #7

P: n/a
Viktor Rosenfeld wrote:
Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

SimpleParse is based on mxTextTools, but is EBNF-driven. You can find
it here:

http://simpleparse.sourceforge.net/

Have fun,
Mike

_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/

Jul 18 '05 #8

P: n/a
_
"Viktor Rosenfeld" <ro******@informatik.hu-berlin.de> wrote in message
news:c1*************@ID-212335.news.uni-berlin.de...
Hi,

I need to create a parser for a Python project, and I'd like to use process kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:


You might also try

http://pyparsing.sourceforge.net



Jul 18 '05 #9

P: n/a
Harry George wrote:
Viktor Rosenfeld <ro******@informatik.hu-berlin.de> writes:

Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?
You should have a look at SimpleParse which converts BNF to
the tag tables used by mxTextTools:

http://simpleparse.sourceforge.net/

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Feb 23 2004)
Python/Zope Consulting and Support ... http://www.egenix.com/
mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

__________________________________________________ ______________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

Jul 18 '05 #10

P: n/a
"Edward C. Jones" <ed******@erols.com> wrote:

When looking for a parser generator, I think it is important that full
grammars be provided for at least C and Python and preferably for C++,
Java, and FORTRAN.


Are you kidding with this? I can't tell.

C, C++, and Fortran are parsing nightmares, where end-of-line and spacing
are important sometimes and ignored at other times, and so on.

I expect to find the canonical desk calculator example, and perhaps a
Pascal-based language, but any more than that is asking a bit much from all
but the most mature parser generators.
--
- Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Jul 18 '05 #11

P: n/a
[Tim Roberts]
C, C++, and Fortran are parsing nightmares, where end-of-line and
spacing are important sometimes and ignored at other times, and so on.


End-of-line processing does not look too difficult for these languages.
But spaces in FORTRAN always looked difficult to parse, at least in
the original FORTRAN where they might appear anywhere, even inside an
identifier, while not even being required between "words".

One routine which was popular, at one place I worded, was named INGMTR,
as people used to always call it this way:

CALLING MTR(... ARGUMENTS ...)

One traditional amusement was writing obscure programs, like:

DO 50 I = 3

that had nothing to do with DO loops. I wonder how FORTRAN parsers
worked to sort out such things. Did later FORTRAN use more strict (or
at least usual) rules on white space?

--
François Pinard http://www.iro.umontreal.ca/~pinard

Jul 18 '05 #12

P: n/a
Tim Roberts wrote:
"Edward C. Jones" <ed******@erols.com> wrote:
When looking for a parser generator, I think it is important that full
grammars be provided for at least C and Python and preferably for C++,
Java, and FORTRAN.

Are you kidding with this? I can't tell.

C, C++, and Fortran are parsing nightmares, where end-of-line and spacing
are important sometimes and ignored at other times, and so on.

I expect to find the canonical desk calculator example, and perhaps a
Pascal-based language, but any more than that is asking a bit much from all
but the most mature parser generators.


Not kidding. Nothing can be parsed without a grammar. I think parsing
the standard computer languages is a common need. I am sporatically
developing software to automatically generate Pyrex code for wrapping C
libraries in Python. I use ANTLR because it comes with a good C grammar.

And then there is HTML. I wonder how Mozilla parses all the ill-formed
html that is on the web.
Jul 18 '05 #13

P: n/a
"Edward C. Jones" <ed******@erols.com> writes:
Tim Roberts wrote:
"Edward C. Jones" <ed******@erols.com> wrote:
When looking for a parser generator, I think it is important that
full grammars be provided for at least C and Python and preferably
for C++, Java, and FORTRAN.

Are you kidding with this? I can't tell.
C, C++, and Fortran are parsing nightmares, where end-of-line and
spacing
are important sometimes and ignored at other times, and so on.
I expect to find the canonical desk calculator example, and perhaps a
Pascal-based language, but any more than that is asking a bit much from all
but the most mature parser generators.


Not kidding. Nothing can be parsed without a grammar. I think parsing
the standard computer languages is a common need. I am sporatically
developing software to automatically generate Pyrex code for wrapping
C libraries in Python. I use ANTLR because it comes with a good C
grammar.

And then there is HTML. I wonder how Mozilla parses all the ill-formed
html that is on the web.

Yes, things can be parsed without a grammar, or at least without a
conventional CFG. Ad hoc parsers are so messy, of course, that we try
to avoid that in modern languages. But I've parsed textual documents
at times with context-sensitive RR(2) approaches and other oddities.

The point is that FORTRAN predates clear understanding of
line-independent lexing and Context Free grammars (CFG's). It uses
constructs which are not handled by the classic
scanner/lexer/parser/AST tools. I don't know how the pros handle
this, but when I run into a non-std grammar, I preprocess to tag it
with additional tokens, and then run it through a std lexer/parser.
Basically a tree re-writer approach.

C++ is (I think) classically lexable, but the semantics are so complex
that parsing (or understanding what to do with the parse) is a pain.
I wasn't in that business, but I understand C compiler vendors bombed
out trying to just upgrade C compilers and had to start fresh with a
much richer type model. SWIG also ran into this.

For parsing of "bad html", see "tidy". Its lexer/parser is ad hoc
(not generated by parser toolkits).
--
ha************@boeing.com
6-6M21 BCA CompArch Design Engineering
Phone: (425) 342-0007
Jul 18 '05 #14

P: n/a
"Edward C. Jones" <ed******@erols.com> wrote in message
news:40**********************@news.rcn.com...
Tim Roberts wrote:
"Edward C. Jones" <ed******@erols.com> wrote:
When looking for a parser generator, I think it is important that full
grammars be provided for at least C and Python and preferably for C++,
Java, and FORTRAN.

Are you kidding with this? I can't tell.

C, C++, and Fortran are parsing nightmares, where end-of-line and spacing are important sometimes and ignored at other times, and so on.

I expect to find the canonical desk calculator example, and perhaps a
Pascal-based language, but any more than that is asking a bit much from all but the most mature parser generators.


Not kidding. Nothing can be parsed without a grammar. I think parsing
the standard computer languages is a common need. I am sporatically
developing software to automatically generate Pyrex code for wrapping C
libraries in Python. I use ANTLR because it comes with a good C grammar.

And then there is HTML. I wonder how Mozilla parses all the ill-formed
html that is on the web.


I'm looking for the C grammar in ANTLR. Do you mean the tinyC example?
That leaves out a *lot*. (There are grammars for Java and Pascal included,
and they look pretty complete.)

-- Paul
Jul 18 '05 #15

This discussion thread is closed

Replies have been disabled for this discussion.