By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,091 Members | 1,546 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,091 IT Pros & Developers. It's quick & easy.

Recursive descent algorithm able to parse Python?

P: n/a
I'm a compiler newbie and curious if Python grammar is able to
be parsed by a recursive descent parser or if it requires
a more powerful algorithm.

Chris

Sep 28 '06 #1
Share this Question
Share on Google+
9 Replies


P: n/a
se******@spawar.navy.mil wrote:
I'm a compiler newbie and curious if Python grammar is able to
be parsed by a recursive descent parser or if it requires
a more powerful algorithm.
Python is relatively simple to parse using a recursive descent parser. If
you're rolling your own as a learning exercise, the grammar for python is
included in one of the top level directories of the source distribution for
python.

The one area that is slightly different from usual is the emitting of INDENT
and DEDENT tokens by the lexer due to handling of whitespace. But that's
not really that complex either.

A couple of years ago I decided to see what it would be like to try to parse
a python-like language with no keywords and to see if you could do it test
first (for fun :). If you do this, you discover the grammar is very short
but you need terminator tokens to close if..elif...elif...else... like
statements (eg if..elif...elif...else...end,
try...except...except...except...endtry).

I put that code up here: http://cerenity.org/SWP/ if you're curious.
That and the python grammar file should probably help you roll your own
parser. (It emits an AST that assumes everything is a function. I was
pondering giving it a lisp backend or transforming to lisp but never
got a round tuit)

If however you're doing this because you're not aware of the compiler
module, it's worth knowing that compiler.parse is a pretty useful
function :-)
Michael.
Sep 28 '06 #2

P: n/a
se******@spawar.navy.mil schrieb:
I'm a compiler newbie and curious if Python grammar is able to
be parsed by a recursive descent parser or if it requires
a more powerful algorithm.
I might be mistaken, but isn't recursive descent one of the more
powerful parsing techniques - for the price of non-linearity and even
potentially non-termination?

Under that assumption: certainly!

Diez
Sep 28 '06 #3

P: n/a
In message <4o************@uni-berlin.de>, Diez B. Roggisch wrote:
se******@spawar.navy.mil schrieb:
>I'm a compiler newbie and curious if Python grammar is able to
be parsed by a recursive descent parser or if it requires
a more powerful algorithm.

I might be mistaken, but isn't recursive descent one of the more
powerful parsing techniques - for the price of non-linearity and even
potentially non-termination?
No, you're thinking of LR(k) bottom-up parsers. Recursive descent is a
top-down parser--might be the same thing as LL(1), I'm not sure. It's easy
to implement and easy to understand, to the point where there is strong
pressure on programming-language designers to make sure their languages can
be parsed with recursive descent.
Sep 28 '06 #4

P: n/a
At Thursday 28/9/2006 17:00, Michael wrote:
>se******@spawar.navy.mil wrote:
I'm a compiler newbie and curious if Python grammar is able to
be parsed by a recursive descent parser or if it requires
a more powerful algorithm.

Python is relatively simple to parse using a recursive descent parser. If
you're rolling your own as a learning exercise, the grammar for python is
included in one of the top level directories of the source distribution for
python.
And it will stay so. From PEP 3099: "The parser won't be more complex
than LL(1). Simple is better than complex. This idea extends to the
parser. Restricting Python's grammar to an LL(1) parser is a
blessing, not a curse. It puts us in handcuffs that prevent us from
going overboard and ending up with funky grammar rules like some
other dynamic languages that will go unnamed, like Perl."

Gabriel Genellina
Softlab SRL

__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Sep 29 '06 #5

P: n/a
Lawrence D'Oliveiro wrote:
In message <4o************@uni-berlin.de>, Diez B. Roggisch wrote:
>se******@spawar.navy.mil schrieb:
>>I'm a compiler newbie and curious if Python grammar is able to
be parsed by a recursive descent parser or if it requires
a more powerful algorithm.

I might be mistaken, but isn't recursive descent one of the more
powerful parsing techniques - for the price of non-linearity and even
potentially non-termination?

No, you're thinking of LR(k) bottom-up parsers. Recursive descent is a
No, I'm not.
top-down parser--might be the same thing as LL(1), I'm not sure. It's easy
to implement and easy to understand, to the point where there is strong
pressure on programming-language designers to make sure their languages
can be parsed with recursive descent.
http://en.wikipedia.org/wiki/Recursive_descent_parser

"""
Recursive descent with backup is a technique that determines which
production to use by trying each production in turn. Recursive descent with
backup is not limited to LL(k) grammars, but is not guaranteed to terminate
unless the grammar is LL(k). Even when they terminate, parsers that use
recursive descent with backup may require exponential time.
"""

I have to admit that I have difficulties to compare LR(k) to recursive
descent, but the fact that the latter contains backtracking makes it at
least more powerful than LL(k)

Diez
Sep 29 '06 #6

P: n/a
In message <4o************@uni-berlin.de>, Diez B. Roggisch wrote:
I have to admit that I have difficulties to compare LR(k) to recursive
descent, but the fact that the latter contains backtracking makes it at
least more powerful than LL(k)
LR(k) is more powerful than LL(k).
Oct 6 '06 #7

P: n/a
>>>>"Diez B. Roggisch" <de***@nospam.web.de(DBR) wrote:
>DBRhttp://en.wikipedia.org/wiki/Recursive_descent_parser
>DBR"""
DBRRecursive descent with backup is a technique that determines which
DBRproduction to use by trying each production in turn. Recursive
DBRdescent with backup is not limited to LL(k) grammars, but is not
DBRguaranteed to terminate unless the grammar is LL(k). Even when they
DBRterminate, parsers that use recursive descent with backup may require
DBRexponential time.
DBR"""
This is the first time I see this called `backup'. I have always used and
seen the term `backtracking'.
--
Piet van Oostrum <pi**@cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: pi**@vanoostrum.org
Oct 6 '06 #8

P: n/a
Lawrence D'Oliveiro schrieb:
In message <4o************@uni-berlin.de>, Diez B. Roggisch wrote:
>I have to admit that I have difficulties to compare LR(k) to recursive
descent, but the fact that the latter contains backtracking makes it at
least more powerful than LL(k)

LR(k) is more powerful than LL(k).
I know _that_, the point was that recursive descent is more power full
than LL(k)

Diez
Oct 6 '06 #9

P: n/a
On 10/6/06, Diez B. Roggisch <de***@nospam.web.dewrote:
Lawrence D'Oliveiro schrieb:
In message <4o************@uni-berlin.de>, Diez B. Roggisch wrote:
I have to admit that I have difficulties to compare LR(k) to recursive
descent, but the fact that the latter contains backtracking makes it at
least more powerful than LL(k)
LR(k) is more powerful than LL(k).

I know _that_, the point was that recursive descent is more power full
than LL(k)
Correct me if I'm wrong, but recursive descent parsers are top-down, yes?

Parse::RecDescent in Perl is a top-down parser. It can even do some
context-sensitive grammars and can almost certainly handle Python.

Is Python context-free? Not sure. Does left parenthesis opening either
a function call (x) or an empty tuple () count as context-sensitive,
for instance? Or can () be considered a terminal symbol unto itself?

I am new to most of this :)

-- Theerasak
Oct 7 '06 #10

This discussion thread is closed

Replies have been disabled for this discussion.