470,648 Members | 1,608 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,648 developers. It's quick & easy.

parsing


I would like to use Python to parse a *python-like* data description
language. That is, it would have it's own keywords, but would have a
syntax like Python. For instance:

Ob1 ('A'):
Ob2 ('B'):
Ob3 ('D')
Ob3 ('E')
Ob2 ('C')

I'm looking for the ':' and indentation to provide nested execution so I
can use a description like the one above to construct an object tree.

In looking at the parser and tokenize sections of the Python Language
Services (http://docs.python.org/lib/language.html), it looks as though
this will only parse Python keywords. Is there a way to tap into Python
parsing at a lower level so that I can use it to parse my own keywords?

Thanks,
Todd Moyer


Jul 18 '05 #1
2 1928

"Todd Moyer" <tm****@inventa.com> wrote in message
news:ma*************************************@pytho n.org...

I would like to use Python to parse a *python-like* data description
language. That is, it would have it's own keywords, but would have a
syntax like Python. For instance:

Ob1 ('A'):
Ob2 ('B'):
Ob3 ('D')
Ob3 ('E')
Ob2 ('C')

I'm looking for the ':' and indentation to provide nested execution so I
can use a description like the one above to construct an object tree.

In looking at the parser and tokenize sections of the Python Language
Services (http://docs.python.org/lib/language.html), it looks as though
this will only parse Python keywords. Is there a way to tap into Python
parsing at a lower level so that I can use it to parse my own keywords?


Perhaps the following copied from another article in another thread will
help
From: "Bram Stolk" <br**@nospam.sara.nl>
Subject: Re: Parsing C Preprocessor files
(I have not checked his code and results, just copy and paste)
================
I would like to thank the people who responded on my question about
preprocessor parsing. However, I think I will just roll my own, as I
found out that it takes a mere 16 lines of code to create a #ifdef tree.

I simply used a combination of lists and tuples. A tuple denotes a #if
block (startline,body,endline). A body is a list of lines/tuples.

This will parse the following text:

Top level line
#if foo
on foo level
#if bar
on bar level
#endif
#endif
#ifdef bla
on bla level
#ifdef q
q
#endif
#if r
r
#endif
#endif

into:

['Top level line\n', ('#if foo\n', ['on foo level\n', ('#if bar\n', ['on
bar level\n'], '#endif\n')], '#endif\n'), ('#ifdef bla\n', ['on bla
level\n', ('#ifdef q\n', ['q\n'], '#endif\n'), ('#if r\n', ['r\n'],
'#endif\n')], '#endif\n')]

Which is very suitable for me.

Code is:

def parse_block(lines) :
retval = []
while lines :
line = lines.pop(0)
if line.find("#if") != -1 :
headline = line
b=parse_block(lines)
endline = lines.pop(0)
retval.append( (headline, b, endline) )
else :
if line.find("#endif") != -1 :
lines.insert(0, line)
return retval
else :
retval.append(line)
return retval

And pretty pretting with indentation is easy:

def traverse_block(block, indent) :
while block:
i = block.pop(0)
if type(i) == type((1,2,3)) :
print indent*"\t"+i[0],
traverse_block(i[1], indent+1)
print indent*"\t"+i[2],
else :
print indent*"\t"+i,
Jul 18 '05 #2

Try YAML

http://yaml.org/

Example YAML file from their site :

import yaml
yaml.load( data below )
et voilà... it's done...

read the site, it's really worth a look. This format is great for data
serialization, and very human-readable unlike XML.

invoice: 34843
date : 2001-01-23
bill-to: &id001
given : Chris
family : Dumars
address:
lines: |
458 Walkman Dr.
Suite #292
city : Royal Oak
state : MI
postal : 48046
ship-to: *id001
product:
- sku : BL394D
quantity : 4
description : Basketball
price : 450.00
- sku : BL4438H
quantity : 1
description : Super Hoop
price : 2392.00
tax : 251.42
total: 4443.52
comments: >
Late afternoon is best.
Backup contact is Nancy
Billsmer @ 338-4338.

I would like to use Python to parse a *python-like* data description
language. That is, it would have it's own keywords, but would have a
syntax like Python. For instance:

Ob1 ('A'):
Ob2 ('B'):
Ob3 ('D')
Ob3 ('E')
Ob2 ('C')

I'm looking for the ':' and indentation to provide nested execution so I
can use a description like the one above to construct an object tree.

In looking at the parser and tokenize sections of the Python Language
Services (http://docs.python.org/lib/language.html), it looks as though
this will only parse Python keywords. Is there a way to tap into Python
parsing at a lower level so that I can use it to parse my own keywords?

Thanks,
Todd Moyer


--
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Gerrit Holl | last post: by
16 posts views Thread by Terry | last post: by
9 posts views Thread by ankitdesai | last post: by
5 posts views Thread by randy | last post: by
13 posts views Thread by Chris Carlen | last post: by
7 posts views Thread by Daniel Fetchinson | last post: by
1 post views Thread by Korara | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.