469,917 Members | 1,943 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,917 developers. It's quick & easy.

Parsing C Preprocessor files

Hi there,

What could I use to parse CPP macros in Python?
I tried the Parnassus Vaults, and python lib docs, but could not
find a suitable module.

Thanks,

Bram
--
------------------------------------------------------------------------------
Bram Stolk, VR Engineer.
SARA Academic Computing Services Amsterdam, PO Box 94613, 1090 GP AMSTERDAM
email: br**@nospam.sara.nl Phone +31-20-5923059 Fax +31-20-6683167

"Software is math. Math is not patentable."
OR
"Software is literature. Literature is not patentable." -- slashdot comment
------------------------------------------------------------------------------
Jul 18 '05 #1
8 5267
Bram Stolk wrote:
What could I use to parse CPP macros in Python?
I tried the Parnassus Vaults, and python lib docs, but could not
find a suitable module.


Does it really need to be in Python? There are probably
dozens of free and adequate macro preprocessors out there
already.

(You might also want to clarify what you mean by "parse"
in this case... do you mean actually running the whole
preprocessor over an input file and expanding all macros,
or do you mean something else?)

-Peter
Jul 18 '05 #2
On Wed, 23 Jun 2004 08:32:08 -0400
Peter Hansen <pe***@engcorp.com> wrote:
Bram Stolk wrote:
What could I use to parse CPP macros in Python?
I tried the Parnassus Vaults, and python lib docs, but could not
find a suitable module.
Does it really need to be in Python? There are probably
dozens of free and adequate macro preprocessors out there
already.


I want to trigger Python actions for certain nodes or states in the
parse tree. I want to traverse this tree, an be able to make
intelligent actions. For this, I want to use python.
(You might also want to clarify what you mean by "parse"
in this case... do you mean actually running the whole
preprocessor over an input file and expanding all macros,
or do you mean something else?)
Roughly speaking, I want to be able to identify sections that are
guarded with #ifdef FOO
Because conditionals can be nested, you would have to count the
ifs/endifs, and additionally, the conditional values may depend on other
preprocessor command, e.g. values may have been defined in included
files.

If I can traverse the #if/#endif tree in Python, a preprocessor file
becomes much more managable.

Bram
-Peter

--
------------------------------------------------------------------------------
Bram Stolk, VR Engineer.
SARA Academic Computing Services Amsterdam, PO Box 94613, 1090 GP AMSTERDAM
email: br**@nospam.sara.nl Phone +31-20-5923059 Fax +31-20-6683167

"Software is math. Math is not patentable."
OR
"Software is literature. Literature is not patentable." -- slashdot comment
------------------------------------------------------------------------------
Jul 18 '05 #3
"Bram Stolk" <br**@nospam.sara.nl> wrote in message
news:20*********************@pistache.sara.nl...
Hi there,

What could I use to parse CPP macros in Python?
I tried the Parnassus Vaults, and python lib docs, but could not
find a suitable module.

Thanks,

Bram


Try pyparsing, at http://pyparsing.sourceforge.net . The examples include a
file scanExamples.py, that does some simple C macro parsing. This should be
pretty straightforward to adapt to matching #ifdef's and #endif's.

-- Paul
(I'm sure pyparsing is listed in Vaults of Parnassus. Why did you think it
would not be applicable?)
Jul 18 '05 #4
Bram Stolk wrote:
Hi there,

What could I use to parse CPP macros in Python?
I tried the Parnassus Vaults, and python lib docs, but could not
find a suitable module.


I wrote a program called SeeGramWrap. It uses Java and ANTLR to parse
C files. See

http://members.tripod.com/~edcjones/...4.03.03.tar.gz
Jul 18 '05 #5
On Wed, 23 Jun 2004 13:58:04 GMT
"Paul McGuire" <pt***@austin.rr._bogus_.com> wrote:
(I'm sure pyparsing is listed in Vaults of Parnassus. Why did you think it
would not be applicable?)


Because I searched for "parser", "macro", "preprocessor", "cpp", and none
of those searches comes up with "pyparsing". I should have searched for
"parsing" I guess.

Bram
--
------------------------------------------------------------------------------
Bram Stolk, VR Engineer.
SARA Academic Computing Services Amsterdam, PO Box 94613, 1090 GP AMSTERDAM
email: br**@nospam.sara.nl Phone +31-20-5923059 Fax +31-20-6683167

"Software is math. Math is not patentable."
OR
"Software is literature. Literature is not patentable." -- slashdot comment
------------------------------------------------------------------------------
Jul 18 '05 #6
pyHi(),

I would like to thank the people who responded on my question about
preprocessor parsing. However, I think I will just roll my own, as I
found out that it takes a mere 16 lines of code to create a #ifdef tree.

I simply used a combination of lists and tuples. A tuple denotes a #if
block (startline,body,endline). A body is a list of lines/tuples.

This will parse the following text:

Top level line
#if foo
on foo level
#if bar
on bar level
#endif
#endif
#ifdef bla
on bla level
#ifdef q
q
#endif
#if r
r
#endif
#endif

into:

['Top level line\n', ('#if foo\n', ['on foo level\n', ('#if bar\n', ['on bar level\n'], '#endif\n')], '#endif\n'), ('#ifdef bla\n', ['on bla level\n', ('#ifdef q\n', ['q\n'], '#endif\n'), ('#if r\n', ['r\n'], '#endif\n')], '#endif\n')]

Which is very suitable for me.

Code is:

def parse_block(lines) :
retval = []
while lines :
line = lines.pop(0)
if line.find("#if") != -1 :
headline = line
b=parse_block(lines)
endline = lines.pop(0)
retval.append( (headline, b, endline) )
else :
if line.find("#endif") != -1 :
lines.insert(0, line)
return retval
else :
retval.append(line)
return retval

And pretty pretting with indentation is easy:

def traverse_block(block, indent) :
while block:
i = block.pop(0)
if type(i) == type((1,2,3)) :
print indent*"\t"+i[0],
traverse_block(i[1], indent+1)
print indent*"\t"+i[2],
else :
print indent*"\t"+i,

I think extending it with '#else' is trivial. Handling includes and
expressions is much harder ofcourse, but not immediately req'd for me.

Bram

On Wed, 23 Jun 2004 14:01:51 +0200
Bram Stolk <br**@nospam.sara.nl> wrote:
Hi there,

What could I use to parse CPP macros in Python?
I tried the Parnassus Vaults, and python lib docs, but could not
find a suitable module.


--
------------------------------------------------------------------------------
Bram Stolk, VR Engineer.
SARA Academic Computing Services Amsterdam, PO Box 94613, 1090 GP AMSTERDAM
email: br**@nospam.sara.nl Phone +31-20-5923059 Fax +31-20-6683167

"Software is math. Math is not patentable."
OR
"Software is literature. Literature is not patentable." -- slashdot comment
------------------------------------------------------------------------------
Jul 18 '05 #7


Nice and simple algorithm, but you should use an iterator to iterate over
your lines, or else shifting your big array of lines with pop() is gonna
be very slow.

Instead of :
line = lines.pop(0)


Try :

lines = iter( some line array )

Or just pass the file handle ; python will split the lines for you.

You can replace your "while lines" with a "for" on this iterator. You'll
need to avoid pushing data in the array (think about it)...

also "#if" in line is prettier.

Another way to do it is without recursion : have an array which is your
stack, advance one level when you get a #if, go back one level at #endif ;
no more recursion.
Have fun !
Jul 18 '05 #8

I thought about it and...

Here's a stackless version with #include and #if. 20 minutes in the
making...
You'll need a pen and paper to figure how the stack works though :) but
it's fun.
It uses references...
file1 = """Top level line
#if foo
on foo level
#if bar
on bar level
#endif
re foo level
#include file2
#else
not foo
#endif
top level
#ifdef bla
on bla level
#ifdef q
q
#else
not q
#endif
check
#if r
r
#endif
#endif"""

file2 = """included file:
#ifdef stuff
stuff level
#endif
"""

# simple class to process included files
class myreader( object ):
def __init__(self):
self.queue = [] # queue of iterables to be played

def __iter__(self):
return self

# insert an iterable into the current flow
def insert( self, iterator ):
self.queue.append( iterator )

def next(self):
while self.queue:
try:
return self.queue[-1].next()
except StopIteration:
self.queue.pop() # this iterable is finished, throw it away
raise StopIteration

reader = myreader()
reader.insert( iter( file1.split("\n") ))

# stackless parser !
result = []
stack = [result]
stacktop = stack[-1]

for line in reader:
ls = line.strip()
if ls.startswith( "#" ): # factor all # cases for speed
keyword = ls.split(" \t\r\n",1)[0]
if keyword == "#if":
next = []
stacktop.append( [line, next] )
stack.append( next )
stacktop = next
elif keyword == "#else":
stack.pop()
stack[-1][-1].append(line)
next = []
stack[-1][-1].append( next )
stack.append( next )
stacktop = next
elif keyword == "#endif":
stack.pop()
stack[-1][-1] = tuple( stack[-1][-1] + [line] )
elif keyword == "#include":
# I don't parse the filename... replace the iter() below by something
like open(filename)
reader.insert( iter(file2.split("\n")) )
else:
stacktop.append(line)

def printblock(block, indent=0) :
ind = "\t"*indent
for elem in block:
if type( elem ) == list:
printblock( elem, indent+1 )
elif type( elem ) == tuple:
printblock( elem, indent )
else:
print ind, elem

print result
printblock(result)
Jul 18 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Todd Moyer | last post: by
12 posts views Thread by Florian | last post: by
9 posts views Thread by Walter Roberson | last post: by
2 posts views Thread by Prashant Mahajan | last post: by
2 posts views Thread by claus.tondering | last post: by
reply views Thread by Ole Nielsby | last post: by
13 posts views Thread by Chris Carlen | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.