473,732 Members | 2,083 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Choosing the right parser for parsing C headers

Hi,

I need to parse a subset of C (a header file), and generate some unit
tests for the functions listed in it. I thus need to parse the code,
then rewrite function calls with wrong parameters. What I call "shaking
the broken tree" :)
I chose to make my UT-generator in Python 2.4. However, I am now
encountering problems in choosing the right parser for the job. I
struggle in choosing between the inappropriate, the out-of-date, the
alpha, or the too-big-for-the task...

So far I've indentified 9(!) potential candidates (Mostly taken from
the http://www.python.org/moin/LanguageParsing page) :

- Plex:
Only a lexical analyser as far as I understand. Kinda RE++, no syntax
processing
- ply:
Lex / Yacc for python! Tackle the Beast! Syntax processing looks
complex..
- Pyggy:
Lex / Yacc -styled too. More recent, but will a 0.4 version be good
enough?
- PyLR:
fast parser with core functions in C... hasn't moved since '97
- Pyparsing:
quick and easy parser... but I don't think it does more than lexical
analysis
- spark:
Here's some wood. Now build your house.
- yapps2 :
yapps2+ (I hesitate to call it yapps3):
chosen by http://www.python.org/sigs/parser-si...-standard.html.
Is the choice up-to-date?
But will it do for parsing C?
- TPG (Toy Parser Generator):
looks cool
- ANTLR (latest version from Jan 28 produces Python code) :
Seems powerful and has a lot of support, but I don't want to have to
use an exterior Java tool. Furthermore, does it let me control what
happens at each stage easily, or does it just make me a compiler?

I've omitted these: shlex, kwparsing (webpage?), PyBison, Trap
(webpage?), DParser, and SimpleParse (I don't want the extra
dependancy).

I was hoping for a quick and easy choice, but got caught in the tar pit
of Too Much Information. Parsing is a large and complex field. As an
added handicap, I'm new to the dark minefield of parsers... I've had
some experience with Lex/Yacc, and have some knowledge of parser
theory, through a course on compilators. I am thus used to EBNF-style
grammar.
I was disappointed to see that Parser-SIG has died out.
Would you have any ideas on which parser is best suited for the task?

John

Jul 18 '05 #1
11 9610
"Jean de Largentaye" <jl*********@gm ail.com> writes:
Hi,

I need to parse a subset of C (a header file), and generate some unit
tests for the functions listed in it. I thus need to parse the code,
then rewrite function calls with wrong parameters. What I call "shaking
the broken tree" :)


IMO, for parsing 'real-world' C header files, nothing can beat gccxml.

Thomas
Jul 18 '05 #2
Jean de Largentaye wrote:
I need to parse a subset of C (a header file), and generate some unit
tests for the functions listed in it. I thus need to parse the code,
then rewrite function calls with wrong parameters. What I call "shaking
the broken tree" :)

I chose to make my UT-generator in Python 2.4. However, I am now
encountering problems in choosing the right parser for the job. I
struggle in choosing between the inappropriate, the out-of-date, the
alpha, or the too-big-for-the task...


why not use a real compiler?

http://www.boost.org/libs/python/pyste/
http://www.gccxml.org/HTML/Index.html

</F>

Jul 18 '05 #3
Hello Jean,
- ply:
Lex / Yacc for python! Tackle the Beast! Syntax processing looks

mini_c is a C compiler written using ply. You can just use it as is.
http://people.cs.uchicago.edu/~varmaa/mini_c/

HTH.
--
------------------------------------------------------------------------
Miki Tebeka <mi*********@zo ran.com>
http://tebeka.bizhat.com
The only difference between children and adults is the price of the toys
Jul 18 '05 #4
Thomas Heller wrote:
IMO, for parsing 'real-world' C header files, nothing can beat gccxml.


no free tool, at least. if a budget is involved, I'd recommend checking
out the Edison Design Group stuff.

</F>

Jul 18 '05 #5
GCC-XML looks like a very interesting alternative, as Python includes
tools to parse XML.
The mini-C compiler looks like a step in the right direction for me.
I'm going to look into that.
I'm not comfortable with C++ yet, and am not sure how I'd use Pyste.

Thanks for the information guys, you've been quite helpful!

John

Jul 18 '05 #6
Jean de Largentaye wrote:
GCC-XML looks like a very interesting alternative, as Python includes
tools to parse XML.
The mini-C compiler looks like a step in the right direction for me.
I'm going to look into that.
I'm not comfortable with C++ yet, and am not sure how I'd use Pyste.


to clarify, Pyste is a Python tool that uses GCCXML to generate bindings; it might
not be something that you can use out of the box for your project, but it's definitely
something you should study, and perhaps borrow implementation ideas from.

</F>

Jul 18 '05 #7
try http://sourceforge.net/projects/pygccxml
There are a few examples and nice ( for me ) documentation.

Roman

On Tue, 8 Feb 2005 13:35:57 +0100, Fredrik Lundh <fr*****@python ware.com> wrote:
Jean de Largentaye wrote:
GCC-XML looks like a very interesting alternative, as Python includes
tools to parse XML.
The mini-C compiler looks like a step in the right direction for me.
I'm going to look into that.
I'm not comfortable with C++ yet, and am not sure how I'd use Pyste.


to clarify, Pyste is a Python tool that uses GCCXML to generate bindings; it might
not be something that you can use out of the box for your project, but it's definitely
something you should study, and perhaps borrow implementation ideas from.

</F>
--
http://mail.python.org/mailman/listinfo/python-list

Jul 18 '05 #8
That looks cool Roman, however, I'm behind a Corporate Firewall, is
there any chance you could send me a cvs snapshot?

John

Jul 18 '05 #9
Jean de Largentaye wrote:
Hi,

I need to parse a subset of C (a header file), and generate some unit
tests for the functions listed in it. I thus need to parse the code,
then rewrite function calls with wrong parameters. What I call "shaking
the broken tree" :)
I chose to make my UT-generator in Python 2.4. However, I am now
encountering problems in choosing the right parser for the job. I
struggle in choosing between the inappropriate, the out-of-date, the
alpha, or the too-big-for-the task...


Why not see if the output from a tags file generator such as ctags or
etags will do what you want.

I often find that some simpler tools do 95% of the work and it is easier
to treat the other five percent as broken-input.

try http://ctags.sourceforge.net/
- Paddy.
Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

42
4101
by: Fred Ma | last post by:
Hello, This is not a troll posting, and I've refrained from asking because I've seen similar threads get all nitter-nattery. But I really want to make a decision on how best to invest my time. I'm not interested on which language is better in *general*, just for my purpose. My area of research is in CAD algorithms, and I'm sensing the need to resort to something more expedient than C++, bash scripting, or sed scripting.
10
2057
by: Paul Kooistra | last post by:
I need a tool to browse text files with a size of 10-20 Mb. These files have a fixed record length of 800 bytes (CR/LF), and containt records used to create printed pages by an external company. Each line (record) contains an 2-character identifier, like 'A0' or 'C1'. The identifier identifies the record format for the line, thereby allowing different record formats to be used in a textfile. For example: An A0 record may consist of:
3
1966
by: cr88192 | last post by:
for various reasons, I added an imo ugly hack to my xml parser. basically, I wanted the ability to have binary payload within the xml parse trees. this was partly because I came up with a binary xml format (mentioned more later), and thought it would be "useful" to be able to store binary data inline with this format, and still wanted to keep things balanced (whatever the binary version can do, the textual version can do as well). the...
4
11448
by: Greg B | last post by:
Well since getopt() doesn't seem to be compatible with Windows, and the free implementation of it for Windows that I found still had some annoying restrictions, I thought I'd whip up a simple parser myself. Just wanted to see if anyone could provide me with some constructive criticism :) any feedback would be greatly appreciated ----------------------------------------------------------------------------- #include "stdio.h" #include...
28
16417
by: Marc Gravell | last post by:
In Linq, you can apparently get a meaningful body from and expression's .ToString(); random question - does anybody know if linq also includes a parser? It just seemed it might be a handy way to write a safe but easy implementation (i.e. no codedom) for an IBindingListView.Filter (by compiling to a Predicate<T>). Anybody know if this is possible at all? Marc
8
2926
by: Filipe Fernandes | last post by:
I have a project that uses a proprietary format and I've been using regex to extract information from it. I haven't hit any roadblocks yet, but I'd like to use a parsing library rather than maintain my own code base of complicated regex's. I've been intrigued by the parsers available in python, which may add some much needed flexibility. I've briefly looked at PLY and pyparsing. There are several others, but too many to enumerate. My...
4
2568
by: fbrewster | last post by:
I'm writing an HTML parser and would like to use Internet Explorers DOM parser. Can I use Internet Explorers DOM parser through a web service? thanks for the help
0
1060
by: Gordon Fraser | last post by:
Hi, I'm trying to parse Python code to an AST, apply some changes to the AST and then compile and run the AST, but I'm running into problems when trying to evaluate/execute the resulting code object. It seems that the global namespace differs depending on where I call parse and eval/exec. The following code parses a file, compiles and then evaluates the AST. If I call Python directly on this code, then it works:
0
1342
by: Orestis Markou | last post by:
Have you tried passing in empty dicts for globals and locals? I think that the defaults will be the *current* globals and locals, and then of course your namespace is broken... On Tue, Oct 7, 2008 at 1:26 PM, Gordon Fraser <gfraser79@gmail.comwrote: -- orestis@orestis.gr
0
8944
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8773
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9445
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9306
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8186
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
4548
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3259
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2721
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2177
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.