473,397 Members | 2,033 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

How to organize the data structure in a parser?

Hi, all

I am trying to design a parser for C program using C++. Currently what
I did for syntax tree is to design a class for each nontermials in the
grammar, and use inherentance to link them. For example, as for the
expression part, the classes are something like the follows:

.....
class MultiplicativeExpr : public AdditiveExpr
{
MultiplicativeExpr * multi_expr;
int op_type;
PmExpr * pm_expr;
};

class AdditiveExpr : public ShiftExpr
{
int op_type;
AdditiveExpr * additive_expr;
MultiplicativeExpr * multi_expr;
};

class ShiftExpr : public RelationalExpr
{
int op_type;
ShiftExpr * shift_expr;
AdditiveExpr * additive_expr;
};

class RelationalExpr : public EqualityExpr
{
int op_type;
RelationalExpr * relational_expr;
ShiftExpr * shift_expr;
};

.....

I noticed that the bad thing about this solution is that using
inhereitance, the children in the leaf of inheritance tree will be of
large size because it comprised of all data fields of its ancestors.
How can I organize the data structure more efficienty?
Thanks a lot!

Andy

Jul 22 '05 #1
8 2419

Andy wrote:
Hi, all

I am trying to design a parser for C program using C++. Currently what I did for syntax tree is to design a class for each nontermials in the grammar, and use inherentance to link them. For example, as for the
expression part, the classes are something like the follows:

....
class MultiplicativeExpr : public AdditiveExpr
{
MultiplicativeExpr * multi_expr;
int op_type;
PmExpr * pm_expr;
};

class AdditiveExpr : public ShiftExpr
{
int op_type;
AdditiveExpr * additive_expr;
MultiplicativeExpr * multi_expr;
};

[snip]


You're going about this the wrong way. Why is 'MultiplicativeExpr' an
'AdditiveExpr'. I guess in your grammar you have something like this,
but it is not the same when designing C++ classes. You express it like
this in your grammar in order to enforce *operator precedence*, not to
enforce *relationships* (which is what OO inheritance expresses). A
more OO approach might be something like:

class Expr {
virtual double evaluate() = 0;
}

class AddExpr : public Expr {
Expr *left;
Expr *right;
virtual double evaluate() { return left->evaluate() +
right.evaluate(); }
}

class MultiplyExpr : public Expr {
Expr *left;
Expr *right;
virtual double evaluate() { return left->evaluate() *
right.evaluate(); }
}

.... and so on ...

Here, an 'AddExpr' IS AN Expr. Likewise a 'MultiplyExpr' IS AN Expr.
However, a 'MultiplyExpr' is NOT an 'AddExpr'.

-shez-

Jul 22 '05 #2
Yes, your method is one way to solve the problem. But it will
complicate the parsing process if we design the data structure in this
way. I think a hierarchical way for expressions and declarations will
be more suitable for parser, especially we want to do it in
recursive-descent way.

For example, the following gramar, can you find a proper way to deal
with it?

postfix-expression:
primary-expression
postfix-expression [ expression ]
postfix-expression ( expression-list_opt )
simple-type-specfier ( expression-list_opt )
typename ::_opt nested-name-specifier identifier (
expresion-list_opt )
postfix-expression . template_opt id-expression
postfix-expresion ->template_opt id-expression
.....
dynamic_cast <type-id> ( expression )
static_cast <type-id> ( expression )

Thanks!

Andy

You're going about this the wrong way. Why is 'MultiplicativeExpr' an 'AdditiveExpr'. I guess in your grammar you have something like this, but it is not the same when designing C++ classes. You express it like this in your grammar in order to enforce *operator precedence*, not to enforce *relationships* (which is what OO inheritance expresses). A
more OO approach might be something like:

class Expr {
virtual double evaluate() = 0;
}

class AddExpr : public Expr {
Expr *left;
Expr *right;
virtual double evaluate() { return left->evaluate() +
right.evaluate(); }
}

class MultiplyExpr : public Expr {
Expr *left;
Expr *right;
virtual double evaluate() { return left->evaluate() *
right.evaluate(); }
}

... and so on ...

Here, an 'AddExpr' IS AN Expr. Likewise a 'MultiplyExpr' IS AN Expr.
However, a 'MultiplyExpr' is NOT an 'AddExpr'.

-shez-


Jul 22 '05 #3
Andy wrote:
Hi, all

I am trying to design a parser for C program using C++. Currently what
I did for syntax tree is to design a class for each nontermials in the
grammar, and use inherentance to link them. For example, as for the
expression part, the classes are something like the follows:


Have you looked at the boost::spirit parser framework at www.boost.org?

Jeff
Jul 22 '05 #4

Andy wrote:
Yes, your method is one way to solve the problem. But it will
complicate the parsing process if we design the data structure in this way. I think a hierarchical way for expressions and declarations will
be more suitable for parser, especially we want to do it in
recursive-descent way.

For example, the following gramar, can you find a proper way to deal
with it?

postfix-expression:
primary-expression
postfix-expression [ expression ]
postfix-expression ( expression-list_opt )
simple-type-specfier ( expression-list_opt )
typename ::_opt nested-name-specifier identifier (
expresion-list_opt )
postfix-expression . template_opt id-expression
postfix-expresion ->template_opt id-expression
....
dynamic_cast <type-id> ( expression )
static_cast <type-id> ( expression )

Thanks!

Andy


You can have a class for each of these expressions, all of which derive
from the Expr class. Even if you write a recursive-descent parser,
your data structures can be independent from your grammar. This lets
you generate code that is not tied too closely with the syntax of the
source language. Try to make each expression a class with a virtual
function that can be used to evaluate the expression. This seems like
the easiest & most flexible approach to me.

-shez-

Jul 22 '05 #5

shez wrote:
Andy wrote:
Hi, all

I am trying to design a parser for C program using C++. Currently what
I did for syntax tree is to design a class for each nontermials in

the
grammar, and use inherentance to link them. For example, as for the
expression part, the classes are something like the follows:

....
class MultiplicativeExpr : public AdditiveExpr
{
MultiplicativeExpr * multi_expr;
int op_type;
PmExpr * pm_expr;
};

class AdditiveExpr : public ShiftExpr
{
int op_type;
AdditiveExpr * additive_expr;
MultiplicativeExpr * multi_expr;
};

[snip]


You're going about this the wrong way. Why is 'MultiplicativeExpr'

an 'AdditiveExpr'. I guess in your grammar you have something like this, but it is not the same when designing C++ classes. You express it like this in your grammar in order to enforce *operator precedence*, not to enforce *relationships* (which is what OO inheritance expresses). A
more OO approach might be something like:

class Expr {
virtual double evaluate() = 0;
}

class AddExpr : public Expr {
Expr *left;
Expr *right;
virtual double evaluate() { return left->evaluate() +
right.evaluate(); }
}

class MultiplyExpr : public Expr {
Expr *left;
Expr *right;
virtual double evaluate() { return left->evaluate() *
right.evaluate(); }
}

... and so on ...

Here, an 'AddExpr' IS AN Expr. Likewise a 'MultiplyExpr' IS AN Expr.
However, a 'MultiplyExpr' is NOT an 'AddExpr'.

-shez-

Shezan is that you -- from Columbia????

Jul 22 '05 #6
how can we deal with the following grammar:
.....
multiplicative-expr : pm-expr
| multiplicative-expr * pm-expr
| multiplicative-expr / pm-expr
| multiplicative-expr % pm-expr

additive-expr : multiplicative-expr
| additive-expr + multiplicative-expr
| additive-expr - multiplicative-expr

shift-expr : additive-expr
| shift-expr << additive-expr
| shift-expr >> additive-expr
......

If it happend to be
shift-expr ==> additive-expr ==> multiplicative-expr => pm-expr =>
.....
The syntax tree will be unnecessary high. I was thinking that with
inheritance, we can avoid the unnecessary heigth in sytax tree. If we
design a class for each nontermial and use Expr base class and virtual
functions, can we avoid it?

Thanks!

Andy

Jul 22 '05 #7

Andy wrote:
how can we deal with the following grammar:
....
multiplicative-expr : pm-expr
| multiplicative-expr * pm-expr
| multiplicative-expr / pm-expr
| multiplicative-expr % pm-expr

additive-expr : multiplicative-expr
| additive-expr + multiplicative-expr
| additive-expr - multiplicative-expr

shift-expr : additive-expr
| shift-expr << additive-expr
| shift-expr >> additive-expr
.....

If it happend to be
shift-expr ==> additive-expr ==> multiplicative-expr => pm-expr =>
....
The syntax tree will be unnecessary high. I was thinking that with
inheritance, we can avoid the unnecessary heigth in sytax tree. If we
design a class for each nontermial and use Expr base class and virtual functions, can we avoid it?

Thanks!

Andy


To me, a class for each _operation_ seems most useful.

-shez-

Jul 22 '05 #8
yes, you are right. The strucutre of the syntax tree can be independent
of the grammar.

Thanks!

andy

Jul 22 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Fabian Kr?ger | last post by:
Hello, I got a weird problem and need your help and ideas... I´ve written an php application which imports data in XML format and writes this data to a MySQL database to have a faster access....
10
by: TokiDoki | last post by:
Hello there, I have been programming python for a little while, now. But as I am beginning to do more complex stuff, I am running into small organization problems. It is possible that what I...
20
by: Luke Matuszewski | last post by:
Welcome As suggested i looked into JSON project and was amazed but... What about cyclical data structures - anybody was faced it in some project ? Is there any satisactional recomendation... ...
0
by: sonu | last post by:
I have following client side code which i have used in my asp.net project SummaryFeatured Resources from the IBM Business Values Solution Center WHITEPAPER : CRM Done Right Improve the...
5
by: Kent Boogaart | last post by:
Hi, I have some hierarchical data (FAQs) that I would like to bind to. The basic structure is: FAQ Category + Categories + FAQs So an FAQ category has any number of sub-categories and any...
9
by: Alex Buell | last post by:
I have a small text file which consist of the following data: ]] And the code I've written is as follows: ]] The trouble is, I can't work out why it goes into an infinite loop reading the...
3
by: aurora | last post by:
This is an entry I just added to ASPN. It is a somewhat novel technique I have employed quite successfully in my code. I repost it here for more explosure and discussions. ...
0
by: Christoph Haas | last post by:
Hi, list... I have written an application in Perl some time ago (I was young and needed the money) that parses multiple large text files containing nested data structures and allows the user to...
12
by: Jchick | last post by:
Boy, this should be a simple bit of code but I can't figure out how to make it happen. I have a CSV file shows up in a directory that has 4 fields that need to be printed on labels. Each line of...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.