associativity and precedence

junky_fellow

Hi,

I have a very basic doubt about associativity and precedence of
operators. I am not a computer science person and may find it quite
weird.

Consider an expression

4+5*2

As per the tutorials that I read earlier, there could be two
different parse trees possible for this.

+ and *
/ \ / \
4 * + 3
/ \ / \
5 2 4 5

So, to avoid such ambiguities, the language specify the associativity
and precedence rules so that the compiler always generate a unique
parse tree for an expression.

But, what I feel that the lexical analyzer always starts reading the
tokens from left. If this is the case it will always generate the
second parse tree and the expression is equivalent to
(4+5)*2.

and there will be no ambiguity. If we start from left we will always
have only one parse tree and there is no need for any associativity and
precedence.

Consider other example,
4+5+8
The articles say that it may be treated as (4+5)+8 Or 4+(5+8) and to
avoid this associativity rule is required that says "+" operator is
left associative.
I feel that if the lexical analyzer starts from left it should always
be equivalent to (4+5)+8 and therefore no associativity rule is
required.

I know there is something terribly wrong in my understanding. So,
please let me know what is that I am missing ? Any pointers to some
good links or books that would be useful for a beginner like me.

thanks a lot for any help ....

Oct 13 '06 #1

Subscribe Post Reply

2514

Tak-Shing Chan

On Fri, 13 Oct 2006, ju**********@yahoo.co.in wrote:

Hi,

I have a very basic doubt about associativity and precedence of
operators. I am not a computer science person and may find it quite
weird.

Consider an expression

4+5*2

As per the tutorials that I read earlier, there could be two
different parse trees possible for this.

+ and *
/ \ / \
4 * + 3
/ \ / \
5 2 4 5

So, to avoid such ambiguities, the language specify the associativity
and precedence rules so that the compiler always generate a unique
parse tree for an expression.

But, what I feel that the lexical analyzer always starts reading the
tokens from left. If this is the case it will always generate the
second parse tree and the expression is equivalent to
(4+5)*2.

and there will be no ambiguity. If we start from left we will always
have only one parse tree and there is no need for any associativity and
precedence.

Consider other example,
4+5+8
The articles say that it may be treated as (4+5)+8 Or 4+(5+8) and to
avoid this associativity rule is required that says "+" operator is
left associative.
I feel that if the lexical analyzer starts from left it should always
be equivalent to (4+5)+8 and therefore no associativity rule is
required.

I know there is something terribly wrong in my understanding. So,
please let me know what is that I am missing ? Any pointers to some
good links or books that would be useful for a beginner like me.

thanks a lot for any help ....

The C standard resolves this issue by specifying an
unambiguous grammar for * and +:

multiplicative-expression:
cast-expression
multiplicative-expression * cast-expression
multiplicative-expression / cast-expression
multiplicative-expression % cast-expression

additive-expression:
multiplicative-expression
additive-expression + multiplicative-expression
additive-expression - multiplicative-expression

Given this, there is only one way to parse 4+5*2, which is
your parse tree on the left.

Tak-Shing

Oct 13 '06 #2

Chris Dollin

ju**********@yahoo.co.in wrote:

Hi,

I have a very basic doubt about associativity and precedence of
operators. I am not a computer science person and may find it quite
weird.

This isn't really topical for comp.lang.c, but since precedence and
associativity (not to mention their half-brother, evaluation order)
seem to cause problems to C beginners, I'll give it a go.

Consider an expression

4+5*2

As per the tutorials that I read earlier,

(which ones?)

there could be two different parse trees possible for this.

No parse tree without a grammar. What the parse tree is depends
on the grammar that's specified. For an expression grammar,
likely productions are

exp ::= number | exp + exp | exp * exp ...

This grammar is ambiguous, as you say, since:

+ and *
/ \ / \
4 * + 3
/ \ / \
5 2 4 5

would both be valid parse trees.

So, to avoid such ambiguities, the language specify the associativity
and precedence rules so that the compiler always generate a unique
parse tree for an expression.

No. To avoid such ambiguities, an unambiguous grammar is used. /One
way/ to disambiguate is to apply precedence and associativity rules.
Another is to write the grammar so that they're unnecessary:

exp ::= additive
additive ::= additive | additive + multiplicative
multiplicative ::= operand | operand * multiplicative
operand ::= integer

Here we can say "+ associates to the left", because we can't
parse "1 + 2 + 3" as "1 + additive" because the right operand
of `+` has to be multiplicative and hence (in this grammar) can't
include a `+` operator.

Similarly we can say that `* has precedence over +` because
"1 + 2 * 3" can't be parsed as "operand * multiplicative" because
`operand` can't include `+` in this grammar.

But these "associative" and "precedence" rules are just shorthands
for talking about the grammar. The C grammar of the (either) Standard
/does not use them/; it has explicit productions in the grammar
instead.

It is, I think, a matter of opinion/style (and parser-generation tool)
which is the "better" way of describing how operators and operands
group in a language. [I lean toward defining it with a full grammar
but ensuring that the shorthand terms work clearly with it.]

But, what I feel that the lexical analyzer always starts reading the
tokens from left. If this is the case it will always generate the
second parse tree and the expression is equivalent to
(4+5)*2.

Well ... no.

First, a piece of terminology: it's not the lexical analyser that parses
the expression, it's the parser. The parser eats the lexical tokens
the lexical analyser provides.

and there will be no ambiguity. If we start from left we will always
have only one parse tree and there is no need for any associativity and
precedence.

If the parser starts from the left and eats as few tokens at a time as
possible, it has /imposed/ (left) associativity and (no) precedence on
the grammar.

Here's an expression parser. (Incomplete pseudo-code, alas, for brevity.)

parseExpression():
var L = parseOperand()
if next token is operator then
var op = parseOperator()
var R = parseOperand()
return infix( L, op, R )
else
return L
endif

This does /right/-association, without precedence. If you find an
operator, you have to /decide/ whether you want to eat the
minimum (left association) or maximum (right association) or
some intermediate (precedence) size of operand.

Consider other example,
4+5+8
The articles say that it may be treated as (4+5)+8 Or 4+(5+8) and to
avoid this associativity rule is required that says "+" operator is
left associative.

You have to disambiguate /somehow/, yes: you have to answer the
question "what should this mean?", or, at least, "what structure
should this have?"

I feel that if the lexical analyzer starts from left it should always
be equivalent to (4+5)+8 and therefore no associativity rule is
required.

The choice doesn't go away if you go left-to-right, but it's not so
easy to /see/ that it's a choice.

I made a mistake like this once [1]. I was designing and implementing
a (Pure-)functional programming language, in which I had the usual
expressions (leaving out a whole host of details):

Exp ::=
let Dec in Exp [Dec ::= Id = Exp, if it matters]
| Exp Exp
| Id

Writing the parser is dead easy:

parseExp()
var F = parseOperand()
while current token can start an Exp do
F := apply( F, parseExp() )
endwhile
return F

parseOperand()
if current token is `let` then
skip token, var D = parseDec()
skip `in`, var E = parseExp()
return let( D, E )
else
return parseId()
endif

Brill, no problem, this ensures that the Exp of a let-expression is as
long as possible, as tradition required.

What I /hadn't/ noticed was that this grammar is ambiguous, because
not only does

let D in F X

have two parses /according to the grammar/, viz

let D in (F X)
(let D in F) X

but there's a whole other issue of

F let D in X Y

which I hadn't even noticed. It took exposure to an LL(1) parser
generator to show me that although my /code/ was one-token-lookahead,
no backtracking, potter along left-to-right, my /grammar/ was not LL(1)
and Decisions Had To Be Made.

I made them by requiring that `let`s had to appear as "full expressions",
ie not embedded in other expressions unless bracketed. Since this covered
about 99.17254% of the actual code written, it was not a problem. (I
paid off the other 0.82746% of the code with a promise of a quick
assignment with the garbage collector.)
[1] OK, once /that I noticed/.

--
Chris "Essen -6 and counting" Dollin
A rock is not a fact. A rock is a rock.

Oct 13 '06 #3

Dave Thompson

On Fri, 13 Oct 2006 11:00:23 +0100, Chris Dollin <ch**********@hp.com>
wrote:

No. To avoid such ambiguities, an unambiguous grammar is used. /One
way/ to disambiguate is to apply precedence and associativity rules.
Another is to write the grammar so that they're unnecessary:

If the parser starts from the left and eats as few tokens at a time as
possible, it has /imposed/ (left) associativity and (no) precedence on
the grammar.

'eats' here is informal and ambiguous(!) to me. If you mean the parser
always matches fewest and most-left tokens to a rule (aka production)
RHS (right hand side), concur; in the conventional LR terminology this
is choosing reduce over shift.

Here's an expression parser. (Incomplete pseudo-code, alas, for brevity.)

parseExpression():
var L = parseOperand()
if next token is operator then
var op = parseOperator()
var R = parseOperand()
return infix( L, op, R )
else
return L
endif

This does /right/-association, without precedence. If you find an
operator, you have to /decide/ whether you want to eat the
minimum (left association) or maximum (right association) or
some intermediate (precedence) size of operand.

Assuming parseOperand() never 'eats' an (unnested) operator token,
that is either stops at or backs up over such, which is the only way
this works, this code doesn't associate at all. You need to make the
var R = parseExpression() to do right association in this technique,
which is conventionally called recursive descent. If you want left
association with the usual unidirectional stream model of input in RD,
you need backtracking. (Which is possible, but not easy to show in a
pedagogical example.)

- David.Thompson1 at worldnet.att.net

Nov 6 '06 #4

Chris Dollin

Dave Thompson wrote:

On Fri, 13 Oct 2006 11:00:23 +0100, Chris Dollin <ch**********@hp.com>
wrote:

>If the parser starts from the left and eats as few tokens at a time as
possible, it has /imposed/ (left) associativity and (no) precedence on
the grammar.

'eats' here is informal and ambiguous(!) to me.

It's meant to be informal. If that ambiguated it, that's a shame.

If you mean the parser
always matches fewest and most-left tokens to a rule (aka production)
RHS (right hand side), concur; in the conventional LR terminology this
is choosing reduce over shift.

No, I didn't mean that, although it's a possible formalisation.

>
>Here's an expression parser. (Incomplete pseudo-code, alas, for brevity.)

parseExpression():
var L = parseOperand()
if next token is operator then
var op = parseOperator()
var R = parseOperand()
return infix( L, op, R )
else
return L
endif

This does /right/-association, without precedence.

GRR GRR GRR re-reading this I see I messed up completely. That's
what comes of writing code on the Friday afternoon before
taking a week's holiday (and possibly missing corrective
posts). Apologies to anyone who was mislead by the kidneys.

As Dave observed:

You need to make the
var R = parseExpression() to do right association in this technique

Yes.

which is conventionally called recursive descent.

Yes.

If you want left
association with the usual unidirectional stream model of input in RD,
you need backtracking. (Which is possible, but not easy to show in a
pedagogical example.)

You don't need backtracking:

parseExpression():
var L = parseOperand()
while next token is operator then
var op = parseOperator()
var R = parseOperand()
L := infix( L, op, R )
endwhile
return L

Unless, of course, the dingoes have been at my brain again.

--
Chris "unhashedup hashed up hashing" Dollin
"Never ask that question!" Ambassador Kosh, /Babylon 5/

Nov 6 '06 #5

raxitsheth2000

ju**********@yahoo.co.in wrote:

Hi,

I have a very basic doubt about associativity and precedence of
operators. I am not a computer science person and may find it quite
weird.

Consider an expression

4+5*2

As per the tutorials that I read earlier, there could be two
different parse trees possible for this.

+ and *
/ \ / \
4 * + 3
/ \ / \
5 2 4 5

So, to avoid such ambiguities, the language specify the associativity
and precedence rules so that the compiler always generate a unique
parse tree for an expression.

>
But, what I feel that the lexical analyzer always starts reading the
tokens from left. If this is the case it will always generate the
second parse tree and the expression is equivalent to
(4+5)*2.

Reading and Evaluating are Not Same.
Lexical Ana. just separate out token (ideally) and mostly left to
right.
Parser ask LA to give next token

>
and there will be no ambiguity. If we start from left we will always
have only one parse tree and there is no need for any associativity and
precedence.

I think you are telling like 4+5*8 should be evaluated as (4+5)*8 by
ignoring the precedence.
as a language designer it is not good.

continuing with above rule what if someone want to do evaluation like
4+(5*8) one need to enter the expression like 5*8+4. In General if
precedence is not explicitly defined by language, then PROGRAMMER has
to manually enter the expression in left-to-right (or language defined)
order and in language defined precedence rule.

Homework, how would u write manually the expression so the equivalent
evaluation is same as of 2+(3*5)+(4*6) <--there are really no bracket
in expression.

3*5+4*6+2<----This is not Correct it is doing
i think this will help you to make clear need of operator precedence in
good language.

>
Consider other example,
4+5+8
The articles say that it may be treated as (4+5)+8 Or 4+(5+8) and to
avoid this associativity rule is required that says "+" operator is
left associative.
I feel that if the lexical analyzer starts from left it should always
be equivalent to (4+5)+8 and therefore no associativity rule is
required.

but precedence required so that wrong evaluation not carried out.

>
I know there is something terribly wrong in my understanding. So,
please let me know what is that I am missing ? Any pointers to some
good links or books that would be useful for a beginner like me.

>If we start from left we will always
have only one parse tree and there is no need for any associativity and
precedence.

if compiler alwas start from left (evaluation ) then programmer is in
trouble
Think

--raxit

Nov 6 '06 #6

Similar topics

associativity

by: Mantorok Redgormor | last post by:

Which section(s) in the standard describe associativity rules? e.g., *ptr++ * and ++ are the same precedence wise and as such will be evaluated left-to-right? --

C / C++

question regarding precedence and associativity of ++ and *

by: maadhuu | last post by:

hi, i am a bit confused as to how *i++ (i is a pointer) works.....the postfix + has a higher precedence than prefix ++ and i think precedence is from left to right whereas ,prefix ++ and * have...

C / C++

Associativity of ++ and --

by: Chad | last post by:

The following question stems from p.132 in the book "The C Programming Language", second edition by K & R. The have struct { int len; char *str; } *p;

C / C++

( ) -> operators associativity confused

by: bochengnever | last post by:

( ) and -are left to right in the same order . eg: struct foo { int a ; void * p; } main() { struct foo* A= malloc(sizeof(struct foo));

C / C++

Logical operator precedence & associativity

by: marko | last post by:

/* code start */ int a = 0; /* expected evaluation and excution order with precedence in mind /* False(3) , True(1), False(2) */ if ( (a=1) == 0 || 0 != 1 && (a =2) == 1) putchar('T');...

C / C++

Associativity of unary C Operators

by: dspfun | last post by:

I'm trying to get a good understanding of how unary operators work and have some questions about the following test snippets. int *p; ~!&*++p--; It doesn't compile, why? The problem seems to be...

C / C++

associativity question

by: fdmfdmfdm | last post by:

Associativity in C takes two forms: left to right and right to left. I think the K&R book lacks something... For example, *p++, since the associativity is from right to left, do this expression...

C / C++

associativity of operators

by: subramanian100in | last post by:

What does "associativity of operators" mean ? I am unable to undersatand this from K & R 2nd edition. Kindly explain with an example. Thanks

C / C++

2**2**2**2**2 wrong? Bug?

by: Jim Langston | last post by:

In Python 2.5 on intel, the statement 2**2**2**2**2 evaluates to 20035299304068464649790723515602557504478254755697514192650169737108940595563114...

Python

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

General

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware