Bottom up syntax analysis
This presentation is the property of its rightful owner.
Sponsored Links
1 / 76

Bottom-Up Syntax Analysis PowerPoint PPT Presentation


  • 38 Views
  • Uploaded on
  • Presentation posted in: General

Bottom-Up Syntax Analysis. Mooly Sagiv http://www.cs.tau.ac.il/~msagiv/courses/wcc11-12.html Textbook:Modern Compiler Design Chapter 2.2.5 (modified). Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program

Download Presentation

Bottom-Up Syntax Analysis

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Bottom up syntax analysis

Bottom-Up Syntax Analysis

Mooly Sagiv

http://www.cs.tau.ac.il/~msagiv/courses/wcc11-12.html

Textbook:Modern Compiler Design

Chapter 2.2.5 (modified)


Efficient parsers

Pushdown automata

Deterministic

Report an error as soon as the input is not a prefix of a valid program

Not usable for all context free grammars

context free grammar

tokens

parser

Efficient Parsers

cup

“Ambiguity errors”

parse tree


Kinds of parsers

Top-Down (Predictive Parsing) LL

Construct parse tree in a top-down matter

Find the leftmost derivation

For every non-terminal and token predict the next production

Bottom-Up LR

Construct parse tree in a bottom-up manner

Find the rightmost derivation in a reverse order

For every potential right hand side and token decide when a production is found

Kinds of Parsers


Bottom up syntax analysis1

Input

A context free grammar

A stream of tokens

Output

A syntax tree or error

Method

Construct parse tree in a bottom-up manner

Find the rightmost derivation in (reversed order)

For every potential right hand side and token decide when a production is found

Report an error as soon as the input is not a prefix of valid program

Bottom-Up Syntax Analysis


Bottom up syntax analysis

Pushdown automata

Bottom-up parsing (informal)

Non-deterministic bottom-up parsing

Deterministic bottom-up parsing

Interesting non LR grammars

Plan


Pushdown automaton

Pushdown Automaton

input

u

t

w

$

V

control

parser-table

$

stack


Informal example 1

stack

tree

input

$

i + i $

stack

tree

input

+ i $

i$

i

Informal Example(1)

S  E$E  T | E + TT i|(E)

shift


Informal example 2

stack

tree

input

T$

T

+ i $

i

i

Informal Example(2)

S  E$E  T | E+TT i|( E)

stack

tree

input

i$

+ i $

reduceT i


Informal example 3

T

stack

tree

input

E

E$

+ i $

T

i

i

Informal Example(3)

S  E$E  T | E+TT i | ( E)

stack

tree

input

T$

+ i $

reduce E  T


Informal example 4

E

stack

tree

input

+E$

E

i $

T

T

i

i

+

Informal Example(4)

S  E$E  T| E+TT i|(E)

stack

tree

input

E$

+ i $

shift


Informal example 5

E

stack

tree

input

E

i+E$

$

T

T

+

i

i

i

+

Informal Example(5)

S  E$E  T| E+TT i|(E)

stack

tree

input

+E$

i $

shift


Informal example 6

stack

tree

input

E

T+E$

$

T

T

T

i

i

+

+

i

i

Informal Example(6)

S  E$E  T| E+TT i|(E)

stack

tree

input

E

i+E$

$

reduce T i


Informal example 7

stack

tree

input

E

$

E$

E

T

T

T

i

i

i

+

+

Informal Example(7)

S  E$E  T| E+TT i|(E)

stack

tree

input

E

$

T

T+E$

reduce E E+T

i


Informal example 8

stack

tree

input

E

$E

$

E

T

T

T

i

i

i

+

+

Informal Example(8)

S  E$E  T| E+TT i|(E)

stack

tree

input

E

$

E$

E

T

shift

i


Informal example 9

T

i

+

Informal Example(9)

S  E$E  T| E+TT i|(E)

stack

tree

input

E

$

$E

$

E

T

i

reduce S  E$


Informal example

Informal Example

reduce S  E$

reduce E  E+T

reduce T i

reduce E  T

reduce T  i


The problem

The Problem

  • Deciding between shift and reduce


Informal example 71

stack

tree

input

E

$

E$

E

T

T

T

i

i

i

+

+

Informal Example(7)

S  E$ E  T | E+T T i| (E)

stack

tree

input

E

$

T

T+E$

reduce E E + T

i


Informal example 72

E

E

T

T

E

i

i

i

+

+

Informal Example(7’)

S  E$ E  T | E+T T i|( E)

stack

tree

input

E

$

T

T+E$

reduce E  T

stack

tree

input

$

E+E$


Bottom up lr 0 items

still need to be matched

already matched

regularexpression

input

input

T·

expecting to match 

already matched 

Bottom-UP LR(0) Items


Bottom up syntax analysis

S  E$

E  T

E  E+ T

T i

T (E)


Formal example 1

stack

input

6: E E+T

1: S E$

i + i $

Formal Example(1)

S  E$ E  T | E+TT i| (E)

stack

input

1: S E$

i + i $

-move 6


Formal example 2

stack

stack

input

input

4: E T

6: E E+T

1: S E$

6: E E+T

1: S E$

i + i $

i + i $

Formal Example(2)

S  E$ E  T | E+T T i|(E)

-move 4


Formal example 3

stack

stack

input

input

10: T  i

4: E T

6: E E+T

1: S E$

4: ET

6: E E+T

1: S E$

i + i $

i + i $

Formal Example(3)

S  E$ E  T | E+TT i |(E)

-move 10


Formal example 4

stack

input

11: T i

10: T  i

4: E T

6: E E+T

1: S E$

+ i $

Formal Example(4)

S  E$ E  T | E+T T i|(E)

stack

input

10: T  i

4: E T

6: E E+T

1: S E$

i + i $

shift 11


Formal example 5

stack

input

5: E  T

4: E T

6: E E+T

1: S E$

+ i $

Formal Example(5)

S  E$ E T | E+ TT i|(E)

stack

input

11: T i

10: T  i

4: E T

6: E E+T

1: S E$

+ i $

reduce T i


Formal example 6

stack

stack

input

input

7: E  E+T

6: E E+T

1: S E$

5: E  T

4: E T

6: E E+T

1: S E$

+ i $

+ i $

Formal Example(6)

S  E$ E  T | E+ T T i|(E)

reduce E  T


Formal example 7

stack

stack

input

input

8: E  E+T

7: E  E+T

6: E E+T

1: S E$

7: E  E+T

6: E E+T

1: S E$

i $

+ i $

Formal Example(7)

S  E$ E  T| E+TT i| (E)

shift 8


Formal example 8

stack

input

10: T i

8: E  E+T

7: E  E+T

6: E E+T

1: S E$

i $

stack

input

8: E  E+T

7: E  E+T

6: E E+T

1: S E$

i $

Formal Example(8)

S  E$ E  T | E+T T i| (E)

-move 10


Formal example 9

stack

input

11: T i 

10: T i

8: E  E+T

7: E  E+T

6: E E+T

1: S E$

stack

input

$

10: T i

8: E  E+T

7: E  E+T

6: E E+T

1: S E$

i $

Formal Example(9)

S  E$ E  T | E+T T i| (E)

shift 11


Formal example 10

stack

input

9: E  E+T 

8: E  E+T

7: E  E+T

6: E E+T

1: S E$

$

Formal Example(10)

S  E$ E  T| E+ T T i|( E)

stack

input

11: T i 

10: T i

8: E  E+T

7: E  E+T

6: E E+T

1: S E$

$

reduce T i


Formal example 11

stack

input

2: S E $

1: S E$

$

Formal Example(11)

S  E$ E  T | E+T T i|(E)

stack

input

9: E  E+T 

8: E  E+T

7: E  E+T

6: E E+T

1: S E$

$

reduce E  E+T


Formal example 12

stack

input

3: S E $ 

2: S E $

1: S E$

Formal Example(12)

S  E$ E  T |E+TT i| (E)

stack

input

2: S E $

1: S E$

$

shift 3


Formal example 13

Formal Example(13)

S  E$ E  T | E+T T i|(E)

stack

input

3: S E $ 

2: S E $

1: S E$

reduce S  E$


B ut how can this be done efficiently

But how can this be done efficiently?

Deterministic Pushdown Automaton


Handles

3

2

1

Handles

  • Identify the leftmost node (nonterminal) that has not been constructed but all whose children have been constructed

input

t1 t2 t4 t5 t6 t7 t8


Identifying handles

Identifying Handles

  • Create a deteteministic finite state automaton over grammar symbols

    • Sets of LR(0) items

    • Accepting states identify handles

  • Use automaton to build parser tables

    • reduce For items A on every token

    • shift For itemsA  t on token t

  • When conflicts occur the grammar is not LR(0)

  • When no conflicts occur use a DPDA which pushes states on the stack


A trivial example

A Trivial Example

  • S  A B $

  • A  a

  • B  a


Bottom up syntax analysis

S  E$

E  T

E  E+ T

T i

T (E)


Bottom up syntax analysis

1

1: S E$

4: E T

6: E E +T

10: T i

12: T  (E)

2: S E $

7: E E +T

0

T

6

E

5: E T

$

i

2

5

3: S E $ 

11: T i

+

(

i

7

13: T (E)

4: E T

6: E E +T

10: T i

12: T  (E)

3

8: E E +T

10: T i

12: T  (E)

(

i

(

E

T

8

+

14: T (E )

7: E E +T

9: E E +T 

)

4

15: T (E) 

9


Example control table

Example Control Table


Bottom up syntax analysis

input

stack

shift 5

0($)

i + i $


Bottom up syntax analysis

stack

input

5 (i)

0 ($)

reduceT i

+ i $


Bottom up syntax analysis

stack

input

6 (T)

0 ($)

+ i $

reduce E  T


Bottom up syntax analysis

stack

input

shift 3

1(E)

0 ($)

+ i $


Bottom up syntax analysis

stack

input

shift 5

3 (+)

1(E)

0 ($)

i $


Bottom up syntax analysis

stack

input

5 (i)

3 (+)

1(E)

0($)

reduce T i

$


Bottom up syntax analysis

input

stack

reduce E  E+T

4 (T)

3 (+)

1(E)

0($)

$


Bottom up syntax analysis

stack

input

1 (E)

0 ($)

$

shift 2


Bottom up syntax analysis

stack

input

2 ($)

1 (E)

0 ($)

accept


Bottom up syntax analysis

input

stack

shift 7

0($)

((i) $


Bottom up syntax analysis

stack

input

shift 7

7(()

0($)

(i) $


Bottom up syntax analysis

stack

7 (()

7(()

0($)

input

shift 5

i) $


Bottom up syntax analysis

stack

5 (i)

7 (()

7(()

0($)

input

reduce T  i

) $


Bottom up syntax analysis

stack

6 (T)

7 (()

7(()

0($)

input

reduce E T

) $


Bottom up syntax analysis

stack

8 (E)

7 (()

7(()

0($)

input

shift 9

) $


Bottom up syntax analysis

9 ())

8 (E)

7 (()

7(()

0($)

stack

input

reduce T (E)

$


Bottom up syntax analysis

stack

6 (T)

7(()

0($)

input

reduce E  T

$


Bottom up syntax analysis

stack

8 (E)

7(()

0($)

input

err

$


Bottom up syntax analysis

1

1: S E$

4: E T

6: E E +T

10: T i

12: T  (E)

2: S E $

7: E E +T

0

T

6

E

5: E T

$

i

2

5

2: S E $ 

11: T i

+

(

i

7

13: T (E)

4: E T

6: E E +T

10: T i

12: T  (E)

3

7: E E +T

10: T i

12: T  (E)

i

(

E

T

8

+

14: T (E )

7: E E +T

8: E E +T 

)

4

15: T (E) 

9


Constructing lr 0 parsing table

Constructing LR(0) parsing table

  • Add a production S’  S$

  • Construct a deterministic finite automaton accepting “valid stack symbols”

  • States are set of items A

    • The states of the automaton becomes the states of parsing-table

    • Determine shift operations

    • Determine goto operations

    • Determine reduce operations


Filling parsing table

Filling Parsing Table

  • A state si

  • reduce A 

    • A  si

  • Shift on t

    • At si

  • Goto(si, X) = sj

    • A  X  si

    • (si, X) = sj

  • When conflicts occurs the grammar is not LR(0)


Example control table1

Example Control Table


Example non lr 0 grammar

Example Non LR(0) Grammar

S  E$

E  E+E

E i


Example non lr 0 dfa

4

EE+ E

EE+E

E i

0

2

SE$

EE+E

E i

SE$

EE+E

+

E

+

$

E

i

5

3

EE+ E

EE+E

1

E i

S E$

i

Example Non LR(0)DFA

S  E$E  E+E | i


Bottom up syntax analysis

4

EE+ E

EE+E

E i

0

2

SE$

EE+E

E i

SE$

EE+E

+

E

+

$

E

i

5

3

E i

EE+ E

EE+E

S E$

1

i


Dangling else

Dangling Else

S  if cond s else s

| if cond s

| assign


Non ambiguous non lr 0 grammar

S E$

E E+T

E T

T T*F

T F

F  i

E T 

T  T *F

T

*

1

T  T *F

Fi

0

2

Non-Ambiguous Non LR(0) Grammar

S  E$

E  E+T | T

T  T*F| F

Fi


Non ambiguous slr 1 grammar

S E$

E E+T

E T

T T*F

T F

F  i

E T 

T  T *F

T

*

1

T  T *F

Fi

0

2

Non-Ambiguous SLR(1) Grammar

S  E$

E  E+T | T

T  T*F| F

Fi


Lr 1 parser

LR(1) Parser

  • LR(1) Items A , t

    •  is at the top of the stack and we are expectingt

  • LR(1) State

    • Sets of items

  • LALR(1) State

    • Merge items with the same look-ahead


Grammar hierarchy

Grammar Hierarchy

Non-ambiguous CFG

CLR(1)

LL(1)

LALR(1)

SLR(1)

LR(0)


Interesting non lr 1 grammars

Interesting Non LR(1) Grammars

  • Ambiguous

    • Arithmetic expressions

    • Dangling-else

  • Common derived prefix

    • A  B1 a b | B2 a c

    • B1

    • B2

  • Optional non-terminals

    • St  OptLab Ass

    • OptLab  id : | 

    • Ass  id := Exp


A motivating example

A motivating example

  • Create a desk calculator

  • Challenges

    • Non trivial syntax

    • Recursive expressions (semantics)

      • Operator precedence


Solution lexical analysis

Solution (lexical analysis)

import java_cup.runtime.*;

%%

%cup

%eofval{

return sym.EOF;

%eofval}

NUMBER=[0-9]+

%%

”+” { return new Symbol(sym.PLUS); }

”-” { return new Symbol(sym.MINUS); }

”*” { return new Symbol(sym.MULT); }

”/” { return new Symbol(sym.DIV); }

”(” { return new Symbol(sym.LPAREN); }

”)” { return new Symbol(sym.RPAREN); }

{NUMBER} {

return new Symbol(sym.NUMBER, new Integer(yytext()));

}

\n { }

. { }

  • Parser gets terminals from the Lexer


Bottom up syntax analysis

terminal Integer NUMBER;

terminal PLUS,MINUS,MULT,DIV;

terminal LPAREN, RPAREN;

terminal UMINUS;

nonterminal Integer expr;

precedence left PLUS, MINUS;

precedence left DIV, MULT;

Precedence left UMINUS;

%%

expr ::= expr:e1 PLUS expr:e2

{: RESULT = new Integer(e1.intValue() + e2.intValue()); :}

| expr:e1 MINUS expr:e2

{: RESULT = new Integer(e1.intValue() - e2.intValue()); :}

| expr:e1 MULT expr:e2

{: RESULT = new Integer(e1.intValue() * e2.intValue()); :}

| expr:e1 DIV expr:e2

{: RESULT = new Integer(e1.intValue() / e2.intValue()); :}

| MINUS expr:e1 %prec UMINUS

{: RESULT = new Integer(0 - e1.intValue(); :}

| LPAREN expr:e1 RPAREN

{: RESULT = e1; :}

| NUMBER:n

{: RESULT = n; :}


Summary

Summary

  • LR is a powerful technique

  • Generates efficient parsers

  • Generation tools exit LALR(1)

    • Bison, yacc, CUP

  • But some grammars need to be tuned

    • Shift/Reduce conflicts

    • Reduce/Reduce conflicts

    • Efficiency of the generated parser

  • There exist more general methods

    • GLR

    • Arbitrary grammars in n3

      • Early parsers

      • CYK algorithms


  • Login