Compiler construction in4020 lecture 4
This presentation is the property of its rightful owner.
Sponsored Links
1 / 50

Compiler construction in4020 – lecture 4 PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on
  • Presentation posted in: General

Compiler construction in4020 – lecture 4. Koen Langendoen Delft University of Technology The Netherlands. program text. lexical analysis. tokens. parser. language. syntax analysis. generator. grammar. AST. context handling. annotated AST. Summary of lecture 3.

Download Presentation

Compiler construction in4020 – lecture 4

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Compiler construction in4020 lecture 4

Compiler constructionin4020 – lecture 4

Koen Langendoen

Delft University of Technology

The Netherlands


Summary of lecture 3

program text

lexical analysis

tokens

parser

language

syntax analysis

generator

grammar

AST

context handling

annotated AST

Summary of lecture 3

  • syntax analysis: tokens  AST

  • top-down parsing

    • recursive descent

    • push-down automaton

    • making grammars LL(1)


Compiler construction in4020 lecture 4

Quiz

2.31Can you create a top-down parser for the following grammars?

(a) S ‘(‘ S ‘)’ | ‘)’

(b) S ‘(‘ S ‘)’ | 

(c) S ‘(‘ S ‘)’ | ‘)’ | 


Overview

program text

lexical analysis

tokens

parser

language

syntax analysis

generator

grammar

AST

context handling

annotated AST

Overview

  • syntax analysis: tokens  AST

  • bottom-up parsing

    • push-down automaton

    • ACTION/GOTO tables

    • LR(0), SLR(1), LR(1), LALR(1)


Bottom up lr parsing

rest_expression

expression

term

rest_expr

IDENT

IDENT

IDENT

aap + ( noot + mies )

Bottom-up (LR) parsing

  • Left-to-right parse, Rightmost-derivation

  • create a node when all

    children are present

  • handle: nodes representing

    the right-hand side of a

    production


Lr 0 parsing

LR(0) parsing

  • running example: expression grammar

    input expression EOF

    expression  expression ‘+’ term | term

    term  IDENTIFIER | ‘(’ expression ‘)’

  • short-hand notation

    Z  E $

    E  E ‘+’ T | T

    T  i | ‘(’ E ‘)


Lr 0 parsing1

LR(0) parsing

  • running example: expression grammar

    input expression EOF

    expression  expression ‘+’ term | term

    term  IDENTIFIER | ‘(’ expression ‘)’

  • short-hand notation

    Z  E $

    E  E ‘+’ T

    E  T

    T  i

    T  ‘(’ E ‘)’


Lr 0 parsing2

Z   E $

LR(0) parsing

  • keep track of progress inside potential

    handles when consuming input tokens

  • LR items: N  

  • initial set

  • -closure: expand dots in front of non-terminals

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

S0


Lr 0 parsing3

input

i + i $

stack

S0

LR(0) parsing

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

  • shift input token (i) onto the stack

  • compute new state


Lr 0 parsing4

LR(0) parsing

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 i S1

+ i $

  • reduce handle on top of the stack

  • compute new state


Lr 0 parsing5

LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 T S2

+ i $

i


Lr 0 parsing6

LR(0) parsing

  • shift input token on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3

+ i $

T

i


Lr 0 parsing7

LR(0) parsing

  • shift input token on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 + S4

i $

T

i


Lr 0 parsing8

LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 + S4 i S1

$

T

i


Lr 0 parsing9

LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 + S4 T S5

$

T

i

i


Lr 0 parsing10

LR(0) parsing

  • shift input token on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3

$

E

+

T

T

i

i


Lr 0 parsing11

LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 $ S6

E

+

T

T

i

i


Lr 0 parsing12

LR(0) parsing

  • accept!

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 Z

E

$

E

+

T

T

i

i


Transition diagram

S0

Z   E $

E   E ‘+’ T

E   T

T   i

T   ‘(’ E ‘)’

S3

Z  E  $

E  E  ‘+’ T

S5

E  E + T 

Transition diagram

S2

T

E  T 

i

S1

T  i 

E

i

S4

E  E ‘+’  T

T   i

T   ‘(’ E ‘)’

‘+’

$

T

S6

Z  E $ 


Exercise 8 min

Exercise (8 min.)

  • complete the transition diagram for the LR(0) automaton

  • can you think of a single input expression that causes all states to be used? If yes, give an example. If no, explain.


Answers

Answers


Answers fig 2 89

S0

Z   E $

E   E ‘+’ T

E   T

T   i

T   ‘(’ E ‘)’

S3

Z  E  $

E  E  ‘+’ T

S5

E  E + T 

Answers (fig 2.89)

S2

S7

T  ‘(’  E ‘)’

E   E ‘+’ T

E   T

T   i

T   ‘(’ E ‘)’

T

T

E  T 

‘(’

i

i

S1

T  i 

E

‘(’

E

‘(’

i

S4

S8

E  E ‘+’  T

T   i

T   ‘(’ E ‘)’

‘+’

‘+’

T  ‘(‘ E  ‘)’

E  E  ‘+’ T

$

‘)’

T

S6

S9

Z  E $ 

T  ‘(‘ E ‘)’ 


Answers1

Answers

The following expression exercises all states

( i ) + i


The lr tables

The LR tables


Lr 0 parsing concise notation

LR(0) parsingconcise notation


The lr push down automaton

The LR push-down automaton

SWITCH action_table[top of stack]:

CASE “shift”:

see book;

CASE (“reduce”, N ):

POP the symbols of  FROM the stack;

SET state TO top of stack;

PUSHNON the stack;

SET new state TO goto_table[state,N];

PUSH new state ON the stack;

CASE empty:

ERROR;


Compiler construction in4020 lecture 4

Break


Lr 0 conflicts

LR(0) conflicts

  • shift-reduce conflict

    • array indexing: T  i [ E ]

      T  i [ E ](shift)

      T  i(reduce)

    • -rule: RestExpr 

      Expr  Term  RestExpr(shift)

      RestExpr (reduce)


Lr 0 conflicts1

LR(0) conflicts

  • reduce-reduce conflict

    • assignment statement: Z  V := E $

      V  i(reduce)

      T  i(reduce)

  • typical LR(0) table contains many conflicts


Handling lr 0 conflicts

Handling LR(0) conflicts

  • solution: use a one-token look-ahead

    two-dimensional ACTION table [state,token]

  • different construction of ACTION table

    • SLR(1) – Simple LR

    • LR(1)

    • LALR(1) – Look-Ahead LR


Slr 1 parsing

SLR(1) parsing

  • solves (some) shift-reduce conflicts

  • reduce N  iff token FOLLOW(N)

    FOLLOW(T) = { ‘+’, ‘)’, $ }

    FOLLOW(E) = { ‘+’, ‘)’, $ }

    FOLLOW(Z) = { $ }


Slr 1 action table

SLR(1) ACTION table


Slr 1 action goto table

SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

sn – shift to

state n

rn – reduce

rule n


Slr 1 action goto table1

SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

6: T  i ‘[‘ E ‘]’

sn – shift to

state n

rn – reduce

rule n


Slr 1 action goto table2

SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

6: T  i ‘[‘ E ‘]’

sn – shift to

state n

rn – reduce

rule n


Slr 1 action goto table3

SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

6: T  i ‘[‘ E ‘]’

sn – shift to

state n

rn – reduce

rule n


Unfortunately

Unfortunately ...

  • SLR(1) leaves many shift-reduce conflicts unsolved

  • problem: FOLLOW(N) set is a union of all contexts in which N may occur

  • example

    S  A | x b

    A  a A b | x


Lr 0 automaton

LR(0) automaton

S3

S5

S  A 

A  x 

A

x

S0

S   A

S   x b

A   a A b

A   x

S4

A  a  A b

A   a A b

A   x

a

a

x

A

S1

S6

S  x  b

A  x 

1: S  A

2: S  x b

3: A  a A b

4: A  x

A  a A  b

b

b

S2

S7

S  x b 

A  a A b 


Exercise 6 min

Exercise (6 min.)

  • derive the SLR(1) ACTION/GOTO table (with shift-reduce conflict) for the following grammar:

    S  A | x b

    A  a A b | x


Answers2

Answers


Answers3

Answers

1: S  A

2: S  x b

3: A  a A b

4: A  x

FOLLOW(S) = {$}

FOLLOW(A) = {$,b}


Lr 1 parsing

LR(1) parsing

  • maintain follow set per item

    LR(1) item:N   {}

  •  - closure for LR(1) item sets:

    if set S contains an item P  N  {} then

    foreach production rule N  

    S must contain the item N   {}

    where  = FIRST(  {} )


Lr 1 automaton

LR(1) automaton

S3

S5

S  A  {$}

A  x  {b}

A

x

x

S0

S   A {$}

S   x b {$}

A   a A b {$}

A   x {$}

S4

S8

A  a  A b {$}

A   a A b {b}

A   x {b}

A  a  A b {b}

A   a A b {b}

A   x {b}

a

a

x

A

A

A

S1

S6

S9

S  x  b {$}

A  x  {$}

A  a A  b {$}

A  a A  b {b}

b

b

b

S2

S7

S10

S  x b  {$}

A  a A b  {$}

A  a A b  {b}


Lalr 1 parsing

LALR(1) parsing

  • LR tables are big

  • combine “equal” sets by merging look-ahead sets


La lr 1 automaton

LALR(1) automaton

S3

S5

S  A  {$}

A  x  {b}

A

x

x

S0

S   A {$}

S   x b {$}

A   a A b {$}

A   x {$}

S4

S8

A  a  A b {$}

A   a A b {b}

A   x {b}

A  a  A b {b}

A   a A b {b}

A   x {b}

a

a

x

A

A

A

S1

S6

S9

S  x  b {$}

A  x  {$}

A  a A  b {$}

A  a A  b {b}

b

b

b

S2

S7

S10

S  x b  {$}

A  a A b  {$}

A  a A b  {b}


Lalr 1 automaton

LALR(1) automaton

S3

S5

S  A  {$}

A  x  {b}

A

x

S0

S   A {$}

S   x b {$}

A   a A b {$}

A   x {$}

S   A {$}

S   x b {$}

A   a A b {b,$}

A   x {b,$}

S4

A  a  A b {b,$}

A   a A b {b,$}

A   x {b,$}

A  a  A b {b,$}

A   a A b {b,$}

A   x {b,$}

a

a

x

A

A

S1

S6

S  x  b {$}

A  x  {$}

S  x  b {$}

A  x  {b,$}

A  a A  b {b,$}

A  a A  b {b,$}

b

b

S2

S7

A  a A b  {b,$}

S  x b  {$}

A  a A b  {b,$}


Lalr 1 action goto table

LALR(1) ACTION/GOTO table

1: S  A

2: S  x b

3: A  a A b

4: A  x


Making grammars lr 1 or not

+

*

*

E

E

+

E

E

E

E

Making grammars LR(1) – or not?

  • grammars are often ambiguous

    E  E ‘+’ E | E ‘*’ E | i

  • handle shift-reduce conflicts

    • (default) longest match: shift

    • precedence directives

      input: i * i + i

      E  E ‘+’ E

      E  E ‘*’ E 

reduce

shift


Summary

Summary

  • syntax analysis: tokens  AST

  • bottom-up parsing

    • push-down automaton

    • ACTION/GOTO tables

      • LR(0)NO look-ahead

      • SLR(1)one-token look-ahead, FOLLOW sets to solve shift-reduce conflicts

      • LR(1)SLR(1), but FOLLOW set per item

      • LALR(1)LR(1), but “equal” states are merged


Homework

Homework

  • study sections:

    • 1.10 closure algorithm

    • 2.2.5.8 error handling in LR(1) parsers

  • print handout for next week [blackboard]

  • find a partner for the “practicum”

  • register your group

    • send e-mail to [email protected]


  • Login