Compiler construction in4020 – lecture 4 - PowerPoint PPT Presentation

Compiler construction in4020 lecture 4
Download
1 / 50

  • 96 Views
  • Uploaded on
  • Presentation posted in: General

Compiler construction in4020 – lecture 4. Koen Langendoen Delft University of Technology The Netherlands. program text. lexical analysis. tokens. parser. language. syntax analysis. generator. grammar. AST. context handling. annotated AST. Summary of lecture 3.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Compiler construction in4020 – lecture 4

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Compiler construction in4020 lecture 4

Compiler constructionin4020 – lecture 4

Koen Langendoen

Delft University of Technology

The Netherlands


Summary of lecture 3

program text

lexical analysis

tokens

parser

language

syntax analysis

generator

grammar

AST

context handling

annotated AST

Summary of lecture 3

  • syntax analysis: tokens  AST

  • top-down parsing

    • recursive descent

    • push-down automaton

    • making grammars LL(1)


Compiler construction in4020 lecture 4

Quiz

2.31Can you create a top-down parser for the following grammars?

(a) S ‘(‘ S ‘)’ | ‘)’

(b) S ‘(‘ S ‘)’ | 

(c) S ‘(‘ S ‘)’ | ‘)’ | 


Overview

program text

lexical analysis

tokens

parser

language

syntax analysis

generator

grammar

AST

context handling

annotated AST

Overview

  • syntax analysis: tokens  AST

  • bottom-up parsing

    • push-down automaton

    • ACTION/GOTO tables

    • LR(0), SLR(1), LR(1), LALR(1)


Bottom up lr parsing

rest_expression

expression

term

rest_expr

IDENT

IDENT

IDENT

aap + ( noot + mies )

Bottom-up (LR) parsing

  • Left-to-right parse, Rightmost-derivation

  • create a node when all

    children are present

  • handle: nodes representing

    the right-hand side of a

    production


Lr 0 parsing

LR(0) parsing

  • running example: expression grammar

    input expression EOF

    expression  expression ‘+’ term | term

    term  IDENTIFIER | ‘(’ expression ‘)’

  • short-hand notation

    Z  E $

    E  E ‘+’ T | T

    T  i | ‘(’ E ‘)


Lr 0 parsing1

LR(0) parsing

  • running example: expression grammar

    input expression EOF

    expression  expression ‘+’ term | term

    term  IDENTIFIER | ‘(’ expression ‘)’

  • short-hand notation

    Z  E $

    E  E ‘+’ T

    E  T

    T  i

    T  ‘(’ E ‘)’


Lr 0 parsing2

Z   E $

LR(0) parsing

  • keep track of progress inside potential

    handles when consuming input tokens

  • LR items: N  

  • initial set

  • -closure: expand dots in front of non-terminals

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

S0


Lr 0 parsing3

input

i + i $

stack

S0

LR(0) parsing

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

  • shift input token (i) onto the stack

  • compute new state


Lr 0 parsing4

LR(0) parsing

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 i S1

+ i $

  • reduce handle on top of the stack

  • compute new state


Lr 0 parsing5

LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 T S2

+ i $

i


Lr 0 parsing6

LR(0) parsing

  • shift input token on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3

+ i $

T

i


Lr 0 parsing7

LR(0) parsing

  • shift input token on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 + S4

i $

T

i


Lr 0 parsing8

LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 + S4 i S1

$

T

i


Lr 0 parsing9

LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 + S4 T S5

$

T

i

i


Lr 0 parsing10

LR(0) parsing

  • shift input token on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3

$

E

+

T

T

i

i


Lr 0 parsing11

LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 $ S6

E

+

T

T

i

i


Lr 0 parsing12

LR(0) parsing

  • accept!

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 Z

E

$

E

+

T

T

i

i


Transition diagram

S0

Z   E $

E   E ‘+’ T

E   T

T   i

T   ‘(’ E ‘)’

S3

Z  E  $

E  E  ‘+’ T

S5

E  E + T 

Transition diagram

S2

T

E  T 

i

S1

T  i 

E

i

S4

E  E ‘+’  T

T   i

T   ‘(’ E ‘)’

‘+’

$

T

S6

Z  E $ 


Exercise 8 min

Exercise (8 min.)

  • complete the transition diagram for the LR(0) automaton

  • can you think of a single input expression that causes all states to be used? If yes, give an example. If no, explain.


Answers

Answers


Answers fig 2 89

S0

Z   E $

E   E ‘+’ T

E   T

T   i

T   ‘(’ E ‘)’

S3

Z  E  $

E  E  ‘+’ T

S5

E  E + T 

Answers (fig 2.89)

S2

S7

T  ‘(’  E ‘)’

E   E ‘+’ T

E   T

T   i

T   ‘(’ E ‘)’

T

T

E  T 

‘(’

i

i

S1

T  i 

E

‘(’

E

‘(’

i

S4

S8

E  E ‘+’  T

T   i

T   ‘(’ E ‘)’

‘+’

‘+’

T  ‘(‘ E  ‘)’

E  E  ‘+’ T

$

‘)’

T

S6

S9

Z  E $ 

T  ‘(‘ E ‘)’ 


Answers1

Answers

The following expression exercises all states

( i ) + i


The lr tables

The LR tables


Lr 0 parsing concise notation

LR(0) parsingconcise notation


The lr push down automaton

The LR push-down automaton

SWITCH action_table[top of stack]:

CASE “shift”:

see book;

CASE (“reduce”, N ):

POP the symbols of  FROM the stack;

SET state TO top of stack;

PUSHNON the stack;

SET new state TO goto_table[state,N];

PUSH new state ON the stack;

CASE empty:

ERROR;


Compiler construction in4020 lecture 4

Break


Lr 0 conflicts

LR(0) conflicts

  • shift-reduce conflict

    • array indexing: T  i [ E ]

      T  i [ E ](shift)

      T  i(reduce)

    • -rule: RestExpr 

      Expr  Term  RestExpr(shift)

      RestExpr (reduce)


Lr 0 conflicts1

LR(0) conflicts

  • reduce-reduce conflict

    • assignment statement: Z  V := E $

      V  i(reduce)

      T  i(reduce)

  • typical LR(0) table contains many conflicts


Handling lr 0 conflicts

Handling LR(0) conflicts

  • solution: use a one-token look-ahead

    two-dimensional ACTION table [state,token]

  • different construction of ACTION table

    • SLR(1) – Simple LR

    • LR(1)

    • LALR(1) – Look-Ahead LR


Slr 1 parsing

SLR(1) parsing

  • solves (some) shift-reduce conflicts

  • reduce N  iff token FOLLOW(N)

    FOLLOW(T) = { ‘+’, ‘)’, $ }

    FOLLOW(E) = { ‘+’, ‘)’, $ }

    FOLLOW(Z) = { $ }


Slr 1 action table

SLR(1) ACTION table


Slr 1 action goto table

SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

sn – shift to

state n

rn – reduce

rule n


Slr 1 action goto table1

SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

6: T  i ‘[‘ E ‘]’

sn – shift to

state n

rn – reduce

rule n


Slr 1 action goto table2

SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

6: T  i ‘[‘ E ‘]’

sn – shift to

state n

rn – reduce

rule n


Slr 1 action goto table3

SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

6: T  i ‘[‘ E ‘]’

sn – shift to

state n

rn – reduce

rule n


Unfortunately

Unfortunately ...

  • SLR(1) leaves many shift-reduce conflicts unsolved

  • problem: FOLLOW(N) set is a union of all contexts in which N may occur

  • example

    S  A | x b

    A  a A b | x


Lr 0 automaton

LR(0) automaton

S3

S5

S  A 

A  x 

A

x

S0

S   A

S   x b

A   a A b

A   x

S4

A  a  A b

A   a A b

A   x

a

a

x

A

S1

S6

S  x  b

A  x 

1: S  A

2: S  x b

3: A  a A b

4: A  x

A  a A  b

b

b

S2

S7

S  x b 

A  a A b 


Exercise 6 min

Exercise (6 min.)

  • derive the SLR(1) ACTION/GOTO table (with shift-reduce conflict) for the following grammar:

    S  A | x b

    A  a A b | x


Answers2

Answers


Answers3

Answers

1: S  A

2: S  x b

3: A  a A b

4: A  x

FOLLOW(S) = {$}

FOLLOW(A) = {$,b}


Lr 1 parsing

LR(1) parsing

  • maintain follow set per item

    LR(1) item:N   {}

  •  - closure for LR(1) item sets:

    if set S contains an item P  N  {} then

    foreach production rule N  

    S must contain the item N   {}

    where  = FIRST(  {} )


Lr 1 automaton

LR(1) automaton

S3

S5

S  A  {$}

A  x  {b}

A

x

x

S0

S   A {$}

S   x b {$}

A   a A b {$}

A   x {$}

S4

S8

A  a  A b {$}

A   a A b {b}

A   x {b}

A  a  A b {b}

A   a A b {b}

A   x {b}

a

a

x

A

A

A

S1

S6

S9

S  x  b {$}

A  x  {$}

A  a A  b {$}

A  a A  b {b}

b

b

b

S2

S7

S10

S  x b  {$}

A  a A b  {$}

A  a A b  {b}


Lalr 1 parsing

LALR(1) parsing

  • LR tables are big

  • combine “equal” sets by merging look-ahead sets


La lr 1 automaton

LALR(1) automaton

S3

S5

S  A  {$}

A  x  {b}

A

x

x

S0

S   A {$}

S   x b {$}

A   a A b {$}

A   x {$}

S4

S8

A  a  A b {$}

A   a A b {b}

A   x {b}

A  a  A b {b}

A   a A b {b}

A   x {b}

a

a

x

A

A

A

S1

S6

S9

S  x  b {$}

A  x  {$}

A  a A  b {$}

A  a A  b {b}

b

b

b

S2

S7

S10

S  x b  {$}

A  a A b  {$}

A  a A b  {b}


Lalr 1 automaton

LALR(1) automaton

S3

S5

S  A  {$}

A  x  {b}

A

x

S0

S   A {$}

S   x b {$}

A   a A b {$}

A   x {$}

S   A {$}

S   x b {$}

A   a A b {b,$}

A   x {b,$}

S4

A  a  A b {b,$}

A   a A b {b,$}

A   x {b,$}

A  a  A b {b,$}

A   a A b {b,$}

A   x {b,$}

a

a

x

A

A

S1

S6

S  x  b {$}

A  x  {$}

S  x  b {$}

A  x  {b,$}

A  a A  b {b,$}

A  a A  b {b,$}

b

b

S2

S7

A  a A b  {b,$}

S  x b  {$}

A  a A b  {b,$}


Lalr 1 action goto table

LALR(1) ACTION/GOTO table

1: S  A

2: S  x b

3: A  a A b

4: A  x


Making grammars lr 1 or not

+

*

*

E

E

+

E

E

E

E

Making grammars LR(1) – or not?

  • grammars are often ambiguous

    E  E ‘+’ E | E ‘*’ E | i

  • handle shift-reduce conflicts

    • (default) longest match: shift

    • precedence directives

      input: i * i + i

      E  E ‘+’ E

      E  E ‘*’ E 

reduce

shift


Summary

Summary

  • syntax analysis: tokens  AST

  • bottom-up parsing

    • push-down automaton

    • ACTION/GOTO tables

      • LR(0)NO look-ahead

      • SLR(1)one-token look-ahead, FOLLOW sets to solve shift-reduce conflicts

      • LR(1)SLR(1), but FOLLOW set per item

      • LALR(1)LR(1), but “equal” states are merged


Homework

Homework

  • study sections:

    • 1.10 closure algorithm

    • 2.2.5.8 error handling in LR(1) parsers

  • print handout for next week [blackboard]

  • find a partner for the “practicum”

  • register your group

    • send e-mail to koen@pds.twi.tudelft.nl


  • Login