Compiler construction in4020 lecture 4
Download
1 / 50

Compiler construction in4020 – lecture 4 - PowerPoint PPT Presentation


  • 106 Views
  • Uploaded on

Compiler construction in4020 – lecture 4. Koen Langendoen Delft University of Technology The Netherlands. program text. lexical analysis. tokens. parser. language. syntax analysis. generator. grammar. AST. context handling. annotated AST. Summary of lecture 3.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Compiler construction in4020 – lecture 4' - vince


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Compiler construction in4020 lecture 4

Compiler constructionin4020 – lecture 4

Koen Langendoen

Delft University of Technology

The Netherlands


Summary of lecture 3

program text

lexical analysis

tokens

parser

language

syntax analysis

generator

grammar

AST

context handling

annotated AST

Summary of lecture 3

  • syntax analysis: tokens  AST

  • top-down parsing

    • recursive descent

    • push-down automaton

    • making grammars LL(1)


Quiz

2.31Can you create a top-down parser for the following grammars?

(a) S ‘(‘ S ‘)’ | ‘)’

(b) S ‘(‘ S ‘)’ | 

(c) S ‘(‘ S ‘)’ | ‘)’ | 


Overview

program text

lexical analysis

tokens

parser

language

syntax analysis

generator

grammar

AST

context handling

annotated AST

Overview

  • syntax analysis: tokens  AST

  • bottom-up parsing

    • push-down automaton

    • ACTION/GOTO tables

    • LR(0), SLR(1), LR(1), LALR(1)


Bottom up lr parsing

rest_expression

expression

term

rest_expr

IDENT

IDENT

IDENT

aap + ( noot + mies )

Bottom-up (LR) parsing

  • Left-to-right parse, Rightmost-derivation

  • create a node when all

    children are present

  • handle: nodes representing

    the right-hand side of a

    production


Lr 0 parsing
LR(0) parsing

  • running example: expression grammar

    input expression EOF

    expression  expression ‘+’ term | term

    term  IDENTIFIER | ‘(’ expression ‘)’

  • short-hand notation

    Z  E $

    E  E ‘+’ T | T

    T  i | ‘(’ E ‘)


Lr 0 parsing1
LR(0) parsing

  • running example: expression grammar

    input expression EOF

    expression  expression ‘+’ term | term

    term  IDENTIFIER | ‘(’ expression ‘)’

  • short-hand notation

    Z  E $

    E  E ‘+’ T

    E  T

    T  i

    T  ‘(’ E ‘)’


Lr 0 parsing2

Z   E $

LR(0) parsing

  • keep track of progress inside potential

    handles when consuming input tokens

  • LR items: N  

  • initial set

  • -closure: expand dots in front of non-terminals

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

S0


Lr 0 parsing3

input

i + i $

stack

S0

LR(0) parsing

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

  • shift input token (i) onto the stack

  • compute new state


Lr 0 parsing4
LR(0) parsing

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 i S1

+ i $

  • reduce handle on top of the stack

  • compute new state


Lr 0 parsing5
LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 T S2

+ i $

i


Lr 0 parsing6
LR(0) parsing

  • shift input token on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3

+ i $

T

i


Lr 0 parsing7
LR(0) parsing

  • shift input token on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 + S4

i $

T

i


Lr 0 parsing8
LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 + S4 i S1

$

T

i


Lr 0 parsing9
LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 + S4 T S5

$

T

i

i


Lr 0 parsing10
LR(0) parsing

  • shift input token on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3

$

E

+

T

T

i

i


Lr 0 parsing11
LR(0) parsing

  • reduce handle on top of the stack

  • compute new state

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 E S3 $ S6

E

+

T

T

i

i


Lr 0 parsing12
LR(0) parsing

  • accept!

Z  E $

E  E ‘+’ T

E  T

T  i

T  ‘(’ E ‘)’

stack

input

S0 Z

E

$

E

+

T

T

i

i


Transition diagram

S0

Z   E $

E   E ‘+’ T

E   T

T   i

T   ‘(’ E ‘)’

S3

Z  E  $

E  E  ‘+’ T

S5

E  E + T 

Transition diagram

S2

T

E  T 

i

S1

T  i 

E

i

S4

E  E ‘+’  T

T   i

T   ‘(’ E ‘)’

‘+’

$

T

S6

Z  E $ 


Exercise 8 min
Exercise (8 min.)

  • complete the transition diagram for the LR(0) automaton

  • can you think of a single input expression that causes all states to be used? If yes, give an example. If no, explain.



Answers fig 2 89

S0

Z   E $

E   E ‘+’ T

E   T

T   i

T   ‘(’ E ‘)’

S3

Z  E  $

E  E  ‘+’ T

S5

E  E + T 

Answers (fig 2.89)

S2

S7

T  ‘(’  E ‘)’

E   E ‘+’ T

E   T

T   i

T   ‘(’ E ‘)’

T

T

E  T 

‘(’

i

i

S1

T  i 

E

‘(’

E

‘(’

i

S4

S8

E  E ‘+’  T

T   i

T   ‘(’ E ‘)’

‘+’

‘+’

T  ‘(‘ E  ‘)’

E  E  ‘+’ T

$

‘)’

T

S6

S9

Z  E $ 

T  ‘(‘ E ‘)’ 


Answers1
Answers

The following expression exercises all states

( i ) + i



Lr 0 parsing concise notation
LR(0) parsingconcise notation


The lr push down automaton
The LR push-down automaton

SWITCH action_table[top of stack]:

CASE “shift”:

see book;

CASE (“reduce”, N ):

POP the symbols of  FROM the stack;

SET state TO top of stack;

PUSHNON the stack;

SET new state TO goto_table[state,N];

PUSH new state ON the stack;

CASE empty:

ERROR;



Lr 0 conflicts
LR(0) conflicts

  • shift-reduce conflict

    • array indexing: T  i [ E ]

      T  i [ E ](shift)

      T  i(reduce)

    • -rule: RestExpr 

      Expr  Term  RestExpr (shift)

      RestExpr  (reduce)


Lr 0 conflicts1
LR(0) conflicts

  • reduce-reduce conflict

    • assignment statement: Z  V := E $

      V  i (reduce)

      T  i (reduce)

  • typical LR(0) table contains many conflicts


Handling lr 0 conflicts
Handling LR(0) conflicts

  • solution: use a one-token look-ahead

    two-dimensional ACTION table [state,token]

  • different construction of ACTION table

    • SLR(1) – Simple LR

    • LR(1)

    • LALR(1) – Look-Ahead LR


Slr 1 parsing
SLR(1) parsing

  • solves (some) shift-reduce conflicts

  • reduce N  iff token FOLLOW(N)

    FOLLOW(T) = { ‘+’, ‘)’, $ }

    FOLLOW(E) = { ‘+’, ‘)’, $ }

    FOLLOW(Z) = { $ }


Slr 1 action table
SLR(1) ACTION table


Slr 1 action goto table
SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

sn – shift to

state n

rn – reduce

rule n


Slr 1 action goto table1
SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

6: T  i ‘[‘ E ‘]’

sn – shift to

state n

rn – reduce

rule n


Slr 1 action goto table2
SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

6: T  i ‘[‘ E ‘]’

sn – shift to

state n

rn – reduce

rule n


Slr 1 action goto table3
SLR(1) ACTION/GOTO table

1: Z  E $

2: E  E ‘+’ T

3: E  T

4: T  i

5: T  ‘(’ E ‘)’

6: T  i ‘[‘ E ‘]’

sn – shift to

state n

rn – reduce

rule n


Unfortunately
Unfortunately ...

  • SLR(1) leaves many shift-reduce conflicts unsolved

  • problem: FOLLOW(N) set is a union of all contexts in which N may occur

  • example

    S  A | x b

    A  a A b | x


Lr 0 automaton
LR(0) automaton

S3

S5

S  A 

A  x 

A

x

S0

S   A

S   x b

A   a A b

A   x

S4

A  a  A b

A   a A b

A   x

a

a

x

A

S1

S6

S  x  b

A  x 

1: S  A

2: S  x b

3: A  a A b

4: A  x

A  a A  b

b

b

S2

S7

S  x b 

A  a A b 


Exercise 6 min
Exercise (6 min.)

  • derive the SLR(1) ACTION/GOTO table (with shift-reduce conflict) for the following grammar:

    S  A | x b

    A  a A b | x



Answers3
Answers

1: S  A

2: S  x b

3: A  a A b

4: A  x

FOLLOW(S) = {$}

FOLLOW(A) = {$,b}


Lr 1 parsing
LR(1) parsing

  • maintain follow set per item

    LR(1) item: N   {}

  •  - closure for LR(1) item sets:

    if set S contains an item P  N  {} then

    foreach production rule N  

    S must contain the item N   {}

    where  = FIRST(  {} )


Lr 1 automaton
LR(1) automaton

S3

S5

S  A  {$}

A  x  {b}

A

x

x

S0

S   A {$}

S   x b {$}

A   a A b {$}

A   x {$}

S4

S8

A  a  A b {$}

A   a A b {b}

A   x {b}

A  a  A b {b}

A   a A b {b}

A   x {b}

a

a

x

A

A

A

S1

S6

S9

S  x  b {$}

A  x  {$}

A  a A  b {$}

A  a A  b {b}

b

b

b

S2

S7

S10

S  x b  {$}

A  a A b  {$}

A  a A b  {b}


Lalr 1 parsing
LALR(1) parsing

  • LR tables are big

  • combine “equal” sets by merging look-ahead sets


La lr 1 automaton
LALR(1) automaton

S3

S5

S  A  {$}

A  x  {b}

A

x

x

S0

S   A {$}

S   x b {$}

A   a A b {$}

A   x {$}

S4

S8

A  a  A b {$}

A   a A b {b}

A   x {b}

A  a  A b {b}

A   a A b {b}

A   x {b}

a

a

x

A

A

A

S1

S6

S9

S  x  b {$}

A  x  {$}

A  a A  b {$}

A  a A  b {b}

b

b

b

S2

S7

S10

S  x b  {$}

A  a A b  {$}

A  a A b  {b}


Lalr 1 automaton
LALR(1) automaton

S3

S5

S  A  {$}

A  x  {b}

A

x

S0

S   A {$}

S   x b {$}

A   a A b {$}

A   x {$}

S   A {$}

S   x b {$}

A   a A b {b,$}

A   x {b,$}

S4

A  a  A b {b,$}

A   a A b {b,$}

A   x {b,$}

A  a  A b {b,$}

A   a A b {b,$}

A   x {b,$}

a

a

x

A

A

S1

S6

S  x  b {$}

A  x  {$}

S  x  b {$}

A  x  {b,$}

A  a A  b {b,$}

A  a A  b {b,$}

b

b

S2

S7

A  a A b  {b,$}

S  x b  {$}

A  a A b  {b,$}


Lalr 1 action goto table
LALR(1) ACTION/GOTO table

1: S  A

2: S  x b

3: A  a A b

4: A  x


Making grammars lr 1 or not

+

*

*

E

E

+

E

E

E

E

Making grammars LR(1) – or not?

  • grammars are often ambiguous

    E  E ‘+’ E | E ‘*’ E | i

  • handle shift-reduce conflicts

    • (default) longest match: shift

    • precedence directives

      input: i * i + i

      E  E ‘+’ E

      E  E ‘*’ E 

reduce

shift


Summary
Summary

  • syntax analysis: tokens  AST

  • bottom-up parsing

    • push-down automaton

    • ACTION/GOTO tables

      • LR(0) NO look-ahead

      • SLR(1) one-token look-ahead, FOLLOW sets to solve shift-reduce conflicts

      • LR(1) SLR(1), but FOLLOW set per item

      • LALR(1) LR(1), but “equal” states are merged


Homework
Homework

  • study sections:

    • 1.10 closure algorithm

    • 2.2.5.8 error handling in LR(1) parsers

  • print handout for next week [blackboard]

  • find a partner for the “practicum”

  • register your group

    • send e-mail to [email protected]


ad