cos 320 compilers l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
COS 320 Compilers PowerPoint Presentation
Download Presentation
COS 320 Compilers

Loading in 2 Seconds...

play fullscreen
1 / 81

COS 320 Compilers - PowerPoint PPT Presentation


  • 211 Views
  • Uploaded on

COS 320 Compilers. David Walker. The Front End. Lexical Analysis : Create sequence of tokens from characters (Chap 2) Parsing : Create abstract syntax tree from sequence of tokens (Chap 3) Type Checking : Check program for well-formedness constraints. stream of characters. stream of

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'COS 320 Compilers' - JasminFlorian


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cos 320 compilers

COS 320Compilers

David Walker

the front end
The Front End
  • Lexical Analysis: Create sequence of tokens from characters (Chap 2)
  • Parsing: Create abstract syntax tree from sequence of tokens (Chap 3)
  • Type Checking: Check program for well-formedness constraints

stream of

characters

stream of

tokens

abstract

syntax

Lexer

Parser

Type

Checker

parsing with cfgs
Parsing with CFGs
  • Context-free grammars are (often) given by BNF expressions (Backus-Naur Form)
    • Appel Chap 3.1
  • More powerful than regular expressions
    • Matching parens
      • wait, we could do nested comments with ML-LEX!
      • still can’t describe PL structure effectively in ML-LEX
  • CFGs are good for describing the overall syntactic structure of programs.
context free grammars
Context-Free Grammars
  • Context-free grammars consist of:
    • Set of symbols:
      • terminals that denotes token types
      • non-terminals that denotes a set of strings
    • Start symbol
    • Rules:
      • left-hand side: non-terminal
      • right-hand side: terminals and/or non-terminals
      • rules explain how to rewrite non-terminals (beginning with start symbol) into terminals

symbol ::= symbol symbol ... symbol

context free grammars5
Context-Free Grammars

A string is in the language of the CFG if and only if it is possible to derive that string using the following non-deterministic procedure:

  • begin with the start symbol
  • while any non-terminals exist, pick a non-terminal and rewrite it using a rule <== could be many choices here
  • stop when all you have left are terminals (and check you arrived at the string your were hoping to)

Parsing is the process of checking that a string is in the CFG for your programming language. It is usually coupled with creating an abstract syntax tree.

slide6

Elist ::= E

Elist ::= Elist , E

E ::= ID

E ::= NUM

E ::= E + E

E ::= ( S , Elist )

S ::= S; S

S ::= ID := E

S ::= PRINT ( Elist )

non-terminals: S, E, Elist

terminals: ID, NUM, PRINT, +, :=, (, ), ;

rules:

slide7

8. Elist ::= E

9. Elist ::= Elist , E

4. E ::= ID

5. E ::= NUM

6. E ::= E + E

7. E ::= ( S , Elist )

1. S ::= S; S

2. S ::= ID := E

3. S ::= PRINT ( Elist )

non-terminals: S, E, Elist

terminals: ID, NUM, PRINT, +, :=, (, ), ;

rules:

Derive me!

ID = NUM ; PRINT ( NUM )

slide8

8. Elist ::= E

9. Elist ::= Elist , E

4. E ::= ID

5. E ::= NUM

6. E ::= E + E

7. E ::= ( S , Elist )

1. S ::= S; S

2. S ::= ID := E

3. S ::= PRINT ( Elist )

non-terminals: S, E, Elist

terminals: ID, NUM, PRINT, +, :=, (, ), ;

rules:

Derive me!

S

ID = NUM ; PRINT ( NUM )

slide9

8. Elist ::= E

9. Elist ::= Elist , E

4. E ::= ID

5. E ::= NUM

6. E ::= E + E

7. E ::= ( S , Elist )

1. S ::= S; S

2. S ::= ID := E

3. S ::= PRINT ( Elist )

non-terminals: S, E, Elist

terminals: ID, NUM, PRINT, +, :=, (, ), ;

rules:

Derive me!

S

ID = E

ID = NUM ; PRINT ( NUM )

slide10

8. Elist ::= E

9. Elist ::= Elist , E

4. E ::= ID

5. E ::= NUM

6. E ::= E + E

7. E ::= ( S , Elist )

1. S ::= S; S

2. S ::= ID := E

3. S ::= PRINT ( Elist )

non-terminals: S, E, Elist

terminals: ID, NUM, PRINT, +, :=, (, ), ;

rules:

Derive me!

S

ID = E

ID = NUM ; PRINT ( NUM )

oops,

can’t make

progress

E

slide11

8. Elist ::= E

9. Elist ::= Elist , E

4. E ::= ID

5. E ::= NUM

6. E ::= E + E

7. E ::= ( S , Elist )

1. S ::= S; S

2. S ::= ID := E

3. S ::= PRINT ( Elist )

non-terminals: S, E, Elist

terminals: ID, NUM, PRINT, +, :=, (, ), ;

rules:

Derive me!

S

ID = NUM ; PRINT ( NUM )

slide12

8. Elist ::= E

9. Elist ::= Elist , E

4. E ::= ID

5. E ::= NUM

6. E ::= E + E

7. E ::= ( S , Elist )

1. S ::= S; S

2. S ::= ID := E

3. S ::= PRINT ( Elist )

non-terminals: S, E, Elist

terminals: ID, NUM, PRINT, +, :=, (, ), ;

rules:

Derive me!

S

S ; S

ID = NUM ; PRINT ( NUM )

slide13

8. Elist ::= E

9. Elist ::= Elist , E

4. E ::= ID

5. E ::= NUM

6. E ::= E + E

7. E ::= ( S , Elist )

1. S ::= S; S

2. S ::= ID := E

3. S ::= PRINT ( Elist )

non-terminals: S, E, Elist

terminals: ID, NUM, PRINT, +, :=, (, ), ;

rules:

Derive me!

S

S ; S

ID := E ; S

ID = NUM ; PRINT ( NUM )

slide14

8. Elist ::= E

9. Elist ::= Elist , E

4. E ::= ID

5. E ::= NUM

6. E ::= E + E

7. E ::= ( S , Elist )

1. S ::= S; S

2. S ::= ID := E

3. S ::= PRINT ( Elist )

non-terminals: S, E, Elist

terminals: ID, NUM, PRINT, +, :=, (, ), ;

rules:

Derive me!

S

S ; S

ID = E ; S

ID = NUM ; S

ID = NUM ; PRINT ( Elist )

ID = NUM ; PRINT ( E )

ID = NUM ; PRINT ( NUM )

slide15

8. Elist ::= E

9. Elist ::= Elist , E

4. E ::= ID

5. E ::= NUM

6. E ::= E + E

7. E ::= ( S , Elist )

1. S ::= S; S

2. S ::= ID := E

3. S ::= PRINT ( Elist )

non-terminals: S, E, Elist

terminals: ID, NUM, PRINT, +, :=, (, ), ;

rules:

S

S ; S

ID = E ; S

ID = NUM ; S

ID = NUM ; PRINT ( Elist )

ID = NUM ; PRINT ( E )

ID = NUM ; PRINT ( NUM )

S

S ; S

S ; PRINT ( Elist )

S ; PRINT ( E )

S ; PRINT ( NUM )

ID = E ; PRINT ( NUM )

ID = NUM ; PRINT ( NUM )

Another way to

derive the

same string

left-most derivation

right-most derivation

parse trees
Parse Trees
  • Representing derivations as trees
    • useful in compilers: Parse trees correspond quite closely (but not exactly) with abstract syntax trees we’re trying to generate
      • difference: abstract syntax vs concrete (parse) syntax
  • each internal node is labeled with a non-terminal
  • each leaf node is labeled with a terminal
  • each use of a rule in a derivation explains how to generate children in the parse tree from the parents
parse trees17
Parse Trees
  • Example:

S

S ; S

ID = E ; S

ID = NUM ; S

ID = NUM ; PRINT ( Elist )

ID = NUM ; PRINT ( E )

ID = NUM ; PRINT ( NUM )

S

S

;

S

:=

E

(

L

)

ID

PRINT

E

NUM

NUM

parse trees18
Parse Trees
  • Example: 2 derivations, but 1 tree

S

S ; S

ID = E ; S

ID = NUM ; S

ID = NUM ; PRINT ( Elist )

ID = NUM ; PRINT ( E )

ID = NUM ; PRINT ( NUM )

S

S

;

S

:=

E

(

L

)

ID

PRINT

S

S ; S

S ; PRINT ( Elist )

S ; PRINT ( E )

S ; PRINT ( NUM )

ID = E ; PRINT ( NUM )

ID = NUM ; PRINT ( NUM )

E

NUM

NUM

parse trees19
Parse Trees
  • parse trees have meaning.
    • order of children, nesting of subtrees is significant

S

S

S

S

S

;

S

;

:=

E

(

L

)

(

L

)

ID

PRINT

PRINT

:=

E

ID

E

E

NUM

NUM

NUM

NUM

ambiguous grammars
Ambiguous Grammars
  • a grammar is ambiguous if the same sequence of tokens can give rise to two or more parse trees
ambiguous grammars21
Ambiguous Grammars

characters: 4 + 5 * 6

tokens: NUM(4) PLUS NUM(5) MULT NUM(6)

E

non-terminals:

E

terminals:

ID

NUM

PLUS

MULT

E ::= ID

| NUM

| E + E

| E * E

E

+

E

E

E

*

NUM(4)

NUM(6)

NUM(5)

I like using this notation where

I avoid repeating E ::=

ambiguous grammars22
Ambiguous Grammars

characters: 4 + 5 * 6

tokens: NUM(4) PLUS NUM(5) MULT NUM(6)

E

non-terminals:

E

terminals:

ID

NUM

PLUS

MULT

E ::= ID

| NUM

| E + E

| E * E

E

+

E

E

E

*

NUM(4)

NUM(6)

NUM(5)

E

E

*

E

E

E

+

NUM(6)

NUM(5)

NUM(4)

ambiguous grammars23
Ambiguous Grammars
  • problem: compilers use parse trees to interpret the meaning of parsed expressions
    • different parse trees have different meanings
    • eg: (4 + 5) * 6 is not 4 + (5 * 6)
    • languages with ambiguous grammars are DISASTROUS; The meaning of programs isn’t well-defined! You can’t tell what your program might do!
  • solution: rewrite grammar to eliminate ambiguity
    • fold precedence rules into grammar to disambiguate
    • fold associativity rules into grammar to disambiguate
    • other tricks as well
building parsers
Building Parsers
  • In theory classes, you might have learned about general mechanisms for parsing all CFGs
    • algorithms for parsing all CFGs are expensive
      • actually, with computers getting faster and bigger year over year, researchers are beginning to dispute this claim.
    • for 1/10/100 million-line applications, compilers must be fast.
      • even for 20 thousand-line apps, speed is nice
    • sometimes 1/3 of compilation time is spent in parsing
  • compiler writers have developed specialized algorithms for parsing the kinds of CFGs that you need to build effective programming languages
    • LL(k), LR(k) grammars can be parsed.
recursive descent parsing
Recursive Descent Parsing
  • Recursive Descent Parsing (Appel Chap 3.2):
    • aka: predictive parsing; top-down parsing
    • simple, efficient
    • can be coded by hand in ML quickly
    • parses many, but not all CFGs
      • parses LL(1) grammars
        • Left-to-right parse; Leftmost-derivation; 1 symbol lookahead
    • key ideas:
      • one recursive function for each non terminal
      • each production becomes one clause in the function
slide26

non-terminals: S, E, L

terminals: NUM, IF, THEN, ELSE, BEGIN, END, PRINT, ;, =

rules:

1. S ::= IF E THEN S ELSE S

2. | BEGIN S L

3. | PRINT E

4. L ::= END

5. | ; S L

6. E ::= NUM = NUM

slide27

non-terminals: S, E, L

terminals: NUM, IF, THEN, ELSE, BEGIN, END, PRINT, ;, =

rules:

1. S ::= IF E THEN S ELSE S

2. | BEGIN S L

3. | PRINT E

4. L ::= END

5. | ; S L

6. E ::= NUM = NUM

Step 1: Represent the tokens

datatype token = NUM | IF | THEN | ELSE | BEGIN | END

| PRINT | SEMI | EQ

Step 2: build infrastructure for reading tokens from lexing stream

function supplied by lexer

val tok = ref (getToken ())

fun advance () = tok := getToken ()

fun eat t = if (! tok = t) then advance () else error ()

slide28

non-terminals: S, E, L

terminals: NUM, IF, THEN, ELSE, BEGIN, END, PRINT, ;, =

rules:

1. S ::= IF E THEN S ELSE S

2. | BEGIN S L

3. | PRINT E

4. L ::= END

5. | ; S L

6. E ::= NUM = NUM

datatype token = NUM | IF

| THEN | ELSE | BEGIN | END

| PRINT | SEMI | EQ

val tok = ref (getToken ())

fun advance () = tok := getToken ()

fun eat t = if (! tok = t) then advance () else error ()

Step 3: write parser => one function per non-terminal; one clause per rule

fun S () = case !tok of

IF => eat IF; E (); eat THEN; S (); eat ELSE; S ()

| BEGIN => eat BEGIN; S (); L ()

| PRINT => eat PRINT; E ()

and L () = case !tok of

END => eat END

| SEMI => eat SEMI; S (); L ()

and E () = eat NUM; eat EQ; eat NUM

slide29

non-terminals: S, A, E, L

rules:

  • S ::= A EOF
  • A ::= ID := E
  • 3. | PRINT ( L )

4. E ::= ID

5. | NUM

6. L ::= E

7. | L , E

fun S () = A (); eat EOF

and A () = case !tok of

ID => eat ID; eat ASSIGN; E ()

| PRINT => eat PRINT; eat LPAREN; L (); eat RPAREN

and E () = case !tok of

ID => eat ID

| NUM => eat NUM

and L () = case !tok of

ID => ???

| NUM => ???

problem
problem
  • predictive parsing only works for grammars where the first terminal symbol in the input provides enough information to choose which production to use
    • LL(1)
  • when parsing L, if !tok = ID, the parser cannot determine which production to use:

6. L ::= E (E could be ID)

7. | L , E (L could be E could be ID)

solution
solution
  • eliminate left-recursion
  • rewrite the grammar so it parses the same language but the rules are different:

S ::= A EOF

A ::= ID := E

| PRINT ( L )

E ::= ID

| NUM

S ::= A EOF

A ::= ID := E

| PRINT ( L )

E ::= ID

| NUM

L ::= E

| L , E

solution32
solution
  • eliminate left-recursion
  • rewrite the grammar so it parses the same language but the rules are different:

S ::= A EOF

A ::= ID := E

| PRINT ( L )

E ::= ID

| NUM

S ::= A EOF

A ::= ID := E

| PRINT ( L )

E ::= ID

| NUM

L ::= E M

M ::= , E M

|

L ::= E

| L , E

eliminating single left recursion
eliminating single left-recursion
  • Original grammar form:
  • Transformed grammar:

X ::= base

| X repeat

Strings: base repeat repeat ...

X ::= base Xnew

Xnew ::= repeat Xnew

|

Strings: base repeat repeat ...

Think about: what if you have mutually left-recursive variables X,Y,Z?

What’s the most general pattern of left recursion? How to eliminate it?

recursive descent parsing34
Recursive Descent Parsing
  • Unfortunately, can’t always eliminate left recursion
  • Questions:
    • how do we know when we can parse grammars using recursive descent?
    • Is there an algorithm for generating such parsers automatically?
constructing rd parsers
Constructing RD Parsers
  • To construct an RD parser, we need to know what rule to apply when
    • we are trying to parse a non terminal X
    • we see the next terminal a in input
  • We apply rule X ::= s when
    • a is the first symbol that can be generated by string s, OR
    • s reduces to the empty string (is nullable) and a is the first symbol in any string that can follow X
constructing rd parsers36
Constructing RD Parsers
  • To construct an RD parser, we need to know what rule to apply when
    • we are trying to parse a non terminal X
    • we see the next terminal a in input
  • We apply rule X ::= s when
    • a is the first symbol that can be generated by string s, OR
    • s reduces to the empty string (is nullable) and a is the first symbol in any string that can followX
constructing predictive parsers
Constructing Predictive Parsers

1. Y ::=

2. | bb

5. Z ::= d

3. X ::= c

4. | Y Z

next

terminal

rule

non-terminal

seen

constructing predictive parsers38
Constructing Predictive Parsers

1. Y ::=

2. | bb

5. Z ::= d

3. X ::= c

4. | Y Z

next

terminal

rule

non-terminal

seen

constructing predictive parsers39
Constructing Predictive Parsers

1. Y ::=

2. | bb

5. Z ::= d

3. X ::= c

4. | Y Z

next

terminal

rule

non-terminal

seen

constructing predictive parsers40
Constructing Predictive Parsers

1. Y ::=

2. | bb

5. Z ::= d

3. X ::= c

4. | Y Z

next

terminal

rule

non-terminal

seen

constricting predictive parsers
Constricting Predictive Parsers
  • in general, must compute:
    • for each production X ::= s, must determine if s can derive the empty string.
      • if yes, X  Nullable
    • for each production X := s, must determine the set of all first terminals Q derivable from s
      • Q  First(X)
    • for each non terminal X, determine all terminals symbols Q that immediately follow X
      • Q  Follow(X)
iterative analysis
Iterative Analysis
  • Many compilers algorithms are iterative techniques.
  • Iterative analysis applies when:
    • must compute a set of objects with some property P
    • P is defined inductively. ie, there are:
      • base cases: objects o1, o2 “obviously” have property P
      • inductive cases: if certain objects (o3, o4) have property P, this implies other objects (f o3; f o4) have property P
    • The number of objects in the set is finite
      • or we can represent infinite collections using some finite notation & we can find effective termination conditions
iterative analysis43
Iterative Analysis
  • general form:
    • initialize set S with base cases
    • applied inductive rules over and over until you reach a fixed point
  • a fixed point is a set that does not change when you apply an inductive rule (function)
    • Nullable, First and Follow sets can be determined through iteration
    • many program analyses & optimizations use iteration
    • worst-case complexity is bad
    • average-case complexity can be good: iteration “usually” terminates in a couple of rounds
iterative analysis44
Iterative Analysis

Base Cases

“obviously”

have the property

iterative analysis45
Iterative Analysis

f

apply function

(rule) to things

that have the

property to

produce new things

that have the property

f

f

f

iterative analysis46
Iterative Analysis

Base Cases

+ things you get

by applying rule

to base cases

have the property

iterative analysis47
Iterative Analysis

Apply rules again

iterative analysis48
Iterative Analysis

Apply rules again

iterative analysis49
Iterative Analysis

Finally, you reach

a fixed point

iterative analysis50
Iterative Analysis
  • Example:
  • axioms are “obviously true”/taken for granted
  • rules of logic take basic axioms and prove new
  • things are true

axioms:

“Dave teaches cos 441”

“Dave teaches cos 320”

“Dave is a great teacher”

rule r: X is a great teacher /\ X teaches Y => Y is a great class

iterative analysis51
Iterative Analysis
  • Example:
  • axioms are “obviously true”/taken for granted
  • rules of logic take basic axioms and prove new
  • things are true

axioms:

“Dave teaches cos 441”

“Dave teaches cos 320”

r

“Dave is a great teacher”

r

“cos 320 is a great class”

“cos 441 is a great class”

rule r: X is a great teacher /\ X teaches Y => Y is a great class

iterative analysis52
Iterative Analysis
  • Example:
  • axioms are “obviously true”/taken for granted
  • rules of logic take basic axioms and prove new
  • things are true

Fixed Point Reached!

axioms:

“Dave teaches cos 441”

“Dave teaches cos 320”

r

“Dave is a great teacher”

r

“cos 320 is a great class”

“cos 441 is a great class”

rule r: X is a great teacher /\ X teaches Y => Y is a great class

nullable sets
Nullable Sets
  • Non-terminal X is Nullable only if the following constraints are satisfied
    • base case:
      • if (X := ) then X is Nullable
    • inductive case:
      • if (X := ABC...) and A, B, C, ... are all Nullable then X is Nullable
computing nullable sets
Computing Nullable Sets
  • Compute X is Nullable by iteration:
    • Initialization:
      • Nullable := { }
      • if (X := ) then Nullable := Nullable U {X}
    • While Nullable different from last iteration do:
      • for all X,
        • if (X := ABC...) and A, B, C, ... are all Nullable then Nullable := Nullable U {X}
first sets
First Sets
  • First(X) is specified like this:
    • base case:
      • if T is a terminal symbol then First (T) = {T}
    • inductive case:
      • if X is a non-terminal and (X:= ABC...) then
        • First (X) = First (ABC...)

where First(ABC...) = F1 U F2 U F3 U ... and

          • F1 = First (A)
          • F2 = First (B), if A is Nullable; emptyset otherwise
          • F3 = First (C), if A is Nullable & B is Nullable; emp...
          • ...
computing first sets
Computing First Sets
  • Compute First(X):
    • initialize:
      • if T is a terminal symbol then First (T) = {T}
      • if T is non-terminal then First(T) = { }
    • while First(X) changes (for any X) do
      • for all X and all rules (X:= ABC...) do
        • First (X) := First(X) U First (ABC...)

where First(ABC...) := F1 U F2 U F3 U ... and

          • F1 := First (A)
          • F2 := First (B), if A is Nullable; emptyset otherwise
          • F3 := First (C), if A is Nullable & B is Nullable; emp...
          • ...
computing follow sets
Computing Follow Sets
  • Follow(X) is computed iteratively
    • base case:
      • initially, we assume nothing in particular follows X
        • (when computing, Follow (X) is initially { })
    • inductive case:
      • if (Y := s1 X s2) for any strings s1, s2 then
        • Follow (X) = First (s2)
      • if (Y := s1 X s2) for any strings s1, s2 then
        • Follow (X) = Follow(Y), if s2 is Nullable
building a predictive parser
building a predictive parser

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d

building a predictive parser59
building a predictive parser

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d

base case

building a predictive parser60
building a predictive parser

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d

after one round of induction, we realize we have reached a fixed point

building a predictive parser61
building a predictive parser

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d

base case

building a predictive parser62
building a predictive parser

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d

round 1

building a predictive parser63
building a predictive parser

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d

round 2

building a predictive parser64
building a predictive parser

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d

after three rounds of iteration, no more changes ==> fixed point

building a predictive parser65
building a predictive parser

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d

base case

building a predictive parser66
building a predictive parser

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d

after one round of induction, no fixed point

building a predictive parser67
building a predictive parser

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d

after two rounds of induction, fixed point

(but notice, computing Follow(X) before Follow (Y) would have required 3rd round)

slide68

Grammar:

Computed Sets:

Z ::= X Y Z

Z ::= d

Y ::= c

Y ::=

X ::= a

X ::= b Y e

  • if T  First(s) then
  • enter (X ::= s) in row X, col T
  • if s is Nullable and T  Follow(X)
  • enter (X ::= s) in row X, col T

Build parsing table where row X, col T

tells parser which clause to execute in

function X with next-token T:

slide69

Grammar:

Computed Sets:

Z ::= X Y Z

Z ::= d

Y ::= c

Y ::=

X ::= a

X ::= b Y e

  • if T  First(s) then
  • enter (X ::= s) in row X, col T
  • if s is Nullable and T  Follow(X)
  • enter (X ::= s) in row X, col T

Build parsing table where row X, col T

tells parser which clause to execute in

function X with next-token T:

slide70

Grammar:

Computed Sets:

Z ::= X Y Z

Z ::= d

Y ::= c

Y ::=

X ::= a

X ::= b Y e

  • if T  First(s) then
  • enter (X ::= s) in row X, col T
  • if s is Nullable and T  Follow(X)
  • enter (X ::= s) in row X, col T

Build parsing table where row X, col T

tells parser which clause to execute in

function X with next-token T:

slide71

Grammar:

Computed Sets:

Z ::= X Y Z

Z ::= d

Y ::= c

Y ::=

X ::= a

X ::= b Y e

  • if T  First(s) then
  • enter (X ::= s) in row X, col T
  • if s is Nullable and T  Follow(X)
  • enter (X ::= s) in row X, col T

Build parsing table where row X, col T

tells parser which clause to execute in

function X with next-token T:

slide72

Grammar:

Computed Sets:

Z ::= X Y Z

Z ::= d

Y ::= c

Y ::=

X ::= a

X ::= b Y e

  • if T  First(s) then
  • enter (X ::= s) in row X, col T
  • if s is Nullable and T  Follow(X)
  • enter (X ::= s) in row X, col T

Build parsing table where row X, col T

tells parser which clause to execute in

function X with next-token T:

slide73

Grammar:

Computed Sets:

Z ::= X Y Z

Z ::= d

Y ::= c

Y ::=

X ::= a

X ::= b Y e

  • if T  First(s) then
  • enter (X ::= s) in row X, col T
  • if s is Nullable and T  Follow(X)
  • enter (X ::= s) in row X, col T

Build parsing table where row X, col T

tells parser which clause to execute in

function X with next-token T:

slide74

Grammar:

Computed Sets:

Z ::= X Y Z

Z ::= d

Y ::= c

Y ::=

X ::= a

X ::= b Y e

What are the blanks?

slide75

Grammar:

Computed Sets:

Z ::= X Y Z

Z ::= d

Y ::= c

Y ::=

X ::= a

X ::= b Y e

What are the blanks? --> syntax errors

slide76

Grammar:

Computed Sets:

Z ::= X Y Z

Z ::= d

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Is it possible to put 2 grammar rules in the same box?

slide77

Grammar:

Computed Sets:

Z ::= X Y Z

Z ::= d

Z ::= d e

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Is it possible to put 2 grammar rules in the same box?

predictive parsing tables
predictive parsing tables
  • if a predictive parsing table constructed this way contains no duplicate entries, the grammar is called LL(1)
    • Left-to-right parse, Left-most derivation, 1 symbol lookahead
  • if not, of the grammar is not LL(1)
  • in LL(k) parsing table, columns include every k-length sequence of terminals:
another trick
another trick
  • Previously, we saw that grammars with left-recursion were problematic, but could be transformed into LL(1) in some cases
  • the example non-LL(1) grammar we just saw:
  • how do we fix it?

Z ::= X Y Z

Z ::= d

Z ::= d e

Y ::= c

Y ::=

X ::= a

X ::= b Y e

another trick80
another trick
  • Previously, we saw that grammars with left-recursion were problematic, but could be transformed into LL(1) in some cases
  • the example non-LL(1) grammar we just saw:
  • solution here is left-factoring:

Z ::= X Y Z

Z ::= d

Z ::= d e

Y ::= c

Y ::=

X ::= a

X ::= b Y e

Z ::= X Y Z

Z ::= d W

W ::=

W ::= e

Y ::= c

Y ::=

X ::= a

X ::= b Y e

summary
summary
  • CFGs are good at specifying programming language structure
  • parsing general CFGs is expensive so we define parsers for simple classes of CFG
    • LL(k), LR(k)
  • we can build a recursive descent parser for LL(k) grammars by:
    • computing nullable, first and follow sets
    • constructing a parse table from the sets
    • checking for duplicate entries, which indicates failure
    • creating an ML program from the parse table
  • if parser construction fails we can
    • rewrite the grammar (left factoring, eliminating left recursion) and try again
    • try to build a parser using some other method