- 129 Views
- Uploaded on
- Presentation posted in: General

Week 3

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

- Questions / Concerns
- What’s due:
- Lab1b due Friday at midnight
- Lab1b check-off next week (schedule will be announced on Monday)
- Homework #2 due next Monday (Draw a parse tree)
- Homework #3 due next Wednesday (Define grammar for your language)
- Homework #4 due next Thursday (Grammar modifications)

- Top down parser
- Grammar modifications

skeletal source program

preprocessor

Modified Source Program

Syntax Analysis

(Parser)

Lexical Analyzer (scanner)

Tokens

Syntactic Structure

Semantic Analysis

Intermediate Representation

Optimizer

Symbol Table

Code Generator

Target machine code

- Choose a type of parser
- Top-Down parser
- Bottom-Up parser

- Choose a parsing technique
- Recursive Descent
- Table driven parser (LL(1) or LR(1))

- Generate a grammar for your language
- Modify the grammar to fit the particular parsing technique
- Remove lambda productions
- Remove unit productions
- Remove left recursion
- Left factor the grammar

- Parser is just a matching tool
- It matches list of tokens with grammar rules to determine if they are legal constructs/statements or not.
- Yes/No machine
- Context-Free
- It doesn’t care about context (types), it just cares about syntax
- If it looks like an assignment statement, then it is an assignment statement.

int x;

x = “Hello”;

S -> aaSc| B

B -> bbbB |

Generate a parse tree for the input string

aaaabbbcc

S -> E

E -> E + E

E -> E * E

E -> a |b | c

Generate a parse tree for the input string

a + b * c

- Lua Grammar

- Two formats
- Context-Free Grammar
- Extended Backus-Naur Form
Lua Example

laststat ::= return [explist] | break

Laststat -> return LaststatOptional | break LaststatOptional -> Explist |

varlist ::= var {`,´ var}

Varlist -> Var Varlist2

Varlist2 -> `,´ Var Varlist2 |

- Two formats
- Context-Free Grammar
- Extended Backus-Naur Form
Mini C example

Program = Definition { Definition }

program -> Definition MoreDefinitions MoreDefinitions -> Definition MoreDefinitions |

Definition = Data_definition | Function_definitionDefinition -> Data_definition | Function_definition

Function_definition = ['int'] Function_header Function_bodyFunction_definition -> OptionalType Function_header Function_body OptionalType -> ‘int’ |

- Start with start symbol of the grammar.
- Grab an input token and select a production rule.
- Use “stack” to store the production rule.

- Try to parse that rule by matching input tokens.
- Keep going until all of the input tokens have been processed.
- If the rule is not the right one, put all the tokens back and try a different rule. (backtracking)

- Ideal grammar:
- Unique rule for each type of token.
- One-token look ahead

- Unique rule for each type of token.

Stat ->

localfunction Name Funcbody |

local Namelist LocalOptional

- Based on one token “local” we should be able to pick one unique rule so we don’t have to backtrack.
- What if we could combine these 2 rules into one rule by factoring out the common parts, it would eliminate the need for backtracking.

Stat ->

localfunction Name Funcbody |

local Namelist LocalOptional

- Left factor the grammar:
Stat -> localMorelocal

Morelocal -> function Name Funcbody |

Namelist LocalOptional

- Ideal grammar:
- Unique rule for each type of token.
- One-token look ahead

- Minimize unit productions
- Unit productions don’t parse tokens immediately. It requires another production.
- It’s hard to tell which tokens match the unit productions thus more chances for backtracking.

- Unique rule for each type of token.

S -> aaSc

S -> B

B -> bbbB

B ->

S

B

b b b B

Exp -> nil | false | true | Number | String | `...´ | Functioncall | Prefixexp | Tableconstructor | Exp Binop Exp | Unop Exp

S -> aaSc

S -> B

B -> bbbB

B ->

S -> aaSc

S -> bbbB

S ->

B -> bbbB

B ->

Exp -> nil | false | true | Number | String | `...´ | Functioncall | Prefixexp | Tableconstructor | Exp Binop Exp | Unop Exp

Exp -> nil | false | true | Number | String | `...´ |

Functioncall| Prefixexp | { Fieldlistoptional }| Exp Binop Exp | Unop Exp

Exp -> nil | false | true | Number | String | `...´ | Functioncall | Prefixexp | Tableconstructor | Exp Binop Exp | Unop Exp

Exp -> nil | false | true | Number | String | `...´ | Prefixexp Args | Prefixexp `:´ Name Args | Prefixexp | { Fieldlistoptional } | Exp Binop Exp | Unop Exp

Exp -> nil | false | true | Number | String | `...´ | Prefixexp Args | Prefixexp `:´ Name Args | Prefixexp | { Fieldlistoptional } | Exp Binop Exp | Unop Exp

More left factoring needed

- Ideal grammar:
- Unique rule for each type of token.
- One-token look ahead

- Minimize unit productions
- Unit productions don’t parse tokens immediately. It requires another production.
- It’s hard to tell which tokens match the unit productions thus more chances for backtracking.

- Lambda productions are okay but we have to process them accordingly.
- Removing lambdas always add more rules.
- It’s not possible to remove all lambda productions and still yield unique token-rule matching.

- Remove left recursion in the grammar.

- Unique rule for each type of token.

Right Recursion

A -> aA

A ->

Left Recursion

A -> Aa

A ->

Same grammar?

A

A

A

a

a

A

a

A

a

A

Only non-recursive rule is

a

A

a

A

Which one works for top down?

A -> Aa

A ->

A -> aA

A ->

A

A

A

a

a

A

a

A

a

A

a

A

a

A

Same grammar?

A -> Aa

A -> b

A -> aA

A -> b

A

A

A

a

a

A

a

A

a

A

Non-recursive rules are not only

a

A

a

A

b

b

- Example:
A -> Aa

A -> b

- Step 1: Make all left recursive rules right recursive, but give them a new non-terminal
A -> Aa X -> aX

- Step 2: Add a lambda production to the new non-terminal X ->
- Step 3: Identify all non-recursive rules.
A -> b

- Step 4: Append the new non-terminal to the end of all non-recursive rules
- A -> bX

A -> A… Left Recursive rule

Same grammar?

A -> bX

X -> aX |

A -> Aa

A -> b

A

A

b

X

A

a

a

X

a

A

Non-recursive rules are not only

a

X

a

A

a

b

S -> Sab

S -> c

S -> d

X -> abX

X ->

S -> cX

S -> dX

PARAMLIST -> IDLIST : TYPE |

PARAMLIST ; IDLIST : TYPE

PARAMLIST2 -> ; IDLIST : TYPE PARAMLIST2

PARAMLIST2 ->

PARAMLIST -> IDLIST : TYPE PARAMLIST2

S -> abSc

S -> A

S -> AB

A -> aA

A ->

B -> bbB

B ->

S -> abSc

S -> aA

S ->

S -> AB

A -> aA

A ->

B -> bbB

B ->

TERM -> FACTOR

FACTOR ->

id

| id ( EXPR_LIST )

| num

| ( EXPRESSION )

| not FACTOR

TERM ->

id

| id ( EXPR_LIST )

| num

| ( EXPRESSION )

| not FACTOR

FACTOR ->

id

| id ( EXPR_LIST )

| num

| ( EXPRESSION )

| not FACTOR

S -> abS

S -> aaA

S -> a

A -> bA

A ->

S -> aX

X -> bS

X -> aA

X ->

A -> bA

A ->

EXPRESSION ->

SIMPLE_EXPR

| SIMPLE_EXPR relop SIMPLE_EXPR

EXPRESSION -> SIMPLE_EXPR RestOfExp RestOfExp -> |

relop SIMPLE_EXPR

- Remove Unit Production
S -> abS | bSa | A | d A -> c | dA

- Left Factor this grammar
FACTOR -> id

| id ( EXPR_LIST )

| num

| ( EXPRESSION )

| not FACTOR

- Remove Left recursion:
SIMPLE_EXPR -> TERM

| SIGN TERM

| SIMPLE_EXPR addop TERM