Chapter 4 syntax analysis part 1 grammar concepts
This presentation is the property of its rightful owner.
Sponsored Links
1 / 61

Chapter 4: Syntax Analysis Part 1: Grammar Concepts PowerPoint PPT Presentation


  • 55 Views
  • Uploaded on
  • Presentation posted in: General

Chapter 4: Syntax Analysis Part 1: Grammar Concepts. Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT 06269-3155. [email protected] http://www.engr.uconn.edu/~steve (860) 486 - 4818.

Download Presentation

Chapter 4: Syntax Analysis Part 1: Grammar Concepts

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Chapter 4 syntax analysis part 1 grammar concepts

Chapter 4: Syntax AnalysisPart 1: Grammar Concepts

Prof. Steven A. Demurjian

Computer Science & Engineering Department

The University of Connecticut

371 Fairfield Way, Unit 2155

Storrs, CT 06269-3155

[email protected]

http://www.engr.uconn.edu/~steve

(860) 486 - 4818

Material for course thanks to:

Laurent Michel

Aggelos Kiayias

Robert LeBarre


Syntax analysis parsing

Syntax Analysis - Parsing

  • An overview of parsing : Functions & Responsibilities

  • Context Free Grammars: Concepts & Terminology

  • Writing and Designing Grammars

  • Resolving Grammar Problems / Difficulties

  • Top-Down Parsing

    • Recursive Descent & Predictive LL

  • Bottom-Up Parsing

    • LR & LALR

  • Key Issues in Error Handling

  • Concluding Remarks/Looking Ahead


An overview of parsing

An Overview of Parsing

  • Why are Grammars to formally describe Languages Important ?

    • Precise, easy-to-understand representations

    • Compiler-writing tools can take grammar and generate a compiler

    • allow language to be evolved (new statements, changes to statements, etc.) Languages are not static, but are constantly upgraded to add new features or fix “old” ones

    • ADA  ADA9x, C++ Adds: Templates, exceptions,

How do grammars relate to parsing process ?


Parsing during compilation

Parsing During Compilation

errors

token

lexical analyzer

rest of front end

source program

parse tree

parser

get next token

symbol table

regular expressions

interm repres

  • also technically part or parsing

  • includes augmenting info on tokens in source, type checking, semantic analysis

  • uses a grammar to check structure of tokens

  • produces a parse tree

  • syntactic errors and recovery

  • recognize correct syntax

  • report errors


Parsing responsibilities

Parsing Responsibilities

  • Syntax Error Identification / Handling

  • Recall typical error types:

    • Lexical : Misspellings

    • Syntactic : Omission, wrong order of tokens

    • Semantic : Incompatible types

    • Logical : Infinite loop / recursive call

  • Majority of error processing occurs during syntax analysis

  • NOTE: Not all errors are identifiable !! Which ones?


Key issues in error handling

Key issues in Error Handling

  • Detection

    • Actual Detection

    • Finding position at which they occur

  • Reporting

    • Clear / accurate presentation

  • Recovery

    • How to skip overto continue and find later errors

    • Cannot impact compilation of correct programs


What are some typical errors

What are some Typical Errors ?

#include<stdio.h>

int f1(int v)

{ int i,j=0;

for (i=1;i<5;i++)

{ j=v+f2(i) }

return j; }

int f2(int u)

{ int j;

j=u+f1(u*u)

return j; }

int main()

{ int i,j=0;

for (i=1;i<10;i++)

{ j=j+i*I; printf(“%d\n”,i);

printf("%d\n",f1(j));

return 0;

}

As reported by MS VC++

'f2' undefined; assuming extern returning intsyntax error : missing ';' before ‘}‘syntax error : missing ';' before ‘return‘fatal error : unexpected end of file found

Which are “easy” to recover from? Which are “hard” ?


Error recovery strategies

Error Recovery Strategies

Panic Mode – Discard tokens until a “synchro” token is

found ( end, “;”, “}”, etc. )

-- Decision of designer

-- Problems:

skip input miss declaration – causing more errors

miss errors in skipped material

-- Advantages:

simple suited to 1 error per statement

Phase Level – Local correction on input

-- “,” ”;” – Delete “,” – insert “;”

-- Also decision of designer

-- Not suited to all situations

-- Used in conjunction with panic mode to

allow less input to be skipped


Error recovery strategies 2

Error Recovery Strategies – (2)

Error Productions:

-- Augment grammar with rules

-- Augment grammar used for parser

construction / generation

-- Add a rule for

:= in C assignment statements

Report error but continue compile

-- Self correction + diagnostic messages

-- What are other rules ? (supported by yacc)

Global Correction:

-- Adding / deleting / replacing symbols is

chancy – may do many changes !

-- Algorithms available to minimize changes

costly - key issues

What do you think of each approach? What approach do compilers you’ve used take ?


Motivating grammars

Motivating Grammars

Reg. Lang.

CFLs

  • Regular Expressions

  •  Basis of lexical analysis

  •  Represent regular languages

  • Context Free Grammars

  •  Basis of parsing

  •  Represent language constructs

  •  Characterize context free languages

EXAMPLE: anbn , n  1 : Is it regular ?


Context free languages

Context Free Languages

Context Free

Languages

Regular Languages

Context Free Languages

Reg. Languages

Token Structure

Sentence Structure

Parser Generation

Scanner Generation


Context free grammars concepts terminology

Context Free Grammars : Concepts & Terminology

Definition: A Context Free Grammar, CFG, is described by T, NT, S, PR, where:

T: Terminals / tokens of the language

NT: Non-terminals to denote sets of strings generatable by the grammar & in the language

S: Start symbol, SNT, which defines all strings of the language

PR: Production rules to indicate how T and NT are combines to generate valid strings of the language.

PR: NT  (T | NT)*

Like a Regular Expression / DFA / NFA, a Context Free Grammar is a mathematical model !


Context free grammars a first look

Context Free Grammars : A First Look

assign_stmt id := expr ;

expr  term operator term

term id

term real

term integer

operator +

operator -

What do “BLUE” symbols represent?

What do “BLACK” symbols represent?

Derivation: A sequence of grammar rule applications and substitutions that transform a starting non-term into a collection of terminals / tokens.

Simply stated: Grammars / production rules allow us to “rewrite” and “identify” correct syntax.


How is grammar used

How is Grammar Used ?

Given the rules on the previous slide, suppose

id := real + int;

is input. Is it syntactically correct? How do we know?

expr is represented as: expr term operator term

Is this accurate / complete?

expr  expr operator term

expr  term

How does this affect the derivations which are possible?


Derivation and parse tree

Derivation and Parse Tree

Let’s derive: id := id + real – integer ;

What’s its parse tree ?


Derivation example

Derivation Example

  • Let’s derive: id := id + real – integer ;

assign_stmt assign_stmt → id := expr ;

→ id := expr ;expr → expr operator term

→ id := expr operator term;expr → expr operator term

→ id := expr operator term operator term; expr → term

→ id := term operator term operator term; term → id

→ id := id operator term operator term; operator→ +

→ id := id + term operator term; term→ real

→ id := id + real operator term; operator→ -

→ id := id + real - term;term→ integer

→ id := id + real - integer;

assign_stmt → id := expr ;

expr → expr operator term

expr → term

term → id

term → real

term → integer

operator → +

operator → -


Example grammar

Example Grammar

expr  expr op expr

expr ( expr )

expr - expr

expr id

op +

op -

op *

op /

op 

Black : NT

Blue : T

expr : S

9 Production rules

To simplify / standardize notation, we offer a synopsis of terminology.


Example grammar terminology

Example Grammar - Terminology

Terminals: a,b,c,+,-,punc,0,1,…,9, blue strings

Non Terminals: A,B,C,S, black strings

T or NT: X,Y,Z

Strings of Terminals: u,v,…,z in T*

Strings of T / NT:  ,  in ( T  NT)*

Alternatives of production rules:

A 1; A 2; …; A k;  A  1 | 2 | … | 1

First NT on LHS of 1st production rule is designated as start symbol !

E  E A E | ( E ) | -E | id

A  + | - | * | / | 


Grammar concepts

Grammar Concepts

A step in a derivation is zero or one action that replaces a NT with the RHS of a production rule.

EXAMPLE: E  -E (the  means “derives” in one step) using the production rule: E  -E

EXAMPLE: E  E A E  E * E  E * ( E )

DEFINITION:  derives in one step

 derives in  one step

 derives in  zero steps

+

*

EXAMPLES:  A      if A  is a production rule

1  2 …  n  1  n ;    for all 

If    and    then   

*

*

*

*


How does this relate to languages

How does this relate to Languages?

+

Let G be a CFG with start symbol S. Then S W (where W has no non-terminals) represents the language generated by G, denoted L(G). So WL(G)  S W.

+

W : is a sentence of G

When S  (and  may have NTs) it is called a sentential form of G.

EXAMPLE: id * id is a sentence

Here’s the derivation:

E  E A E E * E  id * E  id * id

E id * id

Sentential forms

*


Leftmost and rightmost derivations

Leftmost and Rightmost Derivations

lm

rm

Leftmost: Replace the leftmost non-terminal symbol

E  E A E  id A E  id * E  id * id

Rightmost: Replace the leftmost non-terminal symbol

E  E A E  E A id  E *id  id * id

lm

lm

lm

lm

rm

rm

rm

rm

Important Notes: A  

If A    , what’s true about  ?

If A    , what’s true about  ?

Derivations: Actions to parse input can be represented pictorially in a parse tree.


Examples of lm rm derivations

Examples of LM / RM Derivations

E  E A E | ( E ) | -E | id

A  + | - | * | / | 

A leftmost derivation of : id + id * id

A rightmost derivation of : id + id * id


Derivations parse tree

Derivations & Parse Tree

E

E

E

E

*

E

E

E

E

A

A

A

A

E

E

E

E

id

*

id

id

*

E  E A E

 E * E

 id * E

 id * id


Parse trees and derivations

Parse Trees and Derivations

E

E

E

E

E + E  id+ E

E

E

E

E

+

+

*

+

E

E

E

E

id+ E  id+ E * E

id

id

Consider the expression grammar:

E  E+E | E*E | (E) | -E | id

Leftmost derivations of id + id * id

E  E + E


Parse tree derivations continued

Parse Tree & Derivations - continued

id+ E * E  id+id* E

id

id

id

id

id

E

E

E

E

E

E

E

E

+

*

*

+

E

E

E

E

id+id* E  id+id* id


Alternative parse tree derivation

Alternative Parse Tree & Derivation

E

E

E

*

E

E

+

E

id

id

id

E  E * E

 E + E * E

 id + E * E

 id + id * E

 id + id * id

WHAT’S THE ISSUE HERE ?


Example pascal grammar in yacc

Example Pascal Grammar in Yacc

%term

T_AND T_ARRAY T_BEGIN T_CASE

T_CONST T_DIV T_DO T_RANGE

T_TO T_ELSE T_END T_FILE

T_FOR T_FUNCTION /* T_GOTO */ T_IDENTIFIER

T_IF T_IN T_INTEGER T_LABEL

T_MOD T_NOT T_REAL T_OF

T_OR /* T_PACKED */ T_NIL T_PROCEDURE

T_PROGRAM T_RECORD T_REPEAT T_SET

T_STRING T_THEN T_DOWNTO T_TYPE

T_UNTIL T_VAR T_WHILE /* T_WITH */

T_COMMA T_PERIOD T_ASSIGN T_COLON

T_LPAREN T_RPAREN T_LBRACK T_RBRACK

T_UPARROW T_SEMI

%binary T_LT T_EQ T_GT T_GE T_LE T_NE T_IN

%left T_PLUS T_MINUS T_OR

%left UNARYSIGN

%left T_MULT T_RDIV T_DIV T_MOD T_AND

%left T_NOT

%%


Example pascal grammar in yacc1

Example Pascal Grammar in Yacc

Start : Program_Header Declarations Block T_PERIOD

Program_Header : T_PROGRAM T_IDENTIFIER T_LPAREN Id_List T_RPAREN T_SEMI

Block : T_BEGIN Stt_List T_END

Declarations : Const_Def_Part

Type_Def_Part

Var_Decl_Part

Routine_Decl_Part

Const_Def_Part : T_CONST Const_Def_List

| /* epsilon */ ;

Const_Def_List : Const_Def

| Const_Def_List Const_Def ;

Const_Def : T_IDENTIFIER T_EQ Constant T_SEMI

Type_Def_Part : T_TYPE Type_Def_List

| /* epsilon */

Type_Def_List : Type_Def

| Type_Def_List Type_Def


Example pascal grammar in yacc2

Example Pascal Grammar in Yacc

Type_Def : T_IDENTIFIER T_EQ Type T_SEMI

Var_Decl_Part : T_VAR Var_Decl_List

| /* epsilon */

Var_Decl_List : Var_Decl_List Var_Decl

| Var_Decl

Var_Decl : Id_List T_COLON Type T_SEMI

Routine_Decl_Part : Routine_Decl_Part Routine_Decl

| /* epsilon */

Routine_Decl : Routine_Head T_IDENTIFIER T_SEMI

| Routine_Head Declarations Block T_SEMI

Routine_Head : T_PROCEDURE T_IDENTIFIER Params T_SEMI

| T_FUNCTION T_IDENTIFIER Params Function_Type T_SEMI

Params : T_LPAREN Params_List T_RPAREN

| /* epsilon */

Param_Group : Id_List T_COLON Type

| T_VAR Id_List T_COLON Type


Example pascal grammar in yacc3

Example Pascal Grammar in Yacc

Function_Type : T_COLON Type

| /* epsilon */

Params_List : Param_Group

| Params_List T_SEMI Param_Group

Constant : T_STRING | Number | T_PLUS Number | T_MINUS Number

Number : T_IDENTIFIER | T_INTEGER | T_REAL

Const_List : Constant | Const_List T_COMMA Constant

Type : Simple_Type | T_UPARROW T_IDENTIFIER | Structured_Type

Simple_Type : T_IDENTIFIER

| T_LPAREN Id_List T_RPAREN

| Constant T_RANGE Constant

Structured_Type : T_ARRAY T_LBRACK Simple_Type_List T_RBRACK T_OF Type

| T_SET T_OF Simple_Type

| T_RECORD Field_List T_END

Simple_Type_List : Simple_Type

| Simple_Type_List T_COMMA Simple_Type


Example pascal grammar in yacc4

Example Pascal Grammar in Yacc

Field_List : Fixed_Part /* Variant_part */

Fixed_Part : Field | Fixed_Part T_SEMI Field

Field : /* epsilon */ | Id_List T_COLON Type

Variant_part : / * epsilon * /

| T_CASE T_IDENTIFIER T_OF Variant_List

| T_CASE T_IDENTIFIER T_COLON T_IDENTIFIER T_OF Variant_List

Variant_List : Variant | Variant_List T_SEMI Variant

Variant : / * epsilon * /

| Const_List T_COLON T_LPAREN Field_List T_RPAREN

Stt_List : Statement | Stt_List T_SEMI Statement

Case_Stt_List : Case_Statement | Case_Stt_List T_SEMI Case_Statement

Case_Statement : Const_List T_COLON Statement | /* epsilon */


Example pascal grammar in yacc5

Example Pascal Grammar in Yacc

Statement : /* epsilon */

| T_IDENTIFIER | T_IDENTIFIER T_LPAREN Expr_List T_RPAREN

| Variable T_ASSIGN Expression | T_BEGIN Stt_List T_END

| T_CASE Expression T_OF Case_Stt_List T_END

| T_WHILE Expression T_DO Statement

| T_REPEAT Stt_List T_UNTIL Expression

| T_FOR Variable T_ASSIGN Expression T_TO Expression T_DO Statement

| T_FOR Variable T_ASSIGN Expression T_DOWNTO Expression T_DO Statement

| T_IF Expression T_THEN Statement

| T_IF Expression T_THEN Statement T_ELSE Statement

Expression : Expression relop Expression %prec T_LT

| T_PLUS Expression %prec UNARYSIGN

| T_MINUS Expression %prec UNARYSIGN

| Expression addop Expression %prec T_PLUS

| Expression divop Expression %prec T_MULT

| T_NIL | T_STRING | T_INTEGER | T_REAL

| Variable

| T_IDENTIFIER T_LPAREN Expr_List T_RPAREN

| T_LPAREN Expression T_RPAREN

| negop Expression %prec T_NOT

| T_LBRACK Element_List T_RBRACK

| T_LBRACK T_RBRACK


Example pascal grammar in yacc6

Example Pascal Grammar in Yacc

Element_List : Element | Element_List T_COMMA Element

Element : Expression | Expression T_RANGE Expression

Variable : T_IDENTIFIER

| Variable T_LBRACK Expr_List T_RBRACK

| Variable T_PERIOD T_IDENTIFIER

| Variable T_UPARROW

Expr_List : Expression | Expr_List T_COMMA Expression

relop : T_EQ | T_LT | T_GT | T_NE | T_LE | T_GE | T_IN

addop : T_PLUS | T_MINUS | T_OR

divop : T_MULT | T_RDIV | T_DIV | T_MOD | T_AND ;

negop : T_NOT


Resolving grammar problems difficulties

Resolving Grammar Problems/Difficulties

Reg. Lang.

CFLs

Regular Expressions : Basis of Lexical Analysis

Reg. Expr.  generate/represent regular languages

Reg. Languages  smallest, most well defined class

of languages

Context Free Grammars: Basis of Parsing

CFGs  represent context free languages

CFLs  contain more powerful languages

anbn – CFL that’s not regular!

Try to write DFA/NFA for it !


Resolving problems difficulties 2

Resolving Problems/Difficulties – (2)

a

start

a

b

b

0

2

1

3

b

Since Reg. Lang.  Context Free Lang., it is possible to go from reg. expr. to CFGs via NFA.

Recall: (a | b)*abb


Resolving problems difficulties 3

Resolving Problems/Difficulties – (3)

a

b

i

j

i

j

Construct CFG as follows:

  • Each State I has non-terminal Ai : A0, A1, A2, A3

  • If then Aia Aj

  • If then Ai Aj

  • If I is an accepting state, Ai  : A3  

  • If I is a starting state, Ai is the start symbol : A0

: A0 aA0, A0 aA1

: A0 bA0,A1 bA2

: A2 bA3

T={a,b}, NT={A0, A1, A2, A3}, S = A0

PR ={ A0 aA0 | aA1 | bA0 ;

A1  bA2 ;

A2  3A3 ;

A3   }


How does this cfg derive strings

How Does This CFG Derive Strings ?

a

start

a

b

b

0

2

1

3

b

vs.

A0 aA0, A0 aA1

A0 bA0, A1 bA2

A2 bA3, A3 

How is abaabb derived in each ?


Regular expressions vs cfgs

Regular Expressions vs. CFGs

  • Regular expressions for lexical syntax

    • CFGs are overkill, lexical rules are quite simple and straightforward

    • REs – concise / easy to understand

    • More efficient lexical analyzer can be constructed

    • RE for lexical analysis and CFGs for parsing promotes modularity, low coupling & high cohesion. Why?

CFGs : Match tokens “(“ “)”, begin / end, if-then-else, whiles, proc/func calls, …

Intended for structural associations between tokens !

Are tokens in correct order ?


Resolving grammar difficulties motivation

Resolving Grammar Difficulties : Motivation

  • ambiguity

  • -moves

  • cycles

  • left recursion

  • left factoring

  • Humans write / develop grammars

  • Different parsing approaches have different needs

LL(k)

Recursive

LR(k)

LALR(k)

Top-Down vs. Bottom-Up

  • For: 1  remove “errors”

  • For: 2  put / redesign grammar

Grammar Problems


Resolving problems ambiguous grammars

Resolving Problems: Ambiguous Grammars

Consider the following grammar segment:

stmt if exprthen stmt

| if exprthen stmtelse stmt

| other (any other statement)

What’s problem here ?

Consider the Program:

if e1

then if e2

then s1

else s2

Else must match to previous then.

Structure indicates parse subtree for expression.


Resulting parse tree

Resulting Parse Tree

  • Easy case

    • Else must match to previous then.

if e1

then s1

else if e2

then s2

else s3


Example what happens with this string

Example : What Happens with this string?

If E1then if E2then S1else S2

How is this parsed ?

if E1then

if E2then

S1

else

S2

if E1then

if E2then

S1

else

S2

vs.

What’s the issue here ?


Parse trees for example

Parse Trees for Example

if e1

then if e2

then s1

else s2

Form 1:

Form 2:

What’s the issue here ?


Removing ambiguity

Removing Ambiguity

Take Original Grammar:

stmt if exprthen stmt

| if exprthen stmtelse stmt

| other (any other statement)

Revise to remove ambiguity:

stmt  matched_stmt | unmatched_stmt

matched_stmt if exprthen matched_stmt else matched_stmt | other

unmatched_stmt if exprthen stmt

| if exprthen matched_stmt else unmatched_stmt

How does this grammar work ?


Resolving difficulties left recursion

Resolving Difficulties : Left Recursion

A left recursive grammar has rules that support the derivation : A  A, for some .

+

Top-Down parsing can’t reconcile this type of grammar, since it could consistently make choice which wouldn’t allow termination.

A  A  A  A … etc. A A | 

Take left recursive grammar:

A  A | 

To the following:

A’  A’

A’  A’ | 


Why is left recursion a problem

Why is Left Recursion a Problem ?

Derive : id + id + id

E  E + T 

Consider:

E  E + T | T

T  T * F | F

F  ( E ) | id

How can left recursion be removed ?

E  E + T | T What does this generate?

E  E + T  T + T

E  E + T  E + T + T  T + T + T

How does this build strings ?

What does each string have to start with ?


Resolving difficulties left recursion 2

Resolving Difficulties : Left Recursion (2)

For our example:

E  E + T | T

T  T * F | F

F  ( E ) | id

E  TE’ E’  + TE’ | 

T  FT’ T’  * FT’ | 

F  ( E ) | id

Informal Discussion:

Take all productions for A and order as:

A  A1 | A2 | … | Am | 1 | 2 | … | n

Where no i begins with A.

Now apply concepts of previous slide:

A  1A’ | 2A’ | … | nA’

A’  1A’ | 2A’| … | m A’ | 


Resolving difficulties left recursion 3

Resolving Difficulties : Left Recursion (3)

S  Aa | b

A  Ac | Sd | 

S  Aa  Sda

Problem: If left recursion is two-or-more levels deep,

this isn’t enough

Algorithm:

  • Input: Grammar G with no cycles or -productions

  • Output: An equivalent grammar with no left recursion

  • Arrange the non-terminals in some order A1,A2,…An

  • for i := 1 to n do begin

  • for j := 1 to i – 1 do begin

  • replace each production of the form Ai  Aj

  • by the productions Ai  1 | 2 | … | k

  • where Aj  1|2|…|k are all current Aj productions;

  • end

  • eliminate the immediate left recursion among Ai productions

  • end


Using the algorithm

Using the Algorithm

Apply the algorithm to: A1  A2a | b

A2  A2c | A1d | 

i = 1

For A1 there is no left recursion

i = 2

for j=1 to 1 do

Take productions: A2 A1 and replace with

A2  1  | 2  | … | k |

where A1 1 | 2 | … | k are A1 productions

in our case A2  A1d becomes A2  A2ad | bd

What’s left: A1 A2a | b

A2  A2 c | A2ad | bd | 

Are we done ?


Using the algorithm 2

Using the Algorithm (2)

No ! We must still remove A2 left recursion !

A1 A2a | b

A2  A2 c | A2ad | bd | 

Recall:

A  A1 | A2 | … | Am | 1 | 2 | … | n

A  1A’ | 2A’ | … | nA’

A’  1A’ | 2A’| … | m A’ | 

Apply to above case. What do you get ?


Removing difficulties moves

Removing Difficulties : -Moves

Very Simple: A  and B uAv implies add B uv to the grammar G.

Why does this work ?

Examples:

E  TE’ E’  + TE’ | 

T  FT’ T’  * FT’ | 

F  ( E ) | id

A1 A2a | b

A2 bd A2’ | A2’

A2’  c A2’ | bd A2’ | 


Removing difficulties cycles

Removing Difficulties : Cycles

How would cycles be removed ?

S  SS | ( S ) | 

Has: S  SS  S

S  


Removing difficulties left factoring

Removing Difficulties : Left Factoring

Problem : Uncertain which of 2 rules to choose:

stmt if exprthen stmt else stmt

| if exprthen stmt

When do you know which one is valid ?

What’s the general form of stmt ?

A  1 | 2  : if exprthen stmt

1: else stmt 2 : 

Transform to:

A   A’

A’  1 | 2

EXAMPLE:

stmt if exprthen stmt rest

rest else stmt | 


Resolving grammar problems another look

Resolving Grammar Problems : Another Look

Leads to:

  • Forcing Matching Within Grammar

    • - if-then-else else must go with most recent if

  • stmt  if expr then stmt | if expr then stmt else stmt | other

stmt  matched_stmt | unmatched_stmt

matched_stmt if exprthen matched_stmt else matched_stmt | other

unmatched_stmt if exprthen stmt

| if exprthen matched_stmt else unmatched_stmt


Resolving grammar problems another look 2

Resolving Grammar Problems : Another Look (2)

  • Left Factoring - Know correct choice in advance !

  • where_queries  where are_does declarations of symbol

  • | where are_does “search_option” appear

Rules given in Assignment or

where_queries  where rest_where

rest_where  are declarations of symbol

| does“search_option” appear


Resolving grammar problems another look 3

Resolving Grammar Problems : Another Look (3)

E  TE’ E’  + TE’ | 

E  E + T | T

T  T * F | F

F  ( E ) | id

T  FT’ T’  * FT’ | 

F  ( E ) | id

  • Left Recursion - Avoid infinite loop when choosing rules

A  A’

A’  A’ | 

A  A | 

EXAMPLE:

If we use the original production rules on input: id + id (with leftmost):

E  E + T  E + T + T  E + T + T + T  T + T + T + … + T

*

not assured

If we use the transformed production rules on input: id + id (with leftmost):

E  TE’  T + TE’  id+ TE’  id + id E’  id + id

*

*


Resolving grammar problems another look 4

Resolving Grammar Problems : Another Look (4)

  • Removing -Moves

E  TE’ E’  + TE’ | 

E  T

E  TE’

E’  + TE’

E’  + T

Is there still a problem ?


Resolving grammar problems

Resolving Grammar Problems

Reg. Lang.

CFLs

  • Note: Not all aspects of a programming language can be represented by context free grammars / languages.

  • Examples:

    • 1. Declaring ID before its use

    • 2. Valid typing within expressions (Type Rules)

    • 3. Parameters in definition vs. in call

    • 4. Method Overloading/hiding

  • These features are call context-sensitive and define yet another language class, CSL.

  • CSLs


    Context sensitive languages examples

    Context-Sensitive Languages - Examples

    • Examples:

      • L1 = { wcw | w is in (a | b)* } : Declare before use

      • L2 = { an bm cn dm | n  1, m  1 }

        • an bm : formal parameter

        • cn dm : actual parameter

    What does it mean when L is context-sensitive ?


    How do you show a language is a cfl

    How do you show a Language is a CFL?

    L3 = { w c wR | w is in (a | b)* }

    L4 = { an bm cm dn | n  1, m  1 }

    L5 = { an bn cm dm | n  1, m  1 }

    L6 = { an bn | n  1 }


    Solutions

    Solutions

    L3 = { w c wR | w is in (a | b)* }

    L4 = { an bm cm dn | n  1, m  1 }

    L5 = { an bn cm dm | n  1, m  1 }

    L6 = { an bn | n  1 }

    S  a S a | b S b | c

    S  a S d | a A d

    A  b A c | bc

    S  XY

    X  a X b | ab

    Y  c Y d | cd

    S  a S b | ab


  • Login