- 65 Views
- Uploaded on
- Presentation posted in: General

Re-enter Chomsky

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Re-enter Chomsky

More about grammars

Consider L = {am bn| m, n > 0 }

(one/more a’s followed by one/more b’s)

A

B

S A B

a A B

aa A B

aaa B

aaa bB

aaa bb

S A B

A bB

A bb

aA bb

aaA bb

aaa bb

a

A

b

B

a

A

Parse trees

S A B

A aA | a

B bB | b

A Parse Tree (for the same string)

This is a “common” representation where the order of derivation is not explicit—there is no such thing as “left parse tree” or “right parse tree”!

Consider the string “aaabb” which is a valid string in L and can be derived from the grammar.

S

Left-most derivation

Right-most derivation

b

a

The leaves of the tree form the input string.

b

b

a

a

b

b

a

L

b

L

L

b

L

L

L

L

L

L

a

L

a

a

L

b

L

b

L

a

L

ε

ε

ε

ε

Parse trees

Consider L = { wє {a,b}* | w contains a ‘b’ }

(any combination of a’s & b’s that contains at least one b somewhere)

S L b L

L aL | bL |ε

Consider the string “abba” which is a valid string in L and can be derived/generated from/by the grammar.

Parse Tree 1

Parse Tree 2

S

S

The grammar can generate the input string in two different ways. In other words, there are two different parse trees for the string. Since it’s unclear as to how exactly the grammar should generate the string, the grammar is said to be ambiguous *. Note that the grammar on the previous slide is not ambiguous.

* This example is based on an observation by Mr. Hui Zhang, a COMPSCI 220 student.

b

b

a

X

b

Y

X

Y

a

X

b

Y

a

Y

ε

ε

Parse trees

An unambiguous grammar for L = { wє {a,b}* | w contains a ‘b’ }

has only zero/more a’s

first occurrence of b

zero/more a’s and b’s

S X b Y

X a X |ε

Y a Y | b Y | ε

S

Consider the string “abba” which is a valid string in L and can be derived/generated from/by the grammar.

There is only one way in which you can “group” the input string this time!

A grammar is said to be ambiguous if it generates some string wєΣ* in more than one way, i.e. if the string has more than one parse tree.

Ambiguous grammars can be undesirable, for instance, in Compiler Design*, where the code generated by the compiler might depend on the particular way in which the input string (a statement in a programming language) is generated. This will be demonstrated in the examples that follow.

* Grammars are used to describe the syntax of statements in a programming language.

IF-ELSE statement

Is

condition

true?

yes

no

Is

condition

true?

yes

no

(i) IF (condition) Statement 1;

Statement 2;

Statement 1

Statement 2

(ii) IF (condition) Statement 1;

ELSE Statement 2;

Statement 3;

Statement 1

Statement 2

Statement 3

Statement IF_statement | …

IF_statement if (Cond) Statement

IF_statement if (Cond) Statement

else Statement

if

(

Cond

)

Statement

else

Statement

if

(

Cond

)

Statement

else

Statement

Statement IF_statement | …

IF_statement if (Cond) Statement

IF_statement if (Cond) Statement

else Statement

Parse tree for the statement

Statement

Consider the statement:

IF_Statement

IF ( C1) S1;

ELSE

IF ( C2 ) S2;

ELSE

S3;

IF_Statement

C1

S1

C2

S2

S3

if

(

Cond

)

Statement

if

(

Cond

)

Statement

Statement

else

S3

if

(

Cond

)

Statement

else

Statement

if

(

Cond

)

Statement

Grammar for IF-ELSE statement

Consider the statement:

IF ( C1)

IF ( C2 ) S2;

ELSE

S3;

IF ( C1)

IF ( C2 ) S2;

ELSE

S3;

Parse tree 1

Parse tree 2

AMBIGUITY !

(the same expression can be generated in a different way)

Statement

Statement

IF_Statement

IF_Statement

IF_Statement

IF_Statement

C1

C1

C2

S2

S3

C2

S2

Grammar for arithmetic expressions

VERSION I

Consider arithmetic expressions with only one or two variables (that use +, -, * only).

e.g. a, a + b, a – b, a * b

E Var

E Var + Var

E Var – Var

E Var * Var

*

E

Var

+

E

Wrong grouping!

(wrong order of precedence)

1

a * b + c

2

Grammar for arithmetic expressions

VERSION II

Consider arithmetic expressions with any number of variables (more realistic!).

e.g. a + b – c * d

Try generating the expression: a * b + c

E Var

E Var + E

E Var – E

E Var * E

E

a

b

Var

c

*

E

E

+

E

AMBIGUITY !

(the same expression can be generated in a different way)

E

+

E

E

*

E

looks OK!

Grammar for arithmetic expressions

VERSION III

Try to generate arithmetic expressions—preserving the order of precedence.

E Var

E E + E

E E – E

E E * E

Try generating the same expression again: a * b + c

E

E

Var

Var

a

c

Var

Var

Var

Var

b

c

a

b

Note: Each parse tree conveys a different “meaning”; each of them corresponds to a different code (therefore possibly different results) generated by the compiler.

+

Term

Var

Term

*

Grammar for arithmetic expressions

VERSION IV

Try to generate arithmetic expressions—preserving the order of precedence and also avoiding ambiguity.

Try generating the same expression again: a * b + c

E E + Term

E E – Term

E Term

Term Term * Var

Term Var

E

Term

Var

c

Var

b

a

What Context Free Grammars (CFGs) can’t express

Examples of languages that can’t be generated by CFGs:

{ an bn cn| n, m > 0}

{ an bm cn dm | n, m > 0}

{w c w| w єΣ* }

Type 3 (Regular grammar)

Right side:

(i) a single terminal symbol OR

(ii) a single terminal followed by a single non-terminal

Left side:

a single non-terminal symbol

A a

A a B

Type 2 (Context-free grammar)

Right side:

no restriction (any string of terminals and non-terminals).

Left side:

a single non-terminal symbol

A α

Type 1 (Context-sensitive grammar)

Right, left sides:

no restriction except that length( α ) <= length ( β )

αβ

Type 0 (Phrase-structure grammar)

A language is said to be type i(i = 0, 1, 2, 3) if it can be specified by type i grammar and cannot specified by type (i +1) grammar.

Right, left sides:

no restriction at all!

αβ

Converting FSA into equivalent grammar

b

a

i

a

j

b

L = {strings of a’s and b’s—with at least one ‘a’}

Any given FSA can be mechanically converted into grammar rules that generate the exactly the same language recognized by the FSA.

Rules:

(i) For an a-transition from state i to state j, generate the production rule: A i a A j

(i) For the final state f, generate the production rule: A fε

Grammar rules that generate L

A i a A j

A i b A i

A j a A j

A j b A j

A jε