context free languages n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Context Free Languages PowerPoint Presentation
Download Presentation
Context Free Languages

Loading in 2 Seconds...

play fullscreen
1 / 34

Context Free Languages - PowerPoint PPT Presentation


  • 583 Views
  • Uploaded on

Context Free Languages. October 2, 2001 . Announcement. HW 3 due now. Agenda. Context Free Grammars Derivations Parse-trees CFGParse Ambiguity. Motivating Example. pal = { x  {0,1}* | x = x R } = { e , 0,1, 00,11, 000,010, 101,111, 0000,0110,1001,1111, … }

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Context Free Languages' - Faraday


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
context free languages

Context Free Languages

October 2, 2001

announcement
Announcement
  • HW 3 due now
agenda
Agenda
  • Context Free Grammars
    • Derivations
    • Parse-trees
    • CFGParse
    • Ambiguity
motivating example
Motivating Example

pal = { x{0,1}* | x =x R }

= { e,

0,1,

00,11,

000,010, 101,111,

0000,0110,1001,1111, … }

Saw last time that pal non-regular.

But there is a pattern.

Q: If you have one palindrome, how can you generate another?

motivating example1
Motivating Example

pal = { x{0,1}* | x =x R }

A: Can generate palrecursively as follows.

BASE CASE: e, 0 and 1 are palindromes

RECURSE: if x is a palindrome, then so are 0x 0 and 1x 1.

motivating example2
Motivating Example

pal = { x{0,1}* | x =x R }

SUMMARY: In palany x can be replaced by any one of {e, 0, 1, 0x 0, 1x 1}.

NOTATION: x  e|0|1|0x 0|1x 1

Each pipe “|” is an or, just ad in UNIX regexp’s.

In fact, all elements of pal can be generated from e by using these rules.

Q: How would you generate 11011011 starting from the variable x ?

motivating example3
Motivating Example

A: Generate the string from outside-in 11011011 = 1(1(0(1()1)0)1)1

so that:

x  1x1  11x11  110x 011

1101x1011  1101e1011 =

11011011

context free grammars definition
Context Free Grammars.Definition

DEF: A context free grammar consists of

(V, S, R, S ) with:

  • V –a finite set of variables (or symbols, or non-terminals)
  • S–a finite set set of terminals (or the alphabet )
  • R –a finite set of rules (or productions) of the form v  w with vV, and w(SV )* (read “v yieldsw ” or “v producesw ” )
  • S V –the start symbol.

Q: What are (V, S, R, S ) for pal ?

context free grammars definition1
Context Free Grammars.Definition

A: V = {x}

S = {0,1},

R = {xe, x0, x1, x0x 0, x1x 1}

S = x

Standardize pal, reset:

V = {S }, S = S

R = {Se, S0, S1, S0S 0, S1S 1}

derivations
Derivations

DEF: The derivation symbol “”, read “1-step derives” or “1-step produces” is a relation between strings in (SV )*. We write x y if x and y can be broken up as x = svt and y = swt with v w a production in R.

Q: What possible y satisfy (in pal)

S 0000S y ?

derivations1
Derivations

A: S 0000S y :

Any one of:

0000S, 00000S, 10000S, 0S00000S, 1S10000S, S0000, S00000, S00001, S00000S0, S00001S1

derivations2
Derivations

DEF: The derivation symbol “*”, read “derives” or “produces” or “yields” is a relation between strings in (SV )*. We write x *y if there is a sequence of 1-step productions from x to y. I.e., there are strings xiwith i ranging from 0 to n such that x = x0,y = xnand

x0 x1,x1 x2,x2 x3, … , xn-1 xn

Q: Which of LHS’s yield which of RHS’s in pal?

01S, SS ? 01, 0, 01S, 01110111, 0100111

derivations3
Derivations

A: Not all answers are unique.

  • 01S * 01
  • SS * 0
  • 01S * 01S
  • SS * 01110111
  • Nothing yields 0100111
language generated by a cfg
Language Generated by a CFG

DEF: Let G be a context free grammar. The language generated by G is the set of all terminal strings which are derivable from the start symbol. Symbolically:

L(G ) = {w  S* | S * w}

example infix expressions
Example. Infix Expressions

Infix expressions involving {+, , a, b, c, (, )}

E stands for an expression (most general)

F stands for factor (a multiplicative part)

T stands for term (a product of factors)

V stands for a variable, I.e. a, b, or c

Grammar is given by:

E T | E+T

T  F | T F

F  V | (E )

V  a | b | c

CONVENTION: Start variable is the first one (E)

example infix expressions1
Example. Infix Expressions

EG: Consider the string u given by

a  b + (c + (a + c ) )

This is a valid infix expression. Should be able to generated it from E.

  • A sum of two expressions, so first production must be E  E+T
  • Sub-expression a b is a product, so a term so generated by sequence E+T T+T T F+T * a b+T
  • Second sub-expression is a factor only because a parenthesized sum. a b+T  ab+F  ab+(E )  ab+(E+T )
example infix expressions2
Example. Infix Expressions

Continuing on in this fashion and summarizing:

E  E +T T +T T F +T  F F +T  V F+T  aF+T  aV+T  a b+T  a b+F  a b+ (E) a b+(E+T)  ab+(T+T)  ab+(F+T) ab+(V+T)  a b+(c+T)  a b+(c+F)  a b+(c+ (E) )  a b+(c+(E +T ))  ab+(c+(T+T) )  a b+(c+ (F +T) )  ab+(c+(a+T) )  a b+(c+(a+ F) )  ab+(c+(a+V) )  a b+(c + (a+c) )

This is a so-called left-most derivation.

right most derivation
Right-most derivation

In a right-most derivation, the variable most to the right is replaced.

E  E +T  E + F E + (E) E + (E +T) E + (E +F) etc.

There is a lot of ambiguity involved in how a string is derived. However, if decide that derivations are left-most, or right-most, each derivation is not unique.

Another way to describe a derivation in a unique way is using derivation trees.

derivation trees
Derivation Trees

In a derivation tree (or parse tree) each node is symbol. Each parent is a variable whose children spell out the production from left to right. For, example v  abcdefg:

The root is the start variable. The leaves spell out the derived string from left to right.

v

a

b

c

d

e

f

g

derivation trees1
Derivation Trees

EG, a derivation tree for

a  b + (c + (a + c ) )

Advantage. Derivation trees also help understanding semantics! You can tell how expression should be evaluated from the tree.

cfgparse
CFGParse

CFGParse is a tool that I wrote that helps you play with context free grammars and create derivation trees. Yoav Hirsch (TA) also added some functionality that allows you to convert CFG’s into various forms that we’ll learn about next week.

USAGE:

java CFGParse <grammar-file> <input-string>

pdflatex <LaTeX-tree-filename>

Second command only works on CUNIX, unless install LaTeX plus some libraries (ecltree).

ambiguity
Ambiguity

<sentence>  <action>|<action>with<subject>

<action>  <subject><activity>

<subject>  <noun>| <noun>and<subject>

<activity>  <verb>| <verb><object>

<noun> Hannibal | Clarice | rice | onions

<verb>  ate | played

<prep>  with | and | or

<object>  <noun>|<noun><prep><object>

  • Clarice played with Hannibal
  • Clarice ate rice with onions
  • Hannibal ate rice with Clarice

Q: Are there any suspect sentences?

ambiguity1
Ambiguity

A: Consider “Hannibal ate rice with Clarice”.

Could either mean

  • Hannibal and Clarice ate rice together.
  • Hannibal ate rice and ate Clarice.

And this is not absurd, given what we know about Hannibal!

This ambiguity arises from the fact that the sentence has two different parse-trees, and therefore two different interpretations:

hannibal and clarice ate
Hannibal and Clarice Ate

sentence

action

subject

w

i

t

h

subject

activity

noun

noun

verb

object

C

l

a

r

i

c

e

H

a

n

n

i

b

a

l

a

t

e

r

i

c

e

hannibal the cannibal
Hannibal the Cannibal

sentence

action

subject

activity

noun

verb

object

noun

prep

object

H

a

n

n

i

b

a

l

a

t

e

noun

r

i

c

e

w

i

t

h

C

l

a

r

i

c

e

ambiguity definition
Ambiguity.Definition

DEF: A string x is said to be ambiguous relative the grammar G if there are two essentially different ways to derive x in G. I.e. x admits two (or more) different parse-trees (equivalently, x admits different left-most [resp. right-most] derivations). A grammar G is said to be ambiguous if there is some string x in L(G ) which is ambiguous.

Q: Is the grammar S  ab | ba | aSb | bSa |SS ambiguous? What language is generated?

ambiguity2
Ambiguity

A: L(G ) = the language with equal no. of a’ s and b’ s

Yes, the language is ambiguous:

a

a

S

S

b

b

S

S

S

S

S

a

a

S

S

b

b

cfg s proving correctness
CFG’sProving Correctness

The recursive nature of CFG’s means that they are especially amenable to correctness proofs.

For example let’s consider the grammar

G = ( S   | ab | ba | aSb | bSa | SS )

with purported generated language

L(G ) = { x  {a,b}* | na(x) = nb(x) }.

Here na(x) is the number of a’s in x, and nb(x) is the number of b’s.

cfg s proving correctness1
CFG’sProving Correctness

Proof. There are two parts. Let L be the purported language generated by G. We want to show that L = L(G ). The usual way to prove that sets are equal is to show both inclusions:

  • L L(G ). Every string with purported can be generated by G.
  • L L(G ). G only generate strings of the purported pattern.
proving correctness l l g
Proving CorrectnessL L(G )

I. L L(G ): Show that every string x with the same number of a’s as b’s is generated by G. Prove by induction on the length n = |x|.

Base case: n = 0. So x is empty so derived by the production S  .

Inductive hypothesis: Assume n > 0. Let u be the smallest non-empty prefix of x which is also in L. There are two case:

  • u = x
  • u is a proper prefix
proving correctness l l g1
Proving CorrectnessL L(G )

I.1)u = x : Notice that u can’t start and end in the same letter. E.g., if it started and ended with a then write u = ava. This means that v must have 2 more b’s than a’s. So somewhere in v the b’sof u catch up to the a’s which means that there’s a smaller prefix in L, contradicting the definition of u as the smallest prefix in L. Thus for some string v in L we have

u = avb OR u = bva

The length of v is shorter than u so by induction we can assume that a derivation S * v exists. Therefore u can be derived by one of:

S aSb * avb = x OR S aSb * avb = x

proving correctness l l g2
Proving CorrectnessL L(G )

I.2) uis a proper prefix. So can write

x = uv with both u and v in L

As each of u, v are shorter than x, can apply inductive hypothesis to assume that S *u and S *v. To derive x just use:

S SS * uS * uv = x

proving correctness l l g3
Proving CorrectnessL L(G )
  • Everything derivable from L conforms to the pattern. We prove something more general: Any string in {S,a,b}* derivable from S contains the same number of a’s as b’s. Again induction is used, but this time on the number of steps n in the derivation.

Base case n = 0: The only thing derivable in 0 steps is S which has 0 a’s and 0 b’s. OK!

Intuctive step: Assume that u is any string derivable from S in n steps. I.e., S *u. Any further step would have to utilize on of S   | ab | ba | aSb | bSa | SS. But each of these productions preserves the relative number of a’s vs. b’s. Thus any string derivable from S in n+1steps must also have the same number of a’s as b’s. //QED

cfg exercises
CFG Exercises

On black-board