630 likes | 780 Views
CSCI 4325 / 6339 Theory of Computation. Zhixiang Chen Department of Computer Science University of Texas-Pan American. Chapter Two Context-free Languages. A Short Overview. We know that is not regular Can we design a grammar to generate L? Answer:
E N D
CSCI 4325 / 6339Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American
A Short Overview • We know that is not regular • Can we design a grammar to generate L? • Answer: • S a S b • S e • The grammar is (V, , R, S) • V = { a, b, S} • = {a, b} • S • R: S a S b, S e • The above grammar is context-free.
Context-free Grammars • Definition. A context-free grammar G is a quadruple (V, , R, S) where • V is an alphabet • V is the set of terminals • S V - is the start symbol • R is the set of production rules • R (V - ) x V* • V - is the set of non–terminals
Production Rules • Let (A, u) R be a production rule. We can rewrite it as • A u • A u means that from a nonterminal symbol A, we derive a string u. Or, we say A implies u. Or, A derives u. Or, A generates u.
Understand the Derivation Relation • Let G= (V, , R, S) be a context-free grammar. if and only if • The relation has a reflexive and transitive closure denoted by • Understand
Context-free Languages • The language generated by a CF G= (V, , R, S): • A string is generated by G if • The language generated by G is • A language is CF, if it is generated by a CF grammar.
Arithmetic Expressions • Ex’s: The language of arithmetic expressions is CF. • This language is generated by the CF grammar G • G = (V, , R, E) • V = { E , ( , ) , + , * , - , / , id , T , F} • = {( , ) , + , * , / , id} • E • R: E E + T T T * F E – T T / F T T * F T / F F F (E) F id • Ex’s of derivation?
CF Language Examples • Ex: • is context-free. • show this is true in class • Ex: • is context-free. • show this is true in class.
Theorem. Every regular language is context-free. • Proof: Let L = L(M) be regular language recognized by a FA M = ( K, , , s, F). • Construct a context-free grammar to simulate M. • Idea of construction
a b b a b a Example of Construction • Construct the CF grammar for the following FA:
Parse Trees • Given a CF grammar G = ( V, , S, R), L (G), the derivation procedure S * can be described by a tree. We call such a tree as the parse tree of . • Importance of parse tree • analysis of the syntax of .
Parse Tree Examples • Consider arithmetic expressions generated by G = ( V , , R , E), where • V = { E , T , F , ( , ) , + , * , - , / , id} • = {( , ) , + , * , - , / , id} • R: E E + T | E – T | T | T * F | T / F T T * F | T / F | F F (E) | id • Construct a parse tree for • id*(id+id)
Rightmost Derivations • Given a context-free grammar G = ( V, , S, R) for any *, a right-most derivation for is such a derivation that at each step the right-most non-terminal is used to do the derivation.
Ex’s of Rightmost Derivations • Ex: G = ( V, , S, R), where • V = { E , T , ( , ) , id , + , * , - , / } • = {( , ) , id , + , * , - , /} • R: E E + T | E – T | E * T | E / T T (E) | id • Find the rightmost derivation for • ( id + id ) * ( id – id * id )
Leftmost Derivations • Similar to rightmost derivations, at each step the left-most non-terminal symbol is used to do derivation. • Ex Find the leftmost derivation for • ( id + id ) * ( id – id * id )
Theorem 3.2.1. Let G = ( V, , S, R) be a context-free grammar, and let A V-, and *. Then the following statements are equivalent: • (a) A * • (b) There is a parse tree with root A and yield . • (c) There is a leftmost derivation A * • (d) There is a rightmost derivation A * • Proof • by induction on the length of . • Prove (a)(b)(c)(d)(d) L R
Ambiguity • A context-free grammar G = ( V, , S, R) is ambiguous if there is a * such the has two distinct parse trees. • That is, there are different meanings or interpretations for , or • The semantics of is ambiguous
Ambiguity Examples • Ex. E E + E | E * E | (E) | id Note. Can you see different meanings of id+id*id?
Ambiguous Languages • A language is inherently ambiguous if any context-free grammar generating it is ambiguous. • Why ambiguity is not good?
Definition of PA • A pushdown automata is a sextuple M = ( K , , , , S , F ) • K is a finite set of states. • is the input alphabet. • is the stack alphabet. • S K is the initial state. • F K is the set of final states. • is the transition relation • : K x ( { e} ) x * K x *
Understand the Transition Relation • Understand • (p, a , ) = ( q , ) • p: the current state • a: the current input symbol • : the top string on the current stack • q: the new state • : replace the top string on the current stack with
Configurations of PA • Configurations of a pushdown automaton are tuples in • K x * x * • Given a configuration • ( p , , u ) • understand it: • p: the current state • : the remaining tape content • u: the current stack content
Yield Relations of PA • Yield relation | Given two configurations ( p, x, ) and (q, y, ), • ( p, x, ) | (q, y, ) • If x = a y , = , = , (p, a, ) = ( q , ) • Define |* as the reflexive transitive closure of | • Understand | and |*
The Language Accepted by a PA • Give * , a PA M accepts if and only if (s, , e) ⊢* (p, e, e) for some p F. • The language accepted by M is • L (M) = { * : (s, , e) ⊢* (p, e, e) for some p F}
PA Examples • EX. Design a pushdown automaton accepting • L = { c : {a, b}* } • M = ( K , , , , S , F ) • K = { s, f } , = { a, b, c} • = { a, b } , F = { f } • : ( s , a , e ) ( s, a) ( s , b , e ) ( s, b) ( s , c , e ) ( f, e) ( f , a , a ) ( f, e) ( f , b , b ) ( f, e) R
PA Examples • EX. Design a pushdown automaton accepting • L = { : {a, b}* } • ( s , a , e ) ( s, a) • ( s , b , e ) ( s, b) • ( s , e , e ) ( f, e) • ( f , a , a ) ( f, e) • ( f , b , b ) ( f, e) R
PA vs. CF Languages • Theorem 3.4.1: the class of languages accepted by pushdown automata is exactly the class of context-free languages.
Proof. • Part 1 Each CF language is accepted by some PA. • Let G = ( V , , R , S ) be a CF grammar. • Want to construct a PA M such that L (G) = L (M). • The idea of constructing of M? • Push the start symbol S of the CF G onto the stack • Simulate derivation on the stack • Match terminals symbols in stack top with the current input symbols
Constructing of the PA for CF G • M = ( {p ,q} , , V , , p , {q} ) • : ( p , e , e ) ( q, S) ( q , e, A ) ( q, x), if A x R ( q , a , a ) ( q, e), a
Example • EX Construct a PA M for G = ( V , , R , S ) • V = { s , a , b , c } , • = { a , b , c } , • R : • S a S a , • S b S b , • S c
The PA M is • M = ( {p ,q} , , V , , p , {q} ) • : • ( p , e , e ) ( q , S) • ( q , e, S ) ( q , a S a) • ( q , e , S) ( q , b S b) • ( q , e , S) ( q , c) • ( q , a , a) ( q, e) • ( q , b , b) ( q, e) • ( q , c , c) ( q, e)
Operation on abbcbba State Unread Input Stack p abbcbba e q abbcbba S q abbcbba aSa q bbcbba Sa q bbcbba bSba q bcbba Sba q bcbba bSbba q cbba Sbba q cbba cbba q bba bba q ba ba q a a q e e
Now , we need to prove L (M) = L (G) • Claim Let * , ( V - ) V* {e}. Then S * if and only if (q , , S) ⊢* (q , e , ) • Proof of Claim. • (if – part) suppose S * , where * , ( V - ) V* {e}. • We prove (q , , S) ⊢* (q , e , ) • By induction on the length of leftmost. • Basis step. The length is 0, i.e. = e , = S L L
L • Induction hypothesis : • Assume if S * by a derivation of length n or less, n 0, then (q , , S) ⊢* (q , e , ) • Induction step. • Let Be a leftmost derivation if from S. Let A be the leftmost nonterminal symbol, then where *, , V* , A R
(only-if part) Suppose (q , , S) ⊢*(q , e , ) with * , ( V - ) V* {e}. • We show S * . • By induction on the number of transitions of type 2 in the computation by M. L
Part 2. If a language is accepted by a pushdown automaton then it is a context-free language. • We consider simple pushdown automaton: • Whenever (q , , ) (p, ) is a transition and q is not the start state, then , and | | 2. • Note Any pushdown automaton can be simulated by a simple pushdown automaton.
Construction of context-free grammar G = ( V , , R , S ) • is the same • S is the new initial state • V is the set of S plus all the states below < q , A , p >, q , p K , A {e, Z}
Explain < q , A , p > • < q , A , p > represents any portion of the input string that might be read between a point in time when M is in state q with A on the top of its stack, and a point in time when M removes A from the stack and enters state p.
R: • (1) S < s, Z, f’ >, where s is the start state of the original PA M, and f’ is the new final state • (2) For each (q , a , B) ( r, e) • where q, r K, a {e}, B, C {e} • and for each p K, Add rule • < q, B, p > a< r, C, p >
(3) For each (q , a , B) ( r, C1 C2) • Where q, r K, a {e}, B {e} • and C1 , C2 and for each p, p’ K Add rule • < q , B , p > a < r, C1, p’ > < p’, C2, p > • (4) For each q K, add • < q , e , q > e
Claim q, pK, A {e}, and x *, • <q, A, p > * x if and only if • (q, x, A) ⊢* (p, e, e)
Closure Properties. • Theorem 3.5.1. CF languages are closed under union, concatenation and kleene star. • Proof. Given • G1 = ( V1 , 1 , R1 , S1 ) • G2 = ( V2 , 2 , R2 , S2 ) • Union: Want G = ( V , , R , S ) such that • L(G)=L(G1) L(G2) • Construction of G • V = V1 V2 { S } • R = R1 R2 {S S1 , S S2}
Closure Properties • Concatenation: • want G = ( V , , R , S ) such that • L (G) = L (G1) L (G2) • Construction of G • V = V1 V2 { S } • R = R1 R2 {S S1 S2}
Closure Properties • Kleene star: • Want G = ( V , , R , S ) such that • L (G) = L* (G1) , where G1 = ( V , , R , S1 ) • Construction of G • V = V1 { S } • R = R1 {S e , S S S1}
Intersection with Regular Languages • Theorem 3.5.2. The intersection of a CF language with a regular language is CF. • Proof: Given L1 = L (M1), L2 = L (M2) • M1 is a pushdown automaton • M1 = ( K1 , , 1 , 1 , S1 , F1 ) • M2 is a finite automaton (M2 is deterministic) • M2 = ( K2 , , , S2 , F2). • Want a pushdown automaton • M = ( K , , , , S , F ) such that • L (M) = L (M1) L (M2).
Proof (continued). • Idea Use M3 to do parallel simulation of M1 and M2 • Construction: • K = K1 x K2 • = 1 • S = (S1 , S2 ) • F = F1 x F2 • : • If (q1, a, ) (p1 , ) 1 for each q2 K2, define ((q1, q2), a, ) ((p1 , (q2, a)) , ) . • If (q1, e, ) (p1, ) 1, for each q2 K2 , define ((q1, q2), e , ) ((p1, q2), ) .
A Technical Lemma • Let G = (V, , R, S) be a CF grammar. Let (G) denote the largest number of symbols on the right-hand side of any rule in R. • (G) indicates the largest number of children a node in a parse tree of G may have. • Lemma 3.5.1 The yield of any parse tree of G of height h has length at most ((G)) . • Proof. Estimate the tree size. h
The Pumping Theorem • Theorem 3.5.3 Let G = (V, , R, S) be a CF grammar. Then any string L(G) of length greater than can be written as such that either v or y is nonempty and for every n 0 . Furthermore,