1 / 65

Automata, Grammars and Languages

Automata, Grammars and Languages. Discourse 04 Context-Free Grammars and Pushdown Automata. Backus-Naur Form Grammars (CFGs). Algol 60, Algol 68—first “block-structured” languages Ex: CF Grammar. <program> ::= <block> <statement> ::= s | <block> <block> ::= begin <list> end

dwayne
Download Presentation

Automata, Grammars and Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automata, Grammars and Languages Discourse 04 Context-Free Grammars and Pushdown Automata C SC 473 Automata, Grammars & Languages

  2. Backus-Naur Form Grammars (CFGs) • Algol 60, Algol 68—first “block-structured” languages • Ex: • CF Grammar <program> ::= <block> <statement> ::= s | <block> <block> ::= begin <list> end <list> ::= <statement> ; <list> | <statement> begins ; begin s;s;s end ;s end Start variable “S”   terminals Nonterminals =variables rules=productions C SC 473 Automata, Grammars & Languages

  3. Grammars are “Generators” •  “yields” or “derives in one step” • Apply one production to one variable in the string • nondeterministic    C SC 473 Automata, Grammars & Languages

  4. A Particular Derivation • One possible derivation. Variable being rewritten at each stage is underscored • two choices at each derivation step: • Which variable (nonterminal) to be rewritten? • Which rule with that variable as LHS to be applied? • All possible terminal strings obtainable in this way make up L(G) C SC 473 Automata, Grammars & Languages

  5. Why CFGs? • Most natural or artificial (e.g. programming) languages are not regular • We know that the latter language is not regular, so … • Ex: C programs C SC 473 Automata, Grammars & Languages

  6. Derivation (Parse) Tree yield/frontier/terminal string = C SC 473 Automata, Grammars & Languages

  7. Derivation (Parse) Tree C SC 473 Automata, Grammars & Languages

  8. Derivation (Parse) Tree (cont’d) 1 2 3 8 4 5 6 7 9 10 12 11 13 14 15 C SC 473 Automata, Grammars & Languages

  9. Context-Free Grammar • Defn 2.2: A context-free grammar G is a 4-tuple • is a finite set, the variables (nonterminals) • is a finite set disjoint from V, the terminals • is a finite set of rules, of the form • is the start variable • Ex: strings with balanced parentheses. Formally: • Ex: informally • Variables = upper case • Terminals = lower case technically, an ordered pair (A, w) C SC 473 Automata, Grammars & Languages

  10. Yields & Derives Relations • Defn. The relation yields (derivesin 1 step) is defined as follows: if is a rule in R, then • Defn: derives in k steps: • Defn: derives: • In other words: • Defn: A derivation (of n steps) from is any sequence of strings satisfying: C SC 473 Automata, Grammars & Languages

  11. Language Generated • Defn. The language generated by G is the set of all terminal strings derived from S: • A partial derivation is one that starts with S and ends in a non-terminal string containing variables in V • Ex: • Partial: • Terminal or terminated: C SC 473 Automata, Grammars & Languages

  12. Derivations and Parse Trees • Ex: Notice: completed (terminated) parse tree is the same for both derivations—though the sequence “grows” differently C SC 473 Automata, Grammars & Languages

  13. Derivation  Parse Tree • Proposition 1: For every (terminated or partial) derivation there is an unique parse tree T with frontier constructible from D. • Proposition 2: For every parse tree T in G and any traversal order that is top-down (visits parents before children), there is an unique derivation for the frontier of T from S, and it isconstructible from T. • Corollary 3: For every parse tree T in G there is an unique leftmost derivation constructible from T. Pf: Pre-order traverse T, expanding variables as their nodes are visited. C SC 473 Automata, Grammars & Languages

  14. Ex: Leftmost Derivation C SC 473 Automata, Grammars & Languages

  15. Ex: Leftmost Derivation 1 Preorder traversal 2 13 3 4 14 5 15 17 6 11 16 18 7 12 17 10 8 9 C SC 473 Automata, Grammars & Languages

  16. 2 distinct parse trees for same terminal string 2 distinct leftmost derivations for same terminal string Leftmost derivation  parse tree 1-to-1 A CFG is unambiguous  wL(G)w has an unique parse tree (unique leftmost derivation) Syntactic Ambiguity terminal string = C SC 473 Automata, Grammars & Languages

  17. Ex: Ambiguous Grammar--English <Sent><NP><VP> <NP><N>|<Adj><N> <VP><V><Obj>|<V><AdvP> <AdvP><Adv>|<AdvP> <AdvP><Prep><Obj> <Obj><Adj><N> <N>fruit | flies | … … C SC 473 Automata, Grammars & Languages

  18. <Sent> <NP> <VP> <Adj> <N> <V> <Obj> fruit flies like <Adj> <N> a banana “Fruit flies like a banana” <Sent><NP><VP> <NP><N>|<Adj><N> <VP><V><Obj>|<V><AdvP> <AdvP><Adv>|<AdvP> <AdvP><Prep><Obj> <Obj><Adj><N> <N>fruit | flies | … … <Sent> <NP> <VP> <N> <V> <AdvP> <Prep> flies fruit <Obj> like <Adj> <N> a banana C SC 473 Automata, Grammars & Languages

  19. DFA M = conversion algorithm Reg. Expr E Right-linear Grammar G NFA N Right Linear Grammars & Regular Languages • Defn: A CFG is right-linear iff each rule is of one the forms AwB or Aw where A, B are variables and w  * • Chomsky (1958) called these “Type 3” • Thm: L is a regular language iff L=L(G) for some right-linear grammar G. There are algorithms for converting from finite automata to right-linear grammars, and conversely. C SC 473 Automata, Grammars & Languages

  20. Right-Linear & Regular (cont’d) • Pf: () Assume L=L(M) where is a DFA. Construct with R having rule if in  and rule if is a final state. Claim: Pf: easy induction on n  The proof direction follows since • Pf: () Assume L=L(G) where is right-linear. Construct NFA where is a new symbol.  has the transition if in R and transition if C SC 473 Automata, Grammars & Languages

  21. Right-Linear & Regular (cont’d) • Claim: Pf: easy induction on n  The proof direction follows since C SC 473 Automata, Grammars & Languages

  22. Ex: Right-Linear  FA • Ex: • Ex: f “useless” rules—can be eliminated C SC 473 Automata, Grammars & Languages

  23. Pushdown Automaton • Defn 2.12: A pushdown automaton M is a 6-tuple • is a finite set, the states • is a finite, the input alphabet • is a finite set, the stack alphabet • is the transition function • is the start state • is the set of accept (final) states C SC 473 Automata, Grammars & Languages

  24. PushDown Automaton to come seen input  * current input symbol Finite Control stack  * Top Bottom (no end- marker supplied) configuration: (state, rest of input, Stack ) C SC 473 Automata, Grammars & Languages

  25. PDA (cont’d) Finite Control start state Initially: configuration: C SC 473 Automata, Grammars & Languages

  26. PDA (cont’d) Transition: Finite Control configurations: Finite Control C SC 473 Automata, Grammars & Languages

  27. PDA (cont’d) • Can have • -move: consume no input • Pop-move: erase top stack symbol • Push-only move: ignore stack • Any combination is possible C SC 473 Automata, Grammars & Languages

  28. PDA (cont’d) Finally: Finite Control configuration: • Defn: recognizes iff for some , • and some • Defn: C SC 473 Automata, Grammars & Languages

  29. Example: PDA • Recognizer for accepts does not accept (blocked) C SC 473 Automata, Grammars & Languages

  30. Example: PDA w/ nondeterminism • Last example (palindromes with center-mark) was a deterministic PDA (DPDA) • NPDA for does not accept (blocked) Nondeterministic “guess” C SC 473 Automata, Grammars & Languages

  31. Example: PDA • Recall well-nested parentheses (()) (()()) DPDA! C SC 473 Automata, Grammars & Languages

  32. Example: PDA • “guesses” which pattern • “checks” whether guess is correct • accepts iff  correct guess that checks C SC 473 Automata, Grammars & Languages

  33. CFG PDA • Thm 2.20: A language is CF  a PDA recognizes it. • There are algorithms for converting a grammar to an equivalent automaton, and conversely. • Lemma 2.21: There is an algorithm for constructing, from any CFG G, a PDA M such that L(G) = L(M). Pf: In constructing a PDA, we can permit, without losing generality, “multi- push” moves such as where For we may break a multi-push into a sequence of single-push moves by introducing new states: Henceforth we will allow multi-push moves in our PDAs. C SC 473 Automata, Grammars & Languages

  34. CFG PDA • Idea: use nondeterminism. Given G, construct PDA P to • Load S on stack & simulate a leftmost derivation on the stack: • When a variable symbol A comes to stack top, “guess” a grammar rule A , pop A and push  • When a terminal character comes to stack top, compare to next input symbol. • If they match, pop the top and advance the input (“check off”) • If they fail to match, jam (not an accepting computation) • Ifthe input holds a word in L(G) andP guesses the correct leftmost derivation (rules to apply), then all the input characters will be checked off against those at the top of the stack and the stack will empty as the last input is checked off.Otherwise at some point the PDA will jam C SC 473 Automata, Grammars & Languages

  35. CFG PDA (cont’d) • Given construct • States: • Input alphabet:  • Stack alphabet: • Start state: • Accept states: • Transition function: • Initialize stack: • Simulate rules: • Check off terminals: • Detect null stack & accept: C SC 473 Automata, Grammars & Languages

  36. CFG PDA (cont’d) • Ex: C SC 473 Automata, Grammars & Languages

  37. G P CFG PDA (cont’d) C SC 473 Automata, Grammars & Languages

  38. G P CFG PDA (cont’d) CFG leftmost derivation PDA computation C SC 473 Automata, Grammars & Languages

  39. PDA CFG • Lemma 2.27: There is an algorithm for constructing, from any PDA P, a CFG Gsuch thatL(G) = L(P). • Pf: Given a PDA we can convert it into a PDA with the following simplified structure: • it has only one accept state: • add -transitions from multiple accept states • it empties its stack just before entering the accept state: • Loop on a state that just pops: • each PDA transition is either a “pure push” • or a “pure pop • - introduce new intermediate states C SC 473 Automata, Grammars & Languages

  40. PDA CFG (cont’d) • becomes • becomes • Idea of proof: construct G with variables for each p and q in the set of states Q. Arrange that if generates terminal string x, then PDA P started in state p with an empty stack on input string x has a computation that reaches state q with an empty stack. And conversely, if P started in state p with an empty stack has a computation on input string x that reaches state q with an empty stack, then How does P, when started on an empty stack in state p, operate on an input string x, ending with an empty stack in state q ? • First move must be a push • Last move must be a pop C SC 473 Automata, Grammars & Languages

  41. PDA CFG (cont’d) • Trace computation of P on x starting in state p with empty stack, and ending in state q with empty stack: (1) stack never empties Fig. 1 Stack height input C SC 473 Automata, Grammars & Languages

  42. PDA CFG (cont’d) • Trace computation of P on x starting in state p with empty stack, and ending in state q with empty stack: (2) stack empties somewhere Fig. 2 Stack height input C SC 473 Automata, Grammars & Languages

  43. PDA CFG (cont’d) Construction. Given PDA construct with the following rules in R: • If • then C SC 473 Automata, Grammars & Languages

  44. PDA CFG (cont’d) Claim 2.30: If then Pf: by induction on a derivation in G length k. Base: k=1. The only derivations of length 1 are and we have Step: Assume (IH) true for derivations of  k steps. Want Claim true for derivations of k+1 steps. Suppose that . The first derivation step is either of the form or Case . Then with So IH  By construction, since is a rule of G, C SC 473 Automata, Grammars & Languages

  45. PDA CFG (cont’d) Case . Then with So IH  Putting these together: C SC 473 Automata, Grammars & Languages

  46. PDA CFG (cont’d) Claim 2.31: If then Pf: by induction on a computation in P of length k: Base: k=0. The only computations of length 0 are where x = . By construction Step: Assume (IH) true for computations of  k steps. Want Claim true for computations of k+1 steps. Suppose that . Two cases: either the stack does not empty in midst of this computation (Fig. 1) or it Becomes empty during the computation (Fig. 2). Call these Case 1 and Case 2. C SC 473 Automata, Grammars & Languages

  47. PDA CFG (cont’d) Case 1: See Fig.1. The symbol X pushed in the 1st move Is the same as that popped in the last move. Let the 1st and last moves be governed by the push/pop transitions: By construction, there is a rule in G Let x = ayb. Since then we must have By IH Then Using we conclude C SC 473 Automata, Grammars & Languages

  48. PDA CFG (cont’d) Case 2: See Fig.2. Let r be the intermediate state where the stack becomes empty. Then By the IH, and Since by construction there is a rule in G of the form then C SC 473 Automata, Grammars & Languages

  49. PDA CFG (cont’d) Ex: Rules of G: (1) push-pop pairs (1st kind): C SC 473 Automata, Grammars & Languages

  50. PDA CFG (cont’d) Note: If (p´ unreachable) then (abbreviated ). Such variables are useless; all rules involving them on left or right sides can be eliminated as useless productions. For this grammar (2) Rules of the 2nd Kind (with useless rules removed—only 10/27 survive) in the order s,q,f: C SC 473 Automata, Grammars & Languages

More Related