180 likes | 289 Views
This guide explores the fundamentals of Context-Free Languages (CFL), including their definitions, characteristics, and forms. We discuss Chomsky Normal Form (CNF) and Greibach Normal Form (GNF), methods for removing null and unit productions, and the significance of parse trees. Additionally, we introduce pushdown automata (PDAs) and their role in recognizing CFLs, alongside the Pumping Lemma which outlines essential facts about CFLs. By examining these topics, readers will gain a comprehensive overview of context-free languages and their applications in computational theory.
E N D
CONTEXT FREE LANGUAGE by: Er. Sukhwinder kaur
Topics to be discussed… • Context Free Language • Parse Tree • Chomsky Normal Form (CNF) • Greibech normal form(GNF) • Removing null production • Unit Productions • Pushdown Automata: a preview • Pumping Lemma
Context Free Language Facts: 1. each non terminal symbol can derive many different strings. 2. Every string in a derivation is called a sentential form. 3. Every sentential form containing no non terminal symbols is called a sentence. 4. The language L(G) generated by a CFG G is the set of sentences derivable from a distinguished non terminal called the start symbol of G. (eg. <stmt> ) 5. A language is said to be context free (or a context free language (CFL)) if it can be generated by a CFG. A sentence may have many different derivations; a grammar is called unambiguous if this cannot happen (eg: previous grammar is unambiguous)
CFGs: a formal definition a CFG is a quadruple G = (N,S,P,S) where N is a finite set (of non terminal symbols) S is a finite set (of terminal symbols) disjoint from N. S N is the start symbol. P is a a finite subset of N x (N S)* (The productions) Conventions: Non terminals: A,B,C,… terminals: a,b,c,… strings in (N S)* : a,b,g,… Each (A,a) P is called a production rule and is usually written as: A a. A set of rules with the same LHS: A a1 A a2 A a3 can be abbreviated as A a1| a2 | a3. back
S S S ) ( S ) ( S e e Parse Tree Features of the parse tree: 1. The root node is [labeled by] the start symbol: S 2. The left to right traversal of all leaves corresponds to the input string : ( ) ( ). 3. If X is an internal node and Y1 Y2 … YK are an left-to-right listing of all its children in the tree, then X --> Y1Y2… Yk is a rule of G. 4. Every step of derivation corresponds to one-level growth of an internal node
Leftmost, Rightmost Derivations Definition. A left-most derivation of a sentential form is one in which rules transforming the left-most nonterminal are always applied Definition. A right-most derivation of a sentential form is one in which rules transforming the right-most nonterminal are always applied
Trading Left- & Right-Recursion Left recursion: A A a Right recursion: A a A Most algorithms have trouble with one, In recursive descent, avoid left recursion. back
Chomsky Normal Form (CNF) Let G be a CFG for some L-{} Definition: G is said to be in Chomsky Normal Form if all its productions are in one of the following two forms: A BC where A,B,C are variables, or A a where a is a terminal G has no useless symbols G has no unit productions G has no -productions
CNF checklist • G1: • E E+T | T*F | (E) | Ia | Ib | I0 | I1 • T T*F | (E) | Ia | Ib | I0 | I1 • F (E) | Ia | Ib | I0 | I1 • I a | b | Ia | Ib | I0 | I1 Is this grammar in CNF? • Checklist: • G has no -productions • G has no unit productions • G has no useless symbols • But… • the normal form for productions is violated So, the grammar is not in CNF back
Greibech normal form(GNF) • A CFG is in Greibach normal form if each rule has one these forms: • A aA1A2…An • A a • S where a and Ai V – {S} for i = 1, 2,…, n back
Removing -Productions Remove all productions: (1) If there is a rule P Q and Q is nullable, Then: Add the rule P. (2) Delete all rules Q. back
Unit Productions A unit production is a rule whose right-hand side consists of a single nonterminal symbol. Example: SX Y X A AB | a Bb YT TY | c
Removing Unit Productions • removeUnits(G) = • 1. Let G = G. • 2. Until no unit productions remain in G do: • 2.1 Choose some unit production X Y. • 2.2 Remove it from G. • 2.3 Consider only rules that still remain. For every rule Y , • where V*, do: • Add to G the rule X unless it is a rule that has • already been removed once. • 3. Return G. SX Y Aa | b Bb T c X a | b Y c Example: SX Y X A AB | a Bb Y T T Y | c back
Pushdown Automata: a preview FAs recognize regular languages. What kinds of machines recognize CFLs ? ===> Pushdown automata (PDAs) PDA: Like FAs but with an additional stack as working memory. Actions of a PDA 1. Move right one tape cell (as usual FAs) 2. push a symbol onto stack 3. pop a symbol from the stack. Actions of a PDA depend on 1. current state 2. currently scanned I/P symbol 3. current top stack symbol. A string x is accepted by a PDA if it can enter a final state (or clear all stack symbols) after scanning the entire input. More details defer to later chapters. back
Pumping Lemma If a language L is accepted by a DFA M with m states, then any string x in L with |x| > m can be written as x = uvw such that (1) v ≠ε, and (2) uv*w is a subset of L (i.e., for any n> 0, uv w in L).
Proof • Consider the path associated with x (|x| > m). x Since |x| > m, # of nodes on the path is At least m+1. Therefore, there is a state Appearing twice.
v w u v ≠ ε because M is DFA because there is a path associated with uw from initial state to a final state. uw in L n uv w in L due to the same reason as above back