1 / 16

Chapter 4

Chapter 4. Normal Forms for CFGs. 4.5 Chomsky Normal Form. Defn 4.4.1 A CFG G = ( V , , P , S ) is in chomsky normal form if each rule in G has one of the following forms: i) A  BC ii) A  a iii) S   where A , B , C  V and B , C  V - { S } and a  

ernst
Download Presentation

Chapter 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4 Normal Forms for CFGs

  2. 4.5 Chomsky Normal Form • Defn 4.4.1 A CFG G = (V, , P, S) is in chomsky normal form if each rule in G has one of the following forms: i) A BC ii) A  a iii) S   where A, B, C  V and B, C  V - {S} and a   • A simplified normal form which restricts the length & composition of the R.H.S. of a rule in CFG • The derivation tree for a string generated by a CFG in chomsky normal form is a binary tree

  3. 4.5 Chomsky Normal Form • Theorem 4.4.2 Let G = (V, , P, S) be a CFG. There is an algorithm to construct a grammar G’ = (V’, ’, P’, S’) in chomsky normal form that is equivalent to G Proof (sketch): (i) For each rule A w, where |w| > 1, replace each terminal a in w by a distinct variable Y & create new rule Y  a (ii) For each modified rule X  w’, w is either a terminal or a string in V+. Rules in the latter form must be broken into a sequence of rules, each of whose R.H.S. consists of two variables. • Example 4.4.1 • One of the applications of using CFGs that are in Chomsky Normal Form • Constructing binary search trees to accomplish “optimal” time & space search complexity for parsing an input string

  4. * *    G G’ G’ 4.1 Grammar Transformations • Lemma 4.1.1 Let G = (V, , P, S) be a CFG. There is a CFG G’ = (V’, , P’, S’) that satisfies i) L(G) = L(G’) ii) Rules in P’ are of the form A  w where A  V’ and w  ((V – {S’})  )*. Proof. If S is a recursive variable, then construct G’ by creating a new start symbolS’ & adding S’  S to P’, i.e., G’ = (V {S’}, , P  {S’  S}, S’). If Su, then S’ Su, where u  *. Example. 4.1.1 Assume that P in G includes S  aS | AB | AC, then P’ in G’ should include S’  S, S  aS | AB | AC

  5. 4.2 Elimination of -rules • Nullablevariables are variables that can derive . • A grammar w/o nullable variables is called non- contractinggrammar since the application of a rule cannotdecrease the length of the sentenial form. • It is desirable to avoid the generation of (nullable) variables that are subsequently removed by -rules. • The removal of nullable variables in a grammar guarantees that during process of the deriving a terminal string, each variable generates terminal symbol(s). • The derivation of a terminal string in a grammar G is more cost-effective if G is noncontrating than if G is contracting.

  6. 4.2 Elimination of -rules • Algorithm 4.2.1 Construction of Sets of NullableVariables Input:A CFG G = (V, , P, S) Output: Set of Nullable Variables 1. NULL := { A | A   P} 2. Repeat 2.1. PREV := NULL 2.2. For each variable A  Vdo If  A  w, where w PREV* do NULL := NULL  { A } Until NULL = PREV. (Note: w  PREV* indicates that A ( w) produces entirely nullable variables)

  7. 4.2 Elimination of -rules Example 4.2.1 Given the following CFG S  ACA A  aAa | B | CB  bB | bC  cC |  Using Algorithm 4.1.2, the set of nullable variables can be computed 0 { C } 1 { A, C} { C } 2 { S, A, C} { A, C} 3 { S, A, C} { S, A, C}

  8. 4.2 Elimination of -rules • An essentially noncontractinggrammarG ●includes S  , if   L(G) ● excludes any -rules ● yields only noncontracting derivations, with the exception of S  . • Theorem 4.2.3 Let G = (V, , P, S) be a CFG. There is a CFG GL = (VL, , PL, SL) w/o -rules that satisfies ●L(GL) = L(G). ● SL is not a recursive variable. ● A    PL iff  L(G) & A = SL.

  9. 4.2 Elimination of -rules • Constructing GL from G. (i) SL is not a recursive variable. Any recursive start variable can be made nonrecursive according to Lemma 4.1.1. (ii)If   L(G), then SL   PL. (iii) Delete all the -rules in G, except SL , from PL as follows: If A  w  P, w = w1A1w2A2 … wkAkwk+1 , and A1, …, Ak are nullable variables, then create 2k - 1 new rules in PL, i.e., a set of 2k possible combinations with the inclusion and exclusion of A1, A2, … ,Ak. • Example 4.2.2 { S, A, C } are nullable variables of G, where G: S  ACA GL: S  ACA | AC | CA | AA | A | C |  A  aAa | B | CA  aAa| aa | B | C B  bB | b B  bB | b C  cC |  C  cC | c

  10. 4.2 Elimination of -rules • Example 4.2.3 Let G be the grammar S  ABC A  aA |  B  bB |  C  cC |  G generates a*b*c*. The nullable variables of G are S, A, B, and C. The equivalent grammar of G w/o -rules is GL, where S  ABC | AB | AC | BC | A | B | C |  A  aA | a B  bB | b C  cC | c

  11. 4.3 Elimination of Chain rules • Definition. A rule of the form A  B, where A, BV, which simply renames a variable in a derivation, is a chain rule. • The removal of chain rules • increase the number of rules in the grammar, but • reduce the length of derivations • Example. Consider the set of rules P A  aA | a | B B  bB | b | C Eliminating the chain rule A  B by (i) adding A  w, for every rule B  w, and(ii) delete A  B. Hence, P is modified as A  aA | a | bB| b | C B  bB | b | C However, another chain ruleA  C was created.

  12. 4.3 Elimination of Chain rules • Algorithm 4.3.1. Construction of the Set CHAIN(A) Input:A CFG G = (V, , P, S) and variable A Output: The set of chain rules of A 1. CHAIN(A) := { A } 2. PREV :=  3. Repeat 3.1. NEW := CHAIN(A) – PREV 3.2. PREV := CHAIN(A) 3.3. For each variable B  NEW do For each rule B  Cdo CHAIN(A) := CHAIN(A)  { C } UntilCHAIN(A) = PREV.

  13. * * * * * *         Gc G Gc G Gc G G G 4.3 Elimination of Chain rules • Theorem 4.3.3 Let G = (V, , P, S) be a CFG. There is a CFG GC = (V, , PC, S) that satisfies • L(GC) = L(G). • GC has no chain rules. Proof. Using P & CHAIN(A), we compute the A rules in GC. The rule A  wis in PC if  variable B & string w .. i) B  CHAIN(A) ii) B  w P iii) w  V (ensures that PC does not contain chain rules). Let w L(G) & AB be a maximal sequence of chain rules used to derive w, which can be generated by S uAvuBv upvw, and S uAv upvw where B  p  Pis not a chain rule.

  14. 4.3 Elimination of Chain rules Example 4.3.1. Given the following CFG S  ACA | CA | AA | AC | A | C |  A  aAa | aa | B | CB  bB | bC  cC | c Using Algorithm 4.3.1, we generate CHAIN(S) = { S, A, C, B } CHAIN(A) = { A, C, B } CHAIN(B) = { B } CHAIN(C) = { C } Using theseCHAIN sets, we generate GC, where PC GC is S  ACA | CA | AA | AC | aAa | aa | bB | b | cC | c |  A  aAa | aa | bB | b | cC | c B  bB | b C  cC | c

  15. 4.7 Removal of Direct Left Recursion • Recursion is necessary to generate strings of arbitrary length. Left recursion causes problems, not recursion in general • Directly left recursive rules (e.g. A  Aa) introduce the possibility of unending computations since • Repeated applications of them fail to generate a prefix that can terminate the parse. • To avoid the possibility of a non-terminating parse, directly left recursive rules must be removed. • Example. Direct left-recursive rulesNo direct left-recursive rules G1: A Aa | b G2: A Aa | Ab | b | c G3: A AB | BA | a G1’:A bZ | b , Z aZ | a G2’: A bZ | cZ | b | c Z aZ | bZ | a | b G3’:A BAZ | aZ | BA | a Z BZ | B B b | c B b | c

  16. 4.7 Removal of Direct Left Recursion • The removal of any direct left recursion requires the addition of a new variable V, whichintroduces a set of directly right recursive rules • To remove direct left recursion on variable A, A is divided into • directly left recursive rules: A Au1 | Au2 | … | Auj • non-directly left recursive rules: A v1 | v2 | … | vk, where A is not the first symbol in vi (1  i  k) • Strategy: build a string in a left-to-right manner by applying (i) non-recursive rules first, followed by (ii) constructing the remaining symbols by right recursion. A v1 | … | vk|v1Z | … | vkZ Z  u1 | … | uj|u1Z | … | ujZ • Example 4.5.1 Given the rules A Aa | Aab | bb | b The A rules are: A bb | b | bbZ | bZ The Z rules are: Z a | ab | aZ | abZ

More Related