1 / 24

The CYK Parsing Method

The CYK Parsing Method. Chiyo Hotani Tanya Petrova CL2 Parsing Course 28 November, 2007. Overview. CYK Recognition with CF grammar Basic Algorithm Problems: unit-rules, є -rules Recognition with a grammar in CNF CYK Parsing with CNF Parsing with CNF Recognition Table Chart Parsing

tiara
Download Presentation

The CYK Parsing Method

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The CYK Parsing Method Chiyo Hotani Tanya Petrova CL2 Parsing Course 28 November, 2007

  2. Overview • CYK Recognition with CF grammar • Basic Algorithm • Problems: unit-rules, є-rules • Recognition with a grammar in CNF • CYK Parsing with CNF • Parsing with CNF • Recognition Table • Chart Parsing • Summary • Advantages and Disadvantages • Other remarks

  3. Basic Algorithm of CYK Recognition (1) Example Grammar: A grammar describing numbers in scientific notation Input: 32.5e+1

  4. Basic Algorithm of CYK Recognition (2) Digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Sign -> + | - derivations of substrings of length 1

  5. Basic Algorithm of CYK Recognition (3) NumberS -> Integer | Real Integer -> Digit | Integer Digit Digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 derivations of substrings of length 1 • Unit Rule: rules of the form AB, where A and B are non-terminals. We can have chains of them in a derivation.

  6. Basic Algorithm of CYK Recognition (4) NumberS -> Integer | Real Integer -> Digit | Integer Digit Fraction -> . Integer Scale -> e Sign Integer | Empty

  7. Basic Algorithm of CYK Recognition (5) NumberS -> Integer | Real Real -> Integer Fraction Scale Number does indeed derive 32.5e+1.

  8. Basic Algorithm of CYK Recognition (6) є-rules

  9. Basic Algorithm of CYK Recognition (7) • Rє = { Empty, Scale } • sentence: z = z1z2 . . . znsubstring of z starting at position i, of length l.si,l = zizi+1. . . zi+l-1 • Rsi,l: the set of non-terminals deriving the substring si,l A graphical presentation of substrings

  10. CYK recognition with a grammar in CNF • Required restrictions: • Eliminate є-rules and unit rules • Limit the maximum length of RHS of the rule to 2 • CNF • No є-rules and unit rules • all rules have one of the following two forms: AaABC

  11. Our example grammar in CNF

  12. CYK Parsing with CNF • Building the recognition table • Input : Our example grammar in CNF input sentence: 32.5 e + 1

  13. CYK Parsing with the CNF • bottom-row : read directly from the grammar (rules of the form A a )

  14. Two Ways to Copmute a R s i,l: • check each right-hand side • compute possible right-hand sides from the recognition table

  15. How this is done Example: 2.5 e ( = s 2, 4) 1) N1 not in R s 2, 1 or R s 2, 2 N1 is a member of R s 2, 3 But Scale´ is not a member of R s 5, 1 2) R s 2, 4 is the set of Non- Terminals that have a right-hand side AB where either: A in R s 2, 1 and B in R s 3, 3 A in R s 2, 2 and B in R s 4, 2 A in R s 2, 3 and B in R s 5, 1 Possible combinations: N1 T2 or Number T2 In our grammar we do not have such a right-hand side, so nothing is added to R s 2, 4.

  16. Recognition table l i

  17. As a result we find out that: • This process is much less complicated than the one we saw before

  18. Reasons • We do not have to repeat the process again and again until no new Non-Terminals are added to R s i,l (The substrings we are dealing with are really substrings and cannot be equal to the string we start with) • We only have to find one place where the substring must be split into two A  B C Here !

  19. Chart Parsing A chart is just a recognition table.

  20. A short retrospective of CYK • First: recognition table using the original grammar. • Then: transforming grammar to CNF.

  21. A short retrospective of CYK cont. • CNF is useful for improving the efficiency, but it is actually a bit too restrictive  • Disadvantage of CNF: • Resulting recognition table lacks the information we need to construct a derivation using the original grammar!

  22. A short retrospective of CYK cont. • In the transformation process, some non-terminals were thrown away (non-productive) • Missing information could be added.

  23. A short retrospective of CYK cont. • Result: almost the same recognition table. • Extra information on non-terminals • Obtained in a simpler and much more efficient way.

  24. Thank you for your attention! 

More Related