240 likes | 577 Views
Overview. CYK Recognition with CF grammarBasic AlgorithmProblems: unit-rules, ?-rules Recognition with a grammar in CNFCYK Parsing with CNFParsing with CNFRecognition TableChart ParsingSummaryAdvantages and DisadvantagesOther remarks. Basic Algorithm of CYK Recognition (1). Example Gramma
E N D
1. The CYK Parsing Method Chiyo Hotani
Tanya Petrova
CL2 Parsing Course
28 November, 2007
2. Overview CYK Recognition with CF grammar
Basic Algorithm
Problems: unit-rules, ?-rules
Recognition with a grammar in CNF
CYK Parsing with CNF
Parsing with CNF
Recognition Table
Chart Parsing
Summary
Advantages and Disadvantages
Other remarks
3. Basic Algorithm of CYK Recognition (1) Example Grammar:
A grammar describing numbers in scientific notation
Input: 32.5e+1
4. Basic Algorithm of CYK Recognition (2)
5. NumberS -> Integer | Real
Integer -> Digit | Integer Digit
Digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Basic Algorithm of CYK Recognition (3)
6. NumberS -> Integer | Real
Integer -> Digit | Integer Digit
Fraction -> . Integer
Scale -> e Sign Integer | Empty Basic Algorithm of CYK Recognition (4)
7. NumberS -> Integer | Real
Real -> Integer Fraction Scale Basic Algorithm of CYK Recognition (5)
8. Basic Algorithm of CYK Recognition (6)
9. R? = { Empty, Scale }
sentence: z = z1 z2 . . . znsubstring of z starting at position i, of length l.si,l = zizi+1. . . zi+l-1
Rsi,l: the set of non-terminals deriving the substring si,l
Basic Algorithm of CYK Recognition (7)
10. CYK recognition with a grammar in CNF Required restrictions:
Eliminate ?-rules and unit rules
Limit the maximum length of RHS of the rule to 2
CNF
No ?-rules and unit rules
all rules have one of the following two forms: A?a A?BC
11. Our example grammar in CNF
12. CYK Parsing with CNF Building the recognition table
Input :
Our example grammar in CNF
input sentence: 32.5 e + 1
13. CYK Parsing with the CNF
bottom-row : read directly from the grammar (rules of the form A? a )
14. Two Ways to Copmute a R s i,l:
check each right-hand side
compute possible right-hand sides from the recognition table
15. How this is done Example: 2.5 e ( = s 2, 4)
1) N1 not in R s 2, 1 or R s 2, 2
N1 is a member of R s 2, 3
But Scale´ is not a member of R s 5, 1
2) R s 2, 4 is the set of Non- Terminals that have a right-hand side AB where either:
A in R s 2, 1 and B in R s 3, 3
A in R s 2, 2 and B in R s 4, 2
A in R s 2, 3 and B in R s 5, 1
Possible combinations: N1 T2 or Number T2
In our grammar we do not have such a right-hand side, so nothing is added to R s 2, 4.
16. Recognition table
17. As a result we find out that:
This process is much less complicated than the one we saw before
18. Reasons We do not have to repeat the process again and again until no new Non-Terminals are added to R s i,l
(The substrings we are dealing with
are really substrings and cannot be equal to the string we start with)
We only have to find one place where the substring must be split into two A ? B C
Here !
19. Chart Parsing
20. A short retrospective of CYK
First: recognition table using the original grammar.
Then: transforming grammar to CNF.
21. A short retrospective of CYK cont. CNF is useful for improving the efficiency, but it is actually a bit too restrictive
Disadvantage of CNF:
Resulting recognition table lacks the information we need to construct a derivation using the original grammar!
22. A short retrospective of CYK cont.
In the transformation process, some non-terminals were thrown away
(non-productive)
Missing information could be added.
23. A short retrospective of CYK cont.
Result: almost the same recognition table.
Extra information on non-terminals
Obtained in a simpler and much more efficient way.
24.
Thank you
for your attention! ?