1 / 45

LINGUISTICA GENERALE E COMPUTAZIONALE

LINGUISTICA GENERALE E COMPUTAZIONALE. ANALISI SINTATTICA (PARSING). RECAP: CFG. Bird et al, ch . 8.3. PARSING . Parsing is the process of recognizing and assigning STRUCTURE Parsing a string with a CFG: Finding a derivation of the string consistent with the grammar

edita
Download Presentation

LINGUISTICA GENERALE E COMPUTAZIONALE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)

  2. RECAP: CFG • Bird et al, ch. 8.3

  3. PARSING • Parsing is the process of recognizing and assigning STRUCTURE • Parsing a string with a CFG: • Finding a derivation of the string consistent with the grammar • The derivation gives us a PARSE TREE

  4. EXAMPLE (CFR LAST WEEK)

  5. PARSING AS SEARCH • Just as in the case of non-deterministic regular expressions, the main problem with parsing is the existence of CHOICE POINTS • There is a need for a SEARCH STRATEGY determining the order in which alternatives are considered

  6. TOP-DOWN AND BOTTOM-UP SEARCH STRATEGIES • The search has to be guided by the INPUT and the GRAMMAR • TOP-DOWN search: the parse tree has to be rooted in the start symbol S • EXPECTATION-DRIVEN parsing • BOTTOM-UP search: the parse tree must be an analysis of the input • DATA-DRIVEN parsing

  7. AN EXAMPLE OF TOP-DOWN SEARCH(IN PARALLEL)

  8. RECURSIVE DESCENT IN NLTK • P. 303: • nltk.RecursiveDescentParser(grammar) • nltk.app.rdparser()

  9. NON-PARALLEL SEARCH • If it’s not possible to examine all alternatives in parallel, it’s necessary to make further decisions: • Which node in the current search space to expand first (breadth-first or depth-first) • Which of the applicable grammar rules to expand first • Which leaf node in a parse tree to expand next (e.g., leftmost)

  10. TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT

  11. TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (II)

  12. TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (III)

  13. TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (IV)

  14. A T-D, D-F, L-R PARSER

  15. LEFT-RECURSION • A LEFT-RECURSIVE grammar may cause a T-D, D-F, L-R parser to never return • Examples of left-recursive rules: • NP  NP PP • S  S and S • But also: • NP  Det Nom • Det  NP’s

  16. THE PROBLEM WITH LEFT-RECURSION

  17. LEFT-RECURSION: POOR SOLUTIONS • Rewrite the grammar to a weakly equivalent one • Problem: may not get correct parse tree • Limit the depth during search • Problem: limit is arbitrary

  18. AN EXAMPLE OF BOTTOM-UP SEARCH

  19. SHIFT-REDUCE PARSING • P. 305 • nltk.app.srparser() • ShiftReduceParser(grammar)

  20. TOP-DOWN vs BOTTOM-UP • TOP-DOWN: • Only search among grammatical answers • BUT: suggests hypotheses that may not be consistent with data • Problem: left-recursion • BOTTOM-UP: • Only forms hypotheses consistent with data • BUT: may suggest hypotheses that make no sense globally

  21. LEFT-CORNER PARSING • A hybrid of top-down and bottom-up parsing • Strategy: don’t consider any expansion unless the current input can serve as the LEFT-CORNER of that expansion

  22. FURTHER PROBLEMS IN PARSING • Ambiguity • Church and Patel (1982): the number of attachment ambiguities grows like the Catalan numbers • C(2) = 2, C(3) = 5, C(4) = 14, C(5) = 132, C(6) = 469, C(7) = 1430, C(8) = 4867 • Avoiding reparsing

  23. COMMON STRUCTURAL AMBIGUITIES • COORDINATION ambiguity • OLD (MEN AND WOMEN) vs (OLD MEN) AND WOMEN • ATTACHMENT ambiguity: • Gerundive VP attachment ambiguity • I saw the Eiffel Tower flying to Paris • PP attachment ambiguity • I shot an elephant in my pajamas

  24. PP ATTACHMENT AMBIGUITY

  25. AMBIGUITY: SOLUTIONS • Use a PROBABILISTIC GRAMMAR (not covered in this module) • Use semantics

  26. AVOID RECOMPUTING INVARIANTS • Consider parsing with a top-down parser the NP: • A flight from Indianapolis to Houston on TWA • With the grammar rules: • NP  Det Nominal • NP  NP PP • NP  ProperNoun

  27. INVARIANTS AND TOP-DOWN PARSING

  28. THE EARLEY ALGORITHM

  29. DYNAMIC PROGRAMMING • A standard T-D parser would reanalyze A FLIGHT 4 times, always in the same way • A DYNAMIC PROGRAMMING algorithm uses a table (the CHART) to avoid repeating work • The Earley algorithm also • Does not suffer from the left-recursion problem • Solves an exponential problem in O(n3)

  30. THE CHART • The Earley algorithm uses a table (the CHART) of size N+1, where N is the length of the input • Table entries sit in the `gaps’ between words • Each entry in the chart is a list of • Completed constituents • In-progress constituents • Predicted constituents • All three types of objects are represented in the same way as STATES

  31. THE CHART: GRAPHICAL REPRESENTATION

  32. STATES • A state encodes two types of information: • How much of a certain rule has been encountered in the input • Which positions are covered • A  , [X,Y] • DOTTED RULES • VP  V NP  • NP  Det  Nominal • S   VP

  33. EXAMPLES

  34. SUCCESS • The parser has succeeded if entry N+1 of the chart contains the state • S   , [0,N]

  35. THE ALGORITHM • The algorithm loops through the input without backtracking, at each step performing three operations: • PREDICTOR: add predictions to the chart • COMPLETER: Move the dot to the right when looked-for constituent is found • SCANNER: read in the next input word

  36. THE ALGORITHM: CENTRAL LOOP

  37. EARLEY ALGORITHM: THE THREE OPERATORS

  38. EXAMPLE, AGAIN

  39. EXAMPLE: BOOK THAT FLIGHT

  40. EXAMPLE: BOOK THAT FLIGHT (II)

  41. EXAMPLE: BOOK THAT FLIGHT (III)

  42. EXAMPLE: BOOK THAT FLIGHT (IV)

  43. CHART PARSING IN NLTK • 8.4, p. 307 ff

  44. DEPENDENCY PARSING • 8.5

  45. READINGS • Bird et al, chapter 8

More Related