1 / 22

CSA3050: NLP Algorithms

CSA3050: NLP Algorithms. Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm. Left Recursion Ambiguity Inefficiency. Problems with DFTD Parser. Left Recursion.

isaura
Download Presentation

CSA3050: NLP Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm CSA3050: Parsing Algorithms 2

  2. Left Recursion Ambiguity Inefficiency Problems withDFTD Parser CSA3050: Parsing Algorithms 2

  3. Left Recursion • A grammar is left recursive if it contains at least one non-terminal A for whichA * A and  *  • Intuitive idea: derivation of that category includes itself along its leftmost branch. NP  NP PP NP  NP and NP NP  DetP Nominal DetP  NP ' s CSA3050: Parsing Algorithms 2

  4. Infinite Search CSA3050: Parsing Algorithms 2

  5. Dealing with Left Recursion • Reformulate the grammar • A  A | as • A  A' A'  A' |  • Disadvantage: different (and probably unnatural) parse trees. • Use a different parse algorithm CSA3050: Parsing Algorithms 2

  6. Ambiguity • Coordination Ambiguity: different scope of conjunction:Black cats and dogs like to play • Attachment Ambiguity: a constituent can be added to the parse tree in different places:I shot an elephant in my pyjamas • VP → VP PPNP → NP PP CSA3050: Parsing Algorithms 2

  7. Catalan Numbers The nth Catalan number counts the ways of dissecting a polygon with n+2 sides into triangles by drawing nonintersecting diagonals. CSA3050: Parsing Algorithms 2

  8. Handling Disambiguation • Statistical disambiguation • Semantic knowledge. CSA3050: Parsing Algorithms 2

  9. Repeated Parsing ofSubtrees CSA3050: Parsing Algorithms 2

  10. Earley Algorithm • Dynamic Programming: solution involves filling in table of solutions to subproblems. • Parallel Top Down Search • Worst case complexity = O(N3) in length N of sentence. • Table, called a chart, contains N+1 entries ● book ● that ● flight ● 0 1 2 3 CSA3050: Parsing Algorithms 2

  11. The Chart • Each table entry contains a list of states • Each state represents all partial parses that have been reached so far at that point in the sentence. • States are represented using dotted rules containing information about • Rule/subtree: which rule has been used • Progress: dot indicates how much of rule's RHS has been recognised. • Position: text segment to which this parse applies CSA3050: Parsing Algorithms 2

  12. Examples of Dotted Rules • Initial S RuleS → ● VP, [0,0] • Partially recognised NPNP → Det ● Nominal, [1,2] • Fully recognised VPVP → V VP ● , [0,3] • These states can also be represented graphically CSA3050: Parsing Algorithms 2

  13. The Chart CSA3050: Parsing Algorithms 2

  14. Earley Algorithm • Main Algorithm: proceeds through each text position, applying one of the three operators below. • Predictor: Creates "initial states" (ie states whose RHS is completely unparsed). • Scanner: checks current input when next category to be recognised is pre-terminal. • Completer: when a state is "complete" (nothing after dot), advance all states to the left that are looking for the associated category. CSA3050: Parsing Algorithms 2

  15. Early Algorithm – Main Function CSA3050: Parsing Algorithms 2

  16. Early Algorithm – Sub Functions CSA3050: Parsing Algorithms 2

  17. CSA3050: Parsing Algorithms 2

  18. CSA3050: Parsing Algorithms 2

  19. fl CSA3050: Parsing Algorithms 2

  20. Retrieving Trees • To turn recogniser into a parser, representation of each state must also include information about completed states that generated its constituents CSA3050: Parsing Algorithms 2

  21. CSA3050: Parsing Algorithms 2

  22. Chart[3] ↑Extra Field CSA3050: Parsing Algorithms 2

More Related