1 / 37

Coarse-to-Fine Efficient Viterbi Parsing

Coarse-to-Fine Efficient Viterbi Parsing. Nathan Bodenstab OGI RPE Presentation May 8, 2006. Outline. What is Natural Language Parsing? Data Driven Parsing Hypergraphs and Parsing Algorithms High Accuracy Parsing Coarse-to-Fine Empirical Results. What is Natural Language Parsing?.

Download Presentation

Coarse-to-Fine Efficient Viterbi Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coarse-to-Fine Efficient Viterbi Parsing Nathan Bodenstab OGI RPE Presentation May 8, 2006

  2. Outline • What is Natural Language Parsing? • Data Driven Parsing • Hypergraphs and Parsing Algorithms • High Accuracy Parsing • Coarse-to-Fine • Empirical Results

  3. What is Natural Language Parsing? • Provides a sentence with syntactic information by hierarchically clustering and labeling its constituents. • A constituent is a group of one or more words that function together as a unit.

  4. What is Natural Language Parsing? • Provides a sentence with syntactic information by hierarchically clustering and labeling its constituents. • A constituent is a group of one or more words that function together as a unit.

  5. Why Parse Sentences? • Syntactic structure is useful in • Speech Recognition • Machine Translation • Language Understanding • Word Sense Disambiguation (ex. “bottle”) • Question-Answering • Document Summarization

  6. Outline • What is Natural Language Parsing? • Data Driven Parsing • Hypergraphs and Parsing Algorithms • High Accuracy Parsing • Coarse-to-Fine • Empirical Results

  7. Data Driven Parsing • Parsing = Grammar + Algorithm • Probabilistic Context-Free Grammar P(children=[Determiner, Adjective, Noun] | parent=NounPhrase)

  8. Data Driven Parsing • Find the maximum likelihood parse tree from all grammatically valid candidates. • The probability of a parse tree is the product of all its grammar rule (constituent) probabilities. • The number of grammatically valid parse trees increases exponentially with the length of the sentence.

  9. Outline • What is Natural Language Parsing? • Data Driven Parsing • Hypergraphs and Parsing Algorithms • High Accuracy Parsing • Coarse-to-Fine • Empirical Results

  10. Hypergraphs • A directed hypergraph can facilitate dynamic programming (Klein and Manning, 2001). • A hyperedge connects a set of tail nodes to a set of head nodes. Standard Edge Hyperedge

  11. Hypergraphs

  12. The CYK Algorithm • Separates the hypergraph into “levels” • Exhaustively traverses every hyperedge, level by level

  13. The A* Algorithm • Maintains a priority queue of traversable hyperedges • Traverses best-first until a complete parse tree is found Priority Queue

  14. Outline • What is Natural Language Parsing? • Data Driven Parsing • Hypergraphs and Parsing Algorithms • High Accuracy Parsing • Coarse-to-Fine • Empirical Results

  15. High(er) Accuracy Parsing • Modify the Grammar to include more context • (Grand) Parent Annotation (Johnson, 1998) P(children=[Determiner, Adjective, Noun]|parent=NounPhrase, grandParent=Sentence)

  16. Increased Search Space Original Grammar Parent Annotated Grammar

  17. Increased Search Space Original Grammar Parent Annotated Grammar

  18. Increased Search Space Original Grammar Parent Annotated Grammar

  19. Increased Search Space Original Grammar Parent Annotated Grammar

  20. Increased Search Space Original Grammar Parent Annotated Grammar

  21. Grammar Comparison • Exact Inference with the CYK algorithm becomes intractable. • Most algorithms using Lexical models resort to greedy search strategies. • We want to find the globally optimal (Viterbi) parse tree for these high- • accuracy models efficiently.

  22. Outline • What is Natural Language Parsing? • Data Driven Parsing • Hypergraphs and Parsing Algorithms • High Accuracy Parsing • Coarse-to-Fine • Empirical Results

  23. Coarse-to-Fine • Efficiently find the optimal parse tree of a large, context-enriched model (Fine) by following hyperedges suggested by solutions of a simpler model (Coarse). • To evaluate the feasibility of Coarse-to-Fine, we use • Coarse = WSJ • Fine = Parent

  24. Increased Search Space Coarse Grammar Fine Grammar

  25. Coarse-to-Fine Build Coarse hypergraph

  26. Coarse-to-Fine Choose a Coarse hyperedge

  27. Coarse-to-Fine Replace the Coarse hyperedge with Fine hyperedge (modifies probability)

  28. Coarse-to-Fine Propagate probability difference

  29. Coarse-to-Fine Repeat until optimal parse tree has only Fine hyperedges

  30. Upper-Bound Grammar • Replacing a Coarse hyperedge with a Fine hyperedge can increase or decrease its probability. • Once we have found a parse tree with only Fine hyperedges, how can we be sure it is optimal? • Modify the probability of Coarse grammar rules to be an upper-bound on the probability of Fine grammar rules. where N is the set of non-terminals and is a grammar rule.

  31. Outline • What is Natural Language Parsing? • Data Driven Parsing • Hypergraphs and Parsing Algorithms • High Accuracy Parsing • Coarse-to-Fine • Empirical Results

  32. Results

  33. Summary & Future Research • Coarse-to-Fine is a new exact inference algorithm to efficiently traverse a large hypergraph space by using the solutions of simpler models. • Full probability propagation through the hypergraph hinders computational performance. • Full propagation is not necessary; lower-bound of log2(n) operations. • Over 95% reduction in search space compared to baseline CYK algorithm. • Should prune even more space with higher-accuracy (Lexical) models.

  34. Thanks

  35. Choosing a Coarse HyperedgeTop-Down vs. Bottom-Up

  36. Top-Down vs. Bottom-Up • Top-Down • Traverses more hyperedges • Hyperedges are closer to the root • Requires less propagation (1/2) • Bottom-Up • Traverses less hyperedges • Hyperedges are near the leaves • (words) and shared by many trees • True probability of trees isn’t • know at the beginning of CTF

  37. Coarse-to-Fine Motivation Optimal Fine Tree Optimal Coarse Tree

More Related