1 / 99

Inducing Structure for Perception

Inducing Structure for Perception. a.k.a. Slav’s split&merge Hammer. Slav Petrov Advisors: Dan Klein, Jitendra Malik Collaborators: L. Barrett, R. Thibaux, A. Faria, A. Pauls, P. Liang, A. Berg. The Main Idea. True structure. Manually specified structure. MLE structure. He was right.

emory
Download Presentation

Inducing Structure for Perception

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inducing Structure for Perception a.k.a. Slav’s split&merge Hammer Slav Petrov Advisors: Dan Klein, Jitendra Malik Collaborators: L. Barrett, R. Thibaux, A. Faria, A. Pauls, P. Liang, A. Berg

  2. The Main Idea True structure Manually specifiedstructure MLE structure He was right. Observation Complex underlying process

  3. The Main Idea Automatically refinedstructure EM He was right. Manually specifiedstructure Observation Complex underlying process

  4. Why Structure? the the the food cat dog ate and t e c a e h t g f a o d o o d n h e t d a

  5. The dog ate the cat and the food. The dog and the cat ate the food. The cat ate the food and the dog. Structure is important

  6. Syntactic Ambiguity Last night I shot an elephant in my pajamas.

  7. Visual Ambiguity Old or young?

  8. Machine Learning Natural Language Processing Computer Vision Three Peaks?

  9. Machine Learning Natural Language Processing Computer Vision No, One Mountain!

  10. Syntax Scenes Speech Three Domains

  11. Syntax TrecVid Learning Inference Scenes Learning Synthesis Speech Decoding Bayesian Learning Inference Summer ISI Conditional Syntactic MT ‘07 ‘08 ‘09 Now Timeline

  12. Scenes Speech Syntax Split & Merge Learning Coarse-to-Fine Inference Syntax Non- parametric Bayesian Learning Syntactic Machine Translation Generative vs. Conditional Learning Language Modeling

  13. Learning accurate, compact and interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein

  14. Motivation (Syntax) • Task: He was right. • Why? • Information Extraction • Syntactic Machine Translation

  15. S  NP VP . 1.0 NP  PRP 0.5 NP  DT NN 0.5 … PRP She 1.0 DT the 1.0 … Treebank Grammar Treebank Parsing

  16. NPs under S NPs under VP Non-Independence Independence assumptions are often too strong. All NPs

  17. The Game of Designing a Grammar • Annotation refines base treebank symbols to improve statistical fit of the grammar • Parent annotation [Johnson ’98]

  18. The Game of Designing a Grammar • Annotation refines base treebank symbols to improve statistical fit of the grammar • Parent annotation [Johnson ’98] • Head lexicalization[Collins ’99, Charniak ’00]

  19. The Game of Designing a Grammar • Annotation refines base treebank symbols to improve statistical fit of the grammar • Parent annotation [Johnson ’98] • Head lexicalization[Collins ’99, Charniak ’00] • Automatic clustering?

  20. Forward X1 X7 X2 X4 X3 X5 X6 . He was right Backward Learning Latent Annotations EM algorithm: • Brackets are known • Base categories are known • Only induce subcategories Just like Forward-Backward for HMMs.

  21. Ax By Cz By Cz Inside/Outside Scores Inside: Outside: Ax

  22. Ax By Cz Learning Latent Annotations (Details) • E-Step: • M-Step:

  23. Limit of computational resources Overview - Hierarchical Training - Adaptive Splitting - Parameter Smoothing

  24. DT-2 DT-3 DT-1 DT-4 Refinement of the DT tag DT

  25. Refinement of the DT tag DT

  26. Hierarchical refinement of the DT tag DT

  27. Hierarchical Estimation Results

  28. Refinement of the , tag • Splitting all categories the same amount is wasteful:

  29. Oversplit? The DT tag revisited

  30. Adaptive Splitting • Want to split complex categories more • Idea: split everything, roll back splits which were least useful

  31. Adaptive Splitting • Want to split complex categories more • Idea: split everything, roll back splits which were least useful

  32. Adaptive Splitting • Evaluate loss in likelihood from removing each split = Data likelihood with split reversed Data likelihood with split • No loss in accuracy when 50% of the splits are reversed.

  33. Adaptive Splitting (Details) • True data likelihood: • Approximate likelihood with split at n reversed: • Approximate loss in likelihood:

  34. Adaptive Splitting Results

  35. Number of Phrasal Subcategories

  36. Number of Phrasal Subcategories NP VP PP

  37. Number of Phrasal Subcategories NAC X

  38. Number of Lexical Subcategories POS TO ,

  39. Number of Lexical Subcategories RB VBx IN DT

  40. Number of Lexical Subcategories NNP JJ NNS NN

  41. Smoothing • Heavy splitting can lead to overfitting • Idea: Smoothing allows us to pool statistics

  42. Linear Smoothing

  43. Result Overview

  44. Linguistic Candy • Proper Nouns (NNP): • Personal pronouns (PRP):

  45. Linguistic Candy • Relative adverbs (RBR): • Cardinal Numbers (CD):

  46. Nonparametric PCFGs using Dirichlet Processes Percy Liang, Slav Petrov, Dan Klein and Michael Jordan

  47. Improved Inference for Unlexicalized Parsing Slav Petrov and Dan Klein

  48. 1621 min

  49. Treebank Coarse grammar Prune Parse Parse NP … VP NP-apple NP-1 VP-6 VP-run NP-17 NP-dog … … VP-31 NP-eat NP-12 NP-cat … … Refined grammar Refined grammar [Goodman ‘97, Charniak&Johnson ‘05] Coarse-to-Fine Parsing

  50. Prune? For each chart item X[i,j], compute posterior probability: < threshold E.g. consider the span 5 to 12: coarse: refined:

More Related