1 / 40

Global Inference via Linear Programming Formulation

Global Inference via Linear Programming Formulation. Presenter: Natalia Prytkova Tutor: Maximilian Dylla 14.07.2011. Outline. Motivation Naïve Algorithm LP Formulation Constraints Objective Function Applications of LP Experiments Discussion. Inference with Classifiers. Recognize

lok
Download Presentation

Global Inference via Linear Programming Formulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Global Inference via Linear Programming Formulation Presenter: Natalia Prytkova Tutor: Maximilian Dylla 14.07.2011

  2. Outline • Motivation • Naïve Algorithm • LP Formulation • Constraints • Objective Function • Applications of LP • Experiments • Discussion

  3. Inference with Classifiers Recognize entities Recognize relations Inference

  4. Example Book Author

  5. Example Book Author

  6. Properties of Extracted Items Composer Author BookWrittenBy (Book, Author) BalletWrittenBy (Ballet, Composer) Ballet Book

  7. Properties of Extracted Items MemberOfUnion (Author, WritersUnion) GraduatedFrom (Composer, Conservatory) WritersUnion Conservatory Composer Author BookWrittenBy (Book, Author) BalletWrittenBy (Ballet, Composer) Ballet Book ShownInTheater (Ballet,Theater) BookPublishedBy (Book, Publisher) Publisher Theater

  8. Example BalletWrittenBy Ballet Composer

  9. Example BalletWrittenBy Ballet Composer

  10. Properties of Extracted Items • a lot of relations types • a lot of entities types • mutually dependent

  11. Outline • Motivation • Naïve Algorithm • ILP Formulation • Constraints • Objective Function • Applications of ILP • Experiments • Discussion

  12. Outline • Motivation • Naïve Algorithm • LP Formulation • Constraints • Objective Function • Applications of LP • Experiments • Discussion

  13. Key Idea Recognize relations Inference Recognize entities

  14. Naïve Algorithm

  15. Naïve Algorithm P(Book BalletWrittenBy Composer) = 0.07 P(Book BalletWrittenBy Author) = 0.07 P(Book BookWrittenBy Composer) = 0.12 P(Book BookWrittenBy Author) = 0.03 P(Ballet BalletWrittenBy Composer) = 0.28 P(Ballet BalletWrittenBy Author) = 0.28 P(Ballet BookWrittenBy Composer) = 0.12 P(Ballet BookWrittenBy Author) = 0.12 …

  16. Naïve Algorithm P(Book BalletWrittenBy Composer) = 0.07 P(Book BalletWrittenBy Author) = 0.07 n entities – O(n2) binary relations P(Book BookWrittenBy Composer) = 0.12 llabels – ln2 assignments P(Book BookWrittenBy Author) = 0.03 P(Ballet BalletWrittenBy Composer) = 0.28 P(Ballet BalletWrittenBy Author) = 0.28 P(Ballet BookWrittenBy Composer) = 0.12 P(Ballet BookWrittenBy Author) = 0.12 …

  17. Naïve Algorithm P(Book BalletWrittenBy Composer) = 0.07 P(Book BalletWrittenBy Author) = 0.07 n entities – O(n2) binary relations P(Book BookWrittenBy Composer) = 0.12 llabels – ln2 assignments P(Book BookWrittenBy Author) = 0.03 P(Ballet BalletWrittenBy Composer) = 0.28 P(Ballet BalletWrittenBy Author) = 0.28 P(Ballet BookWrittenBy Composer) = 0.12 P(Ballet BookWrittenBy Author) = 0.12 …

  18. Some Useful Properties • Relations impose restrictions on entities • Each entity or relation can be labeled only with one label • Relations can be directed (BookWrittenBy) or undirected (SpouseOf)

  19. Outline • Motivation • Naïve Algorithm • ILP Formulation • Constraints • Objective Function • Applications of ILP • Experiments • Discussion

  20. Key Idea • Obtain a set of possible labels for entities/relations • Optimize the global decision given a set of constraints

  21. Definitions • Sentence S • Linked list of words and entities. Boundaries of entities are given Piotr Ilyich Tchaikovsky is one entity. • Entity ε • Observed variables • Relation • Binary relations between entities • Class • Predefined sets of entities and relations labels .

  22. Constraints Indicator variables

  23. Constraints

  24. Constraints • Each entity or relation can be labeled only with one label • Assignment to each entity or relation variable is consistent with the assignments to its neighboring variables

  25. Objective Function • Assignment cost • e.g. • Cost of deviating from the assignments given by classifiers • Constraint cost • e.g. • Cost of breaking constraints between two neighboring entities

  26. Naïve Algorithm P(Book BalletWrittenBy Composer) = 0.07 P(Book BalletWrittenBy Author) = 0.07 n entities – O(n2) binary relations P(Book BookWrittenBy Composer) = 0.12 llabels – ln2 assignments P(Book BookWrittenBy Author) = 0.03 P(Ballet BalletWrittenBy Composer) = 0.28 P(Ballet BalletWrittenBy Author) = 0.28 P(Ballet BookWrittenBy Composer) = 0.12 P(Ballet BookWrittenBy Author) = 0.12 …

  27. Useful Property ILP is NP hard in general, but sometimes can be solved in polynomial time.

  28. Outline • Motivation • Naïve Algorithm • ILP Formulation • Constraints • Objective Function • Applications of ILP • Experiments • Discussion

  29. Viterbi Shortest path

  30. Viterbi

  31. Phrases Identification

  32. Phrases Identification

  33. Phrases Identification

  34. Outline • Motivation • Naïve Algorithm • ILP Formulation • Constraints • Objective Function • Applications of ILP • Experiments • Discussion

  35. Experiments E -> R E <-> R Separate R -> E Omniscient E E I I R R E I R E E I I R R

  36. Experiments

  37. Experiments • 5 336 entities • 19 048 pairs of entities • 1 437 sentences • running time < 30 sec on Pentium III 800 MHz

  38. Outline • Motivation • Naïve Algorithm • ILP Formulation • Constraints • Objective Function • Applications of ILP • Experiments • Discussion

  39. Discussion • Guarantees optimality • Supports correct decisions by imposing limitations • LP solvers are available • Not scalable • cplex accepts at most 231 variables and constraints • ~ 46 000 entities • student edition accepts only 500 =) • ~ 20 entities • No feedback to extractors

  40. References • Dan Roth and Wen-tau Yih:A Linear Programming Formulation for Global Inference in Natural Language Tasks, CoNLL'04 • Dan Roth and Wen-tau Yih:Global Inference for Entity and Relation Identification via a Linear Programming Formulation, Introduction to Statistical Relational Learning, 2007

More Related