1 / 28

Open IE to KBP Relations in 3 Hours

Open IE to KBP Relations in 3 Hours . Stephen Soderland John Gilmer, Rob Bart, Oren Etzioni, Daniel S. Weld Turing Center University of Washington. Open IE. “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1 Rel Arg2 ( Steve Jobs , died of , cancer).

aimee
Download Presentation

Open IE to KBP Relations in 3 Hours

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Open IE to KBP Relations in 3 Hours Stephen Soderland John Gilmer, Rob Bart, Oren Etzioni, Daniel S. Weld Turing Center University of Washington TAC-KBP Workshop

  2. Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs , died of , cancer) TAC-KBP Workshop

  3. Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs , died of , cancer) (Steve Jobs , died in , his Palo Alto home) TAC-KBP Workshop

  4. Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs , died of , cancer) (Steve Jobs , died in , his Palo Alto home) (Steve Jobs , is co-founder of , Apple) TAC-KBP Workshop

  5. Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs , died of , cancer) (Steve Jobs , died in , his Palo Alto home) (Steve Jobs , is co-founder of , Apple) “Hamas denied responsibility for the attacks , which threaten to derail ongoing peace talks.” Arg1RelArg2 (Hamas , denied responsibility for, the attacks) TAC-KBP Workshop

  6. Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs , died of , cancer) (Steve Jobs , died in , his Palo Alto home) (Steve Jobs , is co-founder of , Apple) “Hamas denied responsibility for the attacks , which threaten to derail ongoing peace talks.” Arg1RelArg2 (Hamas , denied responsibility for, the attacks) (the attacks , threatened to derail, ongoing peace talks) TAC-KBP Workshop

  7. Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs , died of , cancer) (Steve Jobs , died in , his Palo Alto home) (Steve Jobs , is co-founder of , Apple) “Hamas denied responsibility for the attacks , which threaten to derail ongoing peace talks.” Arg1RelArg2 (Hamas , denied responsibility for, the attacks) (the attacks , threatened to derail, ongoing peace talks) “Ribosomes , which are complexes made of ribosomal RNA and protein, are the cellular components that carry out protein synthesis.” Arg1RelArg2 (Ribosomes , are complexes made of , ribosomal RNA and protein) TAC-KBP Workshop

  8. Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs , died of , cancer) (Steve Jobs , died in , his Palo Alto home) (Steve Jobs , is co-founder of , Apple) “Hamas denied responsibility for the attacks , which threaten to derail ongoing peace talks.” Arg1RelArg2 (Hamas , denied responsibility for, the attacks) (the attacks , threatened to derail, ongoing peace talks) “Ribosomes , which are complexes made of ribosomal RNA and protein, are the cellular components that carry out protein synthesis.” Arg1RelArg2 (Ribosomes , are complexes made of , ribosomal RNA and protein) (Ribosomes , are , the cellular components) TAC-KBP Workshop

  9. Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs , died of , cancer) (Steve Jobs , died in , his Palo Alto home) (Steve Jobs , is co-founder of , Apple) “Hamas denied responsibility for the attacks , which threaten to derail ongoing peace talks.” Arg1RelArg2 (Hamas , denied responsibility for, the attacks) (the attacks , threatened to derail, ongoing peace talks) “Ribosomes , which are complexes made of ribosomal RNA and protein, are the cellular components that carry out protein synthesis.” Arg1RelArg2 (Ribosomes , are complexes made of , ribosomal RNA and protein) (Ribosomes , are , the cellular components) (Ribosomes , carry out , protein synthesis) TAC-KBP Workshop

  10. Advantages of Open IE • Robust • Massively scalable • Works out of the box • Finds whatever relations are expressed in the text • Not tied to an ontology of relations • Disadvantages • Finds whatever relations are expressed in the text • Not tied to an ontology of relations • Challenge • Map Open IE to an ontology of relations • Minimum of user effort github/knowitall/openie TAC-KBP Workshop

  11. per:cause_of_death: (Steve Jobs , died of cancer) (Steve Jobs ,died from , cancer) (Steve Jobs ,passed away from , cancer) (Steve Jobs ,succumbed to , cancer) (cancer , killed , Steve Jobs) … (cancer ,claimed the life ofSteve Jobs) (Steve Jobs , lost his battle to , cancer) (Steve Jobs ,was a victim of cancer) (Steve Jobs , could not beat , cancer) (Steve Jobs , could not have prevented , his deathfrom cancer) (Steve Jobs , joins the ranks of cancer fatalities) … Head: high frequency Long tail: low frequency TAC-KBP Workshop

  12. Outline • Rules to map to target relations • Rule language • Semantic taggers • KBP system • Architecture • 3 hour rule set vs. 12 hour rule set • Results and discussion • Future work TAC-KBP Workshop

  13. Desiderata for Target Relation Mapping • Works even if no annotated training • User may have limited skill in NLP and ML • Rules are understandable to user • High precision and good generalization Approach: • Manually created rules based on Open IE tuples • Simple rule language • Rules combine lexical and semantic type constraints • Extensible semantic types based on keyword tagger TAC-KBP Workshop

  14. Rule language (Smith, was appointed, Acting Director of Acme Corporation) entity slotfill Terms in Rule Example Target relation:per:employee_or_member_of Query entity in: Arg1 Slotfill in: Arg2 Slotfill type: Organization Arg1 terms: - Relation terms: appointed Arg2 terms: <JobTitle> of Functional? no TAC-KBP Workshop

  15. Rule language (Smith, was appointed, Acting Director of Acme Corporation) per:employee_or_member_of (Smith, Acme Corporation) Terms in Rule Example Target relation:per:employee_or_member_of Query entity in: Arg1 Slotfill in: Arg2 Slotfill type: Organization Arg1 terms: - Relation terms: appointed Arg2 terms: <JobTitle> of Functional? no TAC-KBP Workshop

  16. Semantic Tagging • General types • Person, Organization, Location, Date • NER tagger • WordNet • User-specified types • Keyword tagger • User creates file of terms for the semantic type • Taggers takes file as input • Used lists from CMU’s NELL for KBP github/knowitall/taggers TAC-KBP Workshop

  17. Semantic Types from CMU’s NELL • 4K Job titles • academic coordinator … zonal underwriting manager • 182 Head job titles • acting chief director … vice-director • 47 Religions • Adventist … Zoroastrianism • 114 Nationalities • Akkadian … Zambian • 5K Cities: Aachen … Zwolle • 536 State-provinces: Ad Dali … Zlitan • 241 Countries: Afghanistan … Zimbabwe TAC-KBP Workshop

  18. Outline • Rules to map to target relations • Rule language • Semantic taggers • KBP system • architecture • 3 hour rule set vs. 12 hour rule set • Co-reference • Results and discussion • Future work TAC-KBP Workshop

  19. KBP Architecture 200M tuples TAC-KBP Workshop

  20. What We Did Not Handle • Entity disambiguation needed for KBP precision • Good extraction for “Paul Gray”, but wrong Paul Gray • Mostly ignored this in our system • Find any tuple that matched entity string • Detect ambiguous entities if linked to multiple KB entries • Discard all results for ambigous entities TAC-KBP Workshop

  21. Creating Rule Sets • 3 Hour Rules set • Avg 3 rules per relation • Light editing of NELL keyword lists per:cause_of_death = “died of”, “died from”, “died as a result of”, “died due to” • 12 Hour Rules set (over two week period) • Avg 16 rules per relation • Refined rules, testing on 2012 KBP answer key • Further editing of NELL keyword lists per:cause_of_death = “die of”, “dies of”, “dying of”, … “succumbed to”, “succumbs to”, … TAC-KBP Workshop

  22. Outline • Rules to map to target relations • Rule language • Semantic taggers • KBP system • architecture • 3 hour rule set vs. 12 hour rule set • Co-reference • Results and discussion • Future work TAC-KBP Workshop

  23. KBP Results 35% recall boost from 12 hours Extractor Precision: per:title(Paul Gray, bassist) per:title(Paul Gray, president) KBP Precision: per:title(Paul Gray, bassist) per:title(Paul Gray, president) TAC-KBP Workshop

  24. Error Analysis • 31% “Looked right to me” “Tantawi was the grand sheik” => per:title(Tantawi, sheik) “ETA's political wing Batasuna” => org:subsidiary(ETA, Batasuna) • 23% Overgeneralized rules “Ginzburg was an outspoken critic” => per:title(Ginzburg, critic) “Meredith led the NFL in scoring” => per:employee_or_member_of(Meredith, NFL) • 19% Rules matched on non-head terms “Kahn’s younger sister married Shankar” => per:spouse(Kahn, Shankar) • 15% Open IE errors • 12% Coref errors TAC-KBP Workshop

  25. Ceiling for Recall from Open IE • 42% Extracts all information for KBP relation • 16% Extractor truncates an argument Omits appositive or parenthetical “Sheikh Tantawi, the top Egyptian cleric who died on Wednesday…” (the top Egyptian cleric , died on, Wednesday) • 10% Extractor misses “relational noun” “Tantawi, the Grand Imam of Al-Azhar” • 10% No extraction of relevant part of sentence Syntactic complexity • 4% Extraction error • 18% Other 68% TAC-KBP Workshop

  26. Future Work • Increase recall of Open IE • Increase precision of rule applier • General method not tied to KBP task • Plug in any ontology of relations • Results not tied to query entity • Release as open-source software TAC-KBP Workshop

  27. Conclusion • Novel approach for KBP Slot Filling • Run Open IE extractor on corpus • Semantic taggers based on user-written keyword lists • User-written rules to map target relations to Open IE • Results • High extraction precision 0.80 • Moderate recall 0.10(comparable to all but top sites) • Low human effort • Requires no NLP or ML experience • Only 3 hours effort gives high precision TAC-KBP Workshop

  28. Thank you github/knowitall/openie github/knowitall/taggers TAC-KBP Workshop

More Related