1 / 20

YAGO-QA Answering Questions by Structured Knowledge Queries

YAGO-QA Answering Questions by Structured Knowledge Queries. Peter Adolphs Martin Theobald Ulrich Sch äfer Hans Uszkoreit Gerhard Weikum. ICSC Stanford University September 19, 2011. Jeopardy!. A big US city with two airports, one named after a World

ethan
Download Presentation

YAGO-QA Answering Questions by Structured Knowledge Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. YAGO-QAAnswering Questions by Structured Knowledge Queries Peter Adolphs Martin Theobald Ulrich Schäfer Hans Uszkoreit Gerhard Weikum ICSC Stanford University September 19, 2011

  2. Jeopardy! A big US city with two airports, one named after a World War II hero, and one named after a World War II battle field? YAGO-QA: Answering Questions by Structured Knowledge Queries

  3. Deep-QA in NL William Wilkinson's "An Account of the Principalities of Wallachia and Moldavia" inspired this author's most famous novel This town is known as "Sin City" & its downtown is "Glitter Gulch" As of 2010, this is the only former Yugoslav republic in the EU 99 cents got me a 4-pack of Ytterlig coasters from this Swedish chain question classification & decomposition knowledge backends D. Ferrucci et al.: Building Watson: An Overview of the DeepQA Project.AI Magazine, 2010. YAGO www.ibm.com/innovation/us/watson/index.htm YAGO-QA: Answering Questions by Structured Knowledge Queries

  4. Structured Knowledge Queries A big US city with two airports, one named after a World War II hero, and one named after a World War II battle field? Select Distinct ?c Where { ?c type City . ?c locatedIn USA . ?a1 type Airport . ?a2 type Airport . ?a1 locatedIn ?c . ?a2 locatedIn ?c . ?a1 namedAfter ?p . ?p type WarHero . ?a2 namedAfter ?b . ?b type BattleField . } In this work: focus on factoid and list questions YAGO-QA: Answering Questions by Structured Knowledge Queries

  5. Agenda • YAGO Server & API • Wikipedia-based information extraction • Searching & ranking in large RDF graphs • Names, Surface Patterns & Paraphrases • Named entity disambiguation • Mapping surface patterns onto semantic relations • Crowdsourcing for questions paraphrases • YAGO-QA Architecture • Template-based mapping of NL questions onto SPARQL • Conclusions & Future Work YAGO-QA: Answering Questions by Structured Knowledge Queries

  6. Information Extraction from Wikipedia YAGO-QA: Answering Questions by Structured Knowledge Queries

  7. YAGO Knowledge Base • Combine knowledge from WordNet & Wikipedia • Additional Gazetteers (geonames.org) • Part of the Linked-Data cloud YAGO-QA: Answering Questions by Structured Knowledge Queries

  8. YAGO-2 Numbers estimated precision > 95% (for base relations excl. space, time & provenance) www.mpi-inf.mpg.de/yago-naga/ YAGO-QA: Answering Questions by Structured Knowledge Queries

  9. Searching & Ranking RDF Graphs in NAGA Rankingbasedon confidence, compactnessandrelevance Discoveryqueries: hasWon diedOn Nobel prize $a $x type bornIn Kiel $x scientist > hasSon diedOn $y $b Connectednessqueries: type * German novelist Thomas Mann Goethe Querieswithregularexpressions: hasFirstName | hasLastName type Ling $x scientist (coAuthor | advisor)* worksFor locatedIn* $y Zhejiang Beng Chin Ooi YAGO-QA: Answering Questions by Structured Knowledge Queries

  10. YAGO Server: UI & API % YAGO-QA: Answering Questions by Structured Knowledge Queries

  11. YAGO Server: UI & API YAGO-UI • Interactive online demo • RDF with time, space & provenance annotations • SPARQL + keywords YAGO-API Two basic WebServices: • processQuery (String query) • getYagoEntitiesByNames (String[] names) … www.mpi-inf.mpg.de/yago-naga/demo.html YAGO-QA: Answering Questions by Structured Knowledge Queries

  12. Names, Surface Patterns & Paraphrases Which chemist was born in London? • (I) Named entity disambiguation • chemist wordnet_chemist, wordnet_pharmacist • born Bertran_de_Born, Born_Identity_(Movie), Born_(Album) • London London_UK, London_Arkansas, Antonio_London • (II) Mapping surface patterns onto semantic relations • <person>was_born_in<location>  bornIn(<person>, <location>) • <person>was_born_in<date>  bornOn(<person>, <date>) • (III) Paraphrases of questions <person>[was] born in<location> <location>-born <person> NN VBD VBN IN NNP/LOC  bornIn(<person>,<location>) YAGO-QA: Answering Questions by Structured Knowledge Queries

  13. (I) Named Entity Disambiguation #inlinks with anchor “Paris” Paris 32,362 Paris, France 570 Paris Masters 134 Paris (mythology) 118 University of Paris 79 Paris, Texas 56 Paris, Ontario 45 Paris (rapper) 29 Open Gaz de France 26 Paris, Kentucky 20 Paris (2008 film) 19 Gare Saint-Lazare 18 Paris, Tennessee 17 BNP Paribas Masters 16 Paris, Maine 14 Paris Hilton 12 Paris, Arkansas 11 Paris (Supertramp album) 10 Gare du Nord 9 Paris (1979 TV series) 8 Count Paris 7 PalaisOmnisports de Paris-Bercy 6 Paris, Virginia 5 Paris 2012 Olympic bid 4 Paris (2003 film) 3 • Wikipedia link structure • 65,872,435 intra-wiki links • 2,782,297 disambiguation pages & 328,372 redirects • 2,886,027 distinct link anchor texts  YAGO “means” relation • 18,470,099 mappings of names to entities • 6.2 distinct names per entity (on avg.) Individual name disambiguation vs. joint disambiguation AIDA tool for graph-based disambiguation in YAGO-2: “Robust Disambiguation of Named Entities in Text” J. Hoffart et al. In EMNLP, Edinburgh, Scotland, 2011 www.mpi-inf.mpg.de/yago-naga/aida/ YAGO-QA: Answering Questions by Structured Knowledge Queries

  14. (II) From Patterns to Semantic Relations • PROSPERA – statistical pattern mining from free-text • Domain-oriented extraction of patterns for known relations (POS-enhanced n-grams) X carried out his doctoral research in math under the supervision of Y  X { carried out PRPdoctoralresearch[IN NP] [DET]supervision [IN] } Y • Confidence & support based on seeds & counter seeds • Pattern/fact-duality & consistency reasoning 10s to 100s of typed patterns per relation occurs(p,x,y)  expresses(p,R)  R(x,y) pattern-fact duality occurs(p,x,y)  R(x,y)  expresses(p,R) Spouse  Person  Person type constraints capitalOfCountry  cityOfCountry inclusion dependencies Spouse(x,y): x  y, y  x functional dependencies YAGO-QA: Answering Questions by Structured Knowledge Queries

  15. PROSPERA Architecture • Gathering: Enhanced Hearst patterns • POS-enhanced n-grams • Pattern-fact duality & constraints • Analysis: Refined pattern weights • Carefully chosen seeds and counter seeds • Thresholds for pattern confidence & support • Reasoning:Scalable extraction & consistency reasoning • MapReduce functions for pattern extraction & statistics gathering • Distributed MaxSat solver • (MAP Inference) YAGO-QA: Answering Questions by Structured Knowledge Queries

  16. (III) Crowdsourcing for Question Paraphrases YAGO-QA: Answering Questions by Structured Knowledge Queries • Pattern acquisition from the crowd • Annotators paraphrase natural-language seed questions • Seed questions are associated with their semantic arguments and functions • Gold resourcefor pattern acquisition and system evaluation • Preliminary results • 4,620 paraphrases for 254 seed questions with 7 annotators • Total annotation time: ~49 hours, ~1 work-day per annotator

  17. YAGO-QA Architecture • Input analysis • SProUTfor tokenization, stemming & NER (http://sprout.dfki.de/) • NE gazetteerextendedby YAGO entities • Input interpretation • Named-entitydisambiguationbased on YAGO statistics • Vaguematchingagainstthegatheredquestionparaphrases YAGO-QA: Answering Questions by Structured Knowledge Queries

  18. YAGO-QA Architecture (ct’d) • Input interpretation / Answerretrieval • An actorwhoseplaceofbirthis Chicago. • Whichactor was born in Chicago ? •  Which<actor>was_born_in<Chicago>? •  ?x typeARG1 . ?x bornInARG2 . • Template-basedanswergeneration • Who/whatis/are <?x> ? YAGO-QA: Answering Questions by Structured Knowledge Queries

  19. YAGO-QA Example • Multiple named entity annotations: all names are annotated • Interpretation picks suitable NE readings • Vague matching against surface templates YAGO-QA: Answering Questions by Structured Knowledge Queries

  20. Conclusions & Future Work • QA based on structured knowledge queries (beyond IR-style retrieval of matching sentences/paragraphs) • Wikipedia as rich knowledge backend • Entities, semantic classes & typed relations • Large-scale statistics for entity disambiguation & surface patterns • Crowdsourcing for question paraphrases • Predefined question templates translated into join queries • Future work • “Open-QA” via open-domain information extraction • Dynamic learning of template structures from grammars • More modular template structures YAGO-QA: Answering Questions by Structured Knowledge Queries

More Related