1 / 59

Stuart Russell Computer Science Division UC Berkeley

Representational and inferential foundations for possible large-scale information extraction and question-answering from the web. Stuart Russell Computer Science Division UC Berkeley. Goal. A system that knows everything on the Web* Answer all questions Discover patterns Make predictions

derick
Download Presentation

Stuart Russell Computer Science Division UC Berkeley

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Representational and inferential foundations for possible large-scale information extraction and question-answering from the web Stuart Russell Computer Science Division UC Berkeley

  2. Goal • A system that knows everything on the Web* • Answer all questions • Discover patterns • Make predictions • Raw data → useful knowledge base (→ $$$) • Requires: NLP, vision, speech, learning, DBs, knowledge representation and reasoning • Berkeley: Klein, Malik, Morgan, Darrell, Jordan, Bartlett, Hellerstein, Franklin, Hearst++

  3. Past projects: PowerSet • “Building a natural language search engine that reads and understands every sentence on the Web.” • Parsing/extraction technology + crowdsourcing to generate collections of x R y triples • Example: • Manchester United beat Chelsea • Chelsea beat Manchester United • Bought by Microsoft in 2008, merged into Bing

  4. Current projects: UW Machine Reading • Initially based on bootstrapping text patterns • Birthplace(Elvis,Tupelo) => “Elviswas born in Tupelo” => “Obamawas born inHawaii” => “Obama’sbirthplace wasHawaii” => …. • [Google: Best guess for Elvis Presley Born is January 8, 1935] • Inaccurate, runs out of gas, learned content shallow, 99% of text ignored • Moving to incorporate probabilistic knowledge, inference using Markov logic

  5. Current Projects: NELL (CMU) • Bootstrapping approach to learning facts from the web using text patterns (642,797 so far) • Initial ontology of basic categories and typed relations • Examples: • the_chicken is a type of meat 100.0% • coventry_evening_telegraph is a blog99.0% • state_university is a sports team also known assyracuse_university93.8% • orac_values_for_mushrooms is a fungus100.0% • Hank Paulsonis the CEO of Goldman100.0%

  6. Problems • Language (incl. speech act pragmatics) • … Jerry Brown, who has been called the first American in space • Uncertainty • Reference uncertainty is ubiquitous • Bootstrapping can converge or diverge; exacerbated by “accepting” uncertain facts, naïve probability models • Universal ontological framework (O(1) work) • Taxonomy, events, compositional structure, time… • Compositional structure of objects and events • Knowledge, belief, other agents • Semantic content below lexical level (must be learned) • E.g., buy = sell-1, ownership, transfer, etc.

  7. Technical approach • Web is just evidence; compute P(World | web) α P(web | World) P(World) • What is the domain of the World variable? • Complex sets of interrelated objects and events • How does it cause the Web variable? • Pragmatics/semantics/syntax (and copying!) • Uncertainty about • What objects exist • How they’re related • What phrases/images refer to what real objects • => Open-universe, first-order probabilistic language

  8. Brief history of expressiveness 17th C 20th C 21st C probability 5th C B.C. 19th C logic atomic propositional first-order/relational

  9. Brief history of expressiveness 17th C 20th C 21st C probability (be patient!) 5th C B.C. 19th C logic atomic propositional first-order/relational

  10. Herbrand vs full first-order Given Father(Bill,William) and Father(Bill,Junior) How many children does Bill have?

  11. Herbrand vs full first-order Given Father(Bill,William) and Father(Bill,Junior) How many children does Bill have? Herbrand semantics: 2

  12. Herbrand vs full first-order Given Father(Bill,William) and Father(Bill,Junior) How many children does Bill have? Herbrand semantics: 2 First-order logical semantics: Between 1 and ∞

  13. Possible worlds • Propositional

  14. Possible worlds A B C D • Propositional • First-order + unique names, domain closure A B A B A B A B C D C D C D C D

  15. Possible worlds A B C D • Propositional • First-order + unique names, domain closure • First-order open-universe A B A B A B A B C D C D C D C D A B C D A B C D A B C D A B C D A B C D A B C D

  16. Open-universe models in BLOG • Construct worlds using two kinds of steps, proceeding in topological order: • Dependency statements: Set the value of a function or relation on a tuple of (quantified) arguments, conditioned on parent values

  17. Open-universe models in BLOG • Construct worlds using two kinds of steps, proceeding in topological order: • Dependency statements: Set the value of a function or relation on a tuple of (quantified) arguments, conditioned on parent values • Number statements: Add some objects to the world, conditioned on what objects and relations exist so far

  18. Technical basics Theorem: Every well-formed* BLOG model specifies a unique proper probability distribution over open-universe possible worlds; equivalent to an infinite contingent Bayes net Theorem: BLOG inference algorithms (rejection sampling, importance sampling, MCMC) converge to correct posteriors for any well-formed* model, for any first-order query

  19. Example: Citation Matching [Lashkari et al 94] Collaborative Interface Agents, Yezdi Lashkari, Max Metral, and Pattie Maes, Proceedings of the Twelfth National Conference on Articial Intelligence, MIT Press, Cambridge, MA, 1994. Metral M. Lashkari, Y. and P. Maes. Collaborative interface agents. In Conference of the American Association for Artificial Intelligence, Seattle, WA, August 1994. Are these descriptions of the same object? Core task in CiteSeer, Google Scholar, over 300 companies in the record linkage industry

  20. (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

  21. (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

  22. (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

  23. (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

  24. (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

  25. (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

  26. (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c))); Evidence: lots of citation strings Query: who wrote what? Which paper is being cited in this string? Are these two people the same?

  27. Citation Matching Results Four data sets of ~300-500 citations, referring to ~150-300 papers

  28. Kenneth Wauchope (1994). Eucalyptus: Integrating natural language input with a graphical user interface. NRL Report NRL/FR/5510-94-9711, Naval Research Laboratory, Washington, DC, 39pp. Second citation makes it clear how to parse the first one Cross-Citation Disambiguation Wauchope, K. Eucalyptus: Integrating Natural Language Input with a Graphical User Interface. NRL Report NRL/FR/5510-94-9711 (1994). Is "Eucalyptus" part of the title, or is the author named K. Eucalyptus Wauchope?

  29. Example: multitarget tracking #Aircraft(EntryTime = t) ~ NumAircraftPrior(); Exits(a, t) if InFlight(a, t) then ~ Bernoulli(0.1); InFlight(a, t)if t < EntryTime(a) then = falseelseif t = EntryTime(a) then = trueelse = (InFlight(a, t-1) & !Exits(a, t-1)); State(a, t)if t = EntryTime(a) then ~ InitState() elseif InFlight(a, t) then ~ StateTransition(State(a, t-1)); #Blip(Source = a, Time = t) if InFlight(a, t) then ~ NumDetectionsCPD(State(a, t)); #Blip(Time = t) ~ NumFalseAlarmsPrior(); ApparentPos(r)if (Source(r) = null) then ~ FalseAlarmDistrib()else ~ ObsCPD(State(Source(r), Time(r)));

  30. Example: cybersecurity sibyl defence #Person ~ LogNormal[6.9, 2.3](); Honest(x) ~ Boolean[0.9](); #Login(Owner = x) ~ if Honest(x) then 1 else LogNormal[4.6,2.3](); Transaction(x,y) ~ if Owner(x) = Owner(y) then SibylPrior() else TransactionPrior(Honest(Owner(x)), Honest(Owner(y))); Recommends(x,y) ~ if Transaction(x,y) then if Owner(x) = Owner(y) then Boolean[0.99]() else RecPrior(Honest(Owner(x)), Honest(Owner(y))); Evidence: lots of transactions and recommendations Query: Honest(x)

  31. Example: Global seismic monitoring • CTBT bans testing of nuclear weapons on earth • Allows for outside inspection of 1000km2 • Need 9 more ratifications for “entry into force” including US, China • US Senate refused to ratify in 1998 • “too hard to monitor”

  32. 254 monitoring stations

  33. Vertically Integrated Seismic Analysis • The problem is hard: • ~10000 “detections” per day, 90% false • CTBT system (SEL3) finds 69% of significant events plus about twice as many spurious (nonexistent) events • 16 human analysts find more events, correct existing ones, throw out spurious events, generate LEB (“ground truth”) • Unreliable below magnitude 4 (1kT)

  34. # SeismicEvents ~ Poisson[time_duration * event_rate]; IsEarthQuake(e) ~ Bernoulli(.999); EventLocation(e) ~ If IsEarthQuake(e) then EarthQuakeDistribution() Else UniformEarthDistribution(); Magnitude(e) ~ Exponential(log(10)) + min_magnitude; Distance(e,s) = GeographicalDistance(EventLocation(e), SiteLocation(s)); IsDetected(e,p,s) ~ Logistic[site-coefficients(s,p)](Magnitude(e), Distance(e,s); #Arrivals(site = s) ~ Poisson[time_duration * false_rate(s)]; #Arrivals(event=e, site) = If IsDetected(e,s) then 1 else 0; Time(a) ~ If (event(a) = null) then Uniform(0,time_duration) else IASPEI(EventLocation(event(a)),SiteLocation(site(a)),Phase(a)) + TimeRes(a); TimeRes(a) ~ Laplace(time_location(site(a)), time_scale(site(a))); Azimuth(a) ~If (event(a) = null) then Uniform(0, 360) else GeoAzimuth(EventLocation(event(a)),SiteLocation(site(a)) + AzRes(a); AzRes(a) ~ Laplace(0, azimuth_scale(site(a))); Slow(a) ~If (event(a) = null) then Uniform(0,20) else IASPEI-slow(EventLocation(event(a)),SiteLocation(site(a)) + SlowRes(site(a));

  35. Fraction of LEB events missed

  36. Fraction of LEB events missed

  37. Event distribution: LEB vs SEL3

More Related