1 / 144

Extending Bayesian Logic Programs for Plan Recognition and Machine Reading

Extending Bayesian Logic Programs for Plan Recognition and Machine Reading. Sindhu V. Raghavan Advisor: Raymond Mooney PhD Proposal May 12 th , 2011. Plan Recognition in Intelligent User Interfaces. $ cd my-dir $ cp test1.txt test-dir $ rm test1.txt. What task is the user performing?

chico
Download Presentation

Extending Bayesian Logic Programs for Plan Recognition and Machine Reading

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extending Bayesian Logic Programsfor Plan Recognition and Machine Reading Sindhu V. Raghavan Advisor: Raymond Mooney PhD Proposal May 12th, 2011

  2. Plan Recognition in Intelligent User Interfaces $ cd my-dir $ cp test1.txt test-dir $ rm test1.txt What task is the user performing? move-file Which files and directories are involved? test1.tx and test-dir Can the task be performed more efficiently? Data is relational in nature - several files and directories and several relations between them

  3. Characteristics of Real World Data • Relational or structured data • Several entities in the domain • Several relations between entities • Do not always follow the i.i.d assumption • Presence of noise or uncertainty • Uncertainty in the types of entities • Uncertainty in the relations Traditional approaches like first-order logic or probabilistic models can handle either structured dataor uncertainty,but not both.

  4. Statistical Relational Learning (SRL) • Integratesfirst-order logic and probabilistic graphical models [Getoor and Taskar, 2007] • Combines strengths of both first-order logic and probabilistic models • SRL formalisms • Stochastic Logic Programs (SLPs) [Muggleton, 1996] • Probabilistic Relational Models (PRMs)[Friedman et al., 1999] • Bayesian Logic Programs (BLPs) [Kersting and De Raedt, 2001] • Markov Logic Networks (MLNs) [Richardson and Dominogs, 2006]

  5. Statistical Relational Learning (SRL) • Integratesfirst-order logic and probabilistic graphical models [Getoor and Taskar, 2007] • Combines strengths of both first-order logic and probabilistic models • SRL formalisms • Stochastic Logic Programs (SLPs) [Muggleton, 1996] • Probabilistic Relational Models (PRMs)[Friedman et al., 1999] • Bayesian Logic Programs (BLPs)[Kersting and De Raedt, 2001] • Markov Logic Networks (MLNs) [Richardson and Dominogs, 2006]

  6. Bayesian Logic Programs (BLPs) • Integrate first-order logic and Bayesian networks • Why BLPs? • Efficient grounding mechanism that includes only those variables that are relevant to the query • Easy to extend by incorporating any type oflogical inferenceto construct networks • Well suited for capturing causal relations in data

  7. Objectives BLPs for Plan Recognition BLPs for Machine Reading

  8. Objectives Plan recognition involves predicting the top-level plan of an agent based on its actions BLPs for Machine Reading

  9. Objectives BLPs for Plan Recognition Machine Reading involves automatic extraction of knowledge from natural language text

  10. Outline • Motivation • Background • First-order logic • Logical Abduction • Bayesian Logic Programs (BLPs) • Completed Work • Part 1 – Extending BLPs for Plan Recognition • Part 2 – Extending BLPs for Machine Reading • Proposed Work • Conclusions

  11. First-order Logic • Terms • Constants – individual entities like anna, bob • Variables – placeholders for objects like X, Y • Predicates • Relations over entities like worksFor, capitalOf • Literal – predicate or its negation applied to terms • Atom – Positive literal like worksFor(X,Y) • Ground literal – literal with no variables like worksFor(anna,bob) • Clause – disjunction of literals • Horn clause has at most one positive literal • Definite clause has exactly one positive literal

  12. First-order Logic • Quantifiers • Universal quantification - true for all objects in the domain • Existential quantification - true for some objects in the domain • Logical Inference • Forward Chaining– For every implication pq, if p is true, then q is concluded to be true • Backward Chaining – For a query literal q, if an implication pq is present and p is true, then q is concluded to be true, otherwise backward chaining tries to prove p

  13. Logical Abduction • Abduction • Process of finding the best explanation for a set of observations • Given • Background knowledge, B, in the form of a set of (Horn) clauses in first-order logic • Observations, O, in the form of atomic facts in first-order logic • Find • A hypothesis, H, a set of assumptions(atomic facts) that logically entail the observations given the theory: B  H  O • Best explanation is the one with the fewest assumptions

  14. Bayesian Logic Programs (BLPs) [Kersting and De Raedt, 2001] • Set of Bayesian clauses a|a1,a2,....,an • Definite clauses that are universally quantified • Head of the clause - a • Body of the clause - a1, a2, …, an • Range-restricted, i.e variables{head} variables{body} • Associated conditional probability table (CPT) • P(head|body) • Bayesian predicates a, a1, a2, …, an have finite domains • Combining rule like noisy-or for mapping multiple CPTs into a single CPT.

  15. Logical Inference in BLPs SLD Resolution Proof alarm(james) BLP lives(james,yorkshire). lives(stefan,freiburg). neighborhood(james). tornado(yorkshire). burglary(X) | neighborhood(X). alarm(X) | burglary(X). alarm(X) | lives(X,Y), tornado(Y). Query alarm(james) Example from Ngo and Haddawy, 1997

  16. Logical Inference in BLPs SLD Resolution Proof alarm(james) BLP lives(james,yorkshire). lives(stefan,freiburg). neighborhood(james). tornado(yorkshire). burglary(X) | neighborhood(X). alarm(X) | burglary(X). alarm(X) | lives(X,Y), tornado(Y). burglary(james) Query alarm(james) Example from Ngo and Haddawy, 1997

  17. Logical Inference in BLPs SLD Resolution Proof alarm(james) BLP lives(james,yorkshire). lives(stefan,freiburg). neighborhood(james). tornado(yorkshire). burglary(X) | neighborhood(X). alarm(X) | burglary(X). alarm(X) | lives(X,Y), tornado(Y). burglary(james) Query alarm(james) neighborhood(james) Example from Ngo and Haddawy, 1997

  18. Logical Inference in BLPs SLD Resolution Proof alarm(james) BLP lives(james,yorkshire). lives(stefan,freiburg). neighborhood(james). tornado(yorkshire). burglary(X) | neighborhood(X). alarm(X) | burglary(X). alarm(X) | lives(X,Y), tornado(Y). burglary(james) Query alarm(james) neighborhood(james) Example from Ngo and Haddawy, 1997

  19. Logical Inference in BLPs SLD Resolution Proof alarm(james) BLP lives(james,yorkshire). lives(stefan,freiburg). neighborhood(james). tornado(yorkshire). burglary(X) | neighborhood(X). alarm(X) | burglary(X). alarm(X) | lives(X,Y), tornado(Y). burglary(james) Query alarm(james) neighborhood(james) Example from Ngo and Haddawy, 1997

  20. Logical Inference in BLPs SLD Resolution Proof alarm(james) BLP lives(james,yorkshire). lives(stefan,freiburg). neighborhood(james). tornado(yorkshire). burglary(X) | neighborhood(X). alarm(X) | burglary(X). alarm(X) | lives(X,Y), tornado(Y). burglary(james) Query alarm(james) neighborhood(james) Example from Ngo and Haddawy, 1997

  21. Logical Inference in BLPs SLD Resolution Proof alarm(james) BLP lives(james,yorkshire). lives(stefan,freiburg). neighborhood(james). tornado(yorkshire). burglary(X) | neighborhood(X). alarm(X) | burglary(X). alarm(X) | lives(X,Y), tornado(Y). burglary(james) lives(james,Y) tornado(Y) Query alarm(james) neighborhood(james) Example from Ngo and Haddawy, 1997

  22. Logical Inference in BLPs SLD Resolution Proof alarm(james) BLP lives(james,yorkshire). lives(stefan,freiburg). neighborhood(james). tornado(yorkshire). burglary(X) | neighborhood(X). alarm(X) | burglary(X). alarm(X) | lives(X,Y), tornado(Y). burglary(james) lives(james,Y) tornado(Y) Query alarm(james) lives(james,yorkshire) neighborhood(james) Example from Ngo and Haddawy, 1997

  23. Logical Inference in BLPs SLD Resolution Proof alarm(james) BLP lives(james,yorkshire). lives(stefan,freiburg). neighborhood(james). tornado(yorkshire). burglary(X) | neighborhood(X). alarm(X) | burglary(X). alarm(X) | lives(X,Y), tornado(Y). burglary(james) lives(james,Y) tornado(Y) Query alarm(james) lives(james,yorkshire) neighborhood(james) tornado(yorkshire) Example from Ngo and Haddawy, 1997

  24. Logical Inference in BLPs SLD Resolution Proof alarm(james) BLP lives(james,yorkshire). lives(stefan,freiburg). neighborhood(james). tornado(yorkshire). burglary(X) | neighborhood(X). alarm(X) | burglary(X). alarm(X) | lives(X,Y), tornado(Y). burglary(james) lives(james,Y) tornado(Y) Query alarm(james) lives(james,yorkshire) neighborhood(james) tornado(yorkshire) Example from Ngo and Haddawy, 1997

  25. Bayesian Network Construction neighborhood(james) burglary(james) lives(james,yorkshire) tornado(yorkshire) alarm(james) Each ground atom becomes a node (random variable) in the Bayesian network Edges are added from ground atoms in the body of a clause to the ground atom in the head Specify probabilistic parameters using the CPTs associated with Bayesian clauses

  26. Bayesian Network Construction neighborhood(james) burglary(james) lives(james,yorkshire) tornado(yorkshire) alarm(james) Each ground atom becomes a node (random variable) in the Bayesian network Edges are added from ground atoms in the body of a clause to the ground atom in the head Specify probabilistic parameters using the CPTs associated with Bayesian clauses

  27. Bayesian Network Construction neighborhood(james) burglary(james) tornado(yorkshire) lives(james,yorkshire) alarm(james) Each ground atom becomes a node (random variable) in the Bayesian network Edges are added from ground atoms in the body of a clause to the ground atom in the head Specify probabilistic parameters using the CPTs associated with Bayesian clauses

  28. Bayesian Network Construction ✖ neighborhood(james) lives(stefan,freiburg) burglary(james) lives(james,yorkshire) tornado(yorkshire) alarm(james) Each ground atom becomes a node (random variable) in the Bayesian network Edges are added from ground atoms in the body of a clause to the ground atom in the head Specify probabilistic parameters using the CPTs associated with Bayesian clauses Use combining rule to combine multiple CPTs into a single CPT

  29. Probabilistic Inference and Learning • Probabilistic inference • Marginal probability given evidence • Most Probable Explanation (MPE) given evidence • Learning [Kersting and De Raedt, 2008] • Parameters • Expectation Maximization • Gradient-ascent based learning • Structure • Hill climbing search through the space of possible structures

  30. Part 1Extending BLPs for Plan Recognition[Raghavan and Mooney, 2010]

  31. Plan Recognition • Predict an agent’s top-level plans based on the observed actions • Abductive reasoning involving inference of cause from effect • Applications • Story Understanding • Recognize character’s motives or plans based on its actions to answer questions about the story • Strategic planning • Predict other agents’ plans so as to work co-operatively • Intelligent User Interfaces • Predict the task that the user is performing so as to provide valuable tips to perform the task more efficiently

  32. Related Work • First-order logic based approaches [Kautz and Allen, 1986; Ng and Mooney, 1992] • Knowledge base of plans and actions • Default reasoning or logical abduction to predict the best plan based on the observed actions • Unable to handle uncertainty in data or estimate likelihood of alternative plans • Probabilistic graphical models [Charniak and Goldman, 1989; Huber et al., 1994; Pynadath and Wellman, 2000; Bui, 2003; Blaylock and Allen, 2005] • Encode the domain knowledge using Bayesian networks, abstract hidden Markov models, or statistical n-gram models • Unable to handle relational/structured data

  33. Related Work • Markov Logic based approaches [Kate and Mooney, 2009; Singla and Mooney, 2011] • Logical inference in MLNs is deductivein nature • MLN-PC [Kate and Mooney, 2009] • Add reverse implications to handle abductive reasoning • Does not scale to large domains • MLN-HC[Singla and Mooney, 2011] • Improves over MLN-PC by adding hidden causes • Still does not scale to large domains • MLN-HCAM [Singla and Mooney, 2011]uses logical abduction to constrain grounding • Incorporates ideas from the BLP approach for plan recognition [Raghavan and Mooney, 2010] • MLN-HCAM performs better than MLN-PC and MLN-HC

  34. BLPs for Plan Recognition • Why BLPs ? • Directed model captures causal relationships well • Efficient grounding process results in smaller networks, unlike in MLNs • SLD resolution is deductive inference, used for predicting observed actions from top-level plans • Plan recognition is abductive in nature and involves predicting the top-level plan from observed actions BLPs cannot be used as is for plan recognition

  35. Extending BLPs for Plan Recognition BLPs Logical Abduction BALPs BALPs – Bayesian Abductive Logic Programs

  36. Logical Abduction in BALPs • Given • A set of observation literals O = {O1, O2,….On} and a knowledge base KB • Compute a set abductive proofs of O using Stickel’s abduction algorithm[Stickel 1988] • Backchain on each Oi until it is proved or assumed • A literal is said to be proved if it unifies with a fact or the head of some rule in KB, otherwise it is said to be assumed • Construct a Bayesian network using the resulting set of proofs as in BLPs.

  37. Example – Intelligent User Interfaces • Top-level plans predicates • copy-file, move-file, remove-file • Actions predicates • cp, rm • Knowledge Base (KB) • cp(filename,destdir) | copy-file(filename,destdir) • cp(filename,destdir) | move-file(filename,destdir) • rm(filename) | move-file(filename,destdir) • rm(filename) | remove-file(filename) • Observed actions • cp(Test1.txt, Mydir) • rm(Test1.txt)

  38. Abductive Inference Assumed literal copy-file(Test1.txt,Mydir) cp(Test1.txt,Mydir) cp(filename,destdir) | copy-file(filename,destdir)

  39. Abductive Inference Assumed literal copy-file(Test1.txt,Mydir) move-file(Test1.txt,Mydir) cp(Test1.txt,Mydir) cp(filename,destdir) | move-file(filename,destdir)

  40. Abductive Inference Match existing assumption copy-file(Test1.txt,Mydir) move-file(Test1.txt,Mydir) rm(Test1.txt) cp(Test1.txt,Mydir) • rm(filename) | move-file(filename,destdir)

  41. Abductive Inference Assumed literal copy-file(Test1.txt,Mydir) move-file(Test1.txt,Mydir) remove-file(Test1) rm(Test1.txt) cp(Test1.txt,Mydir) • rm(filename) | remove-file(filename)

  42. Structure of Bayesian network copy-file(Test1.txt,Mydir) move-file(Test1.txt,Mydir) remove-file(Test1) rm(Test1.txt) cp(Test1.txt,Mydir)

  43. Probabilistic Inference • Specifying probabilistic parameters • Noisy-and • Specify the CPT for combining the evidence from conjuncts in the body of the clause • Noisy-or • Specify the CPT for combining the evidence from disjunctive contributions from different ground clauses with the same head • Models “explaining away” • Noisy-and and noisy-or models reduce the number of parameters learned from data

  44. Probabilistic Inference copy-file(Test1,txt,Mydir) move-file(Test1,txt,Mydir) remove-file(Test1) rm(Test1.txt) cp(Test1.txt,Mydir) 4 parameters 4 parameters

  45. Probabilistic Inference copy-file(Test1,txt,Mydir) move-file(Test1,txt,Mydir) remove-file(Test1) θ4 θ2 θ3 θ1 rm(Test1.txt) cp(Test1.txt,Mydir) 2 parameters 2 parameters Noisy models require parameters linear in the number of parents

  46. Probabilistic Inference • Most Probable Explanation (MPE) • For multiple plans, compute MPE, the most likely combination of truth values to all unknown literals given this evidence • Marginal Probability • For single top level plan prediction, compute marginal probabilityfor all instances of plan predicate and pick the instance with maximum probability • When exact inference is intractable, SampleSearch[Gogate and Dechter, 2007], an approximate inference algorithm for graphical models with deterministic constraints is used

  47. Probabilistic Inference copy-file(Test1,txt,Mydir) move-file(Test1,txt,Mydir) remove-file(Test1) Noisy-or Noisy-or rm(Test1.txt) cp(Test1.txt,Mydir)

  48. Probabilistic Inference copy-file(Test1,txt,Mydir) move-file(Test1,txt,Mydir) remove-file(Test1) Noisy-or Noisy-or rm(Test1.txt) cp(Test1.txt,Mydir) Evidence

  49. Probabilistic Inference Query variables copy-file(Test1,txt,Mydir) move-file(Test1,txt,Mydir) remove-file(Test1) Noisy-or Noisy-or rm(Test1.txt) cp(Test1.txt,Mydir) Evidence

  50. Probabilistic Inference MPE Query variables FALSE FALSE TRUE copy-file(Test1,txt,Mydir) move-file(Test1,txt,Mydir) remove-file(Test1) Noisy-or Noisy-or rm(Test1.txt) cp(Test1.txt,Mydir) Evidence

More Related