1 / 94

Abductive Plan Recognition By Extending Bayesian Logic Programs

Abductive Plan Recognition By Extending Bayesian Logic Programs. Sindhu V. Raghavan & Raymond J. Mooney The University of Texas at Austin. Plan Recognition. Predict an agent’s top-level plans based on the observed actions Abductive reasoning involving inference of cause from effect

heinz
Download Presentation

Abductive Plan Recognition By Extending Bayesian Logic Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Abductive Plan Recognition By Extending Bayesian Logic Programs Sindhu V. Raghavan & Raymond J. Mooney The University of Texas at Austin

  2. Plan Recognition • Predict an agent’s top-level plans based on the observed actions • Abductive reasoning involving inference of cause from effect • Applications • Story Understanding • Strategic Planning • Intelligent User Interfaces

  3. Plan Recognition in Intelligent User Interfaces $ cd test-dir $ cp test1.txt my-dir $ rm test1.txt What task is the user performing? move-file Which files and directories are involved? test1.txt and test-dir Data is relational in nature - several files and directories and several relations between them

  4. Related Work • First-order logic based approaches[Kautz and Allen, 1986; Ng and Mooney, 1992] • Knowledge base of plans and actions • Default reasoning or logical abduction to predict the best plan based on the observed actions • Unable to handle uncertainty in data or estimate likelihood of alternative plans • Probabilistic graphical models[Charniak and Goldman, 1989; Huber et al., 1994; Pynadath and Wellman, 2000; Bui, 2003; Blaylock and Allen, 2005] • Encode the domain knowledge using Bayesian networks, abstract hidden Markov models, or statistical n-gram models • Unable to handle relational/structured data • Statistical Relational Learning based approaches • Markov Logic Networks for plan recognition[Kate and Mooney, 2009; Singla and Mooney, 2011]

  5. Our Approach • Extend Bayesian Logic Programs (BLPs) [Kersting and De Raedt, 2001] for plan recognition • BLPs integrate first-order logic and Bayesian networks • Why BLPs? • Efficient grounding mechanism that includes only those variables that are relevant to the query • Easy to extend by incorporating any type oflogical inferenceto construct networks • Well suited for capturing causal relations in data

  6. Outline • Motivation • Background • Logical Abduction • Bayesian Logic Programs (BLPs) • Extending BLPs for Plan Recognition • Experiments • Conclusions

  7. Logical Abduction • Abduction • Process of finding the best explanation for a set of observations • Given • Background knowledge, B, in the form of a set of (Horn) clauses in first-order logic • Observations, O, in the form of atomic facts in first-order logic • Find • A hypothesis, H, a set of assumptions(atomic facts) that logically entail the observations given the theory: B  H  O • Best explanation is the one with the fewest assumptions

  8. Bayesian Logic Programs (BLPs) [Kersting and De Raedt, 2001] • Set of Bayesian clauses a|a1,a2,....,an • Definite clauses that are universally quantified • Range-restricted, i.e variables{head} variables{body} • Associated conditional probability table (CPT) • P(head|body) • Bayesian predicates a, a1, a2, …, an have finite domains • Combining rule like noisy-or for mapping multiple CPTs into a single CPT.

  9. Inference in BLPs[Kersting and De Raedt, 2001] • Logical inference • Given a BLP and a query, SLD resolution is used to construct proofs for the query • Bayesian network construction • Each ground atom is a random variable • Edges are added from ground atoms in the body to the ground atom in head • CPTs specified by the conditional probability distribution for the corresponding clause • P(X) = P(Xi | Pa(Xi)) • Probabilistic inference • Marginal probability given evidence • Most Probable Explanation (MPE) given evidence

  10. BLPs for Plan Recognition • SLD resolution is deductive inference, used for predicting observations from top-level plans • Plan recognition is abductive in nature and involves predicting the top-level plan from observations BLPs cannot be used as is for plan recognition

  11. Extending BLPs for Plan Recognition BLPs Logical Abduction + = BALPs BALPs – Bayesian Abductive Logic Programs

  12. Logical Abduction in BALPs • Given • A set of observation literals O = {O1, O2,….On} and a knowledge base KB • Compute a set abductive proofs of O using Stickel’s abduction algorithm[Stickel 1988] • Backchain on each Oi until it is proved or assumed • A literal is said to be proved if it unifies with a fact or the head of some rule in KB, otherwise it is said to be assumed • Construct a Bayesian network using the resulting set of proofs as in BLPs.

  13. Example – Intelligent User Interfaces • Top-level plan predicates • copy-file, move-file, remove-file • Action predicates • cp, rm • Knowledge Base (KB) • cp(Filename,Destdir) | copy-file(Filename,Destdir) • cp(Filename,Destdir) | move-file(Filename,Destdir) • rm(Filename) | move-file(Filename,Destdir) • rm(Filename) | remove-file(Filename) • Observed actions • cp(test1.txt, mydir) • rm(test1.txt)

  14. Abductive Inference Assumed literal copy-file(test1.txt,mydir) cp(test1.txt,mydir) cp(Filename,Destdir) | copy-file(Filename,Destdir)

  15. Abductive Inference Assumed literal copy-file(test1.txt,mydir) move-file(test1.txt,mydir) cp(test1.txt,mydir) cp(Filename,Destdir) | move-file(Filename,Destdir)

  16. Abductive Inference Match existing assumption copy-file(test1.txt,mydir) move-file(test1.txt,mydir) rm(test1.txt) cp(test1.txt,mydir) • rm(Filename) | move-file(Filename,Destdir)

  17. Abductive Inference Assumed literal copy-file(test1.txt,mydir) move-file(test1.txt,mydir) remove-file(test1) rm(test1.txt) cp(test1.txt,mydir) • rm(Filename) | remove-file(Filename)

  18. Structure of Bayesian network copy-file(test1.txt,mydir) move-file(test1.txt,mydir) remove-file(test1) rm(test1.txt) cp(test1.txt,mydir)

  19. Probabilistic Inference • Specifying probabilistic parameters • Noisy-and • Specify the CPT for combining the evidence from conjuncts in the body of the clause • Noisy-or • Specify the CPT for combining the evidence from disjunctive contributions from different ground clauses with the same head • Models “explaining away” • Noisy-and and noisy-or models reduce the number of parameters learned from data

  20. Probabilistic Inference copy-file(test1.txt,mydir) move-file(test1.txt,mydir) remove-file(test1) Noisy-or Noisy-or rm(test1.txt) cp(test1.txt,mydir)

  21. Probabilistic Inference • Most Probable Explanation (MPE) • For multiple plans, compute MPE, the most likely combination of truth values to all unknown literals given this evidence • Marginal Probability • For single top level plan prediction, compute marginal probabilityfor all instances of plan predicate and pick the instance with maximum probability • When exact inference is intractable, SampleSearch[Gogate and Dechter, 2007], an approximate inference algorithm for graphical models with deterministic constraints is used

  22. Probabilistic Inference copy-file(test1.txt,mydir) move-file(test1.txt,mydir) remove-file(test1) Noisy-or Noisy-or rm(test1.txt) cp(test1.txt,mydir)

  23. Probabilistic Inference copy-file(test1.txt,mydir) move-file(test1.txt,mydir) remove-file(test1) Noisy-or Noisy-or rm(test1.txt) cp(test1.txt,mydir) Evidence

  24. Probabilistic Inference Query variables copy-file(test1.txt,mydir) move-file(test1.txt,mydir) remove-file(test1) Noisy-or Noisy-or rm(test1.txt) cp(test1.txt,mydir) Evidence

  25. Probabilistic Inference MPE Query variables FALSE FALSE TRUE copy-file(test1.txt,mydir) move-file(test1.txt,mydir) remove-file(test1) Noisy-or Noisy-or rm(test1.txt) cp(test1.txt,mydir) Evidence

  26. Probabilistic Inference MPE Query variables FALSE FALSE TRUE move-file(test1.txt,mydir) copy-file(test1.txt,mydir) remove-file(test1) Noisy-or Noisy-or rm(test1.txt) cp(test1.txt,mydir) Evidence

  27. Parameter Learning • Learn noisy-or/noisy-and parameters using the EM algorithm adapted for BLPs [Kersting and De Raedt, 2008] • Partial observability • In plan recognition domain, data is partially observable • Evidence is present only for observed actions and top-level plans; sub-goals, noisy-or, and noisy-and nodes are not observed • Simplify learning problem • Learn noisy-or parameters only • Used logical-and instead of noisy-and to combine evidence from conjuncts in the body of a clause

  28. Experimental Evaluation • Monroe (Strategic planning) • Linux (Intelligent user interfaces) • Story Understanding (Story understanding)

  29. Monroe and Linux [Blaylock and Allen, 2005] • Task • Monroe involves recognizing top level plans in an emergency response domain (artificially generated using HTN planner) • Linux involves recognizing top-level plans based on linux commands • Single correct plan in each example • Data

  30. Monroe and Linux • Methodology • Manually encoded the knowledge base • Learned noisy-or parameters using EM • Computed marginal probability for plan instances • Systems compared • BALPs • MLN-HCAM [Singla and Mooney, 2011] • MLN-PC and MLN-HC do not run on Monroe and Linux due to scaling issues • Blaylock and Allen’s system [Blaylock and Allen, 2005] • Performance metric • Convergence score - measures the fraction of examples for which the plan predicate was predicted correctly

  31. Results on Monroe Convergence Score 94.2 * MLN-HCAM Blaylock & Allen BALPs * - Differences are statistically significant wrt BALPs

  32. Results on Linux Convergence Score 36.1 * MLN-HCAM Blaylock & Allen BALPs * - Differences are statistically significant wrt BALPs

  33. Experiments with partial observability • Limitations of convergence score • Does not account for predicting the plan arguments correctly • Requires all the observations to be seen before plans can be predicted • Early plan recognition with partial set of observations • Perform plan recognition after observing the first25%, 50%, 75%, and 100% of the observations • Accuracy – Assign partial credit for the predicting plan predicate and a subset of the arguments correctly • Systems compared • BALPs • MLN-HCAM [Singla and Mooney, 2011]

  34. Results on Monroe Accuracy Percent observations seen

  35. Results on Linux Accuracy Percent observations seen

  36. Story Understanding [Charniak and Goldman, 1991; Ng and Mooney, 1992] • Task • Recognize character’s top level plans based on actions described in narrative text • Multiple top-level plans in each example • Data • 25 examples in development set and 25 examples in test set • 12.6 observations per example • 8 top-level plan predicates

  37. Story Understanding • Methodology • Knowledge base was created for ACCEL [Ng and Mooney, 1992] • Parameters set manually • Insufficient number of examples in the development set to learn parameters • Computed MPE to get the best set of plans • Systems compared • BALPs • MLN-HCAM[Singla and Mooney, 2011] • Best performing MLN model • ACCEL-Simplicity [Ng and Mooney, 1992] • ACCEL-Coherence [Ng and Mooney, 1992] • Specific for Story Understanding

  38. Results on Story Understanding * * * - Differences are statistically significant wrt BALPs

  39. Conclusion • BALPS – Extension of BLPs for plan recognition by employing logical abduction to construct Bayesian networks • Automatic learning of model parameters using EM • Empirical results on all benchmark datasets demonstrate advantages over existing methods

  40. Future Work • Learn abductive knowledge base automatically from data • Compare BALPs with other probabilistic logics like ProbLog [De Raedt et. al, 2007], PRISM [Sato, 1995] and Poole’s Horn Abduction [Poole, 1993] on plan recognition

  41. Questions

  42. Backup

  43. Completeness in First-order Logic • Completeness - If a sentence is entailed by a KB, then it is possible to find the proof that entails it • Entailment in first-order logic is semidecidable, i.e it is not possible to know if a sentence is entailed by a KB or not • Resolution is complete in first-order logic • If a set of sentences is unsatisfiable, then it is possible to find a contradiction

  44. First-order Logic • Terms • Constants – individual entities like anna, bob • Variables – placeholders for objects like X, Y • Predicates • Relations over entities like worksFor, capitalOf • Literal – predicate or its negation applied to terms • Atom – Positive literal like worksFor(X,Y) • Ground literal – literal with no variables like worksFor(anna,bob) • Clause – disjunction of literals • Horn clause has at most one positive literal • Definite clause has exactly one positive literal

  45. First-order Logic • Quantifiers • Universal quantification - true for all objects in the domain • Existential quantification - true for some objects in the domain • Logical Inference • Forward Chaining– For every implication pq, if p is true, then q is concluded to be true • Backward Chaining – For a query literal q, if an implication pq is present and p is true, then q is concluded to be true, otherwise backward chaining tries to prove p

  46. Forward chaining • For every implication pq, if p is true, then q is concluded to be true • Results in addition of a new fact to KB • Efficient, but incomplete • Inference can explode and forward chaining may never terminate • Addition of new facts might result in rules being satisfied • It is data-driven, not goal-driven • Might result in irrelevant conclusions

  47. Backward chaining • For a query literal q, if an implication pq is present and p is true, then q is concluded to be true, otherwise backward chaining tries to prove p • Efficient, but not complete • May never terminate, might get stuck in infinite loop • Exponential • Goal-driven

  48. Herbrand Model Semantics • Herbrand universe • All constants in the domain • Herbrand base • All ground atoms atoms over Herbrand universe • Herbrand interpretation • A set of ground atoms from Herbrand base that are true • Herbrand model • Herbrand interpretation that satisfies all clauses in the knowledge base

  49. Advantages of SRL models over vanilla probabilistic models • Compactly represent domain knowledge in first-order logic • Employ logical inference to construct ground networks • Enables parameter sharing

  50. Parameter sharing in SRL father(john) father(mary) father(alice) θ1 θ2 θ3 parent(john) parent(mary) parent(alice) dummy

More Related