1 / 76

CS188: Computational Models of Human Behavior

CS188: Computational Models of Human Behavior. Introduction to graphical models slide Credits: Kevin Murphy, mark pashkin , zoubin ghahramani and jeff bilmes . Reasoning under uncertainty.

ordell
Download Presentation

CS188: Computational Models of Human Behavior

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS188: Computational Models of Human Behavior Introduction to graphical modelsslide Credits: Kevin Murphy, mark pashkin, zoubinghahramani and jeffbilmes

  2. Reasoning under uncertainty • In many settings, we need to understand what is going on in a system when we have imperfect or incomplete information • For example, we might deploy a burglar alarm to detect intruders • But the sensor could be triggered by other events, e.g., earth-quake • Probabilities quantify the uncertainties regarding the occurrence of events

  3. Probability spaces • A probability space represents our uncertainty regarding an experiment • It has two parts: • A sample space , which is the set of outcomes • the probability measure P, which is a real function of the subsets of  • A set of outcomes A is called an event. P(A) represents how likely it is that the experiment’s actual outcome be a member of A

  4. An example • If our experiment is to deploy a burglar alarm and see if it works, then there could be four outcomes:  = {(alarm, intruder), (no alarm, intruder), (alarm, no intruder), (no alarm, no intruder)} • Our choice of P has to obey these simple rules …

  5. The three axioms of probability theory • P(A)≥0 for all events A • P()=1 • P(A U B) = P(A) + P(B) for disjoint events A and B

  6. Some consequences of the axioms

  7. Example • Let’s assign a probability to each outcome ω • These probabilities must be non-negative and sum to one

  8. Conditional Probability

  9. Marginal probability • Marginal probability is then the unconditional probability P(A) of the event A; that is, the probability of A, regardless of whether event B did or did not occur. • For example, if there are two possible outcomes corresponding to events B and B', this means that • P(A) = P(AB) + P(AB’) • This is called marginalization

  10. Example • If P is defined by then P({(intruder, alarm)|(intruder, alarm),(no intruder, alarm)})

  11. The product rule • The probability that A and B both happen is the probability that A happens and B happens, given A has occurred

  12. The chain rule • Applying the product rule repeatedly: P(A1,A2,…,Ak) = P(A1) P(A2|A1)P(A3|A2,A1)…P(Ak|Ak-1,…,A1) • Where P(A3|A2,A1) = P(A3|A2A1)

  13. Bayes’ rule • Use the product rule both ways with P(AB) • P(A B) = P(A)P(B|A) • P(A B) = P(B)P(A|B)

  14. Random variables and densities

  15. Inference • One of the central problems of computational probability theory • Many problems can be formulated in these terms. Examples: • The probability that there is an intruder given the alarm went off is pI|A(true, true) • Inference requires manipulating densities

  16. Probabilistic graphical models • Combination of graph theory and probability theory • Graph structure specifies which parts of the system are directly dependent • Local functions at each node specify how different parts interaction • Bayesian Networks = Probabilistic Graphical Models based on directed acyclic graph • Markov Networks = Probabilistic Graphical Models based on undirected graph

  17. Some broad questions

  18. Bayesian Networks • Nodes are random variables • Edges represent dependence – no directed cycles allowed) • P(X1:N) = P(X1)P(X2|X1)P(X3|X1,X2) = P(Xi|X1:i-1) = P(Xi|Xi) x1 x2 x6 x7 x3 x4 x5

  19. Example • Water sprinkler Bayes net P(C,S,R,W)=P(C)P(S|C)P(R|C,S)P(W|C,S,R) chain rule =P(C)P(S|C)P(R|C)P(W|C,S,R) since R  S|C =P(C)P(S|C)P(R|C)P(W|S,R) since W  C|R,S

  20. Inference

  21. Naïve inference

  22. Problem with naïve representation of the joint probability • Problems with the working with the joint probability • Representation: big table of numbers is hard to understand • Inference: computing a marginal P(Xi) takes O(2N) time • Learning: there are O(2N) parameters to estimate • Graphical models solve the above problems by providing a structured representation for the joint • Graphs encode conditional independence properties and represent families of probability distribution that satisfy these properties

  23. Bayesian networks provide a compact representation of the joint probability

  24. Conditional probabilities

  25. Another example: medical diagnosis (classification)

  26. Approach: build a Bayes’ net and use Bayes’s rule to get class probability

  27. A very simple Bayes’ net: Naïve Bayes

  28. Naïve Bayes classifier for medical diagnosis

  29. Another commonly used Bayes’ net: Hidden Markov Model (HMM)

  30. Conditional independence properties of Bayesian networks: chains

  31. Conditional independence properties of Bayesian networks: common cause

  32. Conditional independence properties of Bayesian networks: explaining away

  33. Global Markov properties of DAGs

  34. Bayes ball algorithm

  35. Example

  36. Undirected graphical models

  37. Parameterization

  38. Clique potentials

  39. Interpretation of clique potentials

  40. Examples

  41. Joint distribution of an undirected graphical model Complexity scales exponentially as 2n for binary random variable if we use a naïve approach to computing the partition function

  42. Max clique vs. sub-clique

  43. Log-linear models

  44. Log-linear models

  45. Log-linear models

  46. Summary

  47. Summary

  48. From directed to undirected graphs

  49. From directed to undirected graphs

  50. Example of moralization

More Related