1 / 35

An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011. Outline. Introduction, Bayesian Networks , Probabilistic Graphical Models, Conditional Independence, I-equivalence. Introduction.

joy
Download Presentation

An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

  2. Outline • Introduction, • Bayesian Networks, • Probabilistic Graphical Models, • Conditional Independence, • I-equivalence.

  3. Introduction • Our goal is to represent a joint distribution over some set of random variables . • Even in the simplest case where these variables are binary-valued, a joint distribution requires the specification of numbers. • The explicit representation of the joint distribution is unmanageable from every perspective: • Computationally, Cognitively, and Statistically.

  4. Bayesian Networks • Bayesian networks exploit conditional independenceproperties of the distribution in order to allow a compact and natural representation. • They are a specific type of probabilistic graphical models. • BNs are directed acyclic graphs (DAG).

  5. Probabilistic Graphical Models • Nodes are the random variables in our domain. • Edges correspond, intuitively, to direct influence of one node on another.

  6. Probabilistic Graphical Models • Graphs are an intuitive way of representing and visualising the relationships between many variables. • A graph allows us to abstract out the conditional independence relationships between the variables from the details of their parametric forms. • Thus we can answer questions like: “Is A dependent on B given that we know the value of C ?” just by looking at the graph. • Graphical models allow us to define general message-passing algorithms that implement probabilistic inference efficiently. Graphical models = statistics × graph theory × computer science.

  7. Bayesian Networks

  8. Bayesian Networks

  9. Conditional Independence: Example 1 tail-to-tail at c

  10. Conditional Independence: Example 1

  11. Conditional Independence: Example 1 Smoking Lung Cancer Yellow Teeth

  12. Conditional Independence: Example 2 head-to-tail at c

  13. Conditional Independence: Example 2

  14. Conditional Independence: Example 2 Type of Car Speed Amount of speeding Fine

  15. Conditional Independence: Example 3 head-to-head at c v-structure

  16. Conditional Independence: Example 3

  17. Conditional Independence: Example 3 Ability of team B Ability of team A Outcome of A vs. B game

  18. D-separation • A, B, and C are non-intersecting subsets of nodes in a directed graph. • A path from A to B is blocked if it contains a node such that either • the arrows on the path meet either head-to-tail or tail-to-tail at the node, and the node is in the set C, or • the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C. • If all paths from A to B are blocked, A is said to be d-separated from B by C. • If A is d-separated from B by C, the joint distribution over all variables in the graph satisfies .

  19. I-equivalence • Let be a distribution over . We define to be the set of independence assertions that hold in . • Two graph structures and over are I-equivalent if . • The set of all graphs over X is partitioned into a set of mutually exclusive and exhaustive I-equivalence classes.

  20. The skeleton of a Bayesian network • The skeleton of a Bayesian network graph over is an undirected graph over that contains an edge for every edge in .

  21. Immorality • A v-structure is an immorality if there is no direct edge between X and Y.

  22. Relationship between immorality, skeleton and I-equivalence • Let and be two graphs over . Then and have the same skeleton and the same set of immoralitiesif and only if they are I-equivalent. • We can use this theorem to recognize that whether two BNs are I-equivalent or not. • In addition, this theorem can be used for learning the structure of the Bayesian network related to a distribution. • We can construct the I-equivalence class for a distribution by determining its skeleton and its immoralities from the independence properties of the given distribution. • We then use both of these components to build a representation of the equivalence class.

  23. Identifying the Undirected Skeleton • The basic idea is to use independence queries of the form for different sets of variables . • If and are adjacent in , we cannot separate them with any set of variables. • Conversely, if and are not adjacent in , we would hope to be able to find a set of variables that makes these two variables conditionally independent: we call this set a witness of their independence.

  24. Identifying the Undirected Skeleton • Let be an I-map of a distribution , and let and be two variables that are not adjacent in . Then either or . • Thus, if and are not adjacent in , then we can find a witness of bounded size. • Thus, if we assume that has bounded indegree, say less than or equal to d, then we do not need to consider witness sets larger than d.

  25. Identifying Immoralities • At this stage we have reconstructed the undirected skeleton. Now, we want to reconstruct edge direction. • Our goal is to consider potential immoralitiesin the skeleton and for each one determine whether it is indeed an immorality. • A triplet of variables X, Z, Y is a potential immoralityif the skeleton contains but does not contain an edge between X and Y. • A potential immorality is an immorality if and only ifZ is not in the witness set(s) for X and Y.

  26. Representing Equivalence Classes • An acyclic graph containing both directed and undirected edges is called a partially directed acyclic graphor PDAG.

  27. Representing Equivalence Classes • Let be a DAG. A chain graph is a class PDAG of the equivalence class of if shares the same skeleton as , and contains a directed edge if and only if all that are I-equivalent to contain the edge . • If the edge is directed, then all the members of the equivalence class agree on the orientation of the edge. • If the edge is undirected, there are two DAGs in the equivalence class that disagree with the orientation of the edge.

  28. Representing Equivalence Classes • Is the output of Mark-Immoralities the class PDAG? • Clearly, edges involved in immoralities must be directed in K. • The obvious question is whether K can contain directed edges that are not involved in immoralities. • In other words, can there be additional edges whose direction is necessarily the same in every member of the equivalence class?

  29. Rules

  30. Example

  31. References • D. Koller and N. Friedman: Probabilistic Graphical Models. MIT Press, 2009. • C. M. Bishop: Pattern Recognition and Machine Learning. Springer, 2006.

  32. THANKS

More Related