An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

1 / 35

# An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011 - PowerPoint PPT Presentation

An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011. Outline. Introduction, Bayesian Networks , Probabilistic Graphical Models, Conditional Independence, I-equivalence. Introduction.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
An introduction to Bayesian networks

Stochastic Processes Course

Hossein Amirkhani

Spring 2011

Outline
• Introduction,
• Bayesian Networks,
• Probabilistic Graphical Models,
• Conditional Independence,
• I-equivalence.
Introduction
• Our goal is to represent a joint distribution over some set of random variables .
• Even in the simplest case where these variables are binary-valued, a joint distribution requires the specification of numbers.
• The explicit representation of the joint distribution is unmanageable from every perspective:
• Computationally, Cognitively, and Statistically.
Bayesian Networks
• Bayesian networks exploit conditional independenceproperties of the distribution in order to allow a compact and natural representation.
• They are a specific type of probabilistic graphical models.
• BNs are directed acyclic graphs (DAG).
Probabilistic Graphical Models
• Nodes are the random variables in our domain.
• Edges correspond, intuitively, to direct influence of one node on another.
Probabilistic Graphical Models
• Graphs are an intuitive way of representing and visualising the relationships between many variables.
• A graph allows us to abstract out the conditional independence relationships between the variables from the details of their parametric forms.
• Thus we can answer questions like: “Is A dependent on B given that we know the value of C ?” just by looking at the graph.
• Graphical models allow us to define general message-passing algorithms that implement probabilistic inference efficiently.

Graphical models = statistics × graph theory × computer science.

Conditional Independence: Example 1

Smoking

Lung Cancer

Yellow Teeth

Conditional Independence: Example 2

Type of Car

Speed

Amount of speeding Fine

Conditional Independence: Example 3

v-structure

Conditional Independence: Example 3

Ability of team B

Ability of team A

Outcome of A vs. B game

D-separation
• A, B, and C are non-intersecting subsets of nodes in a directed graph.
• A path from A to B is blocked if it contains a node such that either
• the arrows on the path meet either head-to-tail or tail-to-tail at the node, and the node is in the set C, or
• the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.
• If all paths from A to B are blocked, A is said to be d-separated from B by C.
• If A is d-separated from B by C, the joint distribution over all variables in the graph satisfies .
I-equivalence
• Let be a distribution over . We define to be the set of independence assertions that hold in .
• Two graph structures and over are I-equivalent if .
• The set of all graphs over X is partitioned into a set of mutually exclusive and exhaustive I-equivalence classes.
The skeleton of a Bayesian network
• The skeleton of a Bayesian network graph over is an undirected graph over that contains an edge for every edge in .
Immorality
• A v-structure is an immorality if there is no direct edge between X and Y.
Relationship between immorality, skeleton and I-equivalence
• Let and be two graphs over . Then and have the same skeleton and the same set of immoralitiesif and only if they are I-equivalent.
• We can use this theorem to recognize that whether two BNs are I-equivalent or not.
• In addition, this theorem can be used for learning the structure of the Bayesian network related to a distribution.
• We can construct the I-equivalence class for a distribution by determining its skeleton and its immoralities from the independence properties of the given distribution.
• We then use both of these components to build a representation of the equivalence class.
Identifying the Undirected Skeleton
• The basic idea is to use independence queries of the form for different sets of variables .
• If and are adjacent in , we cannot separate them with any set of variables.
• Conversely, if and are not adjacent in , we would hope to be able to find a set of variables that makes these two variables conditionally independent: we call this set a witness of their independence.
Identifying the Undirected Skeleton
• Let be an I-map of a distribution , and let and be two variables that are not adjacent in . Then either or .
• Thus, if and are not adjacent in , then we can find a witness of bounded size.
• Thus, if we assume that has bounded indegree, say less than or equal to d, then we do not need to consider witness sets larger than d.
Identifying Immoralities
• At this stage we have reconstructed the undirected skeleton. Now, we want to reconstruct edge direction.
• Our goal is to consider potential immoralitiesin the skeleton and for each one determine whether it is indeed an immorality.
• A triplet of variables X, Z, Y is a potential immoralityif the skeleton contains but does not contain an edge between X and Y.
• A potential immorality is an immorality if and only ifZ is not in the witness set(s) for X and Y.
Representing Equivalence Classes
• An acyclic graph containing both directed and undirected edges is called a partially directed acyclic graphor PDAG.
Representing Equivalence Classes
• Let be a DAG. A chain graph is a class PDAG of the equivalence class of if shares the same skeleton as , and contains a directed edge if and only if all that are I-equivalent to contain the edge .
• If the edge is directed, then all the members of the equivalence class agree on the orientation of the edge.
• If the edge is undirected, there are two DAGs in the equivalence class that disagree with the orientation of the edge.
Representing Equivalence Classes
• Is the output of Mark-Immoralities the class PDAG?
• Clearly, edges involved in immoralities must be directed in K.
• The obvious question is whether K can contain directed edges that are not involved in immoralities.
• In other words, can there be additional edges whose direction is necessarily the same in every member of the equivalence class?
References
• D. Koller and N. Friedman: Probabilistic Graphical Models. MIT Press, 2009.
• C. M. Bishop: Pattern Recognition and Machine Learning. Springer, 2006.