Download Presentation
## 11 - Markov Chains

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**11 - Markov Chains**Jim Vallandingham**Outline**• Irreducible Markov Chains • Outline of Proof of Convergence to Stationary Distribution • Convergence Example • Reversible Markov Chain • Monte Carlo Methods • Hastings-Metropolis Algorithm • Gibbs Sampling • Simulated Annealing • Absorbing Markov Chains**Stationary Distribution**• As approaches Each row is the stationary distribution**Stationary Dist. Example**• Long Term averages: • 24% time spent in state E1 • 39% time spent in state E2 • 21% time spent in state E3 • 17% time spent in state E4**Stationary Distribution**• Any finite, aperiodic irreducible Markov chain will converge to a stationary distribution • Regardless of starting distribution • Outline of Proof requires linear algebra • Appendix B.19**L.A. : Eigenvalues**• Let P be an s x s matrix. • P has s eigenvalues • Found as the s solutions to • Assume all eigenvalues of P are distinct**L.A. : left & right eigenvectors**• Corresponding to each eigenvalue • Is a right eigenvector - • And a left eigenvector - • For which: • Assume they are normalized:**L.A. : Spectral Expansion**• Can express P in terms of its eigenvectors and eigenvalues: • Called a spectral expansion of P**L.A. : Spectral Expansion**• If is an eigenvalue of P with corresponding left and right eigenvectors & • Then is an eigenvalue of Pn with same left and right eigenvectors &**L.A. : Spectral Expansion**• Implies spectral expansion of Pn can be written as:**Outline of Proof**• Going back to proof… • P is transition matrix for finite aperiodic irreducible Markov chain • P has one eigenvalue, equal to 1 • All other eigenvalues have absolute value < 1**Outline of Proof**• Choosing left and right eigenvectors of • Requirements: • Also satisfies : & = 1 Probability vector (sum to 1) Normalization (definition of left eigenvector as eigenvalue of 1)**Outline of Proof**• Also: • Can be shown that there is a unique solution of this equation that also satisfies so so that Same equation satisfied by the stationary distribution**Outline of Proof**• Pn gives the n-step transition probabilities. • Spectral Expansion of Pn is: • So as n increases Pn approaches Only one eigenvalue is = 1. Rest are < 1**Convergence Example**Has Eigenvalues of :**Convergence Example**Has Eigenvalues of : Less than 1**Convergence Example**• Left & Right eigenvectors satisfying**Convergence Example**• Left & Right eigenvectors satisfying Stationary distribution**Convergence Example**• Spectral expansion Stationary distribution 0 0 0**Reversible Markov Chains**• Typically moving forward in ‘time’ in a Markov chain • 1 2 3 … t • What about moving backward in this chain? • t t-1 t-2 … 1**Reversible Markov Chains**Ancestor Back in time Forward in time Species A Species B**Reversible Markov Chains**• Have a finite irreducible aperiodic Markov chain • with stationary distribution • During t transitions, chain will move through states: • Reverse chain • Define • Then reverse chain will move through states:**Reversible Markov Chains**• Want to show structure determining the reverse chain sequence is also a Markov chain • Typical element found from typical element of P, using:**Reversible Markov Chains**• Shown by using Bayes rule to invert conditional probability • Intuitively: • The future is independent of the past, given the present • The past is independent of the future, given the present**Reversible Markov Chains**• Stationary distribution of reverse chain is still • Follows from Stationary distribution property**Reversible Markov Chains**• Markov chain is said to be reversible if • This only holds if**Markov Chain Monte Carlo**• Class of algorithms for sampling from probability distributions • Involve constructing a Markov Chain • Want to have stationary distribution • State of chain after large number of steps is used as a sample of desired distribution • We discuss 2 algorithms • Gibbs Sampling • Simulated Annealing**Basic Problem**• Find transition matrix P such that • Its stationary distribution is the target distribution • Know that Markov chain will converge to stationary distribution, regardless of initial distribution • How can we find such a P with its stationary distribution as the target distribution?**Basic Idea**• Construct transition matrix Q • “candidate generating matrix” • Modify to have correct stationary distribution • Modification involves inserting factors • So that Various ways to picking a’s**Hastings-Metropolis**• Goal: construct aperiodic irreducible Markov chain • Having prescribed stationary distribution • Produces a correlated sequence of draws from the target density that may be difficult to sample using a classical independence method.**Hastings-Metropolis**Process: • Choose set of constants • Such that • And • Define Accept state change Reject state change Chain doesn’t change value**Hastings-Metropolis Example**= (.4 .6) Q =**Hastings-Metropolis Example**= (.4 .6) Q = P=**Hastings-Metropolis Example**= (.4 .6) P= P2= P50=**Algorithmic Description**• Start with State E1, then iterate • Propose E’ from q(Et,E’) • Calculate ratio • If a > 1, • Accept E(t+1) = E’ • Else • Accept with probability of a • If rejected, E(t+1) = Et**Gibbs Sampling**Definitions Be the random vector Be the distribution of Assume We define a Markov chain whose states are the possible values of Y**Gibbs Sampling**Process • Enumerate vectors in some order • 1, 2,…,s • Pick vector j with jth state in chain • pij : • 0 : if vectors i & j differ by more than 1 component If they differ by at most 1 component, y1***Gibbs Sampling**• Assume Joint distribution p(X,Y) • Looking to sample k values of X • Begin with value of y0 • Sample xi using p(X | Y = yi-1) • Once xi is found use it to find yi • p(Y | X = xi) • Repeat k times**Gibbs Sampling**• Allows us to deal with univariate conditional distributions • Instead of complex joint distributions • Chain has stationary distribution of**Why is is Hastings-Metropolis ?**• If we define • Can see that for Gibbs: • When a is always 1**Simulated Annealing**• Goal: Find (approximate) minimum of some positive function • Function defined on an extremely large number of states, s • And to find those states where this function is minimized • Value of the function for state is:**Simulated Annealing**Process • Construct neighborhood of each state • Set of states “close” to the state • Variable in Markov chain can move to a neighbor in one step • Moves outside neighborhood not allowed**Simulated Annealing**• Requirements of neighborhood • If is in neighborhood of then is in the neighborhood of • Number of states in a neighborhood (N) is independent of that state • Neighborhoods are linked so that chain can eventually make it from any Ej to any Em. • If in state Ej, then the next move must be in neighborhood of Ej.