Андрей Андреевич Марков

Андрей АндреевичМарков

Markov Chains Never look behind you… Graduate Seminar in Applied Statistics Presented by Matthias Theubert

Preface The so-called Markov Chains can be used to describe special stochastic processes over a longer period of time and calculate the probabilities of a future state of the process without to much effort. => This makes the method very interesting for predictions. Named after the russian mathematician Andrei Andrejewitsch Markov (1856-1922) who used a comparatively simple way to describe stochastic processes.

Basics A random process or stochastic processX is a collection of random variables { Xt: tT } indexed by some set T, which we will usually will think of as time. If T = { 0, 1, 2, … } the process is called a discrete time process. If T = R or T = { 0, }, we call it a continuous time process. In this talk only discrete time processes will be considered.

Basics Let {X0, X1, …} be a sequence of random variables which takes values in some countable set S, called the state space. • thus, each Xt is a discrete random variable. We write Xn = iand say the process is in state i at time n.

Markov property Definition 1: The process X is said to be a Markov Chain if it satisfies the Markov property: Informally, the Markov property says that given the past history of the process, future behaviour only depends on the current value.

Markov Chains A Markov chain is specified by: • the state space S • the transition probabilities • and the initial distribution

The bug in the maze Transition probabilities: p11= 0 p12= ½ p13= ½ p14= 0 etc. 3 1 4 2

Transition probabilities Probability that the system will be in state j at time n+1if it is in state i at n. These probabilities are called the (one step) transition probabilities for the chain. Can be represented as a transition matrix P.

Transition matrix for the example

Transition matrix The transition matrix P is a stochastic matrix. That is • P has no negative entries: • As each row i describes the probability function the row sums equal one

Homogenous Markov Chain A Markov Chain is called homogenous (or Markov Chain with stationary transition probabilities)if the transition probabilities are independent of time t. = P(Xt+1= j | Xt = i) = P(Xt= j | Xt-1= i )

n-step Transition probabilities It can be also of interest how likely it is, that the system is in state j after two, three, … n steps, given it is now in state i. For describing the long-term behaviour of a Markov chain one can define the n-step transition probabilities The n-step transition matrix is the matrix of n-step transition probabilities

n-step transition probabilities Example: For p14(2) we have: p14(2)= In general: 1 p1k pi1 pi2 p2k 2 i k … pNk N piN pik(2)=

Chapman-Kolmogorov Equations Theorem:

Notation • State i leads to state j : if the chain may ever visit state j with positive probability, starting from state i (possibly in more than one step). • i and j communicate: if and • It can be proven, that and implies

Classes We can partition the state space Sinto equivalence classes w.r.t. . These are called the communicating classes of the chain. If there is a single communicating class, the chain is said to be irreducible. A class Cis said to be closed if implies Once the Markov chain enters a closed class it will never leave the class.

Example Suppose S= { 1, 2, 3, 4, 5, 6} and This can be drawn as a directed graph with vertices S and directed edge from i to j if pij > 0. This gives:

Example The classes are {1, 2, 3}, {4} and {5, 6} with {5, 6} being a closed class. 1/2 2 4 1/2 1/3 1 1/2 1 1 5 6 3 1/3 1/3 1/2 1

Absorbing state If the system reaches a state i that can not be leaved, this state is called absorbing. It is a own closed class { i }. Formally: The Markov-chain is called absorbing if it has at least one absorbing state. 1 i

Recurrence and Transience For any states i and j , define to be the probability that, starting in i, the first transition into j occurs at time n.

Recurrence and Transience Let Definition: A state j is said to be recurrent if and transitive otherwise. Informally, a state j is recurrent if the chain is certain to return to j if it starts in j.

Initial distribution The initial distribution d(1) := [d1(1), d2(1), ..., dm(1)] = [P(X1= 1), P(X1=2), ..., P(X1=m)] gives the probabilities that the Markov Chain starts in state j. For the example it may be: d(1) = (0.25, 0.25, 0.25, 0.25) if we have no better information d(1) = ( 1, 0, 0, 0) if we know the chain starts in state 1 for example. …

Marginal distribution The n-step transition probabilities are the conditional probabilities to be in state j at time m+ngiven that the system is in state i at time m. In general one is also interested in the marginal probability to be in state i at a given time t, , without the condition that the system is in a certain state at certain time before. This question can be answered by the marginal distribution.

Marginal distribution Given the initial probability distribution for the first state, the probability function for the state Xt , can be computed as = (P (Xt = 1), …, P (Xt = m) ) = d (1) Pn-1

Long term behaviour / stationary distribution What happens to the chain for very large t ? One can show that, in case the Markov chain is homogeneous and irreducible, converges to a fixed vector, say , for large t. The vector is called the stationary distribution of P and the Markov chain is said to be stationary.

Long term behaviour If one calculates the transition probabilities in the bug example for large t one gets: P(17) =P(18) = If we multiply it by any initial distribution we get the stationary distribution d= (0.207, 0.207, 0.276, 0.310)

Long term behaviour Convergence to Equilibrium: The effect is that in the long run an irreducible and aperiodic Markov chain “forgets where it started”. The probability of being in state j at time n , converges to , regardless of the initial state. It is easy to see what can go wrong if the chain is not irreducible. For example there are two closed classes, then the long term behaviour will be different depending on which class it starts in. Similarly, if the chain is periodic, then the value of pii(n), for example, will be zero unless n is a multiple of the period of the chain.

Thank you for your attention!

Андрей Андреевич Марков

Андрей Андреевич Марков

Presentation Transcript