Markov Chains

Markov Chains Algorithms in Computational Biology Spring 2006 Slides were edited by Itai Sharon from Dan Geiger and Ydo Wexler

Dependencies Along Biological Sequences • So far we assumed every letter in a sequence is sampled randomly from some distribution q() • This model could suffice for alignment scoring, but it is not the case in true genomes. • There are special subsequences in the genome in which dependencies between nucleotides exist • Example 1: TATA within the regulatory area, upstream a gene. • Example 2: CG pairs • We model such dependencies by Markov chains and hidden Markov Models (HMMs)

X1 X2 Xn-1 Xn Markov Chains • A chain of random variables in which the next one depends only on the current • Given X=x1…xn, then P(xi|x1…xi-1) =P(xi|xi-1) • The general case: kth–order Markov process • Given X=x1…xn, then P(xi|x1…xi-1) =P(xi|xi-1…xi-k)

Markov Chains • An integer time stochastic process, consisting of a domain D of m>1 states {s1,…,sm} and • An m dimensional initial distribution vector (p(s1),.., p(s,)). • An mxmtransition probabilities matrix M = (aij) • For example: • D can be the letters {A, C, G, T} • p(A) the probability of A to be the 1st letter in a sequence • aAG the probability that G follows A in a sequence.

Markov Chains • For each integer n, a Markov Chain assigns probability to sequences (x1…xn) over D (i.e, xiD) as follows: • Similarly, (x1…xi…) is a sequence of probability distributions over D. There is a rich theory which studies the properties of these sequences.

A B C D 0.95 0 0.05 0 A 0.2 0.5 0 0.3 B 0 0.2 0 0.8 C 0 0 1 0 D Matrix Representation • The transition probabilities Matrix M=(ast) • M is a stochastic Matrix: • The initial distribution vector (U1…Um) defines the distribution of X1P(X1= si)=Ui • Then after one move, the distribution changes to x2=x1M

A B C D 0.95 0 0.05 0 A 0.2 0.5 0 0.3 B 0 0.2 0 0.8 C 0 0 1 0 D Matrix Representation • Example: • if X1=(0, 1, 0, 0)Then X2=(0.2, 0.5, 0, 0.3) • And if X1=(0, 0, 0.5, 0.5) then X2=(0, 0.1, 0.5, 0.4). • The ith distribution is Xi=X1Mi-1

0.95 A B C D 0.95 0 0.05 0 A 0.2 0.5 A B 0.2 0.5 0 0.3 B 0.2 0.3 0.05 0.8 0 0.2 0 0.8 C D C 1 0 0 1 0 D Representing a Markov Model as a Digraph Each directed edge AB is associated with the transition probability from A to B.

0.6 0.4 0.8 rain no rain 0.2 Markov Chains – Weather Example • Weather forecast: • raining today 40% rain tomorrow  60% no rain tomorrow • No rain today 20% rain tomorrow  80% no rain tomorrow • Stochastic FSM:

p p p p 0 1 99 2 100 Start (10$) 1-p 1-p 1-p 1-p Markov Chains – Gambler Example • Gambler starts with 10$ • At each play we have one of the following: • Gambler wins 1$ with probability p • Gambler looses 1$ with probability 1-p • Game ends when gambler goes broke, or gains a fortune of 100$

A B D C Properties of Markov Chain States • States of Markov chains are classified by the digraph representation (omitting the actual probability values) • Recurrent states: • sis recurrent if it is accessible fromall states that are accessible from s.C and D are recurrent states. • Transient states: • “s is transient” if it will be visited a finite number of times as n. A and B are transient states.

E A B D C A B D C Irreducible Markov Chains • A Markov Chain is irreducible if the corresponding graph is strongly connected (and thus all its states are recurrent).

E A B D F C Properties of Markov Chain states • A state s has a period k if k is the GCD of the lengths of all the cycles that pass via s. • Periodic states • A state is periodic if it has a period k>1. in the shown graph the period of A is 2. • Aperiodic states • A state is aperiodic if it has a period k=1. in the shown graph the period of F is 1.

A B D C Ergodic Markov Chains • A Markov chain is ergodic if: • the corresponding graph is irreducible. • It is not peridoic • Ergodic Markov Chains are important • they guarantee the corresponding Markovian process converges to a unique distribution, in which all states have strictly positive probability.

Stationary Distributions for Markov Chains • Let M be a Markov Chain of m states, and let V=(v1,…, vm) be a probability distribution over the m states • V=(v1,…, vm)is stationary distribution for M if VM=V. one step of the process does not change the distribution V is a stationary distribution V is a left (row) Eigenvector of Mwith Eigenvalue 1

“Good” Markov chains • A Markov Chains is good if the distributions Xi satisfy the following as i: • converge to a unique distribution, independent of the initial distribution • In that unique distribution, each state has a positive probability • The Fundamental Theorem of Finite Markov Chains: • A Markov Chain is good  the corresponding graph is ergodic.

“Bad” Markov Chains • A Markov chains is not “good” if either : • It does not converge to a unique distribution • It does converge to a unique distribution, but some states in this distribution have zero probability • For instance: • Chains with periodic states • Chains with transient states

An Example: Searching the Genome for CpG Islands • In the human genome, the pair CG appears less than expected • the pair CG often transforms to (methyl-C) G which often transforms to TG. • Hence the pair CG appears less than expected from independent frequencies of C and G alone. • Due to biological reasons, this process is sometimes suppressed in short stretches of genome • such as in the start regions of many genes. • These areas are called CpG islands (p denotes “pair”).

CpG Islands • We consider two questions (and some variants): • Question 1: Given a short stretch of genomic data, does it come from a CpG island ? • Question 2: Given a long piece of genomic data, does it contain CpG islands in it, where, what length? • We “solve” the first question by modeling strings with and without CpG islands as Markov Chains • States are {A,C,G,T} but • Transition probabilities are different

CpG Islands • The “+” model • Use transition matrix A+=(a+st), Where: • a+st = (the probability that t follows s in a CpG island) • The “-” model • Use transition matrix A-=(a-st), Where: • A-st = (the probability that t follows s in a non CpG island)

CpG Islands • To solve Question 1 we need to decide whether a given short sequence of letters is more likely to come from the “+” model or from the “–” model. • This is done by using the definitions of Markov Chain, in which the parameters are determined by known data and the log odds-ratio test.

CpG Islands – the “+” Model • We need to specify p+(xi|xi-1) where + stands for CpG Island. From Durbin et al we have:

CpG Islands – the “-” Model • p-(xi|xi-1) for non-CpG Island is given by:

X1 X2 XL-1 XL CpG Islands • Given a string X=(x1,…, xL), now compute the ratio • RATIO>1 CpG island is more likely • RATIO<1  non-CpG island is more likely.

Markov Chains

Markov Chains

Presentation Transcript

11 - Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Eager Markov Chains

Markov chains

Markov chains

Distributed Markov Chains

Markov Chains

Markov Chains Regular Markov Chains Absorbing Markov Chains

Markov Chains and Hidden Markov Models

Markov Chains

Markov Chains

Tutorial: Markov Chains

Markov Chains

Markov Chains

Markov chains

Markov Chains