1 / 15

How to find foreign genes? Markov Models

A AAA : 10% A AAC : 15% A AAG : 40% A AAT : 35%. AAA AAC AAG AAT ACA . . . TTG TTT. Training Set. How to find foreign genes? Markov Models. Building the model. Candidate gene. 0.10. AAAACAA…. How to find foreign genes? Markov Models. 3rd order Markov model.

kaycee
Download Presentation

How to find foreign genes? Markov Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AAAA: 10% AAAC: 15% AAAG: 40% AAAT: 35% AAAAACAAGAATACA . . .TTGTTT TrainingSet How to find foreign genes?Markov Models Building the model

  2. Candidategene 0.10 AAAACAA… How to find foreign genes?Markov Models 3rd order Markov model A C G TAAA 0.10 0.15 0.40 0.35AAC 0.25 0.45 0.25 0.05AAG 0.25 0.20 0.30 0.25 AAT 0.25 0.20 0.30 0.25 ACA 0.15 0.20 0.25 0.40 . . .TTG 0.20 0.50 0.05 0.25TTT 0.10 0.55 0.25 0.10

  3. Prg = 1 Pgy = 1 Pyr = 1 Markov Chains A traffic light considered as a sequence of states A trivial Markov chain – the transition probability between the states is always 1

  4. Markov Chains A traffic light considered as a sequence of states If we watch our traffic light, it will emit a string of states In the case of a simple Markov model, the state labels (e.g. green, red, yellow) are the observable outputs of the process

  5. Markov Chains An occasionally malfunctioning traffic light!! Pgy = 1 Prg = .85 Pyg = .10 Pyr = .9 Pry = .15 The Markov property is that the probability of observing next a given future state depends only on the current state!

  6. The transition probability ast from state s to state t… …is equal to the probability that the ith state was t.. given that that the immediately proceeding state (xi-1) was s Markov Chains The Markov Property English Translation: ast = P(xi = t | xi-1 = s) This is a form of conditional probability

  7. Markov Chain An occasionally malfunctioning traffic light!! Now we can consider the probability of an observed sequence!

  8. The probability of observing sequence of states x... ...is equal to the probability that the XLth state was whatever AND the XL-1th state was whatever else, AND etc., etc. Markov Chains What is the probability of chain of events x? English Translation: P(x) = P(xL, xL-1, … ,x1) This is a form of joint probability

  9. English Translation: The probability of events X AND Y happening is equal to the probability of X happening given that Y has already happened, times the probability of event Y Markov Chains What is the probability of chain of events x? P(x) = P(xL, xL-1, … ,x1) = P(xL | xL-1, … ,x1) P(xL-1 | xL-2, … ,x1) ... P(x1) This is because P(X,Y) = P(X|Y) * P(Y)

  10. Therefore: P(x) = P(xL | xL-1) P(xL-1 | xL-2) ... P(x2|x1) P(x1) L P P(x) = P(x1) axi-1xi i=2 Markov Chains What is the probability of chain of events x? P(x) = P(xL | xL-1, … ,x1) P(xL-1 | xL-2, … ,x1) ... P(x1) But remember the key property of a Markov Chain is that probability of symbol xidepends ONLY on the value of preceding symbol Xi-1!!

  11. Markov Chains How about nucleic acid sequences? A C T G No reason why nucleic acid sequences found in an organism cannot be modeled using Markov chains

  12. States Transition probabilities Markov Model What do we need to probabilistically model DNA sequences? A C T G The states are the same for all organisms, so the transition probabilities are the modelparameters we need to estimate

  13. AAAA: 10% AAAC: 15% AAAG: 40% AAAT: 35% AAAAACAAGAATACA . . .TTGTTT TrainingSet Parameter estimation Building the Markov Model This is a maximum likelihood approach to parameter estimation. Such procedures maximize the overall probability of the training set data.

  14. Markov Model Which model best explains a newly observed sequence? A A C C G T G T Organism B Organism A Each organism will have different transition probabilities parameters, so you can ask “was the sequence more likely to be generated by model A or model B?”

  15. Markov Model Which model best explains a newly observed sequence? P(x|model A) S(x) = log P(x|model B) L =S aAxi-1xi log aBxi-1xi i =1 A commonly used metric for discrimination using Markov Chains is the Log-Odds ratio

More Related