cse182 l10
Download
Skip this Video
Download Presentation
CSE182-L10

Loading in 2 Seconds...

play fullscreen
1 / 17

CSE182-L10 - PowerPoint PPT Presentation


  • 69 Views
  • Uploaded on

CSE182-L10. HMM applications. Probability of being in specific states. What is the probability that we were in state k at step I? Pr[All paths that passed through state k at step I, and emitted x] Pr[All paths that emitted x]. The Forward Algorithm.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' CSE182-L10' - yoshio-mosley


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cse182 l10

CSE182-L10

HMM applications

probability of being in specific states
Probability of being in specific states
  • What is the probability that we were in state k at step I?

Pr[All paths that passed through state k at step I, and emitted x]

Pr[All paths that emitted x]

the forward algorithm
The Forward Algorithm
  • Recall v[i,j] : Probability of the most likely path the automaton chose in emitting x1…xi, and ending up in state j.
  • Define f[i,j]: Probability that the automaton started from state 1, and emitted x1…xi
  • What is the difference?

x1…xi

most likely path versus probability of arrival
Most Likely path versus Probability of Arrival
  • There are multiple paths from states 1..j in which the automaton can output x1…xi
  • In computing the viterbi path, we choose the most likely path
    • V[i,j] = maxπ Pr[x1…xi|π]
  • The probability of emitting x1…xi and ending up in state j is given by
    • F[i,j] = ∑π Pr[x1…xi|π]
the forward algorithm1
The Forward Algorithm
  • Recall that
    • v(i,j) = max lQ {v(i-1,l).A[l,j] }.ej(xi)
  • Instead
    • F(i,j) = ∑lQ (F(i-1,l).A[l,j] ).ej(xi)

1

j

the backward algorithm
The Backward Algorithm
  • Define b[i,j]: Probability that the automaton started from state i, emitted xi+1…xn and ended up in the final state

xi+1…xn

x1…xi

1

m

i

forward backward scoring
Forward Backward Scoring
  • F(i,j) = ∑lQ (F(i-1,l).A[l,j] ).ej(xi)
  • B[i,j] = ∑lQ (A[j,l].el(xi+1) B(i+1,l))
  • Pr[x,πi=k]=F(i,k) B(i,k)
application of hmms

1 2 3 4 5 6 7 8

A

C

G

T

0.9 0.4 0.3 0.6 0.1 0.0 0.2 1.0

0.0 0.2 0.7 0.0 0.3 0.0 0.0 0.0

0.1 0.2 0.0 0.0 0.3 1.0 0.3 0.0

0.0 0.2 0.0 0.4 0.3 0.0 0.5 0.0

Application of HMMs
  • How do we modify this to handle indels?
applications of the hmm paradigm
Applications of the HMM paradigm
  • Modifying Profile HMMs to handle indels
  • States Ii: insertion states
  • States Di: deletion states

1 2 3 4 5 6 7 8

A

C

G

T

0.9 0.4 0.3 0.6 0.1 0.0 0.2 1.0

0.0 0.2 0.7 0.0 0.3 0.0 0.0 0.0

0.1 0.2 0.0 0.0 0.3 1.0 0.3 0.0

0.0 0.2 0.0 0.4 0.3 0.0 0.5 0.0

profile hmms
Profile HMMs
  • An assignment of states implies insertion, match, or deletion. EX: ACACTGTA

1 2 3 4 5 6 7 8

A

C

G

T

0.9 0.4 0.3 0.6 0.1 0.0 0.2 1.0

0.0 0.2 0.7 0.0 0.3 0.0 0.0 0.0

0.1 0.2 0.0 0.0 0.3 1.0 0.3 0.0

0.0 0.2 0.0 0.4 0.3 0.0 0.5 0.0

C

A

A

C

T

G

T

A

viterbi algorithm revisited
Viterbi Algorithm revisited
  • Define vMj (i)as the log likelihood score of the best path for matching x1..xi to profile HMM ending with xi emitted by the state Mj.
  • vIj(i)andvDj(i)are defined similarly.
viterbi equations for profile hmms
Viterbi Equations for Profile HMMs

vMj-1(i-1) + log(A[Mj-1, Mj])

vMj(i) = log (eMj(xi)) + max vIj-1(i-1) + log(A[Ij-1, Mj])

vDj-1(i-1) + log(A[Dj-1, Mj])

vMj(i-1) + log(A[Mj-1, Ij])

vIj(i) = log (eIj(xi)) + max vIj(i-1) + log(A[Ij-1, Ij])

vDj(i-1) + log(A[Dj-1, Ij])

compositional signals
Compositional Signals
  • CpG islands. In genomic sequence, the CG di-nucleotide is rarely seen
  • CG helps methylation of C, and subsequent mutation to T.
  • In regions around a gene, the methylation is suppressed, and therefore CG is more common.
  • CpG islands: Islands of CG on the genome.
  • How can you detect CpG islands?
an hmm for genomic regions
An HMM for Genomic regions
  • Node A emits A with Prob. 1, and 0 for all other bases.
  • The start and end node do not emit any symbol.
  • All outgoing edges from nodes are equi-probable, except for the ones coming out of C.

A

G

.25

0.1

end

start

C

0.4

T

.25

an hmm for cpg islands
An HMM for CpG islands
  • Node A emits A with Prob. 1, and 0 for all other bases.
  • The start and end node do not emit any symbol.
  • All outgoing edges from nodes are equi-probable, except for the ones coming out of C.

A

G

0.25

0.25

end

start

C

0.25

T

hmm for detecting cpg islands
HMM for detecting CpG Islands

A

B

A

G

A

0.1

end

G

start

end

C

start

0.4

T

C

T

  • In the best parse of a genomic sequence, each base is assigned a state from the sets A, and B.
  • Any substring with multiple states coming from B can be described as a CpG island.
hmm summary
HMM: Summary
  • HMMs are a natural technique for modeling many biological domains.
  • They can capture position dependent, and also compositional properties.
  • HMMs have been very useful in an important Bioinformatics application: gene finding.
ad