Practical applications of hmms chromhmm
1 / 14

Practical applications of HMMs : ChromHMM - PowerPoint PPT Presentation

  • Uploaded on

Practical applications of HMMs : ChromHMM. Sushmita Roy Nov 5th. Chromatin organization and gene expression. http:// =eYrQ0EhVCYA. ChIP-seq to measure histone data. Adapted from Dewey lecture and Peter Park Nature Genetics Review.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Practical applications of HMMs : ChromHMM' - kele

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Practical applications of hmms chromhmm

Practical applications of HMMs: ChromHMM

Sushmita Roy

Nov 5th

Chromatin organization and gene expression
Chromatin organization and gene expression

Chip seq to measure histone data
ChIP-seq to measure histone data

Adapted from Dewey lecture and Peter Park Nature Genetics Review

Chip seq data for multiple marks
ChIP-seq data for multiple marks

Chromatin state: A specific combinations of mark values.

Important because it can be used to segment the genome into biologically

meaningful units.

Problem definition
Problem definition

  • Given

    • A collection of genome-wide measurements of chromatin marks

  • Do

    • Segment the genome into N chromatin states

An hmm for segmenting genomes using chromatin marks
An HMM for segmenting genomes using chromatin marks

  • HMM

    • State: chromatin state

    • Emission->multiple chromatin marks

    • Need a multi-variate HMM

Binarizing the chromatin data
Binarizing the chromatin data

  • Each mark is represented by a binary variable vt,m:

    • 1: mark is present

    • 0: mark is absent




Genomic sequence








Chromhmm with 3 states
ChromHMM with 3 states





Chromhmm notation
ChromHMM notation

  • pk,mdenotes the probability of mark mbeing ON in state k

  • Emission probability of M marks per state is a product of M bernoulli random variables.

  • bk,l denotes the probability of transitioning from state i to state j

  • ak: initial probability of state k

Learning the chromhmm
Learning the ChromHMM

  • Need to figure out the number of states

  • Learn HMMs for K=2 to 80 states with a penalty factor to penalize the number of parameters

  • State transitions: start with the fully connected HMM, and if set parameters to zero if <10-10

  • Final model had 51 states

Learned emission parameters
Learned Emission parameters

Emission parameters for state 5


Example output around capza2 gene from chromhmm
Example output around CAPZA2 gene from ChromHMM

Input chromatin marks

Inferred state sequences

Posterior probability distributions of all 51 states around capza gene
Posterior probability distributions of all 51 states around CAPZA gene

Max posterior state

Posterior probability values of each state

Summary CAPZA gene

  • HMMs are powerful models to capture sequential data

  • Very popular in computational biology

    • Gene annotation

    • Representation of a profile: protein domain finding

    • Genome segmentation