Loading in 2 Seconds...
Loading in 2 Seconds...
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Yuan (Alan) Qi Joint work with Gifford and Young labs High-resolution computational models of genome binding events Dana-Farber Cancer Institute Jan 2007
ChIP-chip Experiments • ChIP-chip data: • Encode valuable information about protein-DNA binding events. • Goal: • Decode accurate binding information from the noisy data. • Challenges: • Noise • Joint influence of multiple binding events
Joint Binding Deconvolution Data Likelihood Prior Distributions: Hyper Prior Distributions: JBD: generative probabilistic graphical model.
Shear Distribution (b) An influence function is derived from the measured fragment size distribution. (a) The distribution of DNA fragment sizes produced in the ChIP protocol were experimentally measured and statistically modeled.
Approximate Bayesian Inference Exact Bayesian posterior of binding events: Where and Non-conjugate models, thousands of variables -> Intractable calculations of the exact posterior distribution! Message passing algorithm (Expectation propagation): EP iteratively refines the factor approximations (i.e., messages) to improve the posterior approximation.
EP in a Nutshell • Approximate a probability distribution by simpler parametric terms: • Each approximation term lives in an exponential family (e.g., Gaussian or Gamma distributions).
EP in a Nutshell Three key steps: • Deletion: Approximate the “leave-one-out” posterior distribution for the ithfactor. • Minimization: Minimize the following KL divergence by moment matching. • Inclusion:
Spatial resolution comparison between JBD and other methods • The average distance of JBD’s Gcn4 binding predictions to motif sites is smaller than for other methods, and JDB identifies more known Gcn4 targets.
JBD better resolves proximal binding events than do other methods. Shown here is performance of the JBD, MPeak and Ratio methods on 200 simulated DNA regions each containing two binding events.
Using binding posterior to guide motif discovery Approach: • Using binding posterior probabilities derived from the ChIP-chip data to weight sequence regions differently for motif discovery. Results: • Finding Mig2 motif while a standard motif discovery algorithm (e.g., MEME) failed. • Note that the correct motif for Mig2 was not recovered when using the Ratio method to analyze the ChIP-chip data.
Positional priors for motif discovery improve robustness to false input DNA sequence regions.