1 / 52

Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets

Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets. Wyeth Wasserman Jan. 18, 2012. opossum.cisreg.ca/oPOSSUM3. Welcome. If you encounter any technical difficulties during the webinar Type a report using the chat option Slide presentation ~20 min

loring
Download Presentation

Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets Wyeth Wasserman Jan. 18, 2012 opossum.cisreg.ca/oPOSSUM3

  2. Welcome • If you encounter any technical difficulties during the webinar • Type a report using the chat option • Slide presentation ~20 min • Compile Questions as they are submitted and answer them during the final Q&A/discussion period • During the discussion session, we’ll allow audience speaking

  3. Webinar Format • Introduction • Walk-Through • Summary • Q&A

  4. INTRoduction

  5. Overview • Given co-expressed gene sets, what are the key mediators of co-expression? • Focus on TFs • Web-based software system for motif enrichment analysis • Co-expressed genes or sequences • Multiple sets of analysis methods • Available for human, mouse, fly, worm, yeast

  6. p=0.66 p=0.55 p=0.04 Motif Enrichment Analysis Background Target Finds over-represented TFBS in co-expressed gene sets

  7. What do we need? • Region selection • Where to look for enriched binding sites • Use conservation filter to restrict search space • TFBS profiles to search for • Need a pool of validated profiles • Scoring metrics for enrichment • How to measure motif over-representation

  8. Conserved Region Selection Gene CR1 CR2 CR3 CR4 Threshold phastCons Score Genomic Position

  9. TFBS Profiles • JASPAR 2010: Portales-Casamaret al. Nucleic Acids Research 2009. • Expanded collection of TFBS profiles • 130 vertebrate profiles • 105 insect profiles • 5 nematode profiles • 177 yeast profiles • PBM (104), PBM_HOMEO (176), PBM_BHLH (19) • Standardized 2-level TF classification (class, family)

  10. Scoring Metrics • Z scores • Based on the number of occurrences of the TFBS relative to background • Normalized for sequence length • Simple binomial distribution model • Fisher scores • Fisher exact probability test • Fisher score = -log(Fisher p-value) • Based on the number of genes containing the TFBS relative to background

  11. Additional Metric for Seq-Based • KS scores • Kolmogorov-Smirnoff test • Compares the empirical distribution of the distances of the binding sites from the maximum point of confidence (MPC) to the background • Expect real binding sites to be centered around the MPC Foreground Background KS score = -log(KS test p-value) MPC

  12. Analysis Methods

  13. Walk-Through

  14. http://opossum.cisreg.ca/oPOSSUM3

  15. Human SSA - Input

  16. Human SSA - Results

  17. oPOSSUM methods

  18. Human aCSA - Input

  19. Human aCSA - Input

  20. Human aCSA - Input

  21. Human aCSA - Results

  22. TFBS Cluster Analysis TFBS Profile Cluster

  23. TFBS Cluster Analysis (TCA) Gene CR1 CR2 CR3 CR4 TFBSs Merge TFBS Cluster Hits Overrepresentation Analysis based on merged TFBS cluster hits

  24. Human TCA – TFBS cluster selection

  25. Human TCA - Results

  26. TFCluster Info Page

  27. Seq SSA - Input

  28. Seq SSA - Input

  29. Seq SSA - Results

  30. KS score

  31. Seq TCA - Input

  32. SUMMARY

  33. oPOSSUM-3 • Web-based system for motif enrichment analysis in co-expressed gene sets and sequences from high-throughput experiments • Important functionalities • Gene-based vs. Sequence-based • Single site vs. Anchored combination site • Individual vs. clusters of TFBS profiles • Human, mouse, fly, worm and yeast

More Related