1 / 22

An extensive map of RNA-protein interactions in Drosophila melanogaster

An extensive map of RNA-protein interactions in Drosophila melanogaster Marcus Stoiber, Biostatistics, PhD student, UC Berkeley Gemma May, Mike Duff , Robert Obar, Spyros Artavanis-Tsakonas, Ben Brown, Brenton Graveley , and Susan Celniker. Outline. RIP- seq , Datasets & QC Overview

moeshe
Download Presentation

An extensive map of RNA-protein interactions in Drosophila melanogaster

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An extensive map of RNA-protein interactions in Drosophila melanogaster Marcus Stoiber, Biostatistics, PhD student, UC Berkeley Gemma May, Mike Duff, Robert Obar, Spyros Artavanis-Tsakonas, Ben Brown, Brenton Graveley, and Susan Celniker

  2. Outline • RIP-seq, Datasets & QC Overview • Differential Binding (DB) Analysis Pipeline • Network, Clustering Analysis • RBPs as global post-transcriptional regulators • Hotspot RNAs • RBPs bind the RNA of other RBPs • Clustering RNA Binding Proteins (RBP) • Comparison to Related Studies • RNAi of RBPs • RIP-chip in Yeast • Motif Enrichment

  3. RIP-seq Overview RIP-seq Overview • RNA – Immuno-Precipitation followed by sequencing • Identifies all RNA binding partners for a single RBP • Smaller studies have been carried out on few RBPs, but none have surveyed many RBPs successfully. Spliceosome Novel EJC hnRNP RNA UTR Intron Exon SR Proteins • RBP Functions (post poly-A): • Export • Translational repression/activation • Localization • Signaling

  4. RIP-seq Overview RIP-seq Experiment S2 Cells RNA Transfect with HA-tagged Protein of Interest Lyse Cells Native Proteins Immuno-precipitation ACGUCGAUUAGCUGCUAUGCAUACAGGCUAUACGUAGCUAUACGAUCGAUCAGUCGAUCAUUACGUAGCUAUCAACGUACG………………. Computational Analysis Illumina Sequencing Confirm IP & Elute RNA

  5. Datasets & QC Overview RIP-seq Data & QC • 24 RPBs of interest (in biological duplicate)★: • Spliceosome Core: Cbp20, CG6227, CG6841, Rm62, Smn, snRNP-U1-70K, U2af50 • Exon Junction Complex (EJC): Upf1 • Heterogeneous Nuclear RNP: CG17838, elav, msi, mub, ps • Novel: Fmr1, qkr54B, qkr58E-1 • Other: RpS3, eIF3-S4 • SR Proteins: B52, Rbp1, SC35, SF2, Srp54, tra2 • 4 controls★✚(empty vector with HA-tag) and 4 Non-RBP★✚negative control experiments ★ - Confirmed via sequence adjacent to HA-tag analysis ✚ - Confirmed via leave-one-out DE analysis

  6. Differential Binding (DB) Analysis Pipeline DB Analysis Pipeline Count RNA Totals Locus Level Read Counts Aligned Reads DESeq on each replicate separately with locus dispersion estimation across all samples and controls Irreproducible Discovery Rate IDR value for each RNA – RBP pair P-values for each RNA – RBP replicate • Theory of IDR: • Li, Brown, Huang, Bickel; Measuring Reproducibility of High-Throughput Experiments. • Practice of Using IDR: • Landt, et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia.

  7. Network, Clustering Analysis RBP Network : Biological Findings • Hotspot RNAs • RNAs of post-transcriptional regulators • RNAi, NMD, RBPs, splicing • RBPs bind the RNA of other RBPs • Confirms predicted phenomenon in metazoans1 • RBP-specific RNA partners • Characterization of RBPs of unknown function • Guilt by association • Differential Exon Usage Example (lncRNA) 1Kosti, I., Radivojac, P. & Mandel-Gutfreund, Y. An integrated regulatory network reveals pervasive cross-regulation among transcription and splicing factors. PLoSComputBiol8, e1002603, (2012).

  8. Hotspot RNAs RBP Network : 30,000 Foot View ? Bound by 17: Hsp26 Bound by 16: Smg5 Bound by 15: Cdc5, CG12065, CG3008,CG8801, Ranbp9, Rpn10 “Hot-spot” RNAs • GO term enrichment for hot-spot RNAs: • Splicing • NDM • RNAi • Neurogenesis • Protein Folding • *** Indicates a global translational regulation mechanism for RBPs *** *Poisson-Bionomial Distribution assuming none of the 5191 RNAs are actually differentially bound.

  9. Hotspot RNAs Hotspot RNAs Hotspot RNA signal is driven by the strongest binders. Below is a similar plot for a GO term which is significant in only four hnRNPfactors.

  10. RBPs bind the RNA of other RBPs RBPs bind mRNAs of other RBPs &Hotspot RNAs

  11. RBPs bind the RNA of other RBPs RBPs that bind many RNAs tend to have their mRNA bound by many RBPs • Correlation Statistics • Raw Correlation (Pearson): 0.7615207 • P-Value (Permutation Test) ≈ 0.002446714 • Raw Correlation (Spearman): 0.718886 • P-Value (Permutation Test) ≈ 0.002336134 Possible Confounding Factors • Driven by Statistical Power Issues? (Transparent Red Circles) • Partial correlation adjusted for normalized expression: • Pearson - 0.7857327 5.849305e-09 • Spearman - 0.7243166 1.477892e-06 Length Normalized Expression = (Normalized expression / Gene length) * mean(all gene lengths) In order to keep normalizations on same scale. • Driven by Native Biological Expression? (Transparent Blue Circles) • Partial correlation adjusted for • length normalized expression: • Pearson - 0.7911933 3.05612e-09 • Spearman - 0.750386 1.968678e-07

  12. RBP-specific RNA partners RBPs Which Bind Unique RNAs • hnRNPs • Core Proteins • SR Proteins • EJC • hnRNPs are more likely to have unique binding partners. • Wilcox Rank Sum Test P-Value: 0.0001907

  13. Characterization of RBPs of unknown function Functionally related RBPs bind functionally related mRNAs • Define distance between two RBPs as: • Transcripts: Raw Overlap (Jaccard Distance) • GO terms: Weighted Overlap (Cosine Distance) • Dimension reduction by MDS • Identified global coordination and potential classification of novel RBPs: • Fmr1 – Spliceosome Core / SR • qkr54B and qkr58E-1 - hnRNP • Fmr1 – SpliceosomeCore /SR • qkr54B and qkr58E-1 - hnRNP • protein phosphorylation • determination of adult lifespan • long-term memory • locomotor rhythm

  14. Differential Exon Usage Example lncRNA Locus CG33229 CR42862 Negative Controls Srp54 B52

  15. Comparison to Yeast Study Comparison to Yeast RIP-chip1 “RBPs that bind many RNAs tend to have their mRNA bound by many RBPs” appears be a metazoan-specific phenomenon. GO coordination appears to be stronger than transcript coordination between functionally related RBPs. 1Diverse RNA-Binding Proteins Interact with Functionally Related Sets of RNAs, Suggesting an Extensive Regulatory System; Daniel J. Hogan, Daniel P. Riordan, Andre´ P. Gerber, Daniel Herschlag, Patrick O. Brown

  16. Comparison to RNAi Study Correspondence with RNAi • 55 RBPs versus 2 control samples • ~20 overlapping experiments with RIP • In S2 cells • Interpretation would be that RIP hits can either • Directly effect expression of an RNA (RNAi Hit) • Localize / sequester an RNA (Causing Other RNAi Hits) RNAi RIP

  17. Comparison to RNAi Study Correspondence with RNAi

  18. Motif Enrichment Motif Enrichment • Current motif enrichment algorithms do not work in complex transcript space. • Either DNA space or simple (yeast) transcript space • For gene set of interest (Red lines) random “matched” sets (Grey lines) are chosen. • Calculate hyper-geometric p-values for each 7-mer in each random gene set • Plots show rank (x-axis) vs. raw p-value (y-axis) • Correct null would follow line with slope 1. • Clearly this is not a valid null because of k-mer distributions within genes ps elav B52

  19. Motif Enrichment Motif Enrichment • Clearly some samples have more significant hits than others • Use 95% quantile of extreme p-value from each random gene set as cutoff value for enriched motifs in the gene set of interest • 2 of the 3 gene sets of interest identified significant motifs using this method • These motifs match closely the in vitromotifs1for these factors • B52cluster top significant enriched motifs: • GAGGAGG, AGGAGGA, GGAGGAG, AGAAGGA • elav cluster significant enriched motif: • UUUUUUU 1 Ray, Debashish, et al. "A compendium of RNA-binding motifs for decoding gene regulation." Nature 499.7457 (2013): 172-177.

  20. Motif Enrichment Motif Enrichment Plan • Hyper-geometric p-values do not give accurate rank list of enrichment • Instead, count of random samples wherein a motif is found less often than in the sample of interest gives a valid rank list of enrichment • Possibly add filter for low complexity regions prior to this step • Motifs found within the sample of interest much more often than within random samples are clustered. • Use edit distance between motifs • followed by k-means clustering on • n-dimensional MDS projection • Align each set of motifs (clustalw/ω) • and produce a PWM

  21. Future Directions Summary of Findings • “RBPs are global regulators of post-transcriptional machinery” • Bind mRNAs of proteins involved in RNAi, NMD, splicing, protein folding • Bind mRNAs of other RBPs • “RBPs which are master regulators must have their translation regulated by many RBPs” • Appears to be a metazoan-specific phenomenon • hnRNPs tend to bind more specific RNA partners • Characterization of 3 RBPs of previously unknown class • Motif enrichment results matches previous studies

  22. Acknowledgments Acknowledgments modENCODE Consortium LBNL Ben Brown Susan Celniker University of Connecticut Health Center Brenton Graveley Mike Duff Gemma May Harvard Medical School Robert Obar Spyros Artavanis-Tsakonas

More Related