1 / 46

First Thesis Advisor Committee Meeting

First Thesis Advisor Committee Meeting. Alejandro Reyes TAC members. DEXSeq, detecting differential usage of exons using RNA-seq. Exon usage and RNA-seq. Exon usage allows expansion of the eukaryotic proteome: Alternative first exons ( promoter )

karim
Download Presentation

First Thesis Advisor Committee Meeting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. First Thesis Advisor Committee Meeting Alejandro Reyes TAC members

  2. DEXSeq, detecting differential usage of exons using RNA-seq

  3. Exon usage and RNA-seq Exon usage allows expansion of the eukaryotic proteome: Alternative first exons ( promoter ) Alternative last exon ( termination signals ) Alternative splicing RNA-seq allows to study transcriptomes. RNA-seq: lack of methods that estimate properly biological variation. 3

  4. pasilla Drosophila melanogaster S2 cell cultures: - siRNA splicing factor pasilla (NOVA) 3 biological replicates (1X single end, 2X paired end) - control (no treatment) 4 biological replicates (2X single end, 2X paired end) 4

  5. RNA-seq Wang, Gerstein, Snyder. (2009). 5

  6. Exon usage (Brooks et al data) Changes should be reflected in the expression of the exon, independently of the expression of the gene. 6

  7. Exon counting bins Anders, Reyes and Huber. In press.. 7

  8. Counting rules Alignment vs genome Uniquely aligned reads

  9. Exon counts

  10. Model counts of exon l of gene i in sample j

  11. Size Factor counts of exon l of gene i in sample j size factor • Where i is a counting bin • Normalization factor for sequencing depth (Anders and Huber, 2010).

  12. Dispersion counts of exon l of gene i in sample j dispersion size factor • Distinguish biological variability within groups. • Mean variance dependency • Then one can infer changes due to the difference in condition

  13. Dispersion counts of exon l of gene i in sample j dispersion size factor • Poisson distribution with mean q and variance q q is computed • Biological sample with mean u and variance v v is estimated from the data • Negative binomial distribution with mean u and variance v + q

  14. Dispersion counts of exon l of gene i in sample j dispersion size factor • Standard maximum-likelihood estimates has strong bias when the number of samples is small. • Cox-Reid conditional maximum likelihood estimation (Robinson, McCarthy, Smyth. 2010).

  15. Dispersion counts of exon l of gene i in sample j dispersion size factor • Cox-Reid dispersion estimate Anders, Reyes and Huber. In press.

  16. Dispersion counts of exon l of gene i in sample j dispersion size factor • Cox-Reid dispersion estimate • Gamma mean-variance fit • max( fitted value, CR est) Anders, Reyes and Huber. In press.

  17. Generalized linear model counts of exon l of gene i in sample j dispersion size factor change in expression of gene i caused by the treatment pj change in the fraction of reads in exon l due to the treatment pj expression strength of gene i in control fraction of reads in exon l in control

  18. Brooks et al dataset 253 exons in 159 genes (flybase annotation) 10%FDR Anders, Reyes and Huber. In press.

  19. FDR control and comparison with cuffdiff

  20. Conclusions I We have developed a R/Bioconductor package to test for differential exon usage: Correct biological variation estimation Flexible statistical test, allowing covariates (GLMs NB) Parallelization Visualization HTML reports 2 HTSeq python preprocessing scripts defining exon bins counting reads 600 monthly downloads, weekly mentions in blogs and mailing lists, cited in two high impact publications, course material in bioinformatic courses, accepted for publication in Genome Research!

  21. Perspectives Improve the html report generator (javascript) Use information from split junction reads

  22. Evolution of exon usage regulation

  23. Spliceosome assembly U2 U2AF U1 GU A YAG U4 U6 U5 hnRNP U1 GU B complex U4 U6 U2 U5 SR proteins YAG A kinases and phosphatases U1 U4 RNA helicases Cyclophilins GU C complex U6 U2 U5 YAG A A complex + ~200 non-snRNP proteins Modified from EURASNET

  24. Evolutionary consequences Alternative splicing could be a flexible feature for adaptation of species, easily generate and test new transcripts (Black, 2003). Alternatively spliced exons have been associated with fast evolution rates and frequently gained/lost. (Ermakova, 2006). Mutations easily generate (exonization of introns) or delete splice signals (loose exons). (Alekseyenko, 2007) Differences between closely related species (Blekhman et al, 2010), between the same species (Lalonde et al, 2010), Most AS events between human-mouse are species-specific (Irimia, 2009).

  25. Functional consequences The transcriptome is dominated by “noisy” splicing, but tissue specific isoforms are highly expressed. (Prickrell et al, 2010)

  26. AS functional importance (example) Sex determination in Drosophila melanogaster:

  27. AS functional importance (example II) Splicing switch for differentiation:

  28. Functional importance (example III) Its mis-regulation is associated to many disease phenotypes, including cancer.

  29. Evolution of exon usage regulation Irimia et al, 2009

  30. Evolution of expression “Rate of gene expression evolution varies among organs, lineages and chromosomes, owing to differences in selective pressures” 9 different organisms, 6 tissues, 2 individuals ~ 139 samples ~ 414145196 high quality reads General goal: explore the functional and evolutionary aspect of the regulation of exon usage

  31. One to one orthologous exon graph 1-2-1 human exons - single hits - 100% coverage - 90% of identity - remove single exon genes 128040 conserved exons in 10673 genes

  32. Exon usage and gene expression Tissue A Species 1 Species 2 Tissue B

  33. Exon usage primate phylogeny Differences in exon usage are correlated with observed instances of speciation. Gorilla – Human: incomplete lineage sortage?

  34. Purifying selection differences

  35. Conserved tissue specific DEU

  36. Conserved tissue specific DEU Known cases Brain: splicing plays important roles in differentiation and protein protein interaction in synapses. Testis: association with developmental processes

  37. 92% of the genes tested have evidence of conserved differential exon usage (10% FDR). Tissue specific splicing regulation is conserved, thus it is likely to be functional.

  38. SFmap Sfmap (200 bp up and downstream the exons): Identifies binding motifs for 17 splicing factors Takes human – mouse conservation into account Takes binding motif cluster into account Background generation (Stuart, 2010) For each tissue, background with same length and expression Wilcoxon test on the number of motifs 30 January 2012 38

  39. Motif enrichment

  40. Explanations Combinatorial control of splicing decisions Chromatin state differences DEU between tissues is tightly regulated. New associations, different splicing factors doses = including / excluding exons

  41. Evolution of exon usage regulation Irimia et al, 2009

  42. Perspectives and future work (short term) Which protein sequence features are differentially used? (ELMs, phosphorylation signals, intrinsically disordered regions) TODAYS RESULT: Conserved DEU tend to avoid SMART domains? (protein structured regions? )

  43. Exon usage within a single species How do the exons with conserved DEU vary between individuals of the same species? HapMap individuals RNA-seq samples

  44. Ideas (long terms) Identification of important developmental splicing switches (Gabut et al, 2010), looking at conservation of events Identification of splicing aberrations in disease. “master equation” of splicing decisions: SF + DNA binding = exon inclusion/exclusion?

  45. Acknowledgements Huber Group Simon Anders Wolfgang Huber

  46. Verify test

More Related