1 / 33

Predicting effect of SNPs and de novo variants on splicing

Predicting effect of SNPs and de novo variants on splicing. presented by Alexander Tchourbanov. Presentation structure. Previous work on predicting aberrant splicing events induced by common and de novo genetic variants Proposed plan of action. Problem of aberrant splicing.

elewa
Download Presentation

Predicting effect of SNPs and de novo variants on splicing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Predicting effect of SNPs and de novo variants on splicing presented by Alexander Tchourbanov

  2. Presentation structure • Previous work on predicting aberrant splicing events induced by common and de novo genetic variants • Proposed plan of action

  3. Problem of aberrant splicing • Splicing in vertebrate genes is governed by highly degenerate motifs that include donor, acceptor, branch site and repertoire of splicing enhancers and silencers • Integrity of human genes is constantly compromised by de novo mutations • ~15% disease associated mutations cause aberrant splicing

  4. Splicing components Image credit: Understanding alternative splicing: towards a cellular code: Arianne J. Matlin, Francis Clark and Christopher W. J. Smith, Nature Reviews Molecular Cell Biology 6, 386-398 (May 2005)

  5. Importance of understanding the aberrant splicing • According to Human Gene Mutation Database (HGMD) Professional 2010.4 (http://www.hgmd.cf.ac.uk) • 60,489 mutations are missence/nonsense • 10,210 mutations have consequences in mRNA splicing • Databases DBASS5 and DBASS3 currently contain 900 well-annotated records of disease causing aberrant splicing events (Buratti et. al., Nucleic Acids Research, 2010).

  6. Importance of understanding the aberrant splicing • Chen R, Davydov E, Sirota M, Butte A: Non-Synonymous and Synonymous Coding SNPs Show Similar Likelihood and Effect Size of Human Disease Association. PLoS ONE 2010, 5(10):e13574. • Frequently it is difficult to get tissue samples for RNA sequencing (brain samples, retina samples) • We need to predict the effect of de novo variants (which includes cancer mutations) and common variants. No association study possible.

  7. Existing elements

  8. Orthologos blocks from UCSC GB • 2,333,379 extended exons from 23 Tetrapoda organisms were obtained • A number of experimental reports showed that genes from distantly related Tetrapoda organisms were correctly expressed and post-transcriptionally modified in transgenic animals (Capetanaki Y et al.: Proc Natl Acad Sci USA 1989, Jacobs GH et al.: Science 2007) • The genes encoding well-known RNA binding proteins involved in splicing regulation are enriched with ultraconserved elements (Bejerano G. et al.:Science 2004)

  9. Counting oligos

  10. Comparing oligo counts

  11. Example of 5’SS ISEs found

  12. Elements found • Using the orthologous exons available for 23 Tetrapoda organisms we have identified 2,546 unique splicing regulatory elements. • Among these elements 203 (7.97%) 3’SS and 177 (6.95%) 5’SS supporting motifs are novel and have not been previously reported in systematic screens detecting such elements. • Among our predicted elements, 41.08% of sequences were heptamers and 51.81% were octamers and only 6.76% hexamers and 0.35% pentamers

  13. Predicting donor splice site Bayesian 5’ splice sites sensor designed during my PhD study has performance better than other sensors, including maximum entropy sensor from MIT.

  14. Exonic length distribution Optimal exonic lengths substantially depend on the flanking splicing signals strengths, considering splice site (SS) strengths in the discrete range from 1 (weakest) to 5 (strongest).

  15. Example of LOD profiles (5’SS ISE)

  16. Exon scoring method • LOD scores associated with 5’SS,3’SS, exonic length, competing SSs and Enhancer/Silencer signals are combined towards an exon strength

  17. Existing splicing prediction software • http://www.umd.be/HSF • http://esrsearch.tau.ac.il/ • http://genes.mit.edu/burgelab/rescue-ese/ • http://genes.mit.edu/exonscan/ • http://cryp-skip.img.cas.cz/ • http://cubweb.biology.columbia.edu/pesx/ • Strongest exonic silencers are the splice sites themselves!!!

  18. SpliceScan II performance on mutations

  19. Disturbing circadian pacemaker • For example, the circadian pacemaker period homolog 1 (Per1) gene locus has intronic non-coding variant rs885747 that has been previously associated with Autism (Nicholas et. al., Molecular Psychiatry, 2007). • Haplotype analysis within per1 gave a single significant result: a global P=0.027 for the markers rs2253820-rs885747 • We predicted creation of intronic splicing enhancer GCGGGGT as one of the possible causative mechanisms behind rs885747 that promotes aberrant exonic isoform.

  20. Disturbing circadian pacemaker

  21. Disturbing circadian pacemaker • Per1 is a member of the Period family of genes and is expressed in a circadian pattern in the suprachiasmatic nucleus, the primary circadian pacemaker in the mammalian brain. Genes in this family encode components of the circadian rhythms of locomotor activity, metabolism, and behavior.

  22. SNPs affect splicing NEW! NEW!

  23. rs849563 variant • Am J Med Genet B Neuropsychiatr Genet. 2007 Jun 5;144B(4):492-5. Association of the neuropilin-2 (NRP2) gene polymorphisms with autism in Chinese Han population. Wu S, Yue W, Jia M, Ruan Y, Lu T, Gong X, Shuang M, Liu J, Yang X, Zhang D. Institute of Mental Health, Peking University, Beijing, China. • Significant genetic association found between autism and two of the SNPs of the NRP2 gene (rs849578: P = 0.017, rs849563: P = 0.027), as well as specific haplotypes, especially those formed by rs849563.

  24. rs849563 is synonymous

  25. rs849563 predicted mechanism The neuropilin-2 (NRP2) gene is localized to 2q34, an autism susceptibility locus. NRP2 has been demonstrated to both guide axons and to control neuronal migration in the central nervous system. It has been reported that NRP2 may be required in vivo for sorting migrating cortical and striatal interneurons to their correct destination.

  26. SpliceScan II tool • SpliceScan II tool http://www.wyomingbioinformatics.org/~achurban/docs/SpliceScanII.tar.gz • Is more sensitive than existing splicing simulators (NetUTR, ExonScan) • Uses novel 5’ GC SS Bayesian sensor • Method allows predicting aberrant splicing events associated with genomic variants • ACGMAP companion database http://www.stritch.luc.edu/node/375

  27. Proposed system architecture Transcriptome Trios Shotgun Mate pairs Online submission Phased reference genomes of healthy individuals Haplotype trees Variants calling (GMAP/gsNap) Use PolyPhen (Ramensky et.al., NAR, 2002), SIFT (Kumar et.al., Nature Protocols, 2009) or Panther (Thomas et.al., Genomic research, 2003) to predict destabilizing effects of non-synonymous genetic variants Use SpliceScanII to predict effect of synonymous mutations on splicing Visualize information in the context of existing information (HGMD, UCSC genome browser, dbSNP, PFAM, ASTD) Variants analysis and visualization

  28. Chromosome testing at BGI

  29. DNA swap Craven et al.Nature, 1-4 (2010)

  30. Mito and Thacker Tachibana et.al., Nature, 2009

  31. Wellderly study • The Wellderly Study is headed by Scripps Health Chief Academic Officer Dr. Eric J. Topol, who has spent the past four years recruiting healthy elderly individuals • youngest participant is at least 80 years old, the median age of this study group is 87 with oldest participant 108 years old • free from major diseases and long-term medications • This fall Complete Genomics announced that it will sequence, at its own cost, the whole human genomes of 1,000 participants in the Wellderly Study • Following announcement NASDAQ: GNOM stocks dropped 7%. • The genomic sequences obtained in this study will be a private property of Complete Genomics • Archon Genomics X PRIZE http://genomics.xprize.org/life-at-100-plus

  32. Third generation platforms • 3rd generation platforms (such as Oxford nanopore http://www.nanoporetech.com/) will revolutionize the field soon. Clarke, J. et al. “Continuous base identification for single-molecule nanopore DNA sequencing.” Nature Nanotech. 2009.

  33. Thanks!

More Related