1 / 38

MW  11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean

MW  11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean. Lecture 17. Ultraconservation. Sequence Conservation implies Function. (but which function/s?...). Comparative Genomics of Distantly related species:. functional region!. human.

rangle
Download Presentation

MW  11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MW  11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean http://cs273a.stanford.edu [Bejerano Aut07/08]

  2. Lecture 17 • Ultraconservation http://cs273a.stanford.edu [Bejerano Aut07/08]

  3. Sequence Conservation implies Function • (but which function/s?...) Comparative Genomics of Distantly related species: functional region! human ...CTTTGCGA-TGAGTAGCATCTACTATTT... common ancestor ...ACGTGGGACTGACTA-CATCGACTACGA... anotherspecies Note: the inverse “no conservation  no function”is a much weaker statement given current knowledge http://cs273a.stanford.edu [Bejerano Aut07/08]

  4. How They Measured all human-mouse alignments human-mouse ancestral repeats alignment Difference: 5% of Human Genome [Mouse consortium, Nature 2002] http://cs273a.stanford.edu [Bejerano Aut07/08]

  5. What They Found Human Genome: 3*109 letters 1.5% known function compare to other species >50% junk >5% human genome functional 3x more functional DNA than known! ~106 substrings do not code for protein What do they do then? [Science 2004 Breakthrough of the Year, 5th runner up] http://cs273a.stanford.edu [Bejerano Aut07/08]

  6. Ultraconservation http://cs273a.stanford.edu [Bejerano Aut07/08]

  7. Typical DNA Conservation levels Conserved elements between human and mouse are on average 85% identical. [mouse consortium, 2002] [Bejerano et al., Science 2004] http://cs273a.stanford.edu [Bejerano Aut07/08]

  8. Ultraconserved Elements fish [Bejerano et al., Science 2004] http://cs273a.stanford.edu [Bejerano Aut07/08]

  9. Ultraconserved Elements [Bejerano et al., Science 2004] http://cs273a.stanford.edu [Bejerano Aut07/08]

  10. * * * * * No known function requires this much conservation ? CDS ncRNA TFBS seq. http://cs273a.stanford.edu [Bejerano Aut07/08]

  11. Discovery can be fun (compared to 4 results day before our ScienceExpress paper) 58,300 http://cs273a.stanford.edu [Bejerano Aut07/08]

  12. Ultra Conserved (UC) Elements • Any contiguous block of human-mouse-rat alignment that isidentical in all three species, syntenic and ≥200 bp. • (p=10-22 of finding one such element in slowest rate 2.9G neutral DNA) • Turns out there are 481(!) such blocks of sizes 200-779bp(total of 126Kb) in all chromosomes but 21, Y. *68 (61%) associated with alt-spliced exons http://cs273a.stanford.edu [Bejerano Aut07/08]

  13. Ultra Clusters • exonic • non • possibly By joining two ultras into a cluster when separated <675Kb we obtained 89 clusters (each named after prominent gene/s) Non exonic elements tend to congregate in clusters. Exonic elements are distributed more randomly (tend to overlap an alt-spliced exon). http://cs273a.stanford.edu [Bejerano Aut07/08]

  14. Genomic Distribution • exonic • non • possibly http://cs273a.stanford.edu [Bejerano Aut07/08]

  15. 481 UCEs identified Associate Ultras with Nearby Genes 93 type I genes 111 exonic 114 possibly exonic 256 non-exonic 100 introns 156 intergenic 225 type II genes http://cs273a.stanford.edu [Bejerano Aut07/08]

  16. Functional Annotation of Related Genes p = 1.3 x 10-19 p = 4.8 x 10-10 p = 5.6 x 10-18 p = 1.2 x 10-4 p = 9.1 x 10-6 p = 4.4 x 10-6 ultra related genes Exonic 86 p = 0.39 p = 0.44 p = 0.77 p = 2.5 x 10-20 p = 1.8 x 10-15 p = 7.8 x 10-15 7 Non Exonic 218 Exonic – RNA processing @ transcription regulation Non Exonic – regulation of transcription at DNA level http://cs273a.stanford.edu [Bejerano Aut07/08]

  17. Non Exonic Enhancers • The non exonic ultras are often found in “gene deserts”(140 / 256 >10Kb from a known gene; 88 > 100Kb away). • The genes flanking these ultras are GO enriched for development(p = 10-6), particularly early developmental tasks (p = 2-7 x 10-5) suggesting distal enhancer roles. • Indeed, uc.351 is contained in a proven enhancer of DACH, located 225Kb upstream of it [Nobrega et al., Science 2003]. ultra conserved http://cs273a.stanford.edu [Bejerano Aut07/08]

  18. Zoom to uc.351, 225Kb upstream of DACH ultra conserved e.d 12.5 http://cs273a.stanford.edu [Bejerano Aut07/08]

  19. Validating Regulatory Elements wild type Conserved Element Minimal Promoter Reporter Gene transgenic where is the reporter gene expressed? where is the wild type gene expressed? http://cs273a.stanford.edu [Bejerano Aut07/08]

  20. Predictions and Proofs I Based on public domain genome wide data: ultraconservedelements larger subset does not one subset codes protein generate testable hypotheses for function from existing knowledge (2004) [Pennacchio et al., Nature, 2006] http://cs273a.stanford.edu [Bejerano Aut07/08]

  21. The most conserved elements in the genome • If one concatenates ultras that are 1-2bp away, all 4 longest ultras (1044, 779, 731, 711bp) lie at 3’ end of POLA, near ARX, on chr X. The longest has 8 subs. and no indels (99.3%id) in chicken. • ARX is a homeobox gene involved in CNS development; defects in the gene are linked to epilepsy, mental retardation, autism and cerebral malformations. ultra conserved http://cs273a.stanford.edu [Bejerano Aut07/08]

  22. Exonic uc’s correlate with Alt Splicing • 68 / 111 exonic ultras overlap an exon that shows clear evidence of being alternatively spliced. • Of the 59 GO annotated genes containing these elements: • 24 are RNA binding (p = 8.1 x 10-18), including HNRPU, HNRPDL, HNRPH1, HNRPK, HNRPM. • 16 contain the RNA recognition motif (p = 9.1 x 10-19), including SFRS1, SFRS3, SFRS6, SFRS7, SFRS10, SFRS11. • These ultras often overlap a short exon that is retained only in some tissues. • Such is the explicitly studied, uc.33 in PTBP2 (length 312bp) overlaps a 34bp exon included in the mature transcript only in the brain [Rahman et al., Genomics 2004] http://cs273a.stanford.edu [Bejerano Aut07/08]

  23. Predictions and Proofs II Based on public domain genome wide data: ultraconservedelements larger subset does not one subset codes protein generate testable hypotheses for function from existing knowledge (2004) post transcriptional modification [Pennacchio et al., Nature, 2006] http://cs273a.stanford.edu [Bejerano Aut07/08]

  24. Splicing Microarrays http://cs273a.stanford.edu [Bejerano Aut07/08]

  25. Non-Sense Mediated Decay (NMD) http://cs273a.stanford.edu [Bejerano Aut07/08]

  26. Of the 29 exonic ultraconserved elements in RNA-binding protein genes in human, 15 have human and/or mouse EST evidence suggesting the presence of AS-NMD in those regions. [Ni et al., Genes & Dev 2007 ] http://cs273a.stanford.edu [Bejerano Aut07/08]

  27. Model for Homeostatic Auto/Cross-regulation [Ni et al., Genes & Dev 2007 ] http://cs273a.stanford.edu [Bejerano Aut07/08]

  28. Similar Results = 100% conservation; associated with AS = normal splice = alternative splice = retained intron = normal stop codon = premature termination codon [Lareau et al., Nature 2007 ] http://cs273a.stanford.edu [Bejerano Aut07/08]

  29. Ultraconserved Non-coding RNA About 1/3 of all ultras are expressed. Some are predicted to provide microRNA targets. A few are anti-correlated with miRNAexpression levels. A few even act as oncogenes. miRNA complementarity [Calin et al, Cancer Cell, 2007] http://cs273a.stanford.edu [Bejerano Aut07/08]

  30. Ultras are Under Strong Human Selection Mutational cold spots? NO. Rare (new) mutations are introduced to the population. Fierce purifying selection? YES. Very few of these get anywhere near fixation. A A G A A humans chimp NonSyn DAF Ultra DAF [Katzman et al, Science ,2007] http://cs273a.stanford.edu [Bejerano Aut07/08]

  31. Relation to Human Disease SHH LMBR1 1Mb Limb Lettice et al. HMG 2003 12: 1725-35 [Derti et al., Nature Genetics, 2006] http://cs273a.stanford.edu [Bejerano Aut07/08]

  32. Link to Disease Remains Elusive http://cs273a.stanford.edu [Bejerano Aut07/08]

  33. A Vertebrate Innovation? • Only 24 ultras can be partially traced back through direct sequence search to Ciona, C. Elegans or Drosophila. • All overlap coding exons from known genes (17 of which show clear evidence of alt-splicing inc. EIF2C1, DDX, BCL11A, EVI1, ZFR, CLK4, HNRPH1, GRIA3). • No intronic element in human was found to be coding in another species, although in some cases EST evidence indicates intron retention, presumably not as CDS. • Interestingly, ribosomal DNA (not part of the draft genomes) also harbors 6 ultraconserved elements in 18S, 28S. def def def http://cs273a.stanford.edu [Bejerano Aut07/08]

  34. Genomic Distribution of Ultraconserved Elements • exonic • non • possibly Origins? http://cs273a.stanford.edu [Bejerano Aut07/08]

  35. Computationally Driven Biology Simplified hypothesis generalize survey case study set BIO CS analyze experiment candidates http://cs273a.stanford.edu [Bejerano Aut07/08]

  36. What we do understand.. • Ultraconserved elements exist. • They are maintained via strong on-going selection. • It is a heterogeneous bunch: • Some mediate splicing • Some regulate gene expression • Some express ncRNAs • (categories are not necessarily mutually exclusive) • Knockouts of four regulatory ultras did not lead to severe phenotypes (similar protein cases: Pbx2, Nkx6.2, Gli1) http://cs273a.stanford.edu [Bejerano Aut07/08]

  37. What we don’t understand • Their functional density: • How did they come to be? • What is the selective advantage that lets them persist? http://cs273a.stanford.edu [Bejerano Aut07/08]

  38. Broad Guess • It’s about 3-D structure. • Observation: rDNA (18S, 28S) have ultraconserved stretches,multiple constraints in a complex 3-D structure, the Ribosome. • ncRNA ultras: structure confers function • Splicing related ultras: the Splicosome • Cis-reg ultras: TSS 3-D proximity, chromatinand/or packed TFBS (Transcription factories?) TSS http://cs273a.stanford.edu [Bejerano Aut07/08]

More Related