1 / 54

Dec 12, 2000

Human Genome Project: sequencing. Dec 12, 2000. Draft Finished. Outline. Exon-intron structure of genes Models of gene grammar Example: Genscan Models of exon-intron sequence Integrating intrinsic, extrinsic information Example: GenomeScan The RNA splicing code. Central Dogma. DNA.

klaus
Download Presentation

Dec 12, 2000

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Human Genome Project: sequencing Dec 12, 2000 Draft Finished

  2. Outline • Exon-intron structure of genes • Models of gene grammar • Example: Genscan • Models of exon-intron sequence • Integrating intrinsic, extrinsic information • Example: GenomeScan • The RNA splicing code

  3. Central Dogma DNA 1:1 ACCGGACCGATGCGACTGCCCGAGGACTAGATAT TGGCCTGGCTACGCTGACGGGCTCCTGATCTATA RNA 1:1 * GACCGAUGCGACUGCCCGAGGACUAGA M R L P E D 3:1 Protein MRLPED

  4. Human Splice Signal Motifs 5' splice signal 3' splice signal

  5. C. Burge & S. Karlin, 1997, 1998

  6. Genscan HSMM

  7. Human Splice Signal Motifs 5' splice signal 3' splice signal http://genes.mit.edu/pictogram.html

  8. Semi-Markov HMM Model

  9. Genome Scale Gene Finding Strategies C. Burge Nature Genet. 27, 5-7, 2001

  10. ExoFish Homo sapiens Tetraodon nigroviridis Roest Crollius et al., Nature Genet., 2000

  11. GenomeScan Objectives • Combine probabilistic ‘extrinsic’ information (BLAST hits) • with a probabilistic model of gene structure/composition • Make method efficient and reliable enough to run on an • entire vertebrate genome without human supervision • Focus on ‘typical case’ when homologous but not identical • proteins are available.

  12. http://genes.mit.edu/genomescan

  13. Current Human Gene Annotation Efforts • Ensembl [http://www.ensembl.org] Genscan (ab initio) + BLAST (homology) + GeneWise (protein:DNA alignment) • NCBI [http://ncbi.nlm.nih.org] acembly (cDNA,EST alignments) • Burge lab [http://genes.mit.edu/genomescan] GenomeScan (ab initio + protein sequence homology) • Neomorphic/Affymetrix Genie (ab initio + EST) • Celera Otto (???) IGI (International Gene Index) / IPI (EBI)

  14. Human Splice Signal Motifs 5' splice signal 3' splice signal

  15. 5’ Splice Signal Scores

  16. Intron Length Distributions

  17. Characterizing the sources of information used for splicing • 5’ splice signal (.AG/GTRAGt) • 3’ splice signal (…YYYYYY.YAG/) • Branch signal (…CTGAC..) • Intron length preference • Intron composition

  18. Splicing-verified Transcripts Data from Sep, 2000 GenBank release

  19. Splice Signal Sequences

  20. IntronScan Accuracy Fivefold cross-validated

More Related