1 / 12

Splicing Exons: A Eukaryotic Challenge to Gene Prediction

Splicing Exons: A Eukaryotic Challenge to Gene Prediction. Ian McCoy. Gene Prediction. Genes must be identified to make the genome useful Computational Problem: Take a seemingly random sequence of characters, millions or billions of bases long, and find the genes. A Serious Complication.

blaze
Download Presentation

Splicing Exons: A Eukaryotic Challenge to Gene Prediction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Splicing Exons: A Eukaryotic Challenge to Gene Prediction Ian McCoy

  2. Gene Prediction • Genes must be identified to make the genome useful • Computational Problem: Take a seemingly random sequence of characters, millions or billions of bases long, and find the genes.

  3. A Serious Complication • Only 3% of the human genome contains genes

  4. Similarity-Based Approach • Instead of looking for a gene for a target protein directly, use a protein in a related organism. • Find all local similarities between a genomic sequence and the target protein sequence. • All substrings that exhibit a certain level of similarity will be called putative exons.

  5. Exon-Chaining Problem • Use brute force to generate a set of putative exons. • Represent each exon with three parameters (l,r,w). • Find a maximum set of nonoverlapping putative exons.

  6. Formulate as Graph Problem • Create a graph G with 2n verticies: n vertices are starting(left) positions of exons and n vertices are ending(right) positions of exons. • The set of left and right interval ends is sorted into increasing order. • There are edges between each li and ri of weight wi for I from 1 to n; and 2n-1 additional edges of weight 0 connecting adjacent vertices.

  7. Input: A set of weighted intervals (putative exons) Output: The length of the maximum chain of intervals from this set

  8. Dynamic Programming Algorithm ExonChaining (G, n) //Graph, number of intervals • fori ← 1 to 2n • si← 0 • fori ← 2 to 2n • if vertex vi in G corresponds to right end of the interval I • j← index of vertex for left end of the interval I • w← weight of the interval I • sj← max {sj + w, si-1} • else • si← si-1 • returns2n

  9. Shortcomings • A large number of short exons will decrease the efficacy of our method for finding putative exons. • Exons may be out of order.

  10. Any Questions? • Jones, Neil C., and Pavel A. Pevzner. An Introduction to Bioinformatics Algorithms. Cambridge: MIT Press, 2004. (p.200-203)

More Related