1 / 14

Survey of Misannotations and Pseudogenes in the Arabidopsis Genome

Survey of Misannotations and Pseudogenes in the Arabidopsis Genome. Tanmay Prakash. Objectives. Objectives Find Possible Misannotations Find Possible Pseudogenes. Why Misannotation can hinder research Pseudogenes can be used to study natural selection. Misannotations. Intron. UTR.

efuru
Download Presentation

Survey of Misannotations and Pseudogenes in the Arabidopsis Genome

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Survey of Misannotations and Pseudogenes in the Arabidopsis Genome Tanmay Prakash

  2. Objectives • Objectives • Find Possible Misannotations • Find Possible Pseudogenes • Why • Misannotation can hinder research • Pseudogenes can be used to study natural selection

  3. Misannotations Intron UTR CDS CDS UTR Many misannotations are the result of gene prediction programs mislabeling introns because of the presence of a stop codon

  4. Pseudogenes • Pseudogenes are DNA sequences that no longer function but resemble the functional genes they once were. There are two types: • Processed • Non-processed • Common Properties of Pseudogenes • Stop Codons • Frameshift mutations • Lack of Selective Pressure agtacatgcataggactcgatcgactc STCIGLDRL ST..DSID agtacatgataggactcgatcgactc

  5. Pipeline Query Protein Domains Genes Matching In Introns BLAST Search Subject Arabidopsis Introns Genes Matching In Both Possibly Misannotated Genes Query Protein Domains Check for Stop Codons Frameshift Genes Matching In CDS HMMER Search Subject Arabidopsis CDS Check Ka/Ks Possible Pseudogenes

  6. Query Protein Domains Genes Matching In Introns BLAST Search Subject Arabidopsis Introns Query Protein Domains Genes Matching In Exons HMMER Search Subject Arabidopsis CDS

  7. Genes Matching In Both Possibly Misannotated Genes

  8. Results There were 346 genes (different models not included) that had matches to the same domain in the introns and exons There were 299 genes (different models not included) that had matches to the same domain in an intron and flanking exons. These are most likely misannotations.

  9. 4 domains with the most possible misannotations

  10. Future Research • Identify pseudogenes by looking for stop codons, and frameshift mutations in the introns and checking the Ka/Ks value • Use a more recent database of domains • Follow the same process for the rice genome

  11. Acknowledgement Dr. Shin-Han Shiu Dr. Kosuke Hanada Dr. Melissa Lehti-Shiu Dr. Gail Richmond HSHSP

More Related