1 / 29

ENCODE pseudogene updates

ENCODE pseudogene updates. Adam Frankish, HAVANA 13/10/05. Not added - AK125808. Reverse strand mRNAs. Translation. Ral-GDS related protein Rgr (Rgr) pseudogene.

declan
Download Presentation

ENCODE pseudogene updates

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ENCODE pseudogene updates Adam Frankish, HAVANA 13/10/05

  2. Not added - AK125808 Reverse strand mRNAs Translation Ral-GDS related protein Rgr (Rgr) pseudogene The transcripts on which this pseudogene is based do not appear to have a valid translation (only BC007286.1 has a translation which looks spurious)

  3. Not added - YalePgene_139 I have been able to reconstruct a coding gene with a full length CDS at this locus (AC009892.1) and would not annotate a coding gene and pseudogene at the same locus as discussed previously. The majority of the gene (3' end of exon 3 to final exon (8)) is supported by 100% matching (best in genome hits) human EST (Em:DN998408.1, Em:BG743947.1) and mRNA evidence (Em:BC033195.1) which together support a structure (although there is a small gap in support in exon 5) with an ORF extending from start to the final exon. Using human ESTs not from this locus eg Em:BM918119.1 (approx 70% ID at this locus best hit in genome 100% to the KIR2DL4 gene also on chr19 by ensembl SSAHA) the 5' end of exon 3 and two further upstream exons can be clearly identified (all splice sites are clearly intact). The structure contains a CDS which starts in exon 1 (shares homology with the N-terminal sequence of several KIR2D family members in the exon), ends in the final exon and contains three immunoglobulin domains. The fact that despite the lack of transcript evidence from the 5’ end locus and the quite high degree of divergence between this locus and other gene family members, these splice sites are preserved suggests that this structure is correct and a coding gene rather than a pseudogene.

  4. Not added - YalePgene_139 Protein EST mRNA Supporting evidence

  5. Not added - YalePgene_139 Dot plot of EST Splice donor

  6. Havana+, Yale-, UCSC- AC006326.4-001 AC006326.2-001 AC063976.2-001 AF277315.12-001 RP11-143H17.1-001 AC009892.5-001 Z84721.2-001 Z84721.4-001 AC103710.2-001 AC103710.4-001 AC129505.5-001 AC087380.10-001 AC087380.14-001 AC002456.1-001 AC009404.5-001 AC114812.7-001 AC011330.5-001 AC011330.8-001 AL162151.3-001 We think the annotation of these as pseduogenes can be supported

  7. ENm001 - AC006326.2, AC006326.4 UCSC pseudo Yale pseudo NADH dehydrogenase 2 (MTND2) pseudogene heterogeneous nuclear ribonucleoprotein A1 (Hnrpa1) pseudogene NADH dehydrogenase 4 (MTND4) pseudogene New cytochrome b (CYTB) pseudogene

  8. ENm002 - AC063976.2 Dot plot Alignment

  9. ENm004 - RP1-127L4.3 UCSC pseudo HAVANA pseudo Yale pseudo

  10. ENm006 - AF277315.12 olfactory receptor family pseudogene

  11. ENm006 - RP11-143H17.1 HAVANA pseudo Frameshift

  12. ENm007 - AC009892.5 HAVANA LIR pseudogene

  13. ENm008 - Z84721.4 HAVANA hemoglobin, alpha pseudogene

  14. ENm009 - AC103710.2 olfactory receptor, family 51, subfamily N, member 1 pseudogene Frameshift

  15. ENm009 - AC103710.4 olfactory receptor, family 52, subfamily Y, member 1 pseudogene

  16. ENm009 - AC129505.5 olfactory receptor, family 52, subfamily Z, member 1 pseudogene No Met First possible Met

  17. ENm009 - AC087380.10 olfactory receptor, family 51, subfamily A, member 10 pseudogene Frameshift

  18. ENm009 - AC087380.14 Novel pseudogene

  19. ENm013 - AC002456.1 ribosomal protein L5 (RPL5) pseudogene

  20. ENr121 - AC009404.5 5-hydroxytryptamine (serotonin) receptor 5B (HTR5B) pseudogene Frameshift

  21. ENr131 - AC114812.7 UDP glycosyltransferase 1 family, polypeptide A2 pseudogene Frameshift

  22. ENr233 - AC011330.5 Novel pseudogene 3’ truncation ~350aa missing, no stop

  23. ENr233 - AC011330.8 Stop codon in exon 20 stereocilin (STRC) pseudogene

  24. ENr322 - AL162151.3 mRNA dot plot pseudogene similar to part of ribosomal protein L3 (RPL3) Protein dot plot

  25. HAVANA pseudogene overlaps exon • Non-coding locus • AC008984.4, AC008984.6, AC009892.8, AC006293.1, AC114812.6, AC114812.5, AC005538.2, AC018512.3, RP3-477O4.5 • Coding locus opposite strand • AC002543.2, RP11-143H17.1, AC010492.4, RP11-398K22.9, RP3-477O4.4 • Coding locus same strand • AC008984.5, Z84721.2, AC011330.5 We believe all these pseudogenes are valid

  26. Non-coding locus Aligned proteins (column collapsed) HAVANA sialyltransferase pseudogene Supporting EST Putative novel transcript

  27. Coding locus opposite strand Protein alignment Non-coding exon HAVANA novel pseudogene ENm001 Pseudogene: AC002543.2

  28. Coding locus same strand LILR pseudogene Frameshift LILRA3

  29. But not…. In-frame stop codon KIR2DL3 – coding gene

More Related