1 / 48

Methods for Creating GO Annotations

Methods for Creating GO Annotations. Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK. The core information needed for a GO annotation. 1. Database object (protein) e.g. Q9ARH1 2. GO term ID e.g. GO:0004674 3. Reference ID

kreeli
Download Presentation

Methods for Creating GO Annotations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK

  2. The core information needed for a GO annotation • 1.Database object (protein) • e.g. Q9ARH1 • 2. GO term ID • e.g. GO:0004674 • 3. Reference ID • e.g. PubMed ID: 12374299 • GOA:InterPro • 4. Evidence code • e.g. TAS

  3. The core information needed for a GO annotation • 1.Database object (protein) • e.g. Q9ARH1 • 2.GO term ID • e.g. GO:0004674 • 3. Reference ID • e.g. PubMed ID: 12374299 • GOA:InterPro • 4. Evidence code • e.g. TAS

  4. The core information needed for a GO annotation • 1.Database object (protein) • e.g. Q9ARH1 • 2.GO term ID • e.g. GO:0004674 • 3.Reference ID • e.g. PubMed ID: 12374299 • GOA:InterPro • 4. Evidence code • e.g. TAS

  5. The core information needed for a GO annotation • 1.Database object (protein) • e.g. Q9ARH1 • 2.GO term ID • e.g. GO:0004674 • 3.Reference ID • e.g. PubMed ID: 12374299 • GOA:InterPro • 4.Evidence code • e.g. TAS

  6. GO Evidence Codes • Every GO annotation includes an Evidence Code that gives information about the evidence from which the annotation has been made. Manually annotated

  7. Additional fields can be used to further clarify an annotation • Qualifiers • (NOT, contributes_to, colocalizes_with) • ‘with’ data • to provide users with more information on the method/experiment applied.

  8. Annotations using the ‘NOT’ qualifier hSNF2H ATPase activity GO:0016887 IDA Rsf-1 NOT ATPase activity GO:0016887 IDA Loyola et al. Mol Cell Biol. 2003 Oct;23(19):6759-68.

  9. Annotations using the ‘contributes_to’ qualifier A protein which is part of a complex can be annotated to terms in that describe: • Its individual action • the action of the whole complex (Molecular Function terms) To differentiate between these two types of annotations, if a protein does not possess the activity itself, the annotation has the contributes_to qualifier added

  10. Ring1B ubiquitin-protein ligase activity IDA Annotations using the ‘contributes_to’ qualifier Bmi-1 ubiquitin-protein ligase activity IDA contributes_to Ring1A ubiquitin-protein ligase activity IDA contributes_to Pc3 ubiquitin-protein ligase activity IDA contributes_to Cao et al. Mol Cell. 2005 Dec 22;20(6):845-54.

  11. Annotations using the ‘colocalizes_with’ qualifier • Used with cellular component terms • To describe proteins that are transiently or peripherally associated with an organelle or complex CENP-E condensed chromosome kinetochore IDA colocalizes_with Meyer et al. J Cell Biol. 1997 Feb 24;136(4):775-88.

  12. When transferring annotations based on sequence similarity… Protein GO term Evidence Reference With For protein binding annotations… Protein GO term Evidence Reference With Annotations using additional identifiers in the ‘with’ column • Provides further information to support the evidence code used in an annotation

  13. There are two main types of GO annotation:  Electronic Annotation  Manual Annotation both these methods have their advantages They can be easily distinguished by the ‘evidence code’ used.

  14. Electronic Annotation Fatty acid biosynthesis ( Swiss-Prot Keyword) EC:6.4.1.2 (EC number) IPR000438: Acetyl-CoA carboxylase carboxyl transferase beta subunit (InterPro entry) MF_00527: Putative 3-methyladenine DNA glycosylase (HAMAP) GO:Fatty acid biosynthesis (GO:0006633) GO:acetyl-CoA carboxylaseactivity (GO:0003989) GO:acetyl-CoA carboxylase activity (GO:0003989) GO:DNA repair (GO:0006281) • Very high-quality • However these annotations often use high-level GO terms and provide little detail. Camon et al. BMC Bioinformatics. 2005; 6 Suppl 1:S17

  15. Mappings of external concepts to GO http://www.geneontology.org/GO.indices.shtml

  16. InterProScan http://www.ebi.ac.uk/InterProScan

  17. Output from InterProScan…

  18. Manual Annotation • High–quality, specific annotations made using: • Peer-reviewed papers • A range of evidence codes to categorize the types of evidence found in a paper • very time consuming and requires trained biologists

  19. Finding GO terms … …for chicken TaxREB107protein (Q8UWG7) increased troponin I reporter gene activity positive modulator of skeletal muscle gene expression nucleoli cytoplasmic Component: cytoplasm GO:0005737 Component: nucleolus GO:0005730 Process: positive regulation of transcription GO:0045941 Process: positive regulation of skeletal muscle development GO:0048643

  20. http://www.geneontology.org/GO.annotation.shtml

  21. Aids for GO manual annotation Many are on the GO Consortium tools page: http://www.geneontology.org/GO.tools.shtml

  22. GoPubMed gives an overview over literature abstracts taken from PubMed and categorizes them with Gene Ontology terms: GoPubMed http://gopubmed.org

  23. GoPubMed http://gopubmed.org

  24. http://www.ebi.ac.uk/Rebholz-srv/whatizit

  25. UniProt Ac’s GO terms Whatizit http://www.ebi.ac.uk/Rebholz-srv/whatizit

  26. Searching for GO terms http://www.ebi.ac.uk/ego http://www.godatabase.org http://www.geneontology.org/GO.tools.html …and more varieties of browsers available on the GO Tools page: http://www.geneontology.org/GO.tools.html

  27. http://www.ebi.ac.uk/ego

  28. Exact match http://www.ebi.ac.uk/ego

  29. GO annotation editors • The GO Consortium is aware there is a need for a light-weight, generic GO annotation tool. • enhanced spreadsheets (e.g. Excel) • Protein2GO (GOA)

  30. Enhanced Spreadsheets • quick and cheap to start with • however difficult to maintain/update a reasonable sized set of annotations

  31. protein2go Protein2GO

  32. Protein2GO

  33. Protein2GO

  34. Protein2GO

  35. Protein2GO

  36. Protein2GO

  37. Protein2GO

  38. How users can view GO annotations Download and parse an entire gene association file… …or look at annotations for a protein using one of the GO browsers or a database that integrates GO annotations. QuickGO : http://www.ebi.ac.uk/ego

  39. http://www.geneontology.org/GO.current.annotations.shtml

  40. http://www.ebi.ac.uk/goa

  41. Acknowledgements Nicky Mulder Head of InterPro Evelyn CamonGOA Coordinator Daniel Barrell GOA Programmer Rachael Huntley GOA Curator David Binns & John Maslen QuickGO, Protein2GO tools Achuthanunni C. Balakrishnan Text-2-GO Jorge Duarte IPI sets Midori Harris GO Editor Jane Lomax GO Curator Amelia Ireland GO Curator Jennifer Clarke GO Curator Rolf Apweiler Head of Sequence Database Group The Gene Ontology Consortium and 1.5 members of GOA currently supported by an P41 grant from the National Human Genome Research Institute (NHGRI) [grant HG002273], GOA is also supported by core EMBL funding.

More Related