1 / 10

GO Term Integration and Curation in Pathway Tools and EcoCyc

GO Term Integration and Curation in Pathway Tools and EcoCyc. Ingrid M. Keseler Bioinformatics Research Group SRI International keseler@ai.sri.com. History of Classification and GO terms in EcoCyc. The MultiFun classification scheme was/is used for

art
Download Presentation

GO Term Integration and Curation in Pathway Tools and EcoCyc

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International keseler@ai.sri.com

  2. History of Classification and GO terms in EcoCyc The MultiFun classification scheme was/is used for gene/gene product classification in EcoCyc. • Developed by Monica Riley and collaborators • Hierarchical classification scheme with 10 major categories for cellular function • In 2005, we began to add support for adding GO terms to genes/gene products.

  3. Why go with GO? • GO has become the standard ontology/classification scheme for gene products • GO is being actively developed with input from the user communities • GO is allowing standardization of annotation across all domains of life • Data mining across genomes • Genome annotation by similarity (e.g. via InterPro, Pfam, TIGRFAM, COG mappings) • Tools that take advantage of GO annotations, e.g. microarray data clustering etc.

  4. The Evolution of GO Within EcoCyc • 12/2005 -- Mapping of MultiFun terms to GO terms (multifun2go – Ashburner and Lomax): multiple specific GO terms were sometimes mapped to one general MultiFun term, resulting in misleading GO term annotations in EcoCyc; no evidence codes, citations • 12/2007 -- Mapping of EC reactions to GO terms (ec2go): imported GO terms for enzymes that catalyzed reactions with full EC number assignments; no evidence codes, citations

  5. The Evolution of GO Within EcoCyc • 4/2008 -- Importing GO term assignments from UniProt; mostly computational evidence codes • Since ~2007 -- Manual curation of GO terms based on publications, with evidence codes (mostly experimental) and literature citations • Since ~2008 -- EcoCyc and EcoliWiki are the source of the official E. coli gene-association file (in collaboration with J. Hu and D. Siegele, EcoliWiki, Texas A&M)

  6. Of Requirements and Differences • Specific requirements for GO gene-association file • Presence of evidence codes and citations • Pathway Tools uses a different evidence code ontology; it is therefore necessary to map the evidence codes carefully • Some types of evidence require use of a With/From qualifier in GO – e.g IPI, ISS • Annotation with other qualifiers is not required by GO (e.g. NOT, contributes_to, colocalizes_with) and is not (yet) supported by Pathway Tools

  7. Tools for the Curator • GO classification editor is accessible via the protein editor • GO database can be searched in the editor; term definitions are available • Tools available locally (ask developers about general availability): • Import new GO database (for newly created terms etc.) • Export gene-association file

  8. Manual Curation of GO terms • Ongoing when we curate or re-curate gene products within EcoCyc • No particular effort to back-fill GO terms; e.g. metabolic enzymes get experimental GO term assignments when we re-curate old metabolic pathways, or when new literature appears • Texas A&M team is part of the Reference Genome Annotation Project; GO term assignments from EcoliWiki get imported into EcoCyc on a regular basis

  9. GO Term Statistics for E. coli (8/2009) • 3721 gene products annotated with at least one GO term • 42724 total GO term annotations, of which there are 6330 non-IEA annotations

  10. Acknowledgements • Peter Karp • Suzanne Paley • Markus Krummenacker • Tomer Altman • Jim Hu • Debby Siegele • GO experts at the GO consortium

More Related