480 likes | 503 Views
Learn about the formation, evolution, and structure of the Gene Ontology (GO) and its importance in genomic annotation. Discover how GO categorizes gene products into cellular components, molecular functions, and biological processes. Explore the collaborative efforts involved in developing and maintaining GO within the Open Biomedical Ontologies (OBO) framework.
E N D
GO and OBO: an introduction
What is the Gene Ontology? • What is OBO? • OBO-Edit demo & practical Jane Lomax EMBL-EBI
Gene Ontology • Built for a very specific purpose: “annotation of genes and proteins in genomic and protein databases” • Applicable to all species Jane Lomax EMBL-EBI
Evolution of GO • Original GO created in 2000 • Three databases involved: • FlyBase (Drosophila) • MGI (Mouse) • SGD (S. cerevisae) • Used immediately Jane Lomax EMBL-EBI
Evolution of GO • Later databases: • TAIR (Arabadopsis) • TIGR (microbes including prokaryotes) • SWISS-PROT (several thousand species inc. human) • PSU (P. falciparum) • Recent additions • ZFIN (zebrafish) • PAMGO (plant pathogens) Jane Lomax EMBL-EBI
Evolution of GO • GO development traditionally annotation-driven • development directed by use • Terms added as new species annotated • Terms added on as as-needed basis Jane Lomax EMBL-EBI
Evolution of GO • Developed by an international consortium of biologists and computer scientists • members from individual databases • central office at EBI • Development involves collaboration with domain experts from different biological fields • also formal ontologists Jane Lomax EMBL-EBI
Evolution of GO • Resulted in ‘organic’ structure, little formality • Ontological formality added subsequently • philosophical and logical Jane Lomax EMBL-EBI
Growth of GO Jane Lomax EMBL-EBI
How does GO work? • What does the gene product do? • Where and when does it act? • Why does it perform these activities? What information might we want to capture about a gene product? Jane Lomax EMBL-EBI
GO structure • GO terms divided into three parts: • cellular component • molecular function • biological process Jane Lomax EMBL-EBI
Cellular Component • where a gene product acts
Cellular Component • Enzyme complexes in the component ontology refer to places, not activities.
Molecular Function • activities or “jobs” of a gene product glucose-6-phosphate isomerase activity
Molecular Function insulin binding insulin receptor activity
Molecular Function drug transporter activity
Molecular Function • A gene product may have several functions; a function term refers to a single reaction or activity, not a gene product. • Sets of functions make up a biological process. Jane Lomax EMBL-EBI
Biological Process a commonly recognized series of events cell division
Biological Process transcription
Biological Process regulation of gluconeogenesis
Biological Process limb development
Biological Process courtship behavior
Ontology Structure • Terms are linked by two relationships • is-a • part-of Jane Lomax EMBL-EBI
cell membrane chloroplast mitochondrial chloroplast membrane membrane is-a part-of Ontology Structure Jane Lomax EMBL-EBI
Ontology Structure • Ontologies are structured as a hierarchical directed acyclic graph (DAG) • Terms can have more than one parent and zero, one or more children Jane Lomax EMBL-EBI
Ontology Structure Directed Acyclic Graph (DAG) - multiple parentage allowed cell membrane chloroplast mitochondrial chloroplast membrane membrane Jane Lomax EMBL-EBI
Open Biomedical Ontologies (OBO) • GO is a member of OBO • An umbrella project for grouping different ontologies in biological/medical field • a repository for ontologies with defined set of standards • Available from a single source: http://obo.sourceforge.net/ Jane Lomax EMBL-EBI
Why do we need OBO? • GO covers small area of biology: • molecular function of a protein • biological function of a protein • cellular location of a protein Jane Lomax EMBL-EBI
Why do we need OBO? • Lots of other aspects that also need to be captured, e.g.: • phenotype • anatomy • genomic • taxonomy Jane Lomax EMBL-EBI
Why do we need OBO? • Many groups develop their own ontologies • e.g. plant ontology, anatomies for specific organisms • No standardisation of ontologies with respect to: • format • scope • relationships • No way of knowing whether such ontologies already exist • No mechanism of distribution for other groups Jane Lomax EMBL-EBI
Why do we need OBO? • Creating ontologies takes a lot of work • Makes sense to reuse existing ontologies where possible • Improves data integration where small set of ontologies used • Allows ontologies to be made available from a single place Jane Lomax EMBL-EBI
Why do we need OBO? • Ultimate aim: a complete set of integrated ontologies completely covering the biomedical domain Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint Jane Lomax EMBL-EBI
OBO requirements: open • Ontologies can be used by anyone without any constraints, except: • original authors are acknowledged • cannot be edited and then released under same name Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax Jane Lomax EMBL-EBI
OBO requirements: syntax • Usually the OBO format, same as primary GO format • and adaptions of OBO format • Also accept OWL (Web Ontology Language) format • Allows the same tools to be applied, facilitating shared software implementations Jane Lomax EMBL-EBI
Anatomy of an OBO term unique ID id: GO:0006094 name: gluconeogenesis namespace: process def: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. [http://cancerweb.ncl.ac.uk/omd/index.html] exact_synonym: glucose biosynthesis xref_analog: MetaCyc:GLUCONEO-PWY is_a: GO:0006006 is_a: GO:0006092 term name ontology definition synonym database ref parentage Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO Jane Lomax EMBL-EBI
OBO requirements: overlapping • Ontologies can (and should) overlap partially, but large overlap should be avoided • Idea is that terms from different ontologies can be combined to form new terms • Striving for accepted standards rather than competition Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO • Share a unique identifier space Jane Lomax EMBL-EBI
OBO requirements: id space • So, for example, the GO identifier is “GO”: • No other OBO ontology could use this id space • Prevents problems where multiple ontologies are used together Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO • Share a unique identifier space • Include text definitions of their terms Jane Lomax EMBL-EBI
OBO requirements • In addition, OBO includes ontology of relationships • all ontologies should use these definitions of relationships • For example • part_of • develops_from • regulates Jane Lomax EMBL-EBI
What’s available • demo: http://obo.sourceforge.net/ Jane Lomax EMBL-EBI
Editing ontologies • GO is edited using OBO-Edit • stand-alone Java application • available for all platforms • browse, create or edit any ontology in OBO format Jane Lomax EMBL-EBI
OBO-Edit demo • Browsing ontologies • loading ontologies (including loading multiple ontologies) • graph viewer • reasoner/single relationship views • searching/filtering/rendering • help • Creating/editing ontologies • creating a new ontology • adding terms • copying/moving/deleting terms • adding definitions, dbxrefs etc • verification plugin • saving ontologies Jane Lomax EMBL-EBI