slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
The Gene Ontology Project: PowerPoint Presentation
Download Presentation
The Gene Ontology Project:

Loading in 2 Seconds...

play fullscreen
1 / 33

The Gene Ontology Project: - PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on

The Gene Ontology Project:. Developing and Using Controlled Vocabularies for Sharing Biological Information. GO Project Goals. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary terms (annotation) Develop tools:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The Gene Ontology Project:' - lis


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

The Gene Ontology Project:

Developing and Using Controlled Vocabularies for Sharing Biological Information

slide2

GO Project Goals

  • Compile structured vocabularies describing
  • aspects of molecular biology
  • Describe gene products using vocabulary terms
  • (annotation)
  • Develop tools:
    • to query and modify the vocabularies and
    • annotations
    • annotation tools for curators
slide3

GO Data

  • GO provides two bodies of data:
    • Terms with definitions and cross-
    • references
    • Gene product annotations with
    • supporting data
slide4

The Three Ontologies

  • Molecular Function — elemental activity or task
    • nuclease, DNA binding, transcription factor
  • Biological Process — broad objective or goal
    • mitosis, signal transduction, metabolism
  • Cellular Component — location or complex
    • nucleus, ribosome, origin recognition complex
slide5

Parent-Child Relationships

A child is a subset of its parent’s elements

slide6

DAG Structure

Directed acyclic graph: each child may have one or more parents

slide7

Relationship Types

  • is-a
    • subclass; a is a type of b
  • part-of
    • physical part of (component)
    • subprocess of (process)
slide8

The True Path Rule

Every path from a node back to the root must be biologically accurate

slide9

GO Annotation

  • Association between gene product and
  • applicable GO terms
  • Provided by member databases
  • Made by manual or automated methods
slide10

DAG Structure

Annotate to any level within DAG

slide11

DAG Structure

mitotic chromosome condensation

S.c. BRN1, D.m. barren

Annotate to any level within DAG

slide12

DAG Structure

mitosis

S.c. NNF1

mitotic chromosome condensation

S.c. BRN1, D.m. barren

Annotate to any level within DAG

slide13

GO Annotation: Data

  • Database object: gene or gene product
  • GO term ID
  • Reference
    • publication or computational method
  • Evidence supporting annotation
slide14

GO Evidence Codes

IDA-Inferred from Direct Assay

IMP-Inferred from Mutant

Phenotype

IGI-Inferred from Genetic

Interaction

IPI-Inferred from Physical

Interaction

IEP-Inferred from Expression

Pattern

TAS-Traceable Author Statement

NAS-Non-traceable Author Statement

IC - Inferred by Curator

ISS-Inferred from Sequence or structural Similarity

IEA-Inferred from Electronic Annotation

ND-Not Determined

slide15

GO Evidence Codes

From reviews or introductions

IDA-Inferred from Direct Assay

IMP-Inferred from Mutant

Phenotype

IGI-Inferred from Genetic

Interaction

IPI-Inferred from Physical

Interaction

IEP-Inferred from Expression

Pattern

TAS-Traceable Author Statement

NAS-Non-traceable Author Statement

IC - Inferred by Curator

ISS-Inferred from Sequence or structural Similarity

IEA-Inferred from Electronic Annotation

ND-Not Determined

automated

From primary literature

slide16

GO Annotation: Methods

  • Manual
  • Automated
    • sequence similarity
    • transitive annotation
    • nomenclature, other text matching
automated annotation interpro example

YFP

Automated Annotation: InterPro Example

InterPro

entry

GO

entry

InterPro2go links InterPro entries and GO terms

automated annotation interpro example1

YFP

Automated Annotation: InterPro Example

Run InterProScan to link YFP and InterPro entry

InterPro

entry

GO

entry

InterPro2go links InterPro entries and GO terms

automated annotation interpro example2

YFP

Automated Annotation: InterPro Example

Run InterProScan to link YFP and InterPro entry

Infer GO term from the other two links

InterPro

entry

GO

entry

InterPro2go links InterPro entries and GO terms

slide20

GO Annotation: Contributors

  • FlyBase • WormBase
  • Saccharomyces Genome Database • DictyBase
  • Mouse Genome Informatics • Gramene
  • The Arabidopsis Information Resource • Compugen, Inc.
  • Swiss-Prot/TrEMBL/InterPro
  • Pathogen Sequencing Unit (Sanger Institute)
  • PomBase (Sanger Institute)
  • Rat Genome Database
  • The Institute for Genomic Research
slide21

GO Annotation: Organisms

  • Fruit fly (Drosophila melanogaster)
  • Budding yeast (Saccharomyces cerevisiae)
  • Fission yeast (Schizosaccharomycespombe)
  • Human (Homosapiens)
  • Mouse (Mus musculus) • Rice (Oryza sativa)
  • Rat (Rattusnorvegicus) • Tsetse fly (G. morsitans)
  • Caenorhabditiselegans • Arabidopsis thaliana
  • Vibrio cholerae • Dictyostelium discoideum
slide23

GO Data Formats

  • flat files
    • working version; updated daily
    • archived monthly
  • XML RDF
    • updated monthly
  • MySQL database
    • updated monthly
slide24

GO Tools

  • Database (and schema)
  • Perl API
  • Browser: AmiGO
  • Editing tool: DAG-Edit
slide26

gene products

annotated to term

AmiGO Browser

slide27

DAG view

tree view

editing

DAG-Edit

slide28

What GO is NOT:

  • Not a way to unify biological databases
  • Not a dictated standard
  • Does not define evolutionary relationships
  • Additional ontologies needed to model
  • biology and experimentation
slide29

Terms outside the Scope of GO

  • Names of gene products
  • Protein domains
  • Protein sequence features
  • Phenotypes; diseases
  • Anatomical terms (except as part of terms generated by cross-products)
slide30

The GOBO Proposal

  • Global Open Biology Ontologies
  • Umbrella site for shared genomics and
  • proteomics vocabularies
  • Present incarnation: subdirectory within
  • GO repository:
  • www.geneontology.org/doc/gobo.html
slide31

GOBO Criteria

  • Open source
  • Can be instantiated in DAML+OIL
  • or GO syntax
  • Orthogonal
  • Shared ID space
  • Defined terms
slide32

Some GOBO Ontologies

gene

gene_attribute

gene_structure SO

gene_variation ME

gene_product

gene_product_attribute

molecular_function GO

protein_family INTERPRO

phenotype

mutant phenotype

anatomy

For complete current draft see

www.geneontology.org/doc/gobo.html

slide33

www.geneontology.org

  • FlyBase & Berkeley Drosophila Genome Project • WormBase
  • Saccharomyces Genome Database • DictyBase
  • Mouse Genome Informatics • Gramene
  • The Arabidopsis Information Resource • Compugen, Inc.
  • Swiss-Prot/TrEMBL/InterPro
  • Pathogen Sequencing Unit (Sanger Institute)
  • PomBase (Sanger Institute)
  • Rat Genome Database
  • Genome Knowledge Base (CSHL)
  • The Institute for Genomic Research

The Gene Ontology Consortium is supported by NHGRI grant HG02273 (R01). The Gene Ontology project thanks AstraZeneca for financial support. The Stanford group acknowledges a gift from Incyte Genomics.