ontologies for gene expression n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Ontologies for Gene Expression PowerPoint Presentation
Download Presentation
Ontologies for Gene Expression

Loading in 2 Seconds...

play fullscreen
1 / 31

Ontologies for Gene Expression - PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on

Ontologies for Gene Expression. History of ontologies in bioinformatics BioOntologies Consortium Ontologies for the biochemical networks that control gene expression. Ontologies. Clear thinking about how to structure information Clearly understand each field in a database

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Ontologies for Gene Expression' - lela


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
ontologies for gene expression
Ontologies for Gene Expression
  • History of ontologies in bioinformatics
  • BioOntologies Consortium
  • Ontologies for the biochemical networks that control gene expression
ontologies
Ontologies
  • Clear thinking about how to structure information
  • Clearly understand each field in a database
  • Formal and informal definitions for database elements
    • Type of value, range of values
    • Product field of Gene class can be a Protein or an RNA
  • Ability to enforce data correctness
  • Ability to compute with database elements in a reliable fashion
history of ontologies in bioinformatics
History of Ontologies inBioinformatics
  • 1994 Meeting on Interoperation of Molecular Biology Databases (MIMBD-94)
  • BioOntologies meetings in 1997, 1998, 1999, 2000, 2001
  • Ontology tutorials at ISMB conference
  • BioOntologies Consortium
bioontologies consortium
BioOntologies Consortium
  • Concerned with ontology infrastructure for bioinformatics
  • Exchange of ontologies
    • Beware: All bioinformatics ontologies expressed in different ontology language
  • Software for constructing, interpreting, applying ontologies
  • http://bioontology.ingenuity.com/
bioontologies consortium1
BioOntologies Consortium
  • ISMB-2000 paper evaluating ontology exchange languages for bioinformatics
bioontologies consortium2
BioOntologies Consortium
  • ISMB-2000 paper evaluating ontology exchange languages for bioinformatics
    • Define criteria for evaluating existing languages
    • No existing languages satisfy all criteria
    • Desired: XML syntax, frame semantics
  • 1999: Karp and Chaudhri develop XOL language
  • 2000: OIL/DAML succeeds XOL
bioontologies consortium potential interactions
BioOntologies Consortium – Potential Interactions
  • Standards and tools
    • DAML/OIL
    • SRI’s GKB Editor ontology editor
  • Collaborate on ontology development
  • Post ontologies on BioOntologies web site
be precise about ontology uses
Be Precise About Ontology Uses
  • Data submission
  • Data exchange among databases
  • High-level database design
  • Mapping from ontologies to database management systems essential
  • Beware of flatfiles
  • Beware of XML
arrayexpress
ArrayExpress
  • Ontology for specifying experiments
  • MAML import and export
  • SQL query access
ecocyc project overview
EcoCyc Project Overview
  • E.coli Encyclopedia and model organism database
    • Tracks the evolving annotation of the E. coli genome
    • Over 3000 literature citations
  • Collaborative development via internet
    • Karp (SRI) -- Bioinformatics architect
    • Riley (MBL) -- Metabolic pathways, signal transduction
    • Saier (UCSD) and Paulsen (TIGR)-- Transport
    • Collado (UNAM)-- Regulation of gene expression
  • Ontology: 1000 biological classes
  • Database content: 16,000 instances
  • Over 3,300 registered users
encoding transcriptional regulation in ecocyc goals
Encoding TranscriptionalRegulation in EcoCyc -- Goals
  • Capture transcriptional regulatory mechanisms within a well structured ontology
  • Provide a training set for inference of gene networks
  • Interpret gene-expression datasets in the context of known regulatory mechanisms
  • Compute with regulatory mechanisms and pathways
    • Summary statistics
    • Pattern discovery
    • Complex queries
    • Consistency checking
pathway tools extensions for transcriptional regulation
Pathway Tools Extensionsfor Transcriptional Regulation
  • Integration of RegulonDB (Collado et al.)
  • Regulation ontology
  • Editing tools for regulatory interactions
  • New visualizations
ecocyc ontology for transcriptional regulation
EcoCyc Ontology forTranscriptional Regulation
  • Terminology: Transcription Unit
    • Definition: A set of coding regions and associated control regions that yield a single transcript
    • “Operons” must have more than one gene
    • Prokaryotic terminology
  • Key features of ontology
    • Model gross structure of transcription units, transcription factors, RNA polymerase
    • Model all molecular interactions as biochemical reactions
      • Binding of transcription factors to ligands and to DNA sites
      • Binding of RNA polymerase to promoter
ontology for transcriptional regulation current limitations
Ontology for Transcriptional Regulation – Current Limitations
  • Focused on prokaryotic regulation
  • Mechanisms based on control of transcription initiation only, e.g., no attenuation
ontology for regulatory interactions
Ontology for RegulatoryInteractions
  • Common slots
    • Citations, Comment, Common-Name, Synonyms
  • Class DNA-Regions
    • Left-End-Position, Right-End-Position, Relative-Start-Distance
    • Class Transcription-Units
      • Components (Promoter, transcription-factor binding sites, genes, terminator)
    • Class Promoters
      • Component-Of
      • Promoter-Strength-Exp, Promoter-Strength-Seq
      • Promoter-Evidence
ontology for regulatory interactions1
Ontology for RegulatoryInteractions
  • Class DNA-Binding-Sites
    • Component-Of
    • Regulated-Promoter, Relative-Center-Distance
    • Type-Of-Evidence
  • Classes Protein-Complexes, Polypeptides
    • Components / Component-Of
  • Class Binding-Reactions
    • Reactants
    • Activators
    • Inhibitors
ecocyc ontology for transcriptional regulation1
EcoCyc Ontology forTranscriptional Regulation
  • One DB object defined for each biological entity and for each molecular interaction

trp

Int005

apoTrpR

Int001

TrpR*trp

site001

pro001

Int003

RpoSig70

trpL

trpLEDCBA

trpE

trpD

trpC

trpB

trpA

integration of regulondb
Integration of RegulonDB
  • RegulonDB has been loaded into EcoCyc
    • RegulonDB originally relational
    • Lisp loader tools developed for relational table dumps
  • Statistics:
    • 528 transcription units
    • 620 promoters
    • 617 DNA binding sites
    • 83 transcription factors
consistency checks on regulondb data
Consistency Checks onRegulonDB Data
  • Find transcription units containing:
    • Undefined components
    • No gene components
    • Genes that are not contiguous
    • Genes with conflicting transcription directions
interactive editing tools
Interactive Editing Tools
  • SRI created interactive tools for creating and modifying regulatory mechanisms
  • Ongoing updates to RegulonDB occur in EcoCyc
visualization capabilities
Visualization Capabilities
  • Transcription units
    • Transcription unit containing a gene: araA
    • Details of a transcription unit
  • Regulons: CRP, NARL
  • Pathway control
    • Overview: show rxns controlled by a TF (CRP, FNR), show other rxns controlled by same TF(s) (use a rxn in purine biosyn)
characterization of the e coli genetic network
Characterization of the E. coliGenetic Network
  • 551 transcription units include 1115 (25%) genes
  • Controlled by 86 transcription factors
  • All experimentally determined
visualization of the full e coli genetic network
Visualization of the FullE. coli Genetic Network
  • Influences of transcription factors on other transcription factors
  • 50 of 85 TFs do not affect other TFs
  • Maximum network depth of 3
  • Only CRP has a branching factor greater than 2
  • No feedback loops other than autoregulation
  • Negative auto-regulation is the dominant form of feedback