-Ontologies: Bio-Ontologies: Their Creation and Design - PowerPoint PPT Presentation

Ontologies bio ontologies their creation and design l.jpg
Download
1 / 84

  • 258 Views
  • Uploaded on
  • Presentation posted in: Pets / Animals

-Ontologies: Bio-Ontologies: Their Creation and Design . Dr. Peter Karp SRI, http://www.ai.sri.com/~pkarp/ Dr. Robert Stevens & Professor Carole Goble University of Manchester, UK http://img.cs.man.ac.uk/tambis. Advertisement. The Fourth Annual Bio-Ontologies Meeting

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

-Ontologies: Bio-Ontologies: Their Creation and Design

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Ontologies bio ontologies their creation and design l.jpg

-Ontologies: Bio-Ontologies: Their Creation and Design

Dr. Peter Karp

SRI, http://www.ai.sri.com/~pkarp/

Dr. Robert Stevens & Professor Carole Goble

University of Manchester, UK

http://img.cs.man.ac.uk/tambis


Advertisement l.jpg

Advertisement

The Fourth Annual Bio-Ontologies Meeting

"Sharing Experiences and Spreading Best Practice”

Sponsored by

GlaxoSmithKline Pharmaceuticals

Tivoli Gardens, Copenhagen, Denmark,

26th July 2001

Organised by: Richard Chen, Carole Goble, Robert Stevens, Peter Karp, Pat Hayes, Robin McEntire and Eric Neumann.

http://img.cs.man.ac.uk/stevens/workshop01


Outline l.jpg

Outline

  • What is an ontology?

    • Motivation for ontologies in bioinformatics

    • Definition of an ontology

    • Naming the parts & comparing the types

    • Knowledge representation

  • Building an ontology

    • Methodologies, pprinciples and pitfalls

    • Running example: a macromolecule fragment

    • Ontology Tools

    • Development tools


Ontologies definitions components subtypes l.jpg

Ontologies:Definitions, Components, Subtypes


Outline5 l.jpg

Outline

  • Motivations for ontologies in bioinformatics

  • Definition of ontology

  • Principles and pitfalls of ontology design

  • GKB Editor ontology development tool


Definition of an ontology l.jpg

Definition of an Ontology

  • Conceptualization of a domain of interest

    • Concepts, relations, attributes, constraints, objects, values

  • An ontology is a specification of a conceptualization

    • Formal notation

    • Documentation

  • A variety of forms, but includes:

    • A vocabulary of terms

    • Some specification of the meaning of the terms

  • Ontologies are defined for reuse


Roles of ontologies in bioinformatics l.jpg

Roles of Ontologies in Bioinformatics

  • Success of many biological DBs depends on

    • High fidelity ontologies

    • Clearly communicating their ontologies

      • Prevent errors on data entry and interpretation

  • Common framework for multidatabase queries

  • Controlled vocabularies for genome annotation

    • Riley ontology, GO

    • EC numbers


Roles of ontologies in bioinformatics8 l.jpg

Roles of Ontologies in Bioinformatics

  • Information-extraction applications

  • Reuse is a core aspect of ontologies

    • Reuse of existing ontologies faster than designing new ones

    • Reuse decreases semantic heterogeneity of DBs

  • Schema-driven Software

    • Knowledge-acquisition tools

    • Query tools


Definitions l.jpg

Definitions

  • Data Model:

    • Primitive data structuring mechanism in which an ontology is expressed

    • Relational data model, object-oriented data model, frame data model

  • Ontology:

    • Domain specific conceptualization expressed within some data model


Components of an ontology l.jpg

Components of an Ontology

  • Concepts

    • AKA: Class, Set, Type, Predicate

    • Gene, Reaction, Macromolecule

  • Taxonomy of concepts

    • Generalization ordering among concepts

    • Concept A is a parent of concept B iff every instance of B is also an instance of A

    • Superset / subset

    • “A kind of” vs “a part of”


Taxonomy of concepts l.jpg

Taxonomy of Concepts


Components of an ontology12 l.jpg

Components of an Ontology

  • Objects

    • AKA: Instances, members of the set

    • trpA Gene, Reaction 1.1.2.4

    • Strictly speaking, an ontology with instances is a knowledge base

  • Relations and Attributes

    • AKA: Slots, properties

    • Product of Gene, Map-Position of Gene

    • Reactants of Reaction, Keq of Reaction

  • Values

    • The Product of the trpA Gene is tryptophan-synthetase

    • trpA.Product = tryptophan-synthetase


Components of an ontology13 l.jpg

Components of an Ontology

  • Constraints and other meta information about relations

    • Slot Product:

    • Value type: Poypeptide or RNA

    • Domain: Genes

    • Slot Map-Position:

    • Value type: Number

    • Domain: Genes

    • Cardinality: At-Most 1

    • Range: 0 <= X <= 100

  • General Axioms

    • Nucleic acids < 20 residues are oligonucleiotides


More on concepts l.jpg

More on Concepts

  • Primitive: properties are necessary

    • Globular protein must have hydrophobic core, but a protein with a hydrophobic core need not be a globular protein

  • Defined: properties are necessary + sufficient

    • Eukaryotic cells must have a nucleus. Every cell that contains a nucleus must be Eukaryotic.


Ontology subtypes expressiveness l.jpg

Ontology Subtypes Expressiveness

  • Controlled vocabulary

    • List of terms

  • Taxonomy

    • Terms in a generalization hierarchy

  • DB schemas (relational and object-oriented)

    • More implementation specific

    • No instance information

    • Limited constraints

  • Frame knowledge bases

  • Description Logics


Ontology subtypes l.jpg

Ontology Subtypes

  • Database schema

    • Concepts, relations, constraints

    • Perhaps no taxonomy

    • At most hundreds of concepts

  • Taxonomy

    • Concepts, taxonomy, perhaps a few relations

    • Thousands of concepts

  • Knowledge base

    • Concepts, relations, constraints, objects, values

    • Hundreds to hundreds of thousands of concepts and objects


Ontology subtypes17 l.jpg

Ontology Subtypes

  • Generic (a.k.a. upper, core or reference)

    • common high level concepts

    • “Physical”, “Abstract”, “Structure”, “Substance”

    • useful for ontology re-use

    • important when generating or analysing natural language expressions

  • Domain-oriented

    • domain specific (e.g. E.coli)

    • domain generalisations (e.g. gene function)

  • Task-oriented

    • task specific (e.g. annotation analysis)

    • task generalisations (e.g. problem solving)


Knowledge representation l.jpg

Knowledge Representation

  • Ontology are best delivered in some computable representation

  • Variety of choices with different:

    • Expressiveness

      • The range of constructs that can be used to formally, flexibly, explicitly and accurately describe the ontology

    • Ease of use

    • Computational complexity

      • Is the language computable in real time

    • Rigour

      • Satisfiability and consistency of the representation

      • Systematic enforcement mechanisms

    • Unambiguous, clear and well defined semantics

      • A subclassOf B don’t be fooled by syntax!


Languages l.jpg

Languages

  • Vocabularies using natural language

    • Hand crafted, flexible but difficult to evolve, maintain and keep consistent, with poor semantics

    • Gene Ontology

  • Object-based KR: frames

    • Extensively used, good structuring, intuitive. Semantics defined by OKBC standard

    • EcoCyc (uses Ocelot) and RiboWeb (uses Ontolingua)

  • Logic-based: Description Logics

    • Very expressive, model is a set of theories, well defined semantics

    • Automatic derived classification taxonomies

    • Concepts are defined and primitive

    • Expressivity vs. computational complexity balance

    • TAMBIS Ontology (uses FaCT)


Vocabularies gene ontology l.jpg

Vocabularies: Gene Ontology

  • Hand crafted with simple tree-like structures

  • Position of each concept and its relationships wholly determined by a person

  • Flexible but…

  • Maintenance and consistency preservation difficult and arduous

  • Poor semantics

  • Single hierarchies are limiting


Description logics l.jpg

Description Logics

  • Describe knowledge in terms of concepts and relations

  • Concept defined in terms of other roles and concepts

    • Enzyme = protein which catalyses reaction

    • Reason that enzyme is a kind of protein

  • Model built up incrementally and descriptively

  • Uses logical reasoning to figure out:

    • Automatically derived (and evolved) classifications

    • Consistency -- concept satisfaction


Frames and logics l.jpg

Frames and Logics

  • Frames

    • Rich set of language constructs

    • Impose restrictive constraints on how they are combined or used to define a class

    • Only support primitive concepts

    • Taxonomy hand-crafted

  • Description logics

    • Limited set of language constructs

    • Primitives combined to create defined concepts

    • Taxonomy for defined concepts established through logical reasoning

    • Expressivity vs. computational complexity

    • Less intuitive

  • Ideal: both! Current OIL activity uses a mixture. Logics provide reasoning services for frame schemes.


Ontology exchange l.jpg

Ontology Exchange

  • To reuse an ontology we need to share it with others in the community

  • Exchanging ontologies requires a language with:

    • common syntax

    • clear and explicit shared meaning

  • Tools for parsing, delivery, visualising etc

  • Exchanging the structure, semantics or conceptualisation?


Ontology exchange languages l.jpg

Frames:

modelling primitives,

OKBC

Description Logics:

formal semantics &

reasoning support

OIL

Web languages:

XML & RDF based syntax

Ontology Exchange Languages

  • XOL eXtensible Ontology Language

    • XML markup

    • Frame based

    • Rooted in OKBC

    • http://www.ai.sri.com/pkarp/xol/

  • OIL Ontology Interface LayerOntology Inference Layer

    • Gives a semantics to RDF-Schema

    • http://www.ontoknowledge.org/oil


Oil ontology metadata dublin core l.jpg

OIL: Ontology Metadata (Dublin Core)

Ontology-container

title “macromolecule fragment”

creator “robert stevens”

subject “macromolecule generic ontology”

description “example for a tutorial”

description.release “2.0”

publisher “R Stevens”

type “ontology”

formal “pseudo-xml”

identifier “http://www.ontoknowledge.org/oil/oil.pdf”

source “http://img.cs.man.ac.uk/stevens/tambis-oil.html”

language “OIL”

language “en-uk”

relation.haspart “http://www.ontoRus.com/bio/mmole.onto”


The three roots of oil l.jpg

The Three Roots of OIL

Description Logics:

Formal Semantics &

Reasoning Support

Frame-based Systems:

Epistemological Modelling

Primitives

OIL

Web Languages:

XML- and RDF-based

syntax


Oil primitive ontology definitions l.jpg

OIL primitive ontology definitions

slot-def has-backbone

inverse is-backbone-of

slot-def has-component

inverse is -component-of

properties transitive

class-def nucleic-acid

class-def rna subclass-of nucleic-acid

slot-constraint has-backbone

value-type ribophosphate

class-def ribophosphate

class-def deoxyribophosphate

subclass-of NOT ribophosphate


Oil defined ontology definitions l.jpg

OIL defined ontology definitions

class-def defined dna

subclass-of nucleic-acid AND NOT rna

slot-constraint has-backbone

value-type deoxyribophosphate

class-def defined enzyme

subclass-of protein

slot-constraint catalyse

has-value reaction

class-def defined kinase

subclass-of protein

slot-constraint catalyse

has-value phosphorylation-reaction


Oil in xml l.jpg

OIL in XML

  • OIL has a DTD, an XML Schema and a mapping to RDF-Schema. See web site for details

    <slot-def>

    <slot-name = “has-component”/>

    <inverse> <slot-name = “is-component-of”/> </inverse>

    <properties> <transitive/> </properties>

    </slot-def>

    <class-def> <class-name= “nucleic-acid”/></class-def>

    <class-def>

    <class-name= “rna”/>

    <subclass-of> <class name = “nucleic-acid”/> </subclass-of>

    <slot-constraint>

    <slot-name = “has-backbone”/>

    <value-type> <class name= “ribophosphate” </value-type>

    </slot-constraint>

    </class-def>


Oil remarks l.jpg

OIL Remarks

  • Tools:

    • Protégé II editor

    • FaCT reasoner

  • Other projects:

    • Semantic Web projects (http://www.semanticweb.org)

    • Agents for the web projects (e.g. DAML)

      A knowledge representation language and inference mechanism for the web


Oil features l.jpg

OIL Features

  • Based on standard frame languages

  • Extends expressive power with DL style logical constructs

    • Still has frame look and feel

    • Can still function as a basic frame language

  • OILcore language restricted in some respects so as to allow for reasoning support

    • No constructs with ill defined semantics

    • No constructs that compromise decidability

  • Has both XML and RDF(S) based syntax


Oil features32 l.jpg

OIL Features

  • Semantics clearly defined by mapping to very expressive Description Logic, e.g.:

    • slot-constraint reverse-transcribe-from has-valuemRNA or (part-of has-value mRNA)

    • eats.meat eats.fish

  • Note the importance of clear semantics:

    • eats.(meat  fish)

  • is inconsistent (assuming meat and fish are disjoint)

  • Mapping can also be used to provide reasoning support from a Description Logic system (e.g., FaCT)


Why reasoning support l.jpg

Why Reasoning Support?

  • Key feature of OIL core language is availability of reasoning support

  • Reasoning intended as design support tool

    • Check logical consistency of classes

    • Compute implicit class hierarchy

  • May be less important in small local ontologies

    • Can still be useful tool for design and maintenance

    • More important with larger ontologies/multiple authors

  • Valuable tool for integrating and sharing ontologies

    • Use definitions/axioms to establish inter-ontology relationships

    • Check for consistency and (unexpected) implied relationships

    • Already shown to be useful technique for DB schema integration


Classifying by reasoning l.jpg

Classifying by Reasoning


Finding inconsistencies l.jpg

Finding Inconsistencies


Changing classifications l.jpg

Changing Classifications


Daml oil l.jpg

DAML+OIL

  • OIL merged with DAML

  • Originally retained frame syntax

  • DAML more concerned with deploymnent rather than building and managing

  • OIL mapped to DAML+OIL, but not reliably reversed

  • FRAME look and feel may return

  • Web ontology language


Building ontologies l.jpg

Building Ontologies


Building ontologies39 l.jpg

Building Ontologies

  • No field of Ontological Engineering equivalent to Knowledge or Software Engineering;

  • No standard methodologies for building ontologies;

  • Such a methodology would include:

    • a set of stages that occur when building ontologies;

    • guidelines and principles to assist in the different stages;

    • an ontology life-cycle which indicates the relationships among stages.

  • Gruber's guidelines for constructing ontologies are well known.


The development lifecycle l.jpg

The Development Lifecycle

  • Two kinds of complementary methodologies emerged:

    • Stage-based, e.g. TOVE [Uschold96]

    • Iterative evolving prototypes, e.g. MethOntology [Gomez Perez94].

  • Most have TWO stages:

    • Informal stage

      • ontology is sketched out using either natural language descriptions or some diagram technique

    • Formal stage

      • ontology is encoded in a formal knowledge representation language, that is machine computable

  • An ontology should ideally be communicated to people and unambiguously interpreted by software

    • the informal representation helps the former

    • the formal representation helps the latter.


A provisional methodology l.jpg

A Provisional Methodology

  • A skeletal methodology and life-cycle for building ontologies;

  • Inspired by the software engineering V-process model;

  • The overall process moves through a life-cycle.

The left side charts the processes in building an ontology

The right side charts the guidelines, principles and evaluation used to ‘quality assure’ the ontology


The v model methodology l.jpg

Ontology in Use

The V-model Methodology

Evaluation: coverage, verification, granularity

Identify purpose and scope

Knowledge acquisition

User Model

Conceptualisation Principles: commitment, conciseness, clarity, extensibility, coherency

Conceptualisation

Integrating existing ontologies

Conceptualisation Model

Encoding/Representation principles: encoding bias, consistency, house styles and standards, reasoning system exploitation

Encoding

Representation

Implementation Model


The ontology building life cycle l.jpg

The ontology building life-cycle

Identify purpose and scope

Knowledge acquisition

Building

Language and representation

Conceptualisation

Integrating existing ontologies

Available development tools

Encoding

Evaluation


User model identify purpose and scope l.jpg

User Model: Identify purpose and scope

  • Decide what applications the ontology will support

  • EcoCyc: Pathway engineering, qualitative simulation of metabolism, computer-aided instruction, reference source

  • TAMBIS: retrieval across a broad range of bioinformatics resources

  • The use to which an ontology is put affects its content and style

  • Impacts re-usability of the ontology


User model knowledge acquisition l.jpg

User Model: Knowledge Acquisition

  • Specialist biologists; standard text books; research papers and other ontologies and database schema.

  • Motivating scenarios and informal competency questions – informal questions the ontology must be able to answer

  • Evaluation:

    • Fitness for purpose

    • Coverage and competency


Ontology scenario l.jpg

Ontology Scenario

  • A molecule ontology;

  • Describes the molecules stored in bioinformatics databases and annotated therein;

  • It should cover the molecules and other chemicals described in the resources;

  • The ontology will be used for querying and annotating information in bioinformatics resources.


Competency questions l.jpg

Competency Questions

  • Cover the macromolecules found in molecular biology resources and courses;

  • Should accommodate various views on the macromolecules;

  • should cover the queries people want to ask of macromolecules;

  • In reality, need more detail on these questions- “give me tRNA genes with anticodon x, from aardvark”.


Acquiring knowledge l.jpg

Acquiring Knowledge

  • Find your knowledge!

  • An important source is your head, but…

  • Use text books, glossaries (many of which lie on the web) and domain experts;

  • Use other ontologies – what did they include and how did they do it?

  • Record your sources of knowledge.

  • Use your competency questions;


Starting concept list l.jpg

Starting Concept List

  • Chemicals – atom, ion, molecule, compound, element;

  • Molecular-compound, ionic-compound, ionic-molecular-compound, …;

  • Ionic-macromolecular-compound and ionic-msall-macromolecular-compound;

  • Protein, peptide, polyprotein, enzyme, holo-protein, apo-protein,…

  • Nucleic acid – DNA, RNA, tRNA, mRna, snRNA, …


Conceptualisation model conceptualisation l.jpg

Conceptualisation Model: Conceptualisation

  • Identify the key concepts, their properties and the relationships that hold between them;

    • Which ones are essential?

    • What information will be required by the applications?

  • Structure domain knowledge into explicit conceptual models.

  • Identify natural language terms to refer to such concepts, relations and attributes;


Slide51 l.jpg

Conceptualisation Sketch

Chemical

Molecule

Compound

Element

Ion

Atom

Molecular

Compound

Ionic

Compound

Molecular

Element

Ionic

Molecule

Non-Metal

Metal

Ionic Molecular

Compound

Metaloid


Slide52 l.jpg

Molecule Conceptualisation Sketch

Ionic Macromolecular

Compound

Macromolecule

Small

Molecule

Polysaccharide

Protein

Nucleic

Acid

Peptide

Starch

Glycogen

Enzyme

DNA

RNA

snRNA

mRNA

tRNA

rRNA


Conceptualisation model naming l.jpg

Conceptualisation Model: Naming

  • Determine naming conventions

    • Consistent naming for classes and slots

    • EcoCyc:

      • Classes are capitalized, hyphenated, plural

      • Slot names are uppercase

        A quality ontology captures relevant biological distinctions with high fidelity


Conceptualisation model pitfalls l.jpg

Conceptualisation Model: Pitfalls

  • Pitfall: Missing ontological elements

    • Missing classes: Swiss-Prot Protein complexes

    • Lack of Lipid and Cofactor in example ontology

    • Missing attributes: Genetic code identifier

    • Confuse 1:1 with 1:Many, or 1:Many with Many:Many

      • Cofactor as an attribute of reaction as well as protein

    • Important data is stored within text/comment fields

  • Pitfall: Extra ontological elements

  • Pitfall: Stop over-elaborating – when do I stop?

  • Pitfall: Relevance – do I really need all this detail?


Conceptualisation partonomy l.jpg

Conceptualisation: Partonomy

  • Part-of relationships very important

  • Several linds of part-of: component-of, region-of, mixture-of

  • Alpha-helix is a region of a protein, but a protein is compoennt of a complex

  • Care in placing transitivity


Integrating existing ontologies l.jpg

Integrating Existing Ontologies

  • Reuse or adapt existing ontologies when possible

    • Save time

    • Correctness

    • Facilitate interoperation

    • Reuse GO to give example ontology Function, Process and Location

  • Integration of ontologies

    • Ontologies have to be aligned

    • Hindered by poor documentation and argumentation

    • Hindered by implicit assumptions

    • Shared generic upper level ontologies should make integration easier


Encoding implementation toolkit l.jpg

Encoding: Implementation Toolkit

  • Construct ontology using an ontology-development system

    • Does the data model have the right expressivity?

      • Is it just a taxonomy or are relationships needed?

      • Is multiple parentage needed? Inverse relationships?

      • What types of constraints are needed?

    • Are reasoning services needed?

    • What are authoring features of the development tool?

    • Can ontology be exported to a DBMS schema?

    • Can ontology be exported to an ontology exchange language?

    • Is simultaneous updating by multiple authors needed?

    • Size limitations of development tool?


Encoding l.jpg

Encoding

Encode sketch in KRL;

  • Use OIL – a frame syntax with reasoning support if we want it;

  • Wide range of expressivity (see cofactor example later);

  • Hand craft a hierarchy – implement the sketch made earlier;

  • This hand-crafted version can be migrated to a more descriptive form later.


Initial encoding l.jpg

Initial Encoding

class-def chemical

subclass-of substance

class-def molecule

subclass-of chemical

class-def compound

subclass-of chemical

class-def molecular-compound

subclass-of molecule and compound


Encoding ontology implementation pitfalls l.jpg

Encoding: Ontology Implementation Pitfalls

  • Pitfall: Semantic ambiguity

    • Multiple ways to encode the same knowledge

    • Meaning of class definitions unclear

  • Pitfall: Encoding Bias

    • Encoding the ontology changes the ontology


Encoding ontology implementation pitfalls61 l.jpg

Encoding: Ontology Implementation Pitfalls

  • Pitfall: Redundancy (lack of normalization)

    • Exact same information repeated

    • Presence of computationally derivable information

      • Date of birth and age

      • Sequence length

      • DNA sequence and reverse complement

    • More effort required for entry and update

    • In KB partial updates lead to inconsistency

    • OK if redundant information is maintained automatically


Encoding the interaction problem l.jpg

Encoding: The Interaction Problem

  • Task influences what knowledge is represented and how its represented

    • Molecular biology: chemical and physical properties of proteins

    • Bioinformatics: accession number, function gene

    • Underlying perspectives mean they may not be reconcilable

  • If an ontology has too many conflicting tasks it can end up compromised – TaO experience


Evaluate it a guide for reusability l.jpg

Evaluate it - A guide for reusability

  • Conciseness

    • No redundancy

    • Appropriateness – protein molecules at the atomic resolution when amino acid level would do

  • Clarity

  • Consistency

  • Satisfiability – it doesn’t contradict itself

  • Molecule and Compound disjoint, but molecular-cpound is (molecule and compound)

    • Commitment

    • Do I have to buy into a load of stuff I don’t really need or want just to get the bit I do?


Documentation make ontology understandable l.jpg

Documentation: Make Ontology Understandable!

  • Produce clear informal and formal documentation

    • An ontology that cannot be understood will not be reused

    • Genbank feature table

    • NCBI ASN.1 definitions

  • There exists a space of alternative ontology design decisions

    • Semantics / Granularity

    • Terminology

  • Pitfall: Neglecting to record design rationale


Slide65 l.jpg

Molecules Revisited

Non-Ionic Macromolecular

Compound

Ionic Macromolecular

Compound

Macromolecule

Small

Molecule

Polysaccharide

Protein

Nucleic

Acid

Peptide

Starch

Glycogen

Enzyme

DNA

RNA

snRNA

mRNA

tRNA

rRNA


More encoding l.jpg

More Encoding

class-def chemical

subclass-of substance

class-def defined molecule

subclass-of chemical

Slot-constraint contains-bond min-cardinality 1 has-value covalent-bond

class-def defined compound

subclass-of chemical

Slot-constraint has-atom-types greater-than 1

class-def defined molecular-compound

subclass-of molecule and compound


Cofactor knowledge l.jpg

Cofactor Knowledge

  • Gather knowledge about cofactors, coenzymes and prosthetic groups from glossaries and dictionaries etc.

  • Note that definitions are inconsistent and even contradictory.

  • Synthesise knowledge and make judgements.


Encoding cofactor l.jpg

Encoding Cofactor

Class-def defined cofactor

Subclass-of metal-ion or small-organic-molecule

Slot-constraint binds-to has-value protein

Class-def defined coenzyme

Subclass-of cofactor

Slot-constraint binds-loosley-to has-value protein

Class-def defined prosthetic-group

Subclass-of cofactor and (not metal-ion)

Slot-constraint binds-strongly-to has-value protein


Cofactor discussion l.jpg

Cofactor Discussion

  • Classifies as a kind of chemical;

  • Taken from IUPAC definition – document – not a child of organic-molecule and metal-ion;

  • Can express both disjunction and negation in OIL;

  • Uses a slot hierarchy in describing binds-to.


More discussion l.jpg

More Discussion

  • Can we define sufficiency conditions for peptide?

  • Mass and length are not easy to use in definition – A protein is > 100 Kda;

  • What about a 99 Kda protein;


Publish the ontology l.jpg

Publish the Ontology

  • Formal and informal specifications

  • Intended domain of application

  • Design rationale

  • Limitations

  • See EcoCyc paper in ISMB-93/Bioinformatics 00

  • See TAMBIS paper in Bioinformatics 99


Ontological pitfalls l.jpg

Ontological Pitfalls

  • Stop-over – when do I stop over elaborating?

    • Proteins  amino acid residues  side chains  physical chemical properties ….

  • Relevance

    • Do we need to mention all the types of nucleic acid?


Ontology development tools l.jpg

Ontology-Development Tools


Ontology developmenttools l.jpg

Ontology DevelopmentTools

  • Development environments

  • Ontology Libraries

  • Ontology publishing and exchange

    • Across all representational forms (logic, frame, etc..)

    • Web compliant

  • Ontology delivery

    • Ontology servers


  • Development environments l.jpg

    Development Environments

    • Considerations depend on ontology subtype!

      • Expressiveness of data model

      • Authoring features

      • DBMS export capabilities

      • Ontology-exchange language export capabilities

      • Distributed authoring

      • Size limitations

    • WebOnto

    • Ontosaurus

    • GKB Editor

    • Protégé II

    • Ontolingua

    • GRAIL toolkit etc…

    • Wondertools


    Gkb editor ontology development toolkit l.jpg

    GKB EditorOntology Development Toolkit

    • Graphical editor for KBs and ontologies

    • Ontologies stored in Ocelot object-oriented knowledge base

      • Expressive, scalable, distributed

      • EcoCyc ontology contains 1K classes, 15K instances

    • Knowledge is graphically portrayed in 3 viewers

    • All operations are schema driven

    • See http://www.ai.sri.com/~gkb/user-man.html


    Ocelot capabilities l.jpg

    Ocelot Capabilities

    • Frame data model

    • KBs and ontologies stored in files or Oracle

    • Oracle KBs and ontologies:

      • Better scalability -- frame faulting on demand and in background

      • Concurrency control system coordinates changes by multiple users

      • Transaction logging (recall operation history)

    • GFP API provides programmatic interface


    Distributed ontology development l.jpg

    Distributed Ontology Development

    User 1

    User 2

    Internet

    Oracle

    Server

    User 4

    User 3


    Gkb editor l.jpg

    GKB Editor

    • Taxonomy Viewer

      • Create/delete classes and instances

      • Browse class taxonomy

      • Alter class/subclass links

    • Frame editor

      • Add/remove slots to/from classes

      • Create/delete/edit slot values for instances

    • Frame relationships viewer

      • View and update a network of relationships among instances


    Summary l.jpg

    Summary

    • A definition of ontology as a characterisation of conceptualisation -- capturing the things we know about a domain;

    • The knowledge within an ontology can be applied to a variety of tasks;

    • Building an ontology -- process and life-cycle;

    • Influences on the choice of encoding language;

    • The desirability of tools for the building, management and exchange of ontologies;


    Final remarks l.jpg

    Final remarks

    • The use of ontologies is growing within the bio-molecular world

    • They are a high-cost, but high-benefit solution to a variety of problems confronting the bioinformatics community.


    Some references 1 l.jpg

    Some References (1)

    Review

    • Stevens R., Goble C.A. and Bechhofer, S. Ontology-based Knowledge Representation for Bioinformatics accepted for Briefings in Bioinformatics

      Bio-ontologies & Systems

    • Karp P. D. An ontology for biological function based on molecularinteractions Bioinformatics 2000;16 269-285

    • Ashburner et al Gene Ontology: Tool for the Unification of Biology, Nature Genetics Vol 25 pages 25-29

    • R. Altman, M. Bada, X.J. Chai, M. Whirl Carillo R.O. Chen, and N.F. Abernethy. RiboWeb: An Ontology-Based System for Collaborative Molecular Biology. IEEE Intelligent Systems, 14(5):68-76, 1999.

    • P.G. Baker, C.A. Goble, S. Bechhofer, N.W. Paton, R. Stevens, and A Brass. An Ontology for Bioinformatics Applications. Bioinformatics, 15(6):510-520, 1999.

    • R.O. Chen, R. Felciano, and R.B. Altman. RiboWeb: Linking Structural Computations to a Knowledge Base of Published Experimental Data. In Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology, pages 84-87. AAAI Press, 1997.

      • Guarino, N. 1992. Concepts, Attributes and Arbitrary Relations: Some Linguistic and Ontological Criteria for Structuring Knowledge Bases. Data & Knowledge Engineering, 8: 249-261.

      • Guarino, N., Carrara, M., and Giaretta, P. 1994a. An Ontology of Meta-Level Categories. In J. Doyle, E. Sandewall and P. Torasso (eds.), Principles of Knowledge Representation and Reasoning: Proceedings of the Fourth International Conference (KR94). Morgan Kaufmann, San Mateo, CA: 270-280.

  • P. Karp and S. Paley Integrated Access to Metabolic and Genomic Data Journal of Computational Biology, 3(1):191--212, 1996.

  • P. Karp, M. Riley, S. Paley, A. Pellegrini-Toole, and M. Krummenacker. EcoCyc: Electronic Encyclopedia of phE. coli Genes and Metabolism. Nucleic Acids Research, 27(1):55-58, 1999.

  • S. Schulze-Kremer. Ontologies for Molecular Biology. In Proceedings of the Third Pacific Symposium on Biocomputing, pages 693-704. AAAI Press, 1998.

  • P.G. Baker, A. Brass, S. Bechhofer, C. Goble, N. Paton, and R. Stevens. TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources. An Overview. In Proceedings of the Sixth International Conference on Intelligent Systems for Molecular Biology, pages 25--34. AAAI Press, June 28-July 1, 1998 1998.


  • Some references 2 l.jpg

    Some References (2)

    Ontology development and exchange

    • T.R. Gruber. Towards Principles for the Design of Ontologies Used for Knowledge Sharing. In Roberto Poli Nicola Guarino, editor, International Workshop on Formal Ontology, Padova, Italy, 1993. Available as technical report KSL-93-04, Knowledge Systems Laboratory, Stanford University:ftp.ksl.ftanford.edu/pub/KSL_Reports/KSL-983-04.ps.


    More references 3 l.jpg

    More References (3)

    • I. Horrocks, D. Fensel, J. Broekstra, M. Crubezy, S. Decker, M. Erdmann, W. Grosso, C. Goble, F. Van Harmelen, M. Klein, M. Musen, S. Staab, and R. Studer. The ontology interchange language oil: The grease between ontologies. http://www.cs.vu.nl/ dieter/oil.

    • R. Jasper and M. Uschold A Framework for Understanding and Classifying Ontology Applications. In Twelfth Workshop on Knowledge Acquisition Modeling and Management KAW'99, 1999.

    • M. Uschold and M. Gruninger. Ontologies: Principles, Methods and Applications. Knowledge Engineering Review, 11(2), June

    • Guarino, N. and Welty, C. Identity, Unity, and Individuality: Towards a Formal Toolkit for Ontological Analysis, in H.\ Werner (Ed), Proceedings of ECAI-2000: The European Conference on Artificial Intelligence , IOS Press, Amsterdam August, 2000 219--223


  • Login