building and using ontologies
Download
Skip this Video
Download Presentation
Building and Using Ontologies

Loading in 2 Seconds...

play fullscreen
1 / 25

Building and Using Ontologies - PowerPoint PPT Presentation


  • 273 Views
  • Uploaded on

Building and Using Ontologies. Robert Stevens Department of Computer Science University of Manchester Manchester UK. Introduction. The nature of bioinformatics resources What is knowledge? What is an ontology? What are the uses of ontologies? Components of an ontology

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Building and Using Ontologies' - Jeffrey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
building and using ontologies

Building and Using Ontologies

Robert Stevens

Department of Computer Science

University of Manchester

Manchester UK

http://img.cs.man.ac.uk/stevens

introduction
Introduction
  • The nature of bioinformatics resources
  • What is knowledge?
  • What is an ontology?
  • What are the uses of ontologies?
  • Components of an ontology
  • Building an ontology (in brief)

http://img.cs.man.ac.uk/stevens

the nature of bioinformatics resources
The Nature of Bioinformatics Resources
  • Over 500 databanks and analysis tools that work over resources
  • Repositories of knowledge and data and generation of new knowledge
  • Knowledge often held as free text; some use made of controlled vocabularies
  • Enormous amount of semantic heterogeneity and poor query facilities
  • Knowledge about services not always apparent

http://img.cs.man.ac.uk/stevens

what is knowledge

PatriciaGraceKennedysaid mine is a pint

…CEKENN…

Single letter amino acid codes

C – cysteine

K - lysine

namenounverb

What is Knowledge?

Pat Baker is a Manchester bioinformatician who drinks beer.

Protein that acts as

a tyrosine kinase in

the liver of primates.

  • Knowledge – all information and an understanding to carry out tasks and to infer new information
  • Information -- data equipped with meaning
  • Data -- un-interpreted signals that reach our senses

PATRICIAGRACEKENNEDY

SAIDMINEISAPINT

http://img.cs.man.ac.uk/stevens

capturing knowledge
Capturing Knowledge
  • Capturing knowledge for both humans an computer applications
  • A set of vocabulary definitions that capture a community’s knowledge of a domain
  • `An ontology may take a variety of forms, but necessarily it will include a vocabulary of terms, and some specification of their meaning. This includes definitions and an indication of how concepts are inter-related which collectively impose a structure on the domain and constrain the possible interpretations of terms.\'

http://img.cs.man.ac.uk/stevens

what does an ontology do
What Does an Ontology Do?
  • Captures knowledge
  • Creates a shared understanding – between humans and for computers
  • Makes knowledge machine processable
  • Makes meaning explicit – by definition and context

http://img.cs.man.ac.uk/stevens

what is an ontology
What is an Ontology?

Thesauri

“narrower

term”

relation

General

Logical

constraints

Catalog/

ID

Formal

is-a

Frames

(properties)

Terms/

glossary

Informal

is-a

Formal

instance

Value Restrs.

Disjointness, Inverse, part-of…

http://img.cs.man.ac.uk/stevens

roles of ontologies in bioinformatics
Roles of Ontologies in Bioinformatics
  • We can divide ontology use into three types:
  • Domain-oriented, which are either domain specific (e.g. E. coli) or domain generalisations (e.g. gene function or ribosomes);
  • Task-oriented, which are either task specific (e.g. annotation analysis) or task generalisations (e.g. problem solving);
  • Generic, which capture common high level concepts, such as Physical, Abstract and Substance. Important in ontology management and language applications.

http://img.cs.man.ac.uk/stevens

uses of ontology
Uses of Ontology
  • Community reference -- neutral authoring.
  • Either defining database schema or defining a common vocabulary for database annotation -- ontology as specification.
  • Providing common access to information. Ontology-based search by forming queries over databases.
  • Understanding database annotation and technical literature.
  • Guiding and interpreting analyses and hypothesis generation

http://img.cs.man.ac.uk/stevens

components of an ontology
Components of an Ontology
  • Concepts: Class of individuals – The concept Protein and the individual `human cytochrome C’
  • Relationships between concepts
  • Is a kind of relationship forms a taxonomy
  • Other relationships give further structure – is a part of
  • Axioms – Disjointness, covering, equivalence,…

http://img.cs.man.ac.uk/stevens

knowledge representation
Knowledge Representation
  • Ontology are best delivered in some computable representation
  • Variety of choices with different:
    • Expressiveness
      • The range of constructs that can be used to formally, flexibly, explicitly and accurately describe the ontology
    • Ease of use
    • Computational complexity
      • Is the language computable in real time?

Rigour -- Satisfiability and consistency of the representation

      • Systematic enforcement mechanisms
    • Unambiguous, clear and well defined semantics

http://img.cs.man.ac.uk/stevens

languages
Languages
  • Vocabularies using natural language
    • Hand crafted, flexible but difficult to evolve, maintain and keep consistent, with weak semantics
    • Gene Ontology
  • Object-based KR: frames
    • Extensively used, good structuring, intuitive. Semantics defined by OKBC standard
    • EcoCyc (uses Ocelot) and RiboWeb (uses Ontolingua)
  • Logic-based: Description Logics
    • Very expressive, model is a set of theories, well defined semantics
    • Automatic derived classification taxonomies
    • Concepts are defined and primitive

http://img.cs.man.ac.uk/stevens

building ontologies
Building Ontologies
  • No field of Ontological Engineering equivalent to Knowledge or Software Engineering;
  • No standard methodologies for building ontologies;
  • Such a methodology would include:
    • a set of stages that occur when building ontologies;
    • guidelines and principles to assist in the different stages;
    • an ontology life-cycle which indicates the relationships among stages.

http://img.cs.man.ac.uk/stevens

the development lifecycle
The Development Lifecycle
  • Two kinds of complementary methodologies emerged:
    • Stage-based, e.g. TOVE [Uschold96]
    • Iterative evolving prototypes, e.g. MethOntology [Gomez Perez94].
  • Most have TWO stages:
    • Informal stage
      • ontology is sketched out using either natural language descriptions or some diagram technique
    • Formal stage
      • ontology is encoded in a formal knowledge representation language, that is machine computable
    • the informal representation helps the former
    • the formal representation helps the latter.

http://img.cs.man.ac.uk/stevens

a provisional methodology
A Provisional Methodology
  • A skeletal methodology and life-cycle for building ontologies;
  • Inspired by the software engineering V-process model;
  • The overall process moves through a life-cycle.

The left side charts the processes in building an ontology

The right side charts the guidelines, principles and evaluation used to ‘quality assure’ the ontology

http://img.cs.man.ac.uk/stevens

the v model methodology

Ontology in Use

The V-model Methodology

Evaluation: coverage, verification, granularity

Identify purpose and scope

Knowledge acquisition

User Model

Conceptualisation Principles: commitment, conciseness, clarity, extensibility, coherency

Conceptualisation

Integrating existing ontologies

Conceptualisation Model

Encoding/Representation principles: encoding bias, consistency, house styles and standards, reasoning system exploitation

Encoding

Representation

Implementation Model

http://img.cs.man.ac.uk/stevens

the ontology building life cycle
The ontology building life-cycle

Identify purpose and scope

Knowledge acquisition

Building

Language and representation

Conceptualisation

Integrating existing ontologies

Available development tools

Encoding

Evaluation

http://img.cs.man.ac.uk/stevens

starting concept list
Starting Concept List
  • Chemicals – atom, ion, molecule, compound, element;
  • Molecular-compound, ionic-compound, ionic-molecular-compound, …;
  • Ionic-macromolecular-compound and ionic-small-macromolecular-compound;
  • Protein, peptide, polyprotein, enzyme, holoprotein, apoprotein,…
  • Nucleic acid – DNA, RNA, tRNA, mRna, snRNA, …

http://img.cs.man.ac.uk/stevens

slide19

Conceptualisation Sketch

Chemical

Molecule

Compound

Element

Ion

Atom

Molecular

Compound

Ionic

Compound

Molecular

Element

Ionic

Molecule

Non-Metal

Metal

Ionic Molecular

Compound

Metaloid

http://img.cs.man.ac.uk/stevens

slide20

Molecule Conceptualisation Sketch

Ionic Macromolecular

Compound

Macromolecule

Small

Molecule

Polysaccharide

Protein

Nucleic

Acid

Peptide

Starch

Glycogen

Enzyme

DNA

RNA

snRNA

mRNA

tRNA

rRNA

http://img.cs.man.ac.uk/stevens

initial encoding
Initial Encoding

class-def chemical

subclass-of substance

class-def molecule

subclass-of chemical

class-def compound

subclass-of chemical

class-def molecular-compound

subclass-of molecule and compound

http://img.cs.man.ac.uk/stevens

slide22

Molecules Revisited

Non-Ionic Macromolecular

Compound

Ionic Macromolecular

Compound

Macromolecule

Small

Molecule

Polysaccharide

Protein

Nucleic

Acid

Peptide

Starch

Glycogen

Enzyme

DNA

RNA

snRNA

mRNA

tRNA

rRNA

http://img.cs.man.ac.uk/stevens

more encoding
More Encoding

class-def chemical

subclass-of substance

class-def defined molecule

subclass-of chemical

Slot-constraint contains-bond min-cardinality 1 has-value covalent-bond

class-def defined compound

subclass-of chemical

Slot-constraint has-atom-types greater-than 1

class-def defined molecular-compound

subclass-of molecule and compound

http://img.cs.man.ac.uk/stevens

expansion
Expansion
  • Sketch and encode in cycles
  • Build a taxonomy of a small portion
  • Then build links to other portions
  • Add more detail
  • Document sources, author, date and argumentation.

http://img.cs.man.ac.uk/stevens

summary
Summary
  • An ontology captures knowledge for a shared understanding
  • The important question is not whether an artefact is an ontology, but whether it does any good
  • Making our understanding of domain explicit, consistent and processable
  • Bioinformatics resources are knowledge resources – needs to be both human and machine understandable

http://img.cs.man.ac.uk/stevens

ad