scec ontology development n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
SCEC Ontology Development PowerPoint Presentation
Download Presentation
SCEC Ontology Development

Loading in 2 Seconds...

play fullscreen
1 / 27

SCEC Ontology Development - PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on

SCEC Ontology Development. Tom Russ Hans Chalupsky, Stefan Decker, Yolanda Gil, Jihie Kim, Varun Ratnakar University of Southern California Information Sciences Institute. Outline. Background SCEC Goals Ontology Basics Semantic Interoperability Examples Weather Seismology

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'SCEC Ontology Development' - yael


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
scec ontology development

SCEC Ontology Development

Tom Russ

Hans Chalupsky, Stefan Decker, Yolanda Gil, Jihie Kim, Varun Ratnakar

University of Southern California

Information Sciences Institute

outline
Outline
  • Background
    • SCEC Goals
    • Ontology Basics
    • Semantic Interoperability
  • Examples
    • Weather
    • Seismology
    • Building Computational Pathways
  • Ontology Development
    • SCEC Ontology Development
    • Gene Ontology Development
    • Fundamental Ontologies?
  • Big Questions
what is an ontology
What is an Ontology?
  • An Ontology is a framework for representing shared conceptualizations of knowledge
  • An Ontology provides:
    • Definitions for objects and relations in the domain
    • Shared vocabulary and and common structure for modeling domain knowledge
    • Domain model/theory that captures common knowledge about the domain
semantic interoperability story
Semantic Interoperability Story
  • SCEC Java code for Community Velocity Model
    • Inputs: longitude and latitude
    • Output: Vs30 (m/s)
  • Connection technology: Java serialization
    • In other words: Ship the bits for two double precision floating point values through a network connection
    • Make sure you send longitude first!
      • Non-standard convention for geography
      • Probably based on X-Y convention instead
  • Better: More structured input
    • Latitude=34.15 Longitude=-117.58
    • Explicit identification of parameters
identify relevant domain concepts

Conditions for Joint Tasks

(from: CJCSM 3500.04A 9/13/96, p. 3-11.)

Identify Relevant Domain Concepts
weather specification in english f rom cjcsm 3500 04a 9 13 96 p 3 11
Weather Specificationin English (from: CJCSM 3500.04A 9/13/96, p. 3-11.)
  • C 1.3.1.3 Weather
    • Definition: current weather (next 24 hours).
    • Descriptors: clear, partly cloudy, overcast, precipitating, stormy
  • C 1.3.1.3.1 Air Temperature
    • Definition: atmospheric temperature at ground level
    • Descriptors: Hot (> 85° F) Temperate (40° to 85° F) Cold (10° to 39° F) Very Cold (< 10° F)
formalizing domain concepts
Formalizing Domain Concepts

A knowledge-based system about “Weather” must know things like these:

  • Terms
    • hot, humid, windy ...
  • Definitions
    • cold = (10° to 39° F)
  • Relationships
    • cold and windy may overlap
    • cold and hot are disjoint
    • cold and very cold are disjoint!
  • Rules
    • IF heavy rain lasts 2 days
    • THEN muddy terrain and excessive runoff
    • (probability .9)
earthquake hazard analog
Earthquake Hazard Analog
  • NEHRP Soil Types
hypocenter vs epicenter
Hypocenter vs. Epicenter
  • The epicenter is the point on the surface directly above the hypocenter.
  • “Directly above”, more formally:
    • The latitude and longitude of the epicenter and hypocenter are the same.
    • The epicenter depth is zero.

PowerLoom:

(deffunction source-hypocenter ((?s earthquake-source)) :-> (?h location)

:documentation "The 3D point where the ruptured started.")

(deffunction source-epicenter ((?s earthquake-source)) :-> (?e location)

:documentation "The point on the earth's surface directly above the hypocenter"

:axioms (=> (earthquake-source ?s)

(and (= (latitude-of (source-hypocenter ?s))

(latitude-of (source-epicenter ?s)))

(= (longitude-of (source-hypocenter ?s))

(longitude-of (source-epicenter ?s)))

(= (depth-of (source-epicenter ?s)) (units 0 "m"))))

powerloom
PowerLoom
  • Knowledge representation & reasoning system
  • Uses definitions specified in a formal logic
    • First order predicate calculus
    • Expressive: We can say what we need to
  • Inference via logical deductions
  • Support for units and dimensions
  • Browsing tool: Ontosaurus
ontosaurus
Ontosaurus

Navigation Tools and Control Panel

Display of formal information and rules

Diagrams and images aid domain familiarization

Domain facts.

Textual documentation

plan building computational pathways
Plan:Building Computational Pathways
  • Simple scenario to illustrate how a user would define computational pathways
  • Behind the scenes, DOCKER uses descriptions of components, their I/O requirements and their constraints to:
    • detect errors in user’s input
    • suggest additional steps needed to make the pathway work
    • make educated guesses about how components selected by the user may be connected to one another
compute pga for an address using these components
Compute PGA for an Address Using These Components

Fault-type

EarthquakeForecastModel

(USGS-02)

AttenuationRelationship

(Field-2000)

Magnitude

PGA

Fault-type

Distance

Magnitude

Vs30

Time Span

Lat/long

Fault-type

AttenuationRelationship

(Campbell-02)

Magnitude

PGA

Distance

Address

Lat/long

Site Type

Geocoder

Lat/long

CommunityVelocity Model

Vs30

Lat/long1

DistanceComputation

Distance

Lat/long2

some data paths connect easily
Some Data Paths Connect Easily

Fault-type

Fault-type

EarthquakeForecastModel

(USGS-02)

AttenuationRelationship

(Field-2000)

Magnitude

Magnitude

PGA

Distance

Vs30

Time Span

Lat/long

Address

Lat/long

Geocoder

Lat/long

CommunityVelocity Model

Vs30

Lat/long1

DistanceComputation

Distance

Lat/long2

others require transformation
Others Require Transformation

Fault-type

Fault-type

EarthquakeForecastModel

(USGS-02)

AttenuationRelationship

(Field-2000)

Magnitude

Magnitude

PGA

Distance

Vs30

Time Span

Lat/long

Lat/long1

DistanceComputation

Distance

Lat/long2

Address

Lat/long

CommunityVelocity Model

Vs30

Lat/long

Geocoder

scec ontology development1
SCEC Ontology Development
  • Task-driven
    • Particular application
    • Modeled on domain inferences & reasoning
  • Small team of Computer Scientists
    • Seismology - Tom Russ
    • Models - Jihie Kim, Varun Ratnakar, Tom Russ
  • Small group of Domain Experts
    • Ned Field and Tom Jordan
  • Future
    • Development and curation by domain experts
    • Requires methodology
    • Requires tools
capture inference in ontology

Computation and checking of properties

Definitions of Terms

Capture Inference in Ontology

Ned Field’s markup of fault parameter data

the gene ontology go
The Gene Ontology (GO)
  • Had a successful jumpstart
  • Done by biologists, not knowledge engineers
  • Developed by a wide, distributed community
  • Focused on specific aspects of genomics
    • Fly-base, yeast, mouse
  • Used 24/7 from day 1
  • Accepted widely by the community
  • Extended based on use requirements of a wide community
  • Quite large (30-40K terms)
jumpstart of go key decisions 1
Jumpstart of Go:Key Decisions (1)
  • Limited scope
    • limit domain, though it could have included many many more areas
      • not let anyone else in until they got somewhere
      • Added new groups incrementally (10)
    • 3 related areas
  • open (no licenses), use open standards
  • Involve the community
  • Had to develop own software
    • control over own code
    • KISS: keep it simple stupid
      • E.g., only two relations
  • Transitivity
key decisions 2
Key Decisions (2)
  • Use it from the beginning
    • If you wait to have ontology finished before using it you’d never be there
    • Errors would only be discovered through use
    • Set things up so that you are OK when you have to fix those errors (entire chunks of ontology had to be entirely redone)
    • Minimized change impacts by limiting most changes are to rels, which in practice does not impact the annotations
  • Face-to-face meetings 3-4 times a year
  • Satisfied a need for DB users that wanted to ask complex queries (1 query to all DBs)
  • Establish migration path
key decisions 3
Key Decisions (3)
  • Requests are resolved either:
    • Immediately
    • Over email if can reach closure over 2-3 days
      • No voting, only consensus
    • on agenda for next meeting
  • Attribution was important
    • Learned that from Flybase
    • Both GO content and annotations are annotated with attribution
  • Unique identifiers within GO
    • The term can change as a lexical string, but no change in meaning and thus no change in identifier
    • Can change defn, but not the GO string, then id changes
    • Small number of relations
fundamental ontologies
Fundamental Ontologies
  • What is out there?
  • Not much.
    • Ontolingua (Stanford University) has a number of small component ontologies
      • Designed as components
      • Not tied to applications
    • DAML is working on fundmental physics ontologies (Jerry Hobbs, SRI International, ISI, Ken Forbus, others)
      • Time
      • Space
        • We would like input from GEON!
some big questions from gene ontology workshop
Some BIG Questions(from Gene Ontology Workshop)
  • How do you get started?
  • How to ensure the community will accept it (use it)?
  • How do you (can you?) represent alternative views?
  • What is the process to contribute to it?
  • What is the process to make changes to it?
  • What happens when there is an update?
  • How is it implemented? What tools?
  • How is it managed?
  • Who does what, when, where, why?