ontologies in data and application integration an update n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Ontologies in Data and Application Integration – an Update PowerPoint Presentation
Download Presentation
Ontologies in Data and Application Integration – an Update

play fullscreen
1 / 62

Ontologies in Data and Application Integration – an Update

3 Views Download Presentation
Download Presentation

Ontologies in Data and Application Integration – an Update

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Ontologies in Data and Application Integration – an Update Kai Lin Bertram Ludäscher Knowledge-Based Information Systems Lab Data and Knowledge Systems (DAKS) San Diego Supercomputer Center University of California San Diego http://www.geongrid.org

  2. Outline • Motivation • Ontology Cheat Sheet • Ontology-enabled Prototypes and Tools • Data & Service Registration (Structural + Semantic) • Scientific Workflows

  3. Ontology Cheat Sheet (1/2) • What is an ontology? An ontology usually … • specifies a theory (a set of models) by … • defining and relating… • concepts representing features of a domain of interest • Also an overloaded (sometimes sloppy) term for: • Controlled vocabularies • Database schema (relational, XML, …) • Conceptual schema (ER, UML, … ) • Thesauri (synonyms, broader term/narrower term) • Taxonomies • Informal/semi-formalrepresentations • “Concept spaces”, “concept maps” • Labeled graphs / semantic networks (RDF) • Formal ontologies, e.g., in [Description] Logic (OWL) • “formalization of a specification”  constrains possible interpretation of terms

  4. A Multi-Hierarchical Rock Classification “Ontology” (GSC) Genesis Fabric Composition Texture

  5. Ontology Cheat Sheet (2/2) • What are ontologies used for? • Conceptual models of a domain or application, (communication means, system design, …) • Classification of … • concepts (taxonomy) and • data/object instances through classes • Analysis of ontologies e.g. • Graph queries (reachability, path queries, …) • Reasoning (concept subsumption, consistency checking, …) • Targets for semantic data registration • Conceptual indexes and views for • searching, • browsing, • querying, and • integration of registered data

  6. +/- Energy +/- a few hundred million years GEON Metamorphism Equation: Geoscientists + Computer Scientists Igneous Geoinformaticists domain knowledge Application Example: Geologic Map Integration Knowledge representation Ontologies!? Nevada

  7. After registering datasets, ontologies (here: “classes”), and an application (“OMI”), the datasets can be searched and displayed in an integrated way. Geologic Map Integration in the Portal

  8. Concept-Based Queries and Analysis • After registering a source with one or more ontologies, concept-based queries and analysis can be launched • Here: light-weight client-side processing (SVG)

  9. Ontologies and Data Management • Where do ontologies fit within data management architectures? • Several answers, specifically: • An ontologyis similar to a schema or conceptual model if one exists, but is • Developed independently of a particular application • Probably given in a different language • Inherently more general • Usually not a very good schema (weak structure)

  10. Ontologies and Data Management( watch out for Semantic Data Registration later) Ontology use concepts from (explicitly or implicitly) Design Artifact Conceptual Model Conceptual Model Schema Schema Schema Schema  Metadata Data

  11. Creating and Sharing Concept Maps (here: Seismology concept map & Cmap tool) • Lock up scientists for 2+ days • Add CS/KRDB types • Create concept maps • Refine • Iterate  from napkin drawings, to concept maps, to ontologies

  12. Graph (RDF) Queries on Ontologies visualisation RQL Query: Show all “products” Query Results

  13. Community-Based Ontology Development • Draft of a geochemistry ontology developed by scientists • Current concept maps and • emerging ontologies: • Igneous Rocks/Plutons • Seismology • Geochemistry

  14. Protégé (… not so ezOWL yet…)

  15. Sparrow (a poor man’s OWL tool …) Simple ASCII-based RDF and OWL entry and manipulation

  16. Semantic Data Registration(joint work w/ Shawn Bowers)

  17. What is Data/Ontology/… Registration? • A mechanism by which data sources, ontologies, services, … • … are publishedin a repository/registry • for the purpose of “smart” discovery, querying, integration

  18. Things to Register • Data files (individual files) • Shapefile as a blob (+ file type) • Collections (of files; nested; eg satellite data) • Databases (has schema and can be queried) • Shapefile with schema registered • Ontologies • Services (web + grid services) • Other/external applications

  19. DataCollectionEvent Measurement MeasurementContext MeasurableItem SpeciesCount SpeciesAbundance AbundanceCollectionEvent Location LTERSite SBLTERSite {naples,…} ⊑ contains.Measurement ⊑ measureOf.MeasurableItem ⊓ hasContext.MeasurementContext ⊑ hasTime.DateTime ⊓ hasLocation.Location ⊑ hasUnit.Unit ⊓ hasValue.UnitValue ⊑ MeasurableItem ⊓ hasSpecies.Species ⊓ hasUnit.RatioUnit … ⊑ Measurement ⊓ measureOf.SpeciesCount ⊑ DataCollectionEvent ⊓ contains.SpeciesAbundance ⊑ position.Coordinate ⊑ Location ⊑ LTERSite ⊓ position.SBLTERCoordinate ⊑ SBLTERSite Connecting Datasets to Ontologies Ontology (snippet) How can we “register” the dataset to concepts in the Ontology? Dataset Date Site Transect SP_Code Count 2000-09-08 CARP 1 CRGI 0 2000-09-08 CARP 4 LOCH 0 2000-09-08 CARP 7 MUCA 1 2000-09-22 NAPL 7 LOCH 1 2000-09-18 NAPL 1 PAPA 5 2000-09-28 BULL 1 CYOS 57

  20. Step1: Selecting Relevant Concepts Concepts from an Ontology • DataCollectionEvent • AbundanceCollectionEvent • Measurement • Abundance • SpeciesAbundance • MeasurementContext • … • Location • LTERSite • SBLTERSite • naples • Species • … • MeasurableItem • SpeciesCount Dataset Date Site Transect SP_Code Count 2000-09-08 CARP 1 CRGI 0 2000-09-08 CARP 4 LOCH 0 2000-09-08 CARP 7 MUCA 1 2000-09-22 NAPL 7 LOCH 1 2000-09-18 NAPL 1 PAPA 5 2000-09-28 BULL 1 CYOS 57

  21. Step1: Selecting Relevant Concepts Concepts from an Ontology • DataCollectionEvent • AbundanceCollectionEvent • Measurement • Abundance • SpeciesAbundance • MeasurementContext • … • Location • LTERSite • SBLTERSite • naples • Species • … • MeasurableItem • SpeciesCount Dataset Date Site Transect SP_Code Count 2000-09-08 CARP 1 CRGI 0 2000-09-08 CARP 4 LOCH 0 2000-09-08 CARP 7 MUCA 1 2000-09-22 NAPL 7 LOCH 1 2000-09-18 NAPL 1 PAPA 5 2000-09-28 BULL 1 CYOS 57

  22. Step2: Generate Object Model Concepts from an Ontology • DataCollectionEvent • AbundanceCollectionEvent • Measurement • Abundance • SpeciesAbundance • MeasurementContext • … • Location • LTERSite • SBLTERSite • naples • Species • … • MeasurableItem • SpeciesCount Abundance Collection Event contains measureOf SpeciesCount SpeciesAbundance hasValue hasSpecies hasUnit Species RatioUnit hasTime hasLoc RatioValue SBLTERSite DateTime

  23. Applications of Semantic Registration • Mentioned before: • Smart data discovery, integration etc. • New application: • Generating data transformation semi-automatically for chaining together computational services

  24. Problem: Service Reusability • Unless “designed to fit,” independent services are structurally incompatible • Generally, the source output type will not be a subtype of the target input type Incompatible StructuralType Ps StructuralType Pt (⋠) Desired Connection Source Service Target Service Pt Ps

  25. (≺) Service Reusability • A data transformation mapping () is required to connect the services … artificially creating subtype compatibility • If such a  exists, the services are “structurally feasible” Incompatible StructuralType Ps StructuralType Pt (⋠)  (Ps) Desired Connection Source Service Target Service Pt Ps

  26. Service Reusability • Idea: • annotate services with semantic types (concept expressions) primarily for discovery of services Ontologies (OWL) Compatible (⊑) SemanticType Ps SemanticType Pt Desired Connection Source Service Target Service Pt Ps

  27. (≺) Service Reusability • Services can be semantically compatible, but structurally incompatible Ontologies (OWL) Compatible (⊑) SemanticType Ps SemanticType Pt Incompatible StructuralType Ps StructuralType Pt (⋠)  (Ps) Desired Connection Source Service Target Service Pt Ps

  28. The Ontology-Driven Framework (work w/ Shawn Bowers, SEEK) Ontologies (OWL) Compatible (⊑) SemanticType Ps SemanticType Pt Registration Mapping (Input) Registration Mapping (Output) StructuralType Ps StructuralType Pt Correspondence (Ps) Generate Source Service Target Service Transformation Pt Ps Desired Connection

  29. Example Generated Data Transformation (in XQuery) • Based on the structural correspondences and certain assumptions, we derive the transformation query: <cohortTable> { for $s in /population/sample return <measurement> { for $c in $s/meas/cnt return <obs>{$c/text()}</obs> } { for $l in $s/lsp return <phase>{$l/text()}</phase> } </measurement> } </cohortTable>

  30. Scientific Workflows(Efrat Jaeger et al.)

  31. Reverse Engineering a Scientific Workflow using the KEPLER Tool (Efrat Jaeger)

  32. A Scientific Workflow in Kepler Extract mineral composition for row Id. Igneous Rock Diagrams information. Rock Name.

  33. A Scientific Workflow in Kepler

  34. A Scientific Workflow in Kepler

  35. Reverse-Engineered the Geological Map Integration in Kepler

  36. DataMapper Sub-Workflow

  37. Result launched via the BrowserUI actor

  38. Kepler … is a community-based, cross-project, open source collaboration for “minute made” application integration using web (grid) services as basic building blocks has a joint CVS repository, mailing lists, web site, … is gaining momentum thanks to contributors and contributions BSD-style license allows commercial spin-offs a pre-packaged, shrink-wrapped version (“Kepler-to-GO”) coming soon to a place near you… KEPLER and YOU

  39. F I N – Questions?

  40. Additional Material

  41. The KEPLER GUI (Vergil from Ptolemy II) Drag and drop utilities, director and actor libraries.

  42. Running the workflow

  43. Distributed Workflows in KEPLER • Web and Grid Service plug-ins • WSDL • ProxyInit, GlobusGridJob, GridFTP, DataAccessWizard • SRB • SSH, SCP • Web Service Harvester • Imports all the operations of a specific WS (or of all the WSs in a UDDI repository) as Kepler actors • XSLT and XQuery transformers to link non-fitting services together • Web Service Deployment (…ongoing work…)