Science environment for ecological knowledge
Download
1 / 41

Science Environment for Ecological Knowledge SEEK ... - PowerPoint PPT Presentation


  • 276 Views
  • Updated On :

UC Santa Barbara. U New Mexico. UC San Diego. U Kansas. Vermont, Napier, ASU, UNC. Science Environment for Ecological Knowledge . Bertram Ludäscher San Diego Supercomputer Center University of California, San Diego. http://seek.ecoinformatics.org. Architecture Overview .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Science Environment for Ecological Knowledge SEEK ...' - Donna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Science environment for ecological knowledge l.jpg

UC Santa Barbara

U New Mexico

UC San Diego

U Kansas

Vermont, Napier, ASU, UNC

Science Environment for Ecological Knowledge

Bertram Ludäscher

San Diego Supercomputer Center

University of California, San Diego

http://seek.ecoinformatics.org


Architecture overview l.jpg
Architecture Overview

  • Analysis & Modeling System

    • Design and execution of ecological models and analysis

    • End user focus

    • application-/upperware

  • Semantic Mediation System

    • Data Integration of hard-to-relate sources and processes

    • Semantic Types and Ontologies

    • upper middleware

  • EcoGrid

    • Access to ecology data and tools

    • middle-/underware

(cf. GEON + Cyberinfrastructure)

  • Plus Working Groups:

  • – Knowledge Representation (SEEK-KR)

  • – Classification and Nomenclature (TAXON)

  • – Biodiversity and Ecological Analysis and Modeling (BEAM)


Seek ecogrid l.jpg
SEEK EcoGrid

  • Goal: standardize interfaces (using web and grid services)

    • We have standardized data via EML

    • Integrate diverse data networks from ecology, biodiversity, and environmental sciences

  • Grid-standardized interfaces

    • Uniform interface to:

      • Metacat, SRB, DiGIR, Xanthoria, etc.

      • Anyone can implement these interfaces

      • Hides complexity of underlying systems

  • Metadata-mediated data access

    • Supports multiple metadata standards

    • EML, Darwin Core as foci

  • Computational services

    • Pre-defined analytical services

    • On-the-fly analytical services


Grid versus web services l.jpg
Grid versus Web Services

  • Grid Services are Web Services

    • Add authentication, lifecycle management, notification, etc.

    • Globus Toolkit 3: Implements Open Grid Services Architecture (OGSA)

  • Implications for use

    • Write a normal web service extending GridService base class

    • When deployed within GT3, you get these extra functions for ‘free’

    • Supports distributed computation via proxy authentication

  • Problems

    • Complex system to understand

    • GT3 can be difficult to deploy

    • Proposals to incorporate grid services within the Web services community (Web Services Resource Framework [WSRF])


Ecogrid client interactions l.jpg
EcoGrid client interactions

  • Modes of interaction

    • Client-server

    • Fully distributed

    • Peer-to-peer

  • EcoGrid Registry

    • Node discovery

    • Service discovery

  • Aggregation services

    • Centralized access

    • Reliability

    • Data preservation


Building the ecogrid l.jpg

LUQ

AND

HBR

VCR

NTL

Building the EcoGrid

LTER Network (24) Natural History Collections (>> 100)

Organization of Biological Field Stations (180)

UC Natural Reserve System (36)

Partnership for Interdisciplinary Studies of Coastal Oceans (4)

Multi-agency Rocky Intertidal Network (60)

Metacat node

SRB node

VegBank node

DiGIR node

Xanthoria node

Legacy system


Kepler scientific workflows l.jpg
Kepler: Scientific Workflows

Query EcoGrid to find data

Archive output to EcoGrid

EML provides semi-automated data binding

Scientific workflows represent knowledge about the process; Kepler captures this knowledge


Garp invasive species model l.jpg

DiGIR

Species presence &absence points (invasion area) (a)

Test sample (d)

DiGIR

Species

presence &

absence points

(native range)

(a)

Native range prediction

map (f)

Training sample (d)

GARP

rule set (e)

Data

Calculation

EcoGrid

Query

EcoGrid

Query

Map

Map

Validation

User

Validation

Sample

+A3

+A2

Model quality

parameter (g)

Integrated

layers

(native range) (c)

Layer

Integration

Layer

Integration

+A1

SRB

Environmental layers (native

range) (b)

Model quality

parameter (g)

SRB

Environmental layers (invasion area) (b)

Integrated

layers

(invasion area) (c)

Invasion

area prediction map (f)

GARP Invasive Species Model

Scientific workflows represent knowledge about the process; AMS captures this knowledge

Slide from D. Pennington


Kepler team projects sponsors l.jpg

Ilkay Altintas SDM

Chad Berkley SEEK

Shawn Bowers SEEK

Jeffrey Grethe BIRN

Christopher H. Brooks Ptolemy II

Zhengang Cheng SDM

Efrat Jaeger GEON

Matt Jones SEEK

Edward A. Lee Ptolemy II

Kai Lin GEON

Bertram Ludäscher BIRN, GEON, SDM, SEEK

Steve Mock NMI

Steve Neuendorffer Ptolemy II

Jing Tao SEEK

Mladen Vouk SDM

Yang Zhao Ptolemy II

Kepler Team, Projects, Sponsors

Ptolemy II


Kepler understands eml data chad berkley seek l.jpg
Kepler Understands EML Data (Chad Berkley, SEEK)


Kepler ecological modeling chad berkley seek l.jpg
Kepler: Ecological Modeling(Chad Berkley, SEEK)


Database access efrat jaeger geon l.jpg
Database Access (Efrat Jaeger, GEON)

Note: EML descriptions of relational sources would allow automated data ingestion





Swf reengineering ilkay sdm ashraf efrat kai geon l.jpg
SWF Reengineering (Ilkay, SDM; Ashraf, Efrat, Kai, GEON)



Result launched via browserui actor coupling with esri s arcims l.jpg
Result launched via BrowserUI actor(coupling with ESRI’s ArcIMS)


Distributed workflows in kepler l.jpg
Distributed Workflows in KEPLER

  • Web and Grid Service plug-ins

    • WSDL (now) and Grid services (stay tuned …)

    • ProxyInit, GlobusGridJob, GridFTP, DataAccessWizard

    • SSH, SCP, SDSC SRB, OGS?-???… coming

  • WS Harvester

    • Import query-defined WS operations as Kepler actors

  • XSLT and XQuery Data Transformers

    • to link not “designed-to-fit” web services

  • WS-deployment interface (planned)


Web service actor ilkay altintas sdm l.jpg

Configure - select service

operation

Web Service Actor (Ilkay Altintas, SDM)

  • Given a WSDL and the name of an operation of a web service, dynamically customizes itself to implement and execute that method.


Set parameters and commit l.jpg
Set Parameters and Commit

Set parameters

and commit


Specialized ws actor after instantiation l.jpg
Specialized WS Actor (after instantiation)


Web service harvester ilkay altintas sdm l.jpg
Web Service Harvester (Ilkay Altintas, SDM)

  • Imports the web services in a repository into the actor library.

  • Has the capability to search for web services based on a keyword.



An oversimplified model of the grid l.jpg

g

f

X Y Z

An (oversimplified) Model of the Grid

  • Hosts: {h1, h2, h3, …}

  • [email protected]: d1@{hi}, d2@{hj}, …

  • [email protected]: f1@{hi}, f2@{hj}, …

  • Given: data/workflow:

  • … as a functional plan: […; Y := f(X); Z := g(Y); …]

  • … as a logic plan: […; f(X,Y)g(Y,Z); …]

  • FindHost Assignment: di hi , fj hj for all di ,fj

    … s.t. […; [email protected] := [email protected]([email protected]), …] is a valid plan


Shipping handling algebra sha l.jpg

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

Shipping & Handling Algebra (SHA)

Logical view

(1)

  • plan [email protected] = [email protected] of [email protected] =

  • [ [email protected] to A, [email protected] := [email protected]([email protected]), [email protected] to C ]

  • [ [email protected] => B, [email protected] := [email protected]([email protected]), [email protected] to C ]

  • [ [email protected] to C, [email protected] => C, [email protected] := [email protected]([email protected]) ]

(2)

(3)

Physical view: SHA Plans


Grid enabling ptii handles l.jpg
Grid-Enabling PTII: Handles

  • AGA: get_handle

  • GAA: return &X

  • AB: send &X

  • BGB: request &X

  • GBGA: request &X

  • GA GB: send *X

  • GBB: send done(&X)

  • Example:

  • &X = “GA.17”

  • *X =<some_huge_file>

  • Candidate Formalisms:

  • GridFTP

  • SSH, SCP

  • SDSC SRB

  • OGS?-??? … WSRF?

Logical token transfer (3) requires get_handle(1,2); then exec_handle(4,5,6,7) for completion.

Keplerspace

3

A

B

4

7

2

1

5

Gridspace

GA

GB

6


Homogeneous data integration l.jpg
Homogeneous Data Integration

  • Integration of homogeneous or mostly homogeneous data via EML metadata is relatively straightforward


Heterogeneous data integration l.jpg
Heterogeneous Data integration

  • Requires advanced metadata and processing

    • Attributes must be semantically typed

    • Collection protocols must be known

    • Units and measurement scale must be known

    • Measurement relationships must be known

      • e.g., that ArealDensity=Count/Area


Semantic mediation l.jpg
Semantic Mediation

  • Label data with semantic types

  • Label inputs and outputs of analytical components with semantic types

  • Use reasoning engines to generate transformation steps

    • Beware analytical constraints

  • Use reasoning engine to discover relevant components

Data

Ontology

Workflow Components


Ecological ontologies l.jpg
Ecological ontologies

  • What was measured (e.g., biomass)

  • Type of measurement (e.g., Energy)

  • Context of measurement (e.g., Psychotria limonensis)

  • How it was measured (e.g., dry weight)

  • SEEK intends to enable community-created ecological ontologies using OWL

    • Represents a controlled vocabulary for ecological metadata


Extensions semantic types l.jpg
Extensions: Semantic Types

  • Take concepts and relationships from an ontology to “semantically type” the data-in/out ports

  • Application: e.g., design support:

    • smart/semi-automatic wiring, generation of “massaging actors”

m1

(normalize)

p3

p4

Takes Abundance Count

Measurements for Life Stages

Returns Mortality Rate Derived

Measurements for Life Stages


Semantic types l.jpg
Semantic Types

  • The semantic type signature

    • Type expressions over the (OWL) ontology

m1

(normalize)

p3

p4

SemType m1 ::

Observation & itemMeasured.AbundanceCount &

hasContext.appliesTo.LifeStageProperty

->

DerivedObservation & itemMeasured.MortalityRate &

hasContext.appliesTo.LifeStageProperty


Extended type system here owl semantic types l.jpg
Extended Type System (here: OWL Semantic Types)

SemType m1 ::

Observation & itemMeasured.AbundanceCount &

hasContext.appliesTo.LifeStageProperty

 DerivedObservation & itemMeasured.MortalityRate & hasContext.appliesTo.LifeStageProperty

Substructure association:

XML raw-data =(X)Query=> object model =link => OWL ontology



Deriving data transformations from semantic service registration l.jpg
Deriving Data Transformations from Semantic Service Registration

[Bowers-Ludaescher,

DILS’04]


Structural and semantic mappings l.jpg
Structural and Semantic Mappings Registration

[Bowers-Ludaescher,

DILS’04]


Seek impact l.jpg
SEEK Impact Registration

  • Fundamental improvements for researchers

    • Global access to ecologically relevant data

    • Rapidly locate and utilize distributed computation

    • Capture, reproduce, extend analysis process


Acknowledgements l.jpg
Acknowledgements Registration

This material is based upon work supported by:

The National Science Foundation under Grant Numbers 9980154, 9904777, 0131178, 9905838, 0129792, and 0225676.

PBI Collaborators: NCEAS, University of New Mexico (Long Term Ecological Research Network Office), San Diego Supercomputer Center, University of Kansas (Center for Biodiversity Research)

Kepler contributors: SEEK, Ptolemy II, SDM/SciDAC, GEON


ad