1 / 45

Steven Perry Dave Vieglais

Steven Perry Dave Vieglais. Overview. WASABI is a framework for building scientific data networks based on RDF, OWL, and open data access protocols. Objective. Build a data access network that… Can handle many types objects Is resilient to changes in data models

eliora
Download Presentation

Steven Perry Dave Vieglais

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Steven PerryDave Vieglais

  2. Overview WASABI is a framework for building scientific data networks based on RDF, OWL, and open data access protocols.

  3. Objective Build a data access network that… • Can handle many types objects • Is resilient to changes in data models • Refers to objects with GUIDs • Allows fast & efficient searches • Allows incremental harvesting • Simplifies creation of client software

  4. RDF and OWL RDF described by OWL allows… • Machine readable controlled vocabularies • Distinction between classes and properties • Data objects as resources identified with globally unique LSIDs • Query languages to examine patterns of relationships between objects

  5. Framework Components Provides access to RDF data sets through multiple protocols

  6. Framework Components Libraries for building client applications Provides access to RDF data sets through multiple protocols

  7. Framework Components Web-based client for accessing data on a wasabi network Libraries for building client applications Provides access to RDF data sets through multiple protocols

  8. A Simple Network

  9. Wasabi Server Server • Stores a cached copy of source data in RDF format called a data set • Each data set is bound to one or more protocols handlers • Standard protocols include OAI, SimpleLSID, and SPARQL

  10. Wasabi Server

  11. Loading Data

  12. Loading Data Loading RDF Data • RDF data can be loaded from one or more files directly into Wasabi • Wasabi will not assign new LSIDs • Wasabi checks to see if any data objects are new or have changed and can scan for deleted data objects

  13. Loading Data

  14. Loading Data Loading Non-RDF Data • Wasabi uses a synchronizer program to generate RDF from SQL output or delimited files • Synch program must know about your source data format • Wasabi can assign LSIDs if needed • Wasabi checks to see if any data objects are new or have changed and can scan for deleted data objects

  15. OAI-PMH

  16. OAI-PMH Open Archive Initiative Protocol for Metadata Harvesting • Wasabi implementation allows efficient harvesting • Supports incremental harvesting “What objects have changed since Oct-02-2006?” • Notifies clients about deletions

  17. LSID Resolution

  18. LSID Resolution Life Science Identifier Metadata Resolution • Wasabi supports a simple HTTP-GET LSID metadata resolution service • Supports metadata resolution “What is the RDF metadata for urn:lsid:auth.org:ns:23?” • Compliant LSID resolution through plug-in for IBM LSID resolver.

  19. LSID Resolution

  20. SPARQL

  21. SPARQL SPARQL Protocol • SPARQL is the W3C candidate for querying RDF • SPARQL protocol bound to HTTP-GET • ASK and SELECT queries return SPARQL XML results • DESCRIBE and CONSTRUCT queries return RDF/XML results

  22. SPARQL SPARQL Query Language Example • “What is urn:lsid:auth.org:person:3424?” DESCRIBE <urn:lsid:auth.org:person:3424> <rdf:RDF xmlns:j.0=“http://tdwg.org/onto/bdi/person.owl#” xmlns:rdf:”http://www.w3.org/1999/02/22-rdf-syntax-ns#”> <rdf:type resource=“http://tdwg.org/onto/bdi/person.owl#Person”/> <j.0:givenName>Steven</j.0:givenName> <j.0:familyName>Perry</j.0:familyName> </rdf:RDF>

  23. SPARQL SPARQL Query Language Example • “What is the genus of the specimen urn:lsid:auth.org:spec:657?” SELECT ?genus WHERE { <urn:lsid:auth.org:spec:657> <spec:identifiedAs> ?txname ?txname <tn:rank> <tn:Genus> ?txname <tn:uninomial> ?genus } ?genus = “Heteractis”

  24. Wasabi Server OAI, SPARQL, and LSID are standard protocols, so Wasabi services can be used by non-Wasabi clients.

  25. Wasabi Client Library

  26. Wasabi Client Library Client Library • Contains implementations of clients for protocols used by Wasabi • Can be included in projects that need to communicate with Wasabi servers • Programmatic access to services (hides XML messaging layer) • Provides status and progress listeners • Can be used to query non-Wasabi implementations of OAI or SPARQL

  27. Wasabi Indexer

  28. Wasabi Indexer Indexer • Harvests from 1 or more RDF sources • Sources can be Wasabi servers (via OAI) sets of RDF files, etc. • Multiple types of indices can be fed from a single set of descriptions • Indexers can filter by object type, etc. • Indexers should understand incremental updates and deletions

  29. Wasabi Indexer

  30. Wasabi Indexer

  31. Wasabi Indexer

  32. Wasabi Portal

  33. Wasabi Portal Portal • Customizable human interface that allows access to 1 or more Wasabi servers • Default portal requires a Lucene index of harvested data. Most portal queries are against the index • To retrieve and display data objects, the portal makes repeated LSID resolution calls so servers can log access

  34. Wasabi Portal Portal • Portal automatically configures search forms and renderers based on downloaded OWL ontologies • Provides simple search, advanced search, ontology browsing, and export of downloaded data to CSV or RDF files

  35. A More Realistic Network

  36. Implementation • http://wasabi.ecoforge.net • Java 1.5 with Spring, Jena, Lucene, and more • Server requires servlet container (Tomcat, WebLogic, etc.) • Server requires JDBC database (MySQL, PostgreSQL, etc.)

  37. Current State • Server, Client Library and Indexer components are feature complete • Portal is still under development • Using experimental OWL data models; awaiting TDWG ontology.

  38. Future Plans • Complete portal • Construct the FishNet2 network (25+ servers) • Construct the PlantCollections network (15+ servers)

  39. Conclusion WASABI is a framework for building scientific data networks based on RDF, OWL, and open data access protocols.

  40. Conclusion • RDF allows us to share complex data models • OWL allows machines to understand the data models and provides opportunities for extending models over time • Standard protocols (OAI, LSID, & SPARQL) allow for integration across data networks and with the semantic web

  41. Support Development of Wasabi is supported by the National Science Foundation as part of the Integrated Community Infrastructure (ICI) project.

More Related