1 / 24

Dag Endresen (dendresen@gbif) Knowledge Systems Engineer GBIF New Orleans (Louisiana, USA)

Biodiversity Information Standards, TDWG Annual Meeting 2011, New Orleans. The GBIF KOS Work Program: Prioritized Requirements and Proposed Solutions. Dag Endresen (dendresen@gbif.org) Knowledge Systems Engineer GBIF New Orleans (Louisiana, USA) 20 October 2011. Outline.

tanek
Download Presentation

Dag Endresen (dendresen@gbif) Knowledge Systems Engineer GBIF New Orleans (Louisiana, USA)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biodiversity Information Standards, TDWG Annual Meeting 2011, New Orleans The GBIF KOS Work Program: Prioritized Requirements and Proposed Solutions Dag Endresen (dendresen@gbif.org) Knowledge Systems Engineer GBIF New Orleans (Louisiana, USA) 20 October 2011

  2. Outline • Element vocabularies and value vocabularies • Vocabulary management tools • Vocabularies exchange format (SKOS) • Vocabulary registry (portal) • New data types

  3. Standards Biodiversity Information Standards (TDWG), Dublin Core Metadata Initiative (DCMI), Genomics Standards Consortium (GSC), etc... provide domain standards. We want to reuse, map and relate terms across these standards. Why: Gain understanding across domains

  4. Element vocabulary (glossary) Darwin Core (DwC), Dublin Core (DCMI), Ecological Metadata Language (EML), Gene Ontology (GO), TDWG Ontology, etc... provide definitions for conceptual terms. We want to reuse, map and relate terms from basic vocabularies with concept definitions. Why: reuse terms and share a common definitions and understanding of biodiversity concepts.

  5. Vocabulary management tools • GBIF Vocabularies • Custom Scratchpad Tool (Drupal) • Semantic Wiki (SpeciesID, Key to Nature) • Protégé (collaborative Protégé) • SKOSEd plugin, Web-Protégé • Top Quadrant EVN (commercial) • Pool Party (commercial) • ThManager (open source) • ISOcat (Clarin, linguistics) • iQvoc (open source) • TemaTres (open source, Spanish)

  6. GBIF Vocabularies http://vocabularies.gbif.org Collaborative development of community terminology, including biodiversity concept definitions and controlled value lists.

  7. Controlled Vocabularies The “Vocabularies” are Value Vocabularies (authority files) of accepted values for terms where controlled values are already available - or appropriate to develop. dc:Type Collection Dataset Event Image InteractiveResource Service Software Sound Text PhysicalObject StillImage MovingImage dwc:basisOfRecord PreservedSpecimen FossilSpecimen LivingSpecimen HumanObservation MachineObservation NomenclaturalChecklist Occurrence Taxon Location gbif:nomenclatural_code ICBN ICZN ICVCN ICNB ICNCP BioCode Why: standardize how biodiversity data is provided when controlled values are appropriate

  8. Controlled Vocabularies • “Extensions” are Element Vocabularies defining new terms organized as extensions to Core Types (dwc: Taxon and dwc: Occurrence). • Audubon Core (multimedia/images) • DwC-Germplasm (plant genetic resources) • EOL Data Object (species profiles) • GISIN Species Status (invasive species) • …etc Why: Provide a mechanism for thematic communities to define their own specific terms.

  9. GBIF Vocabularies http://vocabularies.gbif.org • Core types – could be more than DwC: Taxon and DwC: Occurrence • habitat, spatial areas, lines, grid, places, images/multimedia, literature, people, institutes, collections, collection specimens, etc…? • “Extensions” = element/attribute vocabulary, definition of terms • Separate the definition of terminology from application models • Is “extensions” the appropriate label? • “Vocabularies” = value vocabulary, authority files • external examples: countries, languages, … • biodiversity domain: taxonRank, basisOfRecord, …

  10. GBIF Vocabularies • GBIF Vocabularies is hosted by the Scratchpads server in London • Install the GBIF Vocabulary Service in Copenhagen? • Further developments are needed. • Package the Vocabulary Service as an open-source tool? • Develop as Drupal modules, migrate to Drupal 7? • Element vocabularies are not always an “extension” of Darwin Core…? • Add management interface with definitions for new core types? • Rename “Extensions” to “Element-” or ”Attribute Vocabularies”? • Rename “Vocabularies” to “Value Vocabularies” or “Authority files”…? • Export and import of vocabularies to and from other management systems (SKOS, RDF, OWL as vocabulary exchange format?) • SKOS import and export features to be developed? • Improved Human readable interface • Export to HTML/PDF format for human readable documentation of a vocabulary?

  11. Vocabulary Registry/Portal • GBIF Vocabulary Registry • Is the present registry sufficient? • GBIF Vocabularies • Develop the Scratchpads solution further as a vocabulary registry? • NCBO BioPortal alternative • Start using the NCBO BioPortal software Why: Support the discovery of biodiversity terminology and standard vocabularies.

  12. GBIF Vocabulary Registry The official versions of the “vocabularies” and “extensions” for deployment are available from the GBIF Registry (http://rs.gbif.org). They are used from here by the GBIF infrastructure such as the IPT and HIT. Separate service for discovery – different service from the GBIF Vocabulary site (management≠ discovery).

  13. GBIF Vocabulary Registry http://rs.gbif.org • Promote SKOS as the preferred vocabulary (exchange) format? • Gradually replace XML Schema for defining standards? • Why: Promote ease of vocabulary exchange, import and export. Simple Knowledge Organization System (SKOS)

  14. GBIF Vocabulary Registry • Add human interface to explore SKOS documents at the GBIF Registry? • OWLDoc (CO-ODE, static HTML) • OWL Ontology Browser (CO-ODE, dynamic)

  15. Using the BioPortal Registry GBIF KOS Task Group: “GBIF should deploy an instance of the BioPortal platform for biodiversity ontologies as a complement to the GBIF Vocabularies Server.”

  16. Using the BioPortal Registry • Include Biodiversity Vocabularies to the NCBO BioPortal…? • Will support the mapping of terms to the major Genomics Vocabularies. • Establish a “GBIF BioPortal” using the same BioPortal software? • Will focus on Biodiversity Community identity and relevance.

  17. Workflow Draft vocabulary Review version … and other SKOS compliant vocabulary management tools. Approve? Published version -> Uptake by the GBIF infrastructure including the IPT and the data portal.

  18. GBIF Strategic Plan 2012-2016: “In anticipation of the integration and serving of future data types, GBIF will work closely with partners to enable data integration and interoperability across phenotypic, genomic, taxonomic, geospatial and ecosystem domains.”

  19. GBIF Strategic Plan 2007-2011: “Further activities as part of the Plan will include improving the Data Portal system and expanding the depth and range of data types“ “specimen, observation, descriptive, literature, name/concept, image, character, OGC, etc”

  20. New Core Types? • DwC: Taxon • DwC: Occurrence • Aububon Core (images/multimedia) • Invasive Species (invasive in region/country) • New Spatial Objects (from point locations to include polygon, poly-line and grid objects) • etc… • Is the general principle on Extension of Core Types also suitable for new data types?

  21. dwc:Identification dwc:identificationID dwc:dateIdentified dwc:identifiedBy dwc:taxonID dwc:scientificNameID dwc:scientificName … Data types Bread wheat (Triticum aestivum L.) dwc:MeasurementOrFact gbif:Reference dwc:taxonID dwc:scientificNameID dwc:scientificName dwc:taxonConceptID dwc:kingdom dwc:family dwc:genus dwc:specificEpithet … dwc:Taxon dc:identifier dc:bibliographicCitation dc:title dc:creator dc:date dc:source dc:language dwc:taxonRemarks … Hazelnut (Corylus avellana L.) Laurel (Laurus azorica (Seub.) Franco) dwc:occurrenceID dwc:basisOfRecord dwc:eventID dwc:eventDate dwc:locationID dwc:decimalLongitude dwc:decimalLatitude dwc:taxonID dwc:scientificNameID dwc:scientificName … dwc:Occurrence dwc:measurementID dwc:measurementValue dwc:measurementUnit dwc:measurementDeterminedBy … Almonds (Prunusdulcis (Mill.) D.A.Webb) in Manouba, Tunisia dwc:vernacularName dc:language dc:temporal dwc:locality … etc… dc = http://purl.org/dc/terms/ dwc = http://rs.tdwg.org/dwc/terms/ gbif = http://rs.gbif.org/terms/1.0/

  22. Star schema dwc:Identification Bread wheat (Triticum aestivum L.) dwc:MeasurementOrFact audubon:Image dwc:Taxon Hazelnut (Corylus avellana L.) gbif:Reference Laurel (Laurus azorica (Seub.) Franco) gbif:VernacularNames dc = http://purl.org/dc/terms/ dwc = http://rs.tdwg.org/dwc/terms/ gbif = http://rs.gbif.org/terms/1.0/ audubon: http://rs.tdwg.org/ac/terms/

  23. Star schema (??) dwc:MeasurementOrFact dwc:Taxon Hazelnut (Corylus avellana L.) gbif:Reference audubon:Image Laurel (Laurus azorica (Seub.) Franco) dwc:Occurrence Almonds (Prunusdulcis (Mill.) D.A.Webb) in Manouba, Tunisia CollectionObject etc… dc = http://purl.org/dc/terms/ dwc = http://rs.tdwg.org/dwc/terms/ gbif = http://rs.gbif.org/terms/1.0/ audubon: http://rs.tdwg.org/ac/terms/ Metadata Place/location

  24. Biodiversity Information Standards, TDWG Annual Meeting 2011, New Orleans The GBIF KOS Work Program: Prioritized Requirements and Proposed Solutions Dag Endresen (dendresen@gbif.org) Knowledge Systems Engineer GBIF New Orleans (Louisiana, USA) 20 October 2011

More Related