1 / 23

GLOBAL BIODIVERSITY

WWW.GBIF.ORG. GLOBAL BIODIVERSITY. INFORMATION FACILITY. The GBIF Data Repository Tool (New updated version 3.0). Hannu Saarenmaa EC CHM & GBIF European Regional Nodes Meeting Copenhagen, 2005-09-15/18. Outline. Objectives and background Design and installation Use Demonstration.

Download Presentation

GLOBAL BIODIVERSITY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WWW.GBIF.ORG GLOBALBIODIVERSITY INFORMATIONFACILITY The GBIF Data Repository Tool (New updated version 3.0) Hannu Saarenmaa EC CHM & GBIF European Regional Nodes Meeting Copenhagen, 2005-09-15/18

  2. Outline • Objectives and background • Design and installation • Use • Demonstration

  3. 1. Objectives and background

  4. Challenges in data sharing • Eventually, all data sets become orphans: Archiving services are a necessity. • The concept ”share once, use many” requires available data repositories. • Data from archives must be available using standard mechanisms to portals such as GBIF. • IPR, confidentiality, and benefit sharing must be respected at all times.

  5. Goals of the GBIF Data Repository Tool • Enable data custodians to manage their data and control its publishing. • Provide mechanism such that spreadsheets, etc., can directly be used for sharing data • Hide the database complexities from users • Make available a simple data warehouse tool for those who want to host datasets for the community • I.e., lower the threshold of data sharing as low as possible.

  6. 2. Design

  7. Functionalities • Data must be formatted according to the Darwin Core standard and its extensions in flat spreadsheet format. • In fact, any flat format will work (rows, columns) • The system will check and parse the data into embedded MySQL database that becomes available to the public as a DiGIR/TAPIR resource. • Owners can control the level of detail released: • Fuzzying of geographic coordinates is available • Collector names and time periods can be hidden • Approval of terms and conditions for data use can be required • Owner can revoke release and update data. • Metadata can be inherited to data to replace missing values as defined. • Includes an embedded image server

  8. Component architecture

  9. Installation • For Linux and Windows • Based on Python, Zope 2.10 and MySQL • Supports the DiGIR and TAPIR protocols of TDWG • Turn-key installation • Fits with directly into the EC CHM software package

  10. 3. Use

  11. Steps for data owners • Prepare the data files • Create a nested folder structure on the Repository for the collection • Enter default metadata scope (to cover missing values in data, etc.) • Decide on access policies • Upload the files • Publish the data files

  12. Create a collection and assign it to a data custodian...

  13. Enter metadata scope for the collection (inherited)

  14. Create the resources (databases) of the collection and folders

  15. Upload the files to folder, validate and release them

  16. Prepare the data file(s) in tab-separated format

  17. Access policy • options: • Fully open • Standard GBIF policy of acknow-ledgements • No direct download and fuzzying for web service access

  18. Data is now searchable locally and through the DiGIR/TAPIR protocols

  19. 4. Demonstrationhttp://fmnh.eaudeweb.ro/

More Related