Enhancing Global Biodiversity Through a Unified Entity Registration System
The Global Biodiversity Information Facility (GBIF) aims to streamline the registration and discovery of various entities, including institutions, networks, datasets, and vocabularies. It enables users to access network resources accurately while modeling complex relationships between entities for correct attribution. Secondary services include monitoring for new resources and technical failures, along with enhanced search capabilities through metadata indexing. Utilizing a MySQL database, SOLR search server, Rabbit MQ for messaging, and RESTful web services, GBIF facilitates a comprehensive approach to biodiversity management.
Enhancing Global Biodiversity Through a Unified Entity Registration System
E N D
Presentation Transcript
Registry Global Biodiversity Information Facility (GBIF) 2012 Éamonn Ó Tuama
A shared registry Primary aims: • To allow the registration and discovery of a growing amount of entities: Institutions, networks, datasets, schemas, vocabularies etc • Provide the means to direct clients on how to access network resources • To accurately model the complex relationships between entities, to enable correct attribution (e.g. recognizing data hosting partnerships) • To provide a reliable identifier “minting” service, allowing distributed systems to connect on common resources
A shared registry Secondary aims: • Provide network monitoring services, to (e.g.) provide alerts on new resources, or technical failures (servers going offline) • Offer search capabilities through indexing of metadata • Enable external classification on registered objects through the use of tagging (both private and public tagging)
AGENT table: holds the information of all entities inside the GBIF Network. These are Organizations, Datasets, Technical Installations, Nodes and Networks AGENT_RELATION table: holds the relations between these entities. For example, this tables models relations like "OrganizationXYZ owns DatasetABC" or "NodeLMP endorses OrganizationXYZ“ 4 tables hold information related to each agent: SERVICE, CONTACT, IDENTIFIER, TAG
System Architecture • A MySQL database (modeling the network graph) • SOLR Search Server • Rabbit MQ message broadcasting • XML files stored on the filesystem • RESTful (JSON) web services • Considering SPARQL endpoint