140 likes | 264 Views
EnParDis En abling Par ameter Dis covery. Roy Lowry, Michael Hughes & Laura Bird British Oceanographic Data Centre. Background. In the 1980s oceanographic data management handled <20 parameters During the 1990s the number of parameters handled increased dramatically, largely due to JGOFS
E N D
EnParDisEnabling Parameter Discovery Roy Lowry, Michael Hughes & Laura Bird British Oceanographic Data Centre
Background • In the 1980s oceanographic data management handled <20 parameters • During the 1990s the number of parameters handled increased dramatically, largely due to JGOFS • BODC adopted a controlled vocabulary approach to parameter management • A parameter dictionary with >9000 entries has been developed • EnParDis is a project to build on this
Background • BODC’s data holdings are marked up with controlled vocabulary codes • Mark-up requires copious quantities of skilled labour • In an ideal world the data would be marked up by scientists using a standard dictionary • This won’t happen unless scientists can see benefits, such as improved data access • EnParDis aims to address this
EnParDis Objectives • Enhance the BODC Parameter Dictionary into a worthy international standard vocabulary. • Develop interoperability between the BODC Parameter Dictionary and other organisations’ dictionaries. • Examine the potential of semantic and ontological tools for dictionary management and parameter discovery.
EnParDis Work Outline Work Package 1 • Map biological entities in the dictionary to ITIS • Implement ITIS taxonomy as a dictionary parameter grouping tool
EnParDis Work Outline Work Package 2 • Develop keyword group mappings (SeaSearch and GCMD) • Investigate feasibility of user-definable groupings • Investigate techniques to support parameter interconversions (e.g. unit conversions)
EnParDis Work Outline Work Package 3 • Expand the BODC dictionary to cover other known dictionaries • SISMER (France) • SMHI (Sweden) • DOD (Germany) • Pangaea (Germany) • Rijkwaterstaat (Netherlands) • US JGOFS • MEDS (Canada) • BIO (Canada) • ICES Contaminants Database • CF? • Any others I can find
EnParDis Work Outline Work Package 3 (continued) • Review the current dictionary • Revise implementation structure • Eliminate classification that is hard-coded into data markup • Improve the clarity of parameter descriptions • Improve the valid parameter value ranges stored in the dictionary • Investigate if dictionary could be mapped through a data model into an ontology
EnParDis Progress • WP1 Progress • ITIS database installed at BODC • Initial mapping between dictionary biological entities and ITIS • Over 100 spellings in the dictionary standardised to ITIS • A couple of errors in ITIS identified and their correction agreed • 163 additional genera/species submitted to go to ITIS. A further 81 identified for submission • Access and Web taxonomic browser demonstrators developed
EnParDis Progress • WP1 Taxonomic Browser • Driven by two tables • Map between BODC code and ITIS code plus parameter measured • Implementation of ITIS Taxonomy • ITIS code plus taxonomy (27 levels) encoded in a 216-byte string • Allows all BODC codes for a given parameter to be extracted at any taxonomic level (eg abundance in water column for all species of a given genus)
EnParDis Progress • WP1 Next Stage • Operational implementation of ITIS mapping • Revision of map design to facilitate operation with RIKZ DONAR and WADI data models • Map maintenance protocols • ITIS database update protocols • Incorporation of demonstrator technology into BODC data retrieval tools
EnParDis Progress • WP2 Progress • First-cut mapping to GCMD parameter valids • SEASEARCH groupings implemented (in June) • WP2 Next Stage • Consultations with GCMD on their parameter valids • Implementation of GCMD groupings as an alternative to SEASEARCH • Look into potential of RDF or OWL for parameter classification management
EnparDis Progress • WP3 Progress • Work commenced on RIKZ, SISMER and US JGOFS manual mappings • Dictionary entries in preparation to fill gaps identified by mappings • Units abbreviated titles standardised (NPL) • Revised dictionary structure implemented • WP3 Next Stage • Continue mappings and dictionary extension • Research more elegant techniques for large scale mappings (semantic analysis?) • Data model mapping and ontology research
Contacting EnParDis • An e-mail discussion list has been set up for EnParDis • List address is: enpardis@mailman.nerc-bidston.ac.uk • URL for subscription to the list: mailman.nerc-bidston.ac.uk/mailman/listinfo/enpardis