1 / 39

Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps

Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps. Ruth Duerr, NSIDC Christopher Lynnes, GES DISC The HDF Group. Background and basic concept. I’m Plastic Man!. HDF4 is. EXTENSIBLE. FLEXIBLE. SELF-DESCRIBING. But There’s a cost…. Complexity!. complexity.

cgonsalves
Download Presentation

Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Ruth Duerr, NSIDC Christopher Lynnes, GES DISC The HDF Group HDF and HDF-EOS Workshop XII

  2. Background and basic concept HDF and HDF-EOS Workshop XII

  3. I’m Plastic Man! HDF4 is EXTENSIBLE FLEXIBLE SELF-DESCRIBING HDF and HDF-EOS Workshop XII

  4. ButThere’s a cost… HDF and HDF-EOS Workshop XII

  5. Complexity! complexity HDF and HDF-EOS Workshop XII

  6. HDF and HDF-EOS Workshop XII

  7. HDF and HDF-EOS Workshop XII

  8. HDF and HDF-EOS Workshop XII

  9. HDF and HDF-EOS Workshop XII

  10. HDF and HDF-EOS Workshop XII

  11. HDF and HDF-EOS Workshop XII

  12. HDF and HDF-EOS Workshop XII

  13. How do we save HDF users from having to deal with all of the complexity under the hood? HDF and HDF-EOS Workshop XII

  14. Through the HDF software libraries, either by using the HDF APIs directly or by using HDF tools that depend on the HDF libraries. But what about the future… HDF and HDF-EOS Workshop XII

  15. There is a risk in depending solely on the HDF libraries to access HDF-formatted data over the long term. • It is possible, especially in the distant future, that the libraries may not be available. HDF and HDF-EOS Workshop XII

  16. Really smart people and software? Maybe future data users and their computers will be so smart that the HDF4 format will be a piece of cake. HDF and HDF-EOS Workshop XII

  17. Maybe not. HDF and HDF-EOS Workshop XII

  18. We need an “easy” button HDF and HDF-EOS Workshop XII

  19. “If only we could read HDF data with an independent program that does not rely on the HDF API… A possible approach [would be to] extend hdfls to print a hierarchical map of a data file, [and] write ncdump/hdp-like utilities to find, assemble and write out SDSes and vdatas.” “Leveraging HDF Utilities”Christopher LynnesHDF Workshop X. HDF and HDF-EOS Workshop XII

  20. HDF and HDF-EOS Workshop XII

  21. HDF4 file layout HDF and HDF-EOS Workshop XII

  22. HDF4 file layout HDF and HDF-EOS Workshop XII

  23. The project HDF and HDF-EOS Workshop XII

  24. HDF4 mapping • Problem • The complex internal byte layout of HDF files requires one to use the API to access HDF data. • This makes long-term readability of HDF data dependent on long-term allocation of resources to support HDF software. • Proposed solution • Create a map of the layout of data objects in an HDF file, allowing a simple reader to be written to access the data. HDF and HDF-EOS Workshop XII

  25. HDF4 mapping project activities • Assess and categorize HDF4 data held by NASA • To determine what types of objects to map. • To get an idea of the magnitude of the project. • Develop prototype for proof of concept • Develop markup-language based layout specification. • Develop tool to produce layout for an HDF4 file. • Develop and test two independent tools to read HDF4 data based solely on the map files. HDF and HDF-EOS Workshop XII

  26. Project activities (continued) • Assess results and plan next steps • Present results and options for proceeding to the community. • Assess the likely usefulness of this approach, as well as any desirable modifications • Evaluate the effort required for a full solution that best meets community needs • Submit a proposal for the work needed to provide a full solution HDF and HDF-EOS Workshop XII

  27. 1. Assess and categorize HDF and HDF-EOS Workshop XII

  28. How many HDF4 products? HDF and HDF-EOS Workshop XII

  29. Product Identification Product Name Data Level Archive Location Product Version Whether the product was multi-file For HDF-EOS products HDF-EOS version For point data Number of point data sets Maximum number of levels For swath data Number of swaths Maximum number of dimensions Organized by time, space, both, or other Whether dimension maps were used For gridded data Number of grids Max number of dimensions in a grid Number of projections used Whether any grids were indexed HDF Version For raster data Number of 8-bit rasters Number of 24-bit rasters Number of general rasters Whether any rasters had attributes Whether any rasters were compressed Whether any rasters were chunked Whether there were any palettes For SDS data Number of SDSs Maximum number of dimensions Did any SDS have attributes Was any SDS annotated Were dimension scales used Was compression used and if so what kind Was chunking used For Vdata Number of Vdata structures Did any Vdata have attributes Did any Vdata fields have attributes Was compression used and if so what kind Was chunking used Data characteristics Product Characteristics Examined HDF and HDF-EOS Workshop XII

  30. Other results • Slightly more than half of the HDF4 products are in HDF-EOS 2 format • Grids are the most common HDF-EOS data structures in use • No products use a combination of grid, swath, and point data structures HDF and HDF-EOS Workshop XII

  31. 2. Prototype and proof of concept HDF and HDF-EOS Workshop XII

  32. HDF4 mapping prototype workflow HDF4 File “H4.hdf” HDF4 Mapping File (XML document) “H4.hdf.map.xml” hmap linked with HDF4 library Object Data Groups, Data Objects, Structural and Application Metadata; Locations of Object Data Reader 1 (C program) Reader 2 (Perl Script) HDF and HDF-EOS Workshop XII

  33. Proof-of-concept results • The HDF Group created prototype map generation software and a draft map specification • Map generator was tested on a wide variety of data products • GES-DISC and NSIDC independently wrote software that uses maps to read data files in NSIDC’s and GES-DISC’s archives • Summary - the concept is feasible! HDF and HDF-EOS Workshop XII

  34. Example map fragment <?xml version="1.0" encoding="utf-8"?> <hdf4:HDFMap xmlns:hdf4="http://www.hdfgroup.org/HDF4/HDF4Map"> <hdf4:RootGroup> <hdf4:SDS objName="data1" objPath="/" objID="xid-DFTAG_NDG-2"> <hdf4:Attribute name="data range" ntDesc="32-bit signed integer"> 0 255 </hdf4:Attribute> <hdf4:Datatype dtypeClass="INT" dtypeSize="4" byteOrder="BE" /> <hdf4:Dataspace ndims="2"> 10 100 </hdf4:Dataspace> <hdf4:Datablock nblocks="1"> <hdf4:BlockOffset> 2502 </hdf4:BlockOffset> <hdf4:BlockNbytes> 4000 </hdf4:BlockNbytes> </hdf4:Datablock> </hdf4:SDS> </hdf4:RootGroup> </hdf4:HDFMap> HDF and HDF-EOS Workshop XII

  35. Next steps HDF and HDF-EOS Workshop XII

  36. Effort for full implementation • Generate maps for existing archives • GES-DISC approach: append the map XML to the XML files already kept for each file in their archive • NSIDC non-ECS data implementation: add an XML file for each data file in same directory • Other systems TBD • Generate maps for new data • Add map generation as a step in the ingest process using stand alone tool • Request product generation systems to use new API calls that generate maps • Develop production quality implementation of mapping tool, and possibly an API. • Possibly do similar assessment for HDF5 maps. HDF and HDF-EOS Workshop XII

  37. How you can help • Consider what it might take to implement this for your archive - contact Ruth if you’d like support • Review the materials on the wiki and elsewhere - comment heavily! HDF and HDF-EOS Workshop XII

  38. For more information • Wiki page added to Confluence wiki • Project page at The HDF Group website: • http://www.hdfgroup.org/projects/hdf4mapping/ • Paper at 2008 fall AGU • Paper “Ensuring Long Term Access to Remotely Sensed Data with Layout Maps” in the upcoming TGRSS special issue on archiving and distribution HDF and HDF-EOS Workshop XII

  39. Thank you. This report is based upon work supported in part by a Cooperative Agreement with the National Aeronautics and Space Administration (NASA) under NASA Award NNX06AC83A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration. HDF and HDF-EOS Workshop XII

More Related