1 / 1

NPOESS Enhanced Description Tool - “ned”

NPOESS Enhanced Description Tool - “ned”. Format Challenges. HDF5 for NPOESS. Geolocation appears in a separate product group and may be in separate HDF5 file. Field metadata, used to interpret data (similar to netCDF CF) are in separate product profile file.

ora-hudson
Download Presentation

NPOESS Enhanced Description Tool - “ned”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NPOESS Enhanced Description Tool - “ned” Format Challenges HDF5 for NPOESS • Geolocation appears in a separate product group and may be in separate HDF5 file. • Field metadata, used to interpret data (similar to netCDF CF) are in separate product profile file. • Quality flags must be parsed before they can be interpreted. • Information needed for un-scaling scaled integers is not obvious. • HDF5 indirect reference link API, used to link metadata to the data in NPOESS’ use is complex and not supported by all analysis COTS implementations. • Hierarchical Data Format 5 (HDF5) is the format for delivery of processed products from the National Polar-orbiting Operational Environmental Satellite System (NPOESS) and for the NPOESS Preparatory Project (NPP). • HDF5 is a general purpose library and file format for storing scientific data. Two primary objects: • Dataset, a multidimensional array of data elements • Group, a structure for organizing objects • Efficient storage and I/O, including parallel I/O. • Free, open source software, multiple platforms. • Data stored in HDF5 is used in many fields from computational fluid dynamics to film making. • Data can be stored in HDF5 in an endless variety of ways, so it is important to standardize how NPOESS product data is organized in HDF5. Enhancements by “ned” What is “ned”? HDF5 NPOESS HDF5 NPOESS • A “c” language demonstration exercise • An exploration of using the HDF5 API and other tools to navigate the NPOESS product information model. • How hard is it to deal with the format challenges? The challenges marked with (see left)are addressed by “ned” • Are NPOESS products enough consistent, one to another that a general purpose reader is possible? • Can a computer program make detailed associations between the HDF5 file and the XML product profile? • Can the product profile be used to drive automated parsing of quality flags? • To demonstrate the feasibility of implementation of suggested enhancements in the self-description of NPOESS products. • “ned” makes enhancements to the NPOESS format for challenge items marked by <<HDF5 Group>> / (root) <<HDF5 Group>> All_Data 1 1 <<HDF5 Group>> / (root) <<HDF5 Group>> All_Data 1 1 1 1 • attributes • attributes ned 1 Bit-wise flag fields are unpacked, split into separate fields, then compressed 1 1 * 1 * <<HDF5 Group>> Data Products <<HDF5 Group>> <collection_shortname>_All <<HDF5 Group>> Data Products <<HDF5 Group>> <collection_shortname>_All • attributes 1 1 1 <<HDF5 Dataset>> <QA field> 1 <<HDF5 Dataset>> <collection_shortname>_Aggr <<HDF5 Dataset>> <collection_shortname>_ Aggr • attributes 1 1 • attributes • attributes dataset of data Format Strengths Extract from XML Profile source … 1 dataset of Object References [one per field] 1 dataset of Object References [one per field] * * <<HDF5 Group>> <collection_shortname> 1 <<HDF5 Group>> <collection_shortname> 1 <<HDF5 Dataset>> <field> {nf} {nf} <<HDF5 Dataset>> <field> {nf} {nf} • attributes 1 • attributes 1 • attributes <<HDF5 Dataset>> <collection_shortname>_Gran_n <<HDF5 Dataset>> <collection_shortname>_ Gran_n 1 {nf} {nf} ng dataset of data ng dataset of data • attributes • attributes • Straight HDF5. • No need for additional libraries. • Consistent HDF5 group structure • Organization for each product is the same as all others. • Data “payload” is always in a product group within All_Data group. • Allows for flexible temporal aggregation • Granules are appended by extending dataset dimension. Attribute Aggregation … dataset of Region References (Hyperslabs) [one per field] {nf} dataset of Region References (Hyperslabs) [one per field] 1 HDF5 Enhanced by “ned” XML Product Profile Current Information Model as delivered by NPOESS IDPS. HDF5 file and associated XML Product Profile 1 <field> • attributes <<HDF5 Dataset>> <field> <<HDF5 Dataset>> <field> <<HDF5 Dataset>> <field> <<HDF5 Dataset>> <field> • attributes • attributes • attributes • attributes dataset of data dataset of data dataset of data dataset of data Information Model UML Diagram  SDR  Sensor Data Record  TDR  Temperature Data Record  EDR  Environmental Data Record  IP  Intermediate Product  ARP  Application Related Product  GEO  Geolocation   NPOESS  National Polar-orbiting Operational Environmental Satellite System  NPP  NPOESS Preparatory Project  ned version 0.7 • ANSI C code • HDF hdf5-1.6.5 • libxml 2.6.16 • udunits 1.4 • Development • Apple Xcode • GNU C 4.0 • SLOCCount.. • ansic=3707 • XML=1500 • Submitted for NASA technology transfer open source release. • Written against NPOESS IDPS build 1.4 sample data and profiles. • Numerous discrepancies between profiles and sample data resolved by editing XML profiles (not affecting build 1.4 profile DTD). • After editing profiles, ned successfully parses and acts upon 50 of 53 non-RDR sample files. • Not complete! NED …. open attribute definition configuration file open the input HDF5 file create an output HDF5 file (make the "All_Data" and "Data_Product" paths) for each product in the "Data_Products" path of the input HDF5 file (H5Giterate) { Find and open the associated XML Product Profile create a product group in the "Data_Products" path of the output HDF5 file for the aggregate reference in the "Data_Products" path of the input HDF5 file { create an "aggregate reference stub" in the output HDF5 file for each attribute attached to the aggregate (H5Aiterate) validate type and range attach attribute to the aggregate reference stub of the output HDF5 file hold aggregate metadata for later } for each granule reference in the "Data_Products" path of the input HDF5 file { create a ”granule reference stub" in the output HDF5 file for each attribute attached to the granule (H5Aiterate) validate type and range attach attribute to the granule reference stub of the output HDF5 file "aggregate" the metadata according to attribute definition file (hold for later) } create a product group in the "All_Data" path of the output HDF5 file attach the aggregate metadata as HDF5 attributes of the product group in the "All_Data" path of output file. attach the aggregated granule metadata as HDF5 attributes of the product group in the "All_Data" path of output file. for each field in the associated product group in the "All_Data" path of the input HDF5 file { Find and open the associated "sub-tree" of the XML profile. read metadata from XML profile and validate type and range ** "aggregate" the field metadata to associate with the data aggregation for each NPOESS "datum" in the field { create a data aggregation field in the output HDF5 file copy the data from the input HDF5 file to the output HDF5 file attach the field metadata to the data aggregation field of the output HDF5 file } } ** for each field in the "All_Data" path of the output HDF5 file create the object reference to the field populate the "aggregate reference stub” in the "Data_Products" path of the output HDF5 file ** for each granule for each field in the "All_Data" path of the output HDF5 file create the region reference to the portion of the field appropriate to this granule populate the "granule reference stub"in the "Data_Products" path of the output HDF5 file }     What does ned do? • Check the profile against the HDF5 product. • Validate attributes in HDF5 product (type and value). • For each field, read 16 attributes from product profile and attach them as HDF attributes of the field dataset. • Attach scale and offset values as “Scale_Factor” and “Add_Offset” attributes to field dataset. • Parse bit-wise fields and create separate compressed datasets for each quality flag. • Aggregate granule attributes and attach them as attributes of All_data product group. What does (0.7)not do? • Copy some of the attributes. • Aggregate g-Rings. • Aggregate percent attributes. • Compute reference regions. • Produce a user block. • Discover shared dimensions and create appropriate aggregate attributes. Additional thoughts • Modifications will be necessary to accommodate build 1.5 product profiles. • Should the field attribute names follow the CF conventions? ‘Enhanced” UML Diagram N A S A • N P P S C I E N C E D A T A S E G M E N T • I P O • A L G O R I T H M D I V I S I O N Richard E. Ullman NASA/GSFC/NPP • NOAA/NESDIS/IPO Data / Information Architecture • Algorithm / System Engineering richard.e.ullman@nasa.gov

More Related