hdf update
Download
Skip this Video
Download Presentation
HDF Update

Loading in 2 Seconds...

play fullscreen
1 / 61

HDF Update - PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on

HDF Update. Mike Folk The HDF Group HDF and HDF-EOS Workshop XII Aurora, Colorado October 16, 2008. Topics. Topics. What’s up with The HDF Group?. Announcement!. NASA Commits $3.1M to The HDF Group to Support Earth System Science. NASA Commits ….

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' HDF Update' - rusk


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
hdf update

HDF Update

Mike Folk

The HDF Group

HDF and HDF-EOS Workshop XII

Aurora, Colorado

October 16, 2008

HDF and HDF-EOS Workshop XII

topics
Topics

HDF and HDF-EOS Workshop XII

topics1
Topics

HDF and HDF-EOS Workshop XII

what s up with the hdf group

What’s up with The HDF Group?

HDF and HDF-EOS Workshop XII

announcement

Announcement!

NASA Commits $3.1M to The HDF Group to Support Earth System Science

HDF and HDF-EOS Workshop XII

nasa commits
NASA Commits …
  • “The HDF Group  has received a 3-year contract from NASA to provide ongoing development and support for the HDF technologies used by NASA’s Earth Observing System.  
  • The project continues the relationship that was first established in 1994, when HDF was selected as the standard format for the EOS Data and Information System (EOSDIS).  
  • Since that time, over 4 petabytes of mission data and derived data products have been stored in HDF4 and HDF5, with an estimated 1.6 million users.    

HDF and HDF-EOS Workshop XII

slide7
Under the new contract, The HDF Group will support NASA’s EOS program in five critical areas:
    •  Provide user support to EOS data providers and data consumers
    • Perform software development and quality assurance
    • Assure long-term access to HDF data
    • Integrate with complementary technologies and applications
    • Advise follow-on earth systems projects

HDF and HDF-EOS Workshop XII

history of the hdf group
History of The HDF Group
  • 18 Years at University of Illinois National Center for Supercomputing Applications
  • Spun-off from University July 2006
  • Non-profit
  • 20+ scientific, technology, professional staff
  • Intellectual property:
    • The HDF Group owns HDF4 and HDF5
    • HDF formats and libraries to remain open
    • BSD-type license

HDF and HDF-EOS Workshop XII

slide10

The HDF Group Mission To ensure long-term accessibility of HDF data through sustainable development and support of HDF technologies.

HDF and HDF-EOS Workshop XII

goals
Goals
  • Maintain, evolve HDF for sponsors and communities that depend on it
  • Provide consulting, training, tuning, development, research
  • Sustain the group for long term to assure data access over time

HDF and HDF-EOS Workshop XII

the hdf group services
The HDF Group Services
  • Helpdesk and Mailing Lists
    • Available to all users as a first level of support
  • Standard Support
    • Rapid issue resolution support
  • Consulting
    • Needs assessment, troubleshooting, design reviews, etc.
  • Enterprise Support
    • Coordinating HDF activities across departments
  • Special Projects
    • Adapting customer applications to HDF
    • New features and tools, with changes normally incorporated into open source product
    • Research and Development
  • Training
    • Tutorials and hands-on practical experience

HDF and HDF-EOS Workshop XII

members of the hdf support community
Members of the HDF support community
  • NASA
  • Sandia National Laboratory (2)
  • University of Illinois/NCSA
  • A leading U.S. aerospace company
  • NOAA Science Data Stewardship
  • New projects and partners
    • A major product lifecycle management company
    • A bioinformatics software company
    • Engineering Research and Development Center – Topographic Engineering Center
    • NPOESS
    • ITT VIS

HDF and HDF-EOS Workshop XII

initiatives and areas of increased interest
Initiatives and areas of increased interest
  • Bioinformatics
  • High performance computing (HPC)
  • Microsoft products (HPC, .NET, others)
  • Database integration
  • Improving concurrency
  • Performance and storage efficiency
  • Improving high level language support

HDF and HDF-EOS Workshop XII

topics2
Topics

HDF and HDF-EOS Workshop XII

basic library releases

Basic Library Releases

HDF5

HDF4

HDF4

HDF and HDF-EOS Workshop XII

overview of basic library releases
Overview of basic library releases

HDF and HDF-EOS Workshop XII

hdf5 1 8 0 feb 08
HDF5 1.8.0 (Feb 08)
  • Major release with file format changes and features.
  • File format changes affect backward/forward compatibility with previous releases.
  • See "New Features in Release 1.8.0 and Format Compatibility Considerations”

http://hdfgroup.org/HDF5/doc/ADGuide/CompatFormat180.html

HDF and HDF-EOS Workshop XII

hdf5 1 8 minor releases
HDF5 1.8 minor releases
  • 1.8.1 (May 08)
    • A minor release with bug fixes
    • Provided 1.8 full support for Fortran applications
    • Enhanced tools with 1.8.0 features
  • HDF5 1.8.2 coming Nov 08
    • Minor bug fixes
    • Tool enhancements

HDF and HDF-EOS Workshop XII

hdf5 1 6 minor releases
HDF5 1.6 minor releases
  • 1.6.7 (Feb 08)
    • Modification to address Aura issue
  • 1.6.8 coming Nov 08
    • Minor bug fixes

HDF and HDF-EOS Workshop XII

future hdf5 releases highlights
Future HDF5 releases (highlights)
  • Release HDF5 1.10.0
    • Performance improvements
    • Some new features
    • Support for Fortran 2003 features
    • Target date November 2009
  • When to drop support for 1.6.* ?

HDF and HDF-EOS Workshop XII

hdf 4 minor releases
HDF 4 minor releases
  • 4.2r3 (Feb 08)
    • Improved support for apps using HDF4 and NetCDF3
    • Improved support for data sets and coordinate variable with the same names
  • Release HDF4r2.4 coming Nov 08
    • Minor bug fixing, tools enhancements
    • Support for C shared libraries
    • Support for 32-bit version on Mac Intel
  • http://hdfgroup.org/products/hdf4/

HDF and HDF-EOS Workshop XII

h4 h5 conversion software 2 0 may
H4-H5 Conversion Software 2.0 (May)
  • Re-built with HDF5 1.8.1 and HDF 4.2r3.
  • Conversion tool h4toh5 enhanced
    • Converts HDF-EOS2 files to HDF5 files
    • Makes HDF5 files readable by NetCDF4

http://hdfgroup.org/h4toh5/

HDF and HDF-EOS Workshop XII

hdf eos library

HDF-EOS library

HDF and HDF-EOS Workshop XII

hdf eos2 and hdf eos5
HDF-EOS2 and HDF-EOS5
  • Auto configuration for HDF-EOS2 and HDF-EOS5
    • Compile and test libraries with automatic configuration tools
    • Thank you, Abe!
  • Testing of EOS2 and EOS5
    • Test daily with HDF4 and HDF5 development code
    • Periodically test on EOS-critical platforms
  • EOS website support

HDF and HDF-EOS Workshop XII

topics3
Topics

HDF and HDF-EOS Workshop XII

h5check 1 0 march 2008
h5check 1.0 (March 2008)
  • A validation tool to verify whether an HDF5 file is encoded according to the HDF5 File Format Specification.
  • To ensure format integrity and long-term compatibility between versions of the HDF5 library.
  • By default, the file is verified against 1.8.x. Can also verify against 1.6.x.

HDF and HDF-EOS Workshop XII

major improvements for existing tools
Major Improvements for Existing Tools
  • Improved handling of large datasets by h5diff, h5repack, hdiff, and hrepack
  • Other added capabilities
    • H5import: to import strings
    • H5diff: to deal with NaN values
    • H5dump: to dump objects in requested order
    • H5repack:
      • To apply multiple filters to all objects
      • To add a userblock
      • To align datasets in file at byte offsets that support efficient access

HDF and HDF-EOS Workshop XII

in the works h52jpeg
In the works: h52jpeg
  • Converts datasets in an HDF5 file to a jpeg image.
  • Prototype available, if you are interested.

HDF and HDF-EOS Workshop XII

please send us your comments and requests regarding the hdf4 and hdf5 library and tools
Please send us your comments and requests regarding the HDF4 and HDF5 library and tools

HDF and HDF-EOS Workshop XII

topics4
Topics

HDF and HDF-EOS Workshop XII

hdf java
HDF Java
  • HDF-Java 2.5 release
    • Beta 1 Release Feb 08
    • Full release planned for Dec. 2008
  • HDF5 JNI updated for HDF5 1.8.x with 1.6 flag
  • Binary for 32-bit Linux and 64-bit Solaris
  • Also added daily testing added for hdf-java products

HDF and HDF-EOS Workshop XII

also in the pipeline
Also in the pipeline
  • Full Java Support for HDF5 1.8.x
    • Add and test new functions in Java wrapper
    • Implement and test new functions in C JNI
    • Use new functions in HDF-Java objects
  • Add many new features
  • Improve performance
  • Revise HDFView User’s Guide

HDF and HDF-EOS Workshop XII

topics5
Topics

HDF and HDF-EOS Workshop XII

surviving a system failure

Surviving a System Failure

HDF and HDF-EOS Workshop XII

35

surviving a system failure in hdf5
Surviving a System Failure in HDF5

Problem:

In the event of an application or system crash, data in HDF5 files are susceptible to corruption

Corruption can occur if structural metadata is being written when the crash occurs

Initial Objective:

Guarantee an HDF5 file with consistent metadata can be reconstructed in the event of a crash

No guarantee on state of raw data – contains whatever data made it to disk prior to crash

HDF and HDF-EOS Workshop XII

36

hdf5 metadata journaling recovery
HDF5 Metadata Journaling Recovery

Application crashes

H5recover Tool

RestoredHDF5 File

Corrupted HDF5 File

Companion Journal File

HDF and HDF-EOS Workshop XII

faster hdf5 data appends

Faster HDF5 Data Appends

HDF and HDF-EOS Workshop XII

fast data appends
Fast Data Appends
  • Problem: Metadata operations limit the rate at which HDF5 can append data to datasets.
  • Solution: new data structure for indexing chunks:
    • Allows constant time extend, shrink and lookup of chunks in datasets with single unlimited dimension
    • # of metadata I/O operations to append to dataset is independent of # of chunks
    • Also allows single-writer/multiple-reader access
  • Details at:http://hdfgroup.uiuc.edu/RFC/HDF5/ReviseChunks/

HDF and HDF-EOS Workshop XII

41

hdf performance framework

HDF Performance Framework

A framework for performance regression testing

HDF and HDF-EOS Workshop XII

hdf performance framework1
HDF Performance Framework

A tool for

Testing on multiple platforms

Testing different versions

Long term regression testing

Assistance in debugging

New for 1.8:

API and format versioning

Improved reporting interfaces

Future related work

Quality monitoring of the software, such as code coverage, memory usage

HDF and HDF-EOS Workshop XII

other library work

Other library work

HDF and HDF-EOS Workshop XII

library features
Library Features
  • Improved external link support
    • External link: link to HDF5 object in another file
    • Can more easily specify path lookup of external files
    • Adding external link support for h5ls and h5dump
  • Time datatype improvements
    • Expand time type to support native formats better
    • Adapt tools to display them properly
  • Port to OpenVMS (limited support)

HDF and HDF-EOS Workshop XII

improving performance
Faster file free-space management while file open

Many transactions can create many holes

Free space management recovers unused space

Up to 38x improvement in experiments

Direct I/O: file I/O goes directly between application and storage, bypassing operating system read and write caches

Disabling automatic metadata cache flushing

In experiments, direct I/O combined with metadata cache disabling improved I/O speed by about 2x.

Improving performance

HDF and HDF-EOS Workshop XII

topics6
Topics

HDF and HDF-EOS Workshop XII

remote access

Remote access

HDF and HDF-EOS Workshop XII

three remote access projects
Three “remote access” projects
  • HDF5-OPeNDAP handler
    • See talk by Kent Yang: “HDF5 OPeNDAP project update and demo”
  • HDF5-iRODS integration
    • See Peter Cao’s talk Thursday: “HDF5 iRODS”
  • Accessing HDF5 through SSHFS-FUSE

HDF and HDF-EOS Workshop XII

accessing hdf5 through sshfs fuse
Accessing HDF5 through SSHFS-FUSE
  • Access to files on remote NFS system limited
  • Combining FUSE (Filesystem in Userspace) with SSHFS (Secure Shell File System)
    • FUSE provides application with local view of remote file system
      • Another way to mount remote file system
    • SSHFS allows the local file system to access parts of remote file.
      • e.g., “read” operation on the remote filesystem can be served through SSH
      • Subsetting can be efficiently done with SSHFS
  • Extract a dataset (5 MB) from a 96 MB HDF5 file
    • Download whole file + subset locally: 9.85 seconds
    • Subset with SSHFS: 0.47 seconds
  • Technical report in the works

HDF and HDF-EOS Workshop XII

hdf4 layout map project
HDF4 Layout Map Project
  • Problem
    • Long-term readability of HDF data dependent on long-term availability of HDF software
  • Proposed solution
    • Create a map of the layout of data objects in an HDF file, allowing a simple reader to be written to access the data
  • See today’s talk by Folk and Duerr: “Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps.”

HDF and HDF-EOS Workshop XII

hdf and net framework
HDF and .NET Framework
  • Prototype .NET wrappers for HDF5 1.8.0
    • Based on subset of HDF5 C routines
  • Released in March, 2008
  • Unsupported
    • Considerable interest, but currently no funding to support or maintain
    • Use hdf-forum email list for questions

HDF and HDF-EOS Workshop XII

52

netcdf 4 released june 2008

netCDF-4Released June 2008!!

HDF and HDF-EOS Workshop XII

five open source packages
Five open source packages
  • PyHDF
    • Python interface to HDF4
    • http://pysclint.sourceforge.net/pyhdf/
  • Geospatial Data Abstraction Library (GDAL)
    • Translator library for Raster Geospatial Data Formats
    • Supports about 100 file formats
    • http://gdal.org/
  • NCAR Common Language (NCL)
    • Interpreted Language for Data Analysis and Visualization
    • http://ncl.ucar.edu/
  • Grid Analysis and Display System (GrADS)
    • Interpreted Language for Data Analysis and Visualization
    • http://iges.org/grads/
  • GNU Data Language (GDL)
    • Interpreted Language for Data Analysis and Visualization
    • Data Analysis and Visualization
    • http://gnudatalanguage.sourceforge.net/

HDF and HDF-EOS Workshop XII

evaluation criteria
Evaluation criteria
  • Formats
    • HDF4, HDF5, netCDF
    • Objects supported in each language
  • Installation
    • Availability of binaries
    • Other requirements
  • Adequacy of documentation
  • Technical report available soon.

HDF and HDF-EOS Workshop XII

windows virtualization motivation high cost of maintaining many different windows configurations

Windows VirtualizationMotivation: high cost of maintaining many different Windows configurations

HDF and HDF-EOS Workshop XII

57

maintenance testing with vmware
Maintenance & Testing with VMWare
  • Multiple virtual machines run in parallel
  • Only relevant software installed
  • Each represents a supported configuration
  • Run nightly tests of HDF4, HDF5
  • Each is powered on, tested, cleaned automatically
  • Technical report available soon.

HDF and HDF-EOS Workshop XII

hdf5 data transform pilot study
HDF5 Data Transform Pilot Study
  • Tools for Flight Test Data
  • Framework to define and apply transformations to data being read
  • Transformations specified in Python

HDF and HDF-EOS Workshop XII

science data stewardship
Science Data Stewardship
  • Goal: migrate data to a single standards-based archive format.
  • Approach: investigate how to store NASA ECS data and metadata in HDF5 Archival Information Packages (AIP).
  • See talk by Yang, Duerr et al: “Using HDF5 Archive Information Package to preserve HDF-EOS2 data”

HDF and HDF-EOS Workshop XII

thank you all and thank you nasa

Thank You AllandThank You NASA!

HDF and HDF-EOS Workshop XII

acknowledgements
Acknowledgements

This report is based upon work supported in part by a Cooperative Agreement with the National Aeronautics and Space Administration (NASA) under NASA Awards NNX06AC83A and NNX08AO77A.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration.

HDF and HDF-EOS Workshop XII

questions comments

Questions/comments?

HDF and HDF-EOS Workshop XII

ad