1 / 72

HDF Update

HDF Update. Mike Folk The HDF Group HDF and HDF-EOS Workshop XI November 7, 2007. Outline. What is The HDF Group? HDF Software Update Other Activities of Interest. What is The HDF Group (THG)?. THG, the Company. Spun-off from University of Illinois July 2006 Non-profit

gizela
Download Presentation

HDF Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HDF Update Mike Folk The HDF Group HDF and HDF-EOS Workshop XI November 7, 2007 The HDF Group

  2. Outline • What is The HDF Group? • HDF Software Update • Other Activities of Interest The HDF Group

  3. What is The HDF Group (THG)? The HDF Group

  4. THG, the Company • Spun-off from University of Illinois July 2006 • Non-profit • 20+ scientific, technology, professional staff • Intellectual property: • THG owns HDF4 and HDF5 • HDF formats and libraries to remain open • Libraries have BSD-type license • Continue ties to U of I and NCSA The HDF Group

  5. The mission of The HDF Group is to ensure long-term accessibility of HDF data through sustainable development and support of HDF technologies. The HDF Group

  6. Goals • Maintain, evolve HDF for sponsors and communities that depend on it • Do consulting, training, tuning, development, research • Sustain The HDF Group for long term to assure data access over time The HDF Group

  7.  THG Services • Helpdesk and Mailing Lists • Available to all users as a first level of support • Standard Support • Rapid issue resolution support • Consulting • Needs assessment, troubleshooting, design reviews, etc. • Enterprise Support • Coordinating HDF activities across divisions • Special Projects • Adapting customer applications to HDF • New features and tools, with changes normally incorporated into open source product • Research and Development • Training • Tutorials and hands-on practical experience The HDF Group

  8. HDF Software Update The HDF Group

  9. HDF4 update The HDF Group

  10. HDF 4.2r2 Released in October The HDF Group

  11. New features and changes • New APIs added to the SD and GR interfaces: • SDreset_maxopenfiles, SDget_maxopenfiles, Modifies, reports maximum allowable number of files • SDget_numopenfiles:Gets number of open files • SDgetcompinfo, GRgetcompinfo: Gets compression info • SDgetfilename: Retrieves name of file, given its ID • SDgetnamelen: Retrieves length of object name, given its ID • SZIP compression • Now can be invoked by Fortran API • Now available for raster images via GR interface • SDS, Vgroup names no longer limited to 64 characters The HDF Group

  12. New features and changes • HDF configuration changes • --enable-netcdf flag introduced • Autotools versions updated • Many bug fixes made to hrepack and hdiff • See RELEASE.txt for a full list of changes The HDF Group

  13. Drop Windows XP with MSVC++ 6.0 Linux 2.4 IRIX64 6.5 SunOS 5.8, 5.9 Add Windows 64-bit (32 and 64-bit binaries) Platforms to drop/add next release The HDF Group

  14. Systems AIX 5.3 (32-bit, 64-bit) Free BSD 6.2 (32-bit, 64-bit)* HP-UX B.11.23 (32-bit, 64-bit)* IRIX 64 v6.5 (32-bit, 64-bit) Linux 2.4, 2.6* Linux ia64 Linux x86_64 Sun OS 5.8, 5.10* (32-bit, 64-bit) SunOS 5.10 on Intel Windows XP, Vista Mac OS X Intel* * New platforms For detailed info, see RELEASE.txt Compilers IBM C and Fortran compilers GNU gcc 3.4* and GNU Fortran HPUX C and Fortran compilers GNU gcc 3.4 and 4.* Intel C and Fortran versions 9.1 and 10.00 SUN WorkShop C and Fortran Visual Studio .NET and 2005 and Intel Fortran Visual Studio 2005 (no fortran) GNU gcc 4.0.1 with gfortran and g95 Platforms tested The HDF Group

  15. HDF5 Update The HDF Group

  16. HDF5 1.6.6 The HDF Group

  17. HDF5 1.6.6 release • Primarily a bug-fix release • Some tool changes (see later slide) • http://hdfgroup.org/HDF5/release/obtain5.html The HDF Group

  18. Compilers PGI 6.5-* Platforms dropped • Operating systems • AIX 5.3 • Solaris 2.8 and 2.9 • OSF1 • Windows XP with MSVC++ 6.0 http://www.hdfgroup.org/HDF5/release/alpha/obtain518.html The HDF Group

  19. Systems Alpha Open VMS MAC OSX 10.4 (Intel) Solaris 2.* on Intel Cray XT3 Windows 64-bit (32 and 64-bit) BG/L Compilers PGI V. 7.* Intel 10.* MPICH 1.2.7 MPICH2 Platforms added The HDF Group

  20. HDF5 1.8 The HDF Group

  21. HDF5 1.8 new library features • Datatype and dataspace features • Create datatype from text description • Integer to float conversions during I/O • Compact storage for N-bit datatypes • Offset+size storage filter, saving space • “Null” dataspace – datasets with no elements • Data transformation filter The HDF Group

  22. HDF5 1.8 – new library features • Group improvements • Creation order access • Compact groups – small groups take less space • Large group storage improvements • Intermediate group creation • Link improvements • Unicode names allowed • External links – to objects in another file • User defined links – create own kinds of links The HDF Group

  23. HDF5 1.8 – new library features • Attribute improvements • Improved storage for large number of attributes • Iterate or look up by creation order • Unicode names allowed • Support for Unicode UTF-8 character set • Shared header information, possibly saving space • Metadata cache improvements – faster I/O on files with many objects • Better UNIX/Linux portability The HDF Group

  24. HDF5 1.8 – new APIs • New extendible error-handling API • New APIs to copy objects between files quickly • Dimension scale model and API • “HDFpacket” API, to read/write packets efficiently The HDF Group

  25. HDF5 1.8 – Backward and Forward Compatibility The HDF Group

  26. HDF5 1.8 and 1.6 • Differences between 1.8 and 1.6.x • Some file format changes • Several new routines added • Old APIs deprecated – may be removed in later release • Consequences • Applications requiring 1.8 format changes will generate objects that cannot be read by 1.6 library • To exploit 1.8 changes, applications need to be rewritten The HDF Group

  27. “The art of progress is to preserve order amid change, and to preserve change amid order.”Alfred North Whitehead The HDF Group

  28. Principle of Maximum File Format Compatibility Unless instructed otherwise, the HDF5 library will write objects using the earliest version of the format possible for describing the information. Assures older library versions are forward compatible whenever possible: Objects in new files can be read with old versions of the library, if the objects are “known” to the old libraries. New versions of the library can always read objects in files written with older versions. 10/8/2014 The HDF Group The HDF Group 28

  29. Command Line Tools 10/8/2014 The HDF Group The HDF Group 32

  30. New features for existing tools -V option for all tools Prints HDF5 library version number used by tool h5repack: -L option Use latest version of file format to create objects h5dump: dumps groups/attributes in creation or name order -q Q, --sort_by=Q    Sort groups and attributes by index Q -z Z, --sort_order=Z Sort groups and attributes by order Z 10/8/2014 The HDF Group The HDF Group 33

  31. New command line tools • h5mkgrp • Creates new groups and group hierarchies in an HDF5 file • h5stat • Provides statistics regarding the file, such as number of objects per group, sizes of datasets, amount of free space in file • h5copy • Copy object within a file or cross files • h5check • Verifies an HDF5 file against the defined HDF5 File Format Specification • Completed for 1.6. • In progress for 1.8 10/8/2014 The HDF Group The HDF Group 34

  32. Tool work in the pipeline Export numeric data formatted in several different ways (such as MS excel, XML, etc) Import ASCII data that conforms to certain format Use a common text format for h5import and h5dump Support NaN in tools such as h5diff. Challenges: NaN is platform specific NaN can have different values for the same machine Checking NaN can be a performance hit 10/8/2014 The HDF Group The HDF Group 35

  33. HDF Java Products 10/8/2014 The HDF Group The HDF Group 36

  34. HDF5 Java is Growing UP The HDF Group

  35. HDFView changes HDFView 2.4 released Many new features, such as Support for compound datatypes of 2D+ arrays Support for "filtering fill value" in Image Viewer Effective handling of large 3D images Support large fonts in GUI components New autogain algorithm for image Brightness/Contrast New platforms Mac intel Linux 64-bit AMD Solaris 64-bit 10/8/2014 The HDF Group The HDF Group 38

  36. Other Java products 36 new enhancements and 44 bugs fixed Test suite (using junit testing framework) Tests all public methods in the object package Added “make check” to run the test suite Enhanced documentation All public methods in the object package are fully documented 10/8/2014 The HDF Group The HDF Group 39

  37. Future work for Java Update HDF5 JNI APIs for HDF5 1.8 release Release HDFView with bug fixes/new features with HDF5 1.8 release Port HDF5-SRB model to HDF5-iRODS model Writing capability for HDF5-iRODS model 10/8/2014 The HDF Group The HDF Group 40

  38. Other Activities of Interest The HDF Group

  39. New THG Website The HDF Group

  40. New THG Website 10/8/2014 The HDF Group The HDF Group 43

  41. HDF Performance Framework The HDF Group

  42. Goals A framework for performance regression testing A tool for Testing on multiple platforms Testing different versions Long term regression testing Assistance in debugging The HDF Group

  43. Solution HDF5 1.6 HDF5 1.8 Database cron A User’s Benchmark Performance Library PHP Web Server www Graph/Text The HDF Group

  44. Sample Usage H5Perf_startTimer(&time); for(i=0;i<1000 ;i++) { H5Gcreate(fileid,group_name,(size_t)0)); // Add groups } H5Perf_endTimer(&time); H5Perf_addInstance(db_host, date, time); 00 21 * * * /home/local/hyoklee/src/chicago/test-perf-hdfdap-3.sh | 178820 | 2007-08-17 21:51:14 | 10000 groups | creating 10000 empty groups | 1.8.0 | hdfdap | 0.670198 | 4384 | Timestamp Instance Name Version Platform Time The HDF Group

  45. Improved Crash Survivability in the HDF5 Library 10/8/2014 The HDF Group The HDF Group 48

  46. Crash Survivability in HDF5 Problem: Data in HDF5 files susceptible to corruption in the event of an application or system crash. Corruption possible if structural metadata is being written when the crash occurs. Initial Objective: Guarantee an HDF5 file with consistent metadata can be reconstructed in the event of a crash. No guarantee on state of raw data – contains whatever made it to disk prior to crash. 10/8/2014 The HDF Group The HDF Group 49

  47. Crash Survivability in HDF5 Approach: Metadata Journaling When a piece of metadata is modified and in a consistent state, make a journal note.  If the application crashes, a recovery program can replay the journal by applying in order all metadata writes until the end of the last completed transaction written to the journal file. 10/8/2014 The HDF Group The HDF Group 50

  48. Faster HDF5 Data Appends The HDF Group

  49. Fast Data Appends • Problem: Metadata operations limit the rate at which HDF5 can append data to datasets. • Solution: new data structure for indexing chunks: • Allows constant time extend, shrink and lookup of chunks in datasets with single unlimited dimension • # of metadata I/O operations to append to dataset is independent of # of chunks • Allows single-writer/multiple-reader access • Details at: http://www.hdfgroup.uiuc.edu/RFC/HDF5/SkipListChunkIndex/SkipListChunkIndex.html 10/8/2014 The HDF Group The HDF Group 52

  50. netCDF-4 The HDF Group

More Related