1 / 28

Introduction to HDF5

Introduction to HDF5. Ruth Aydt Quincey Koziol The HDF Group {aydt,koziol}@hdfgroup.org. THE BLIND MEN AND THE ELEPHANT A HIDOO FABLE by John Godfrey Saxe

groomsr
Download Presentation

Introduction to HDF5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to HDF5 Ruth Aydt Quincey Koziol The HDF Group {aydt,koziol}@hdfgroup.org

  2. THE BLIND MEN AND THE ELEPHANT A HIDOO FABLEby John Godfrey Saxe It was six men of Indostan To learning much inclined,Who went to see the Elephant (Though all of them were blind),That each by observation Might satisfy his mind. … (see wikipedia.org )

  3. Our Purpose Today • Familiarize you with HDF5 and its capabilities. 2) Help you understand how HDF5 might be applied to your data management challenges.

  4. Outline of Sessions • Overview, HDF5 Data Model • Data Model Comparisons, HDF5 File Format • HDF5 Software

  5. What is HDF5? • Open file format • Designed for high volume or complex data • Open source software • Works with data in the format • A data model • Structures for data organization and specification

  6. HDF = Hierarchical Data Format • HDF4 is the first HDF • Originally called HDF; last major release was version 4 • Still supported by The HDF Group • HDF5 benefits from lessons learned with HDF4 • Changes to file format, software, and data model • Not compatible with HDF4 • No plans for HDF6!

  7. 1987: Graphics task force at NCSA began work on architecture-independent format and library, HDF. 1990: NSF provided funding to improve documentation, testing, and user support. 1994: NASA selected HDF as standard format for Earth Observing System. 1996–1998: DOE tri-labs and NCSA, with additional support from NASA, developed HDF5, initially called “BigHDF”. 2005: NASA funded development of netCDF-4, a new version of netCDF that uses the HDF5 file format. 2006: The HDF Group, a non-profit corporation, spun off from NCSA and the University of Illinois. Condensed History

  8. HDF5 in the Formative Years Focus: Serve science and engineering communities • Variety of data sources, often in single workflow • Simulation, observation, visualization, annotation • High volume • Data rates and data sizes • Complex • Data types and data relationships • Variety of system architectures • Data often shared widely • Different users care about different subsets • Data must be accessible far into the future

  9. HDF5 Philosophy • A single platform with multiple uses • One general file format • Self-describing, allows for discovery of objects in the file • Designed for speed and storage efficiency • One software library • Options to adapt I/O and storage to data needs • Layers above and below • One data model • Structures based on mathematical foundations • Supports expression of complex data types and relationships • Work well with other technologies • Attention to compatibility

  10. HDF5 is like…

  11. HDF5 Technology Platform • HDF5 data model • The “building blocks” for data organization and specification • HDF5 software • Library, language interfaces, tools • HDF5 file format • Bit-level organization of HDF5 file Let’s look at… Recall…

  12. HDF5 Data Model Dataset Link Group HDF5 Objects Datatype Attribute Dataspace File a.k.a. HDF5 Abstract Data Modela.k.a. HDF5 Logical Data Model

  13. There will be a Quiz! • Use objects from the HDF5 data model to design an HDF5 file to store daily temperature measurements made at various locations throughout a building.

  14. HDF5 File lat | lon | temp ----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 An HDF5 file is a container that holds data objects. Experiment Notes: Serial Number: 99378920 Date: 3/13/09 Configuration: Standard 3 An HDF5 file is not necessarily a file on a filesystem.

  15. HDF5 Dataset HDF5Datatype HDF5Dataspace Integer 32bit LE Rank Dimensions 3 Dim_0 = 4 Dim_1 = 5 Dim_2 = 7 Specifications for single dataelement and array dimensions Multi-dimensional array of identically typed data elements • HDF5 datasets organize and contain “raw data values”. • HDF5 datatypes describe individual data elements. • HDF5 dataspaces describe the logical layout of the data elements.

  16. HDF5 Datatypes • Describe individual data elements in an HDF5 dataset • Wide range of datatypes supported • integer, float, unsigned, bitfield, … • user-defined (e.g., 13-bit integer) • variable length types (e.g., strings) • reference to object; reference to dataset region • enumerations - names mapped to integers • opaque • array • compound (similar to C structs) • Can be named and shared • Committed datatypes (a.k.a. named datatypes)

  17. HDF5 Dataset & Datatype HDF5Datatype Integer 32bit LE Specifications for single dataelement Multi-dimensional array of identically typed data elements • HDF5 datasets organize and contain “raw data values”. • HDF5 datatypes describe individual data elements.

  18. HDF5 Dataspaces • Describe the logical layout of the elements in an HDF5 dataset • NULL • no elements • Scalar • single element • Simple array (most common) • multiple elements organized in a rectangular array • rank = number of dimensions (1-32) • dimension sizes = number of elements in each dimension • maximum number of elements in each dimension • may be fixed or unlimited multi-dimensional

  19. HDF5 Dataset & Dataspace Dim_1 = 5 Dim_2 = 7 Dim_0 = 4 Rank Dimensions HDF5Dataspace 3 Dim_0 = 4 Dim_1 = 5 Dim_2 = 7 Specifications for array dimensions Multi-dimensional array of identically typed data elements • HDF5 datasets organize and contain“raw data values”. • HDF5 dataspaces describe the logical layout of the data elements.

  20. HDF5 Dataset HDF5Datatype HDF5Dataspace Integer 32bit LE Rank Dimensions 3 Dim_0 = 4 Dim_1 = 5 Dim_2 = 7 Specifications for single dataelement and array dimensions Multi-dimensional array of identically typed data elements • HDF5 datasets organize and contain “raw data values”. • HDF5 datatypes describe individual data elements. • HDF5 dataspaces describe the logical layout of the data elements.

  21. Compound Datatype in HDF5 Dataset Committed Datatype HDF5Datatype HDF5 Dataset PurpleGreenRedBlue PurpleGreenRedBlue ... int8 int4 int16 2x3x2 array of float32 HDF5Dataspace MaxDimSizes Rank DimSizes Compound Datatype … unlimited 5 2 3 3

  22. HDF5 Data Model: Are we there yet? HDF5 Objects Group and Link Attribute  Dataspace  Datatype  Dataset  File

  23. HDF5 Attributes • Typically contain user metadata • Have a name and a value • May be associated with • HDF5 datasets • HDF5 committed datatypes • HDF5 groups • Value is described by a datatype and a dataspace • analogous to a dataset

  24. HDF5 Groups and Links HDF5 groups and links organize data objects. Every HDF5 file has a root group / SimOut Parameters 10;100;1000 Viz Timestep 36,000 Experiment Notes: Serial Number: 99378920 Date: 3/13/09 Configuration: Standard 3 lat | lon | temp ----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6

  25. Quiz • Use objects from the HDF5 data model to design an HDF5 file to store daily temperature measurements made at various locations throughout a building. Building SRP-Z Sensor Type Temperature / February January … 3 NW 2 … 31 1 2 day of month meters south 1 meters above ground meters east

  26. Review • HDF5 consists of • file format • software • data model • file, dataset, datatype, dataspace, attribute, group, link • HDF5 designed to support • management of high-volume, complex data • data sharing and preservation

  27. Test Data and HDF5 Data Model Objects

  28. Stretch Break

More Related