Reading HDF family of formats via NetCDF-Java / CDM - PowerPoint PPT Presentation

amelia
reading hdf family of formats via netcdf java cdm n.
Skip this Video
Loading SlideShow in 5 Seconds..
Reading HDF family of formats via NetCDF-Java / CDM PowerPoint Presentation
Download Presentation
Reading HDF family of formats via NetCDF-Java / CDM

play fullscreen
1 / 30
Download Presentation
Reading HDF family of formats via NetCDF-Java / CDM
130 Views
Download Presentation

Reading HDF family of formats via NetCDF-Java / CDM

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Reading HDF family of formatsvia NetCDF-Java / CDM John Caron UCAR/Unidata

  2. NetCDF-Java library • 100% Java • Open Source (LGPL, MIT) • Independent implementation • Used as a component in other software (partial) • Integrated Data Viewer, THREDDS Data Server (Unidata) • Panoply (NASA) • ncBrowse (EPIC/NOAA) • Java NEXRAD Viewer (NCDC/NOAA) • MyWorld GIS (Northwestern) • EDC for ArcGIS, ERRDAP (SFSC/NOAA) • Live Access Server (PMEL/NOAA) • ncWMS (Reading) • Matlab plug-in (USGS)

  3. THREDDS Catalog.xml Application Scientific Feature Types Datatype Adapter NetCDF-Java/ CDM architecture NetcdfDataset CoordSystem Builder NetcdfFile I/O service provider OPeNDAP NetCDF-3 NIDS NcML NetCDF-4 GRIB HDF5 GINI Nexrad DMSP …

  4. Format Readers (IOSP) • General: NetCDF, HDF5, HDF4, OPeNDAP • Gridded: GRIB-1, GRIB-2, GEMPAK • Radar: NEXRAD 2&3, DORADE, CINRAD, Universal Format • Point: BUFR, ASCII • Satellite: DMSP, GINI, McIDAS AREA • Misc: GTOPO, Lightning, etc • Others in development (partial): • AVHRR, GPCP, GACP, SRB, SSMI, HIRS (NCDC)

  5. Line of Code (est)

  6. Why all the trouble? • ~20-40% C/C++ time spent on portability issues • Platform Independence • Linux, Solaris, Windows (Sun) • Mac OS X (Apple) • AIX, Linux, Windows, z/OS (IBM) • HP-UX (Hewlitt-Packard) • Progammer productivity • Object-Oriented • Garbage Collected – no memory leaks • Rich libraries • Open source • Faster than C for some applications

  7. Independent implementation • Written entirely from reading HDF4, HDF5 file specifications • Helped debug (HDF5), validate file specs • File format spec is what will be needed in 100 years to read legacy data • OTOH, semantics not always obvious • Don’t confuse reference implementation with the file/protocol specification

  8. HDF family of formats • HDF5/NetCDF-4 • HDF4 • HDF-EOS • Note: read-only, no parellel I/O, etc

  9. HDF5/NetCDF4 • Goal is to read all HDF5 • Can read all HDF5 files that we have example • including references, soft links • Complete coverage difficult to guarantee – combinatoric explosion • Some esoteric features we are skipping • File drivers, external files, slib compression • Working on a comprehensive test harness • JNI interface to Netcdf4/HDF5 library • read every byte and compare

  10. HDF4 / HDF-EOS • Complete, works against all examples • Tested against 400 sample files (27 Gb) • thanks to Ruth Duerr (NSIDC) • Spot checked against HDFView • Need systematic test to compare reading against the HDF4 C Library

  11. Geolocation Primer

  12. Swath Float lat(245, 33477); Float lon(245, 33477); Float time(33477); Float data(245, 33477); Just know that its swath data • 245 points cross track • 33477 along the track • Each scan has a time coordinate

  13. Swath Float lat(33477, 245); Float lon(33477, 245); Float time(33477); Float data(245, 33477);

  14. Swath Float lat(999,999); Float lon(999,999); Float time(999); Float data(999,999);

  15. Swath Float v1(999, 999); Float v2(999, 999); Float v3(999); Float v4(999,999);

  16. If you write data • Don’t rely on variable name conventions • Don’t rely on index ordering • Don’t rely on matching index sizes • Minimize “you just have to know that…”

  17. Dimensions Dimensions d1=999; d2=999; Variables: float v1(d1=999, d2=999); float v2(d1=999, d2=999); float v3(d2=999); float v4(d2=999,d1=999);

  18. Good Variables: float v1(d1=999, d2=999); v1:standard_name = “Latitude”; float v2(d1=999, d2=999); v2:standard_name = “Longitude”; float v3(d2=999); v3:standard_name = “Time”; float v4(d2=999,d1=999); Data_type = “Swath”; Conventions = “My unique name”;

  19. If you write data • Unique signature • Specify dimensions • Identify georeferencing coordinates • Identify data type • Units are not optional

  20. HDF-EOS, HDF-EOS2 • Read “structural metadata” field to obtain more semantics • Parse text in “ODL” • Data type: Swath, Grid, Point • Dimensions • Geolocation coordinate variable types: Latitude, Longitude, Time

  21. HDF-EOS, HDF-EOS2 • Good • Unique signature, identify coordinates and data type • Not so good • ODL • Not using hdf4/5 constructs • Bad • No data units • No time coordinate units!

  22. Better EOS Variables: float v1(999, 999); v1:standard_name = “Latitude”; v1:dims = “d1 d2”; float v2(999, 999); v2:standard_name = “Longitude”; v2:dims = “d1 d2”; float v3(999); v3:standard_name = “Time”; v3:dims = “d2”; float v4(999,999); v4:dims = “d2 d1”;

  23. NPP (i1.4.0.3_NPP_QUAL) • Good • XML better than ODL • Not so good • Not using hdf4/5 constructs • Bad • No data units • No time coordinate units! • Fatal Error: please reboot • Metadata not in the same file

  24. Summary • Netcdf-Java reads entire HDFx family • Good for Java-philes • Needs more testing • Send example files, $ • Dimensions are not optional • Keep structural and georeferncing metadata in the same file as the data • Can also have specialized external files

  25. Contact caron@ucar.edu Google “netcdf java”

  26. NetCDF-4 and Common Data Model (Data Access Layer)

  27. Dimension primer Float lat(180); Float lon(360); Float alt(20); Float time(1200); Float data(1200,20,180,360);

  28. Unique Name! Float lfip(lfip=180); Float lflop(lflop=180); Float zorg(zorg=20); Float skdf(skdf=1200); Float dglot(skdf=1200,zorg=20, lfip=180,lflop=180);

  29. Float lfip(180); Float lflop(180); Float zorg(20); Float freebish(1200); Float dglot(1200,20,180,180);

  30. Float lat(180); Float lon(180); Float alt(20); Float time(1200); Float data(1200,20,180,180);