1 / 30

Reading HDF family of formats via NetCDF-Java / CDM

Reading HDF family of formats via NetCDF-Java / CDM. John Caron UCAR/Unidata. NetCDF-Java library. 100% Java Open Source (LGPL, MIT) Independent implementation Used as a component in other software (partial) Integrated Data Viewer, THREDDS Data Server (Unidata) Panoply (NASA)

amelia
Download Presentation

Reading HDF family of formats via NetCDF-Java / CDM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reading HDF family of formatsvia NetCDF-Java / CDM John Caron UCAR/Unidata

  2. NetCDF-Java library • 100% Java • Open Source (LGPL, MIT) • Independent implementation • Used as a component in other software (partial) • Integrated Data Viewer, THREDDS Data Server (Unidata) • Panoply (NASA) • ncBrowse (EPIC/NOAA) • Java NEXRAD Viewer (NCDC/NOAA) • MyWorld GIS (Northwestern) • EDC for ArcGIS, ERRDAP (SFSC/NOAA) • Live Access Server (PMEL/NOAA) • ncWMS (Reading) • Matlab plug-in (USGS)

  3. THREDDS Catalog.xml Application Scientific Feature Types Datatype Adapter NetCDF-Java/ CDM architecture NetcdfDataset CoordSystem Builder NetcdfFile I/O service provider OPeNDAP NetCDF-3 NIDS NcML NetCDF-4 GRIB HDF5 GINI Nexrad DMSP …

  4. Format Readers (IOSP) • General: NetCDF, HDF5, HDF4, OPeNDAP • Gridded: GRIB-1, GRIB-2, GEMPAK • Radar: NEXRAD 2&3, DORADE, CINRAD, Universal Format • Point: BUFR, ASCII • Satellite: DMSP, GINI, McIDAS AREA • Misc: GTOPO, Lightning, etc • Others in development (partial): • AVHRR, GPCP, GACP, SRB, SSMI, HIRS (NCDC)

  5. Line of Code (est)

  6. Why all the trouble? • ~20-40% C/C++ time spent on portability issues • Platform Independence • Linux, Solaris, Windows (Sun) • Mac OS X (Apple) • AIX, Linux, Windows, z/OS (IBM) • HP-UX (Hewlitt-Packard) • Progammer productivity • Object-Oriented • Garbage Collected – no memory leaks • Rich libraries • Open source • Faster than C for some applications

  7. Independent implementation • Written entirely from reading HDF4, HDF5 file specifications • Helped debug (HDF5), validate file specs • File format spec is what will be needed in 100 years to read legacy data • OTOH, semantics not always obvious • Don’t confuse reference implementation with the file/protocol specification

  8. HDF family of formats • HDF5/NetCDF-4 • HDF4 • HDF-EOS • Note: read-only, no parellel I/O, etc

  9. HDF5/NetCDF4 • Goal is to read all HDF5 • Can read all HDF5 files that we have example • including references, soft links • Complete coverage difficult to guarantee – combinatoric explosion • Some esoteric features we are skipping • File drivers, external files, slib compression • Working on a comprehensive test harness • JNI interface to Netcdf4/HDF5 library • read every byte and compare

  10. HDF4 / HDF-EOS • Complete, works against all examples • Tested against 400 sample files (27 Gb) • thanks to Ruth Duerr (NSIDC) • Spot checked against HDFView • Need systematic test to compare reading against the HDF4 C Library

  11. Geolocation Primer

  12. Swath Float lat(245, 33477); Float lon(245, 33477); Float time(33477); Float data(245, 33477); Just know that its swath data • 245 points cross track • 33477 along the track • Each scan has a time coordinate

  13. Swath Float lat(33477, 245); Float lon(33477, 245); Float time(33477); Float data(245, 33477);

  14. Swath Float lat(999,999); Float lon(999,999); Float time(999); Float data(999,999);

  15. Swath Float v1(999, 999); Float v2(999, 999); Float v3(999); Float v4(999,999);

  16. If you write data • Don’t rely on variable name conventions • Don’t rely on index ordering • Don’t rely on matching index sizes • Minimize “you just have to know that…”

  17. Dimensions Dimensions d1=999; d2=999; Variables: float v1(d1=999, d2=999); float v2(d1=999, d2=999); float v3(d2=999); float v4(d2=999,d1=999);

  18. Good Variables: float v1(d1=999, d2=999); v1:standard_name = “Latitude”; float v2(d1=999, d2=999); v2:standard_name = “Longitude”; float v3(d2=999); v3:standard_name = “Time”; float v4(d2=999,d1=999); Data_type = “Swath”; Conventions = “My unique name”;

  19. If you write data • Unique signature • Specify dimensions • Identify georeferencing coordinates • Identify data type • Units are not optional

  20. HDF-EOS, HDF-EOS2 • Read “structural metadata” field to obtain more semantics • Parse text in “ODL” • Data type: Swath, Grid, Point • Dimensions • Geolocation coordinate variable types: Latitude, Longitude, Time

  21. HDF-EOS, HDF-EOS2 • Good • Unique signature, identify coordinates and data type • Not so good • ODL • Not using hdf4/5 constructs • Bad • No data units • No time coordinate units!

  22. Better EOS Variables: float v1(999, 999); v1:standard_name = “Latitude”; v1:dims = “d1 d2”; float v2(999, 999); v2:standard_name = “Longitude”; v2:dims = “d1 d2”; float v3(999); v3:standard_name = “Time”; v3:dims = “d2”; float v4(999,999); v4:dims = “d2 d1”;

  23. NPP (i1.4.0.3_NPP_QUAL) • Good • XML better than ODL • Not so good • Not using hdf4/5 constructs • Bad • No data units • No time coordinate units! • Fatal Error: please reboot • Metadata not in the same file

  24. Summary • Netcdf-Java reads entire HDFx family • Good for Java-philes • Needs more testing • Send example files, $ • Dimensions are not optional • Keep structural and georeferncing metadata in the same file as the data • Can also have specialized external files

  25. Contact caron@ucar.edu Google “netcdf java”

  26. NetCDF-4 and Common Data Model (Data Access Layer)

  27. Dimension primer Float lat(180); Float lon(360); Float alt(20); Float time(1200); Float data(1200,20,180,360);

  28. Unique Name! Float lfip(lfip=180); Float lflop(lflop=180); Float zorg(zorg=20); Float skdf(skdf=1200); Float dglot(skdf=1200,zorg=20, lfip=180,lflop=180);

  29. Float lfip(180); Float lflop(180); Float zorg(20); Float freebish(1200); Float dglot(1200,20,180,180);

  30. Float lat(180); Float lon(180); Float alt(20); Float time(1200); Float data(1200,20,180,180);

More Related