unidata infrastructure for data services n.
Download
Skip this Video
Download Presentation
Unidata Infrastructure for Data Services

Loading in 2 Seconds...

play fullscreen
1 / 12

Unidata Infrastructure for Data Services - PowerPoint PPT Presentation


  • 107 Views
  • Uploaded on

Unidata Infrastructure for Data Services. Russ Rew GO-ESSP Workshop, LLNL 2006-06-19. Some Current Unidata Infrastructure Projects. LDM for distributing and processing near real-time data

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Unidata Infrastructure for Data Services' - chester-tyler


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
unidata infrastructure for data services
Unidata Infrastructure for Data Services
  • Russ Rew
  • GO-ESSP Workshop, LLNL
  • 2006-06-19
some current unidata infrastructure projects
Some Current Unidata Infrastructure Projects
  • LDM for distributing and processing near real-time data
  • Integrated Data Viewer (IDV) for testing infrastructure in platform-independent data visualization and analysis
  • NetCDF C-based interfaces for data access
  • CFIOlib for a CF conventions API (tomorrow)
  • NetCDF Java for advanced data access infrastructure
  • Common Data Model for improving interoperability
  • NcML for metadata annotation and data aggregation
  • THREDDS Data Server (TDS) for remote access to archives
  • GALEON for serving netCDF data through OGC Web Coverage Services (WCS)
ldm 6 for internet data distribution
LDM-6 for Internet Data Distribution
  • Implements a peer-to-peer system for reliable, event-driven data distribution
  • Supports subscriptions to many near real-time data feeds; no data center needed
  • Data product abstraction is general: model output, observations, text products, satellite data, radar, …
  • Protocols use persistent connections to achieve low latency
  • Highly configurable: inject, distribute, capture, filter, and process arbitrary data products
  • In continuous use by over 160 universities, NOAA, USGS, NASA, internationally, THORPEX global ensembles (TIGGE), …
  • Candidate for use in new WMO weather information system
idv integrated data viewer
IDV (Integrated Data Viewer)
  • Freely available 100% Java reference application and framework for visualization and analysis of geoscience data
  • Provides integrated and time synchronized 2-D and 3-D visualizations of model outputs, observed, and remotely sensed data, using U. of Wisc. VisAD
  • Handles diverse formats and protocols for local and remote access: GRIB, netCDF, OPeNDAP, ADDE, HTTP, GIS, …
  • Serves as end-to-end test for many Unidata technologies: THREDDS services, Java netCDF, XML bundles, plug-in architecture, interactive collaboration, …
netcdf s niche
NetCDF’s Niche
  • Simple data model for scientific datasets
  • Portable, self-describing data
  • Appendable, sharable, archivable
  • Direct access for efficient subsetting
  • Metadata via attribute conventions such as CF
  • Flexible remote access via OPeNDAP, HTTP, WCS
  • Lots of applications: NCO, ncbrowse, ncview, IDV, IDL, MATLAB, ArcGIS, ...
  • Language interfaces include C, Java, Fortran, C++, Perl, Python, Ruby, ...
netcdf 3 data model
NetCDF-3 Data Model

File

location: Filename

create( ), open( ), …

DataType

char

byte

short

int

float

double

Attribute

name: String

type: DataType

values: 1D array

Dimension

name: String

length: int

isUnlimited( )

Variable

name: String

shape: Dimension[ ]

type: DataType

array: read( ), …

Variables and attributes have one of six primitive data types.

A file has named variables, dimensions, and attributes. Variables also have attributes. Variables may share dimensions, indicating a common grid. One dimension may be of unlimited length.

some netcdf 3 limitations
Some NetCDF-3 Limitations
  • Only one shared unlimited dimension
  • No structures, just scalars and multidimensional arrays
  • No strings, just arrays of characters
  • Limited numeric types
  • No ragged arrays or nested structures
  • Only ASCII characters in names
  • Changes to file schema can be expensive
  • Efficient access requires reads in same order as writes
  • No built-in compression
  • Only serial I/O
  • Flat name space limits scalability
netcdf 4 features to address limitations
NetCDF-4 Features to Address Limitations
  • Multiple unlimited dimensions
  • Portable structured types
  • String type
  • Additional numeric types
  • Variable-length types for ragged arrays
  • Unicode names
  • Efficient dynamic schema changes
  • Multidimensional tiling (chunking)
  • Per variable compression
  • Parallel I/O
  • Nested scopes using Groups
netcdf 4 data model common data access model

Variable

name: String

shape: Dimension[ ]

type: DataType

array: read( ), …

File

location: Filename

create( ), open( ), …

PrimitiveType

char

byte

short

int

int64

float

double

unsigned byte

unsigned short

unsigned int

unsigned int64

string

UserDefinedType

typename: String

Attribute

name: String

type: DataType

values: 1D array

Enum

Opaque

Compound

VariableLength

Group

name: String

Dimension

name: String

length: int

isUnlimited( )

NetCDF-4 Data Model (Common Data Access Model)

DataType

Variables and attributes have one of twelve primitive data types or one of four user-defined types.

A file has a top-level unnamed group. Each group may contain one or more named subgroups, variables, dimensions, and attributes. Variables also have attributes. Variables may share dimensions, indicating a common grid. One or more dimensions may be of unlimited length.

netcdf 4 architecture
NetCDF-4 Architecture

NetCDF Java

applications

NetCDF-3

applications

NetCDF-4

applications

HDF5

applications

  • NetCDF-4 uses HDF5 for storage, high performance
    • Parallel I/O
    • Chunking for efficient access in different orders, efficient use of compression
    • Conversion using “reader makes right” approach
  • Provides simple netCDF interface to subset of HDF5
  • Also supports netCDF classic and 64-bit formats

NetCDF Java

application

NetCDF-3

application

NetCDF-4

application

HDF5

application

netCDF Java

netCDF-4

netCDF-3

HDF5

Java VM

POSIX I/O

MPI I/O

status of netcdf 4
Status of NetCDF-4
  • NetCDF-4.0-alpha14 currently available for testing
    • Files created with alpha release use unsupported artifacts
    • We’re seeking feedback on performance and functionality
  • NetCDF-4.0-beta waiting for HDF5 1.8-beta
    • Will finalize file format, eliminate necessity for artifacts
    • Expected within a few weeks of HDF5 1.8-beta release, maybe by August 2006
  • HDF5 1.8 currently expected by November 2006
    • Has enhancements specifically for netCDF-4: variable creation order, Unicode names, dimension scales, on-the-fly numeric conversions
  • Plans for netCDF-4.1 and beyond on netCDF-4 web site
summary
Summary
  • Unidata’s LDM-6 implements an event-driven architecture for low-latency data distribution
  • Unidata’s IDV provides a platform-independent visualization and analysis framework and reference application for integrating data from diverse sources
  • Unidata’s netCDF-4 software preserves backward compatibility and eliminates many limitations of netCDF-3 with only a modest increase in complexity
ad