Aggregation and subsetting in erddap a middleman data server
This presentation is the property of its rightful owner.
Sponsored Links
1 / 16

Aggregation and Subsetting in ERDDAP (a middleman data server) PowerPoint PPT Presentation


  • 101 Views
  • Uploaded on
  • Presentation posted in: General

Aggregation and Subsetting in ERDDAP (a middleman data server). http://coastwatch.pfeg.noaa.gov/erddap Bob Simons <[email protected]> NOAA NMFS SWFSC ERD. Aggregating Gridded Data.

Download Presentation

Aggregation and Subsetting in ERDDAP (a middleman data server)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Aggregation and subsetting in erddap a middleman data server

Aggregation and Subsettingin ERDDAP (a middleman data server)

http://coastwatch.pfeg.noaa.gov/erddap

Bob Simons <[email protected]>

NOAA NMFS SWFSC ERD


Aggregating gridded data

Aggregating Gridded Data

  • Aggregating time points: 10,000's of data files: sst[latitude][longitude]become one virtual dataset:sst[time][latitude][longitude]

  • Aggregating variables:Many files with one variable per filebecome one virtual dataset with all variables


Subsetting gridded data

Subsetting Gridded Data

  • OPeNDAP Projection Constraintssst[57:57][121:2:141][163:2:183]ERDDAP: sst[(2012-08-12)][(20):2:(40)][(-140):2:(-120)]

  • Huge time-saver: User can just request what she needs (1%).

  • Aggregated datasets need to be subset-able.


Aggregating in situ and tabular data

Aggregating In-Situ and Tabular Data

  • A database-like table with rows and columnsE.g., One file has data for one buoy for one month. It isn't a multi-dimensional grid.There are no dimensions.

  • Aggregating features and time points: Features: stations, trajectories, profiles, ...Append into a giant virtual table.


Subsetting in situ and tabular data

SubsettingIn-Situ and Tabular Data

  • OPeNDAP Selection Constraints(no indices, because no multi-dimensional grids)longitude,latitude,time,sst&sst>35Easy to create. Uses domain units (degC).Very flexible. (Based on database's SQL SELECT.)

  • Huge time-saver User can just request what she needs (1%).

  • Aggregated datasets need to be subset-able.


Don t treat in situ tabular data like gridded data

Don't Treat In-Situ/Tabular Data Like Gridded Data

  • CF DSG stores in-situ data as as gridded .ncFine for storage, not for subsetting.

  • Problem: Indices aren't domain units. How do you request sst>35 with indices?

  • Problem: Indices aren't real-world sequence.Grid: lat[] is a sequence. lat[42:53] has meaning.Table: Buoy number isn't. &lat>20&lat<40 is buoy #2,14,26,109, not buoy[42:53]

  • Problem: 5 CF DSG data structures.


Option treat gridded data like tabular data

Option: Treat Gridded Data Like Tabular Data

  • Standard request: time, lat, lon bounding boxWhat about unusual requests of gridded data,e.g., SST>35 ("Select by value")

  • ERDDAP's EDDTableFromEDDGrid creates a giant virtual table from a gridded dataset.Columns: longitude, latitude, time, sstQuery: e.g., longitude,latitude,time,sst&sst>35Response: a table (one data point per row)

  • Risk: huge effort for server.


Summary huge advantages of aggregation and subsetting

Summary: Huge Advantages of Aggregation and Subsetting

  • Users can find and deal with one aggregated dataset.

  • Users can make one subset request to one aggregated datasetGrids: indices to get a temporal and spatial subset.Tables (selection constraints): any subset you want.(Not: one subset request to each unaggregated file,or worse, using FTP to download lots of entire files.)

  • Don't treat tabular/in-situ data like gridded data.


Aggregation and subsetting in erddap a middleman data server1

Aggregation and Subsettingin ERDDAP (a middleman data server)

http://coastwatch.pfeg.noaa.gov/erddap

Bob Simons <[email protected]>

NOAA NMFS SWFSC ERD


  • Login