Aggregation and subsetting in erddap a middleman data server
Download
1 / 16

Aggregation and Subsetting in ERDDAP (a middleman data server) - PowerPoint PPT Presentation


  • 136 Views
  • Uploaded on

Aggregation and Subsetting in ERDDAP (a middleman data server). http://coastwatch.pfeg.noaa.gov/erddap Bob Simons <bob.simons@noaa.gov> NOAA NMFS SWFSC ERD. Aggregating Gridded Data.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Aggregation and Subsetting in ERDDAP (a middleman data server)' - jill


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Aggregation and subsetting in erddap a middleman data server
Aggregation and Subsettingin ERDDAP (a middleman data server)

http://coastwatch.pfeg.noaa.gov/erddap

Bob Simons <bob.simons@noaa.gov>

NOAA NMFS SWFSC ERD


Aggregating gridded data
Aggregating Gridded Data

  • Aggregating time points: 10,000's of data files: sst[latitude][longitude]become one virtual dataset:sst[time][latitude][longitude]

  • Aggregating variables:Many files with one variable per filebecome one virtual dataset with all variables


Subsetting gridded data
Subsetting Gridded Data

  • OPeNDAP Projection Constraintssst[57:57][121:2:141][163:2:183]ERDDAP: sst[(2012-08-12)][(20):2:(40)][(-140):2:(-120)]

  • Huge time-saver: User can just request what she needs (1%).

  • Aggregated datasets need to be subset-able.


Aggregating in situ and tabular data
Aggregating In-Situ and Tabular Data

  • A database-like table with rows and columnsE.g., One file has data for one buoy for one month. It isn't a multi-dimensional grid.There are no dimensions.

  • Aggregating features and time points: Features: stations, trajectories, profiles, ...Append into a giant virtual table.


Subsetting in situ and tabular data
SubsettingIn-Situ and Tabular Data

  • OPeNDAP Selection Constraints(no indices, because no multi-dimensional grids)longitude,latitude,time,sst&sst>35Easy to create. Uses domain units (degC).Very flexible. (Based on database's SQL SELECT.)

  • Huge time-saver User can just request what she needs (1%).

  • Aggregated datasets need to be subset-able.


Don t treat in situ tabular data like gridded data
Don't Treat In-Situ/Tabular Data Like Gridded Data

  • CF DSG stores in-situ data as as gridded .ncFine for storage, not for subsetting.

  • Problem: Indices aren't domain units. How do you request sst>35 with indices?

  • Problem: Indices aren't real-world sequence.Grid: lat[] is a sequence. lat[42:53] has meaning.Table: Buoy number isn't. &lat>20&lat<40 is buoy #2,14,26,109, not buoy[42:53]

  • Problem: 5 CF DSG data structures.


Option treat gridded data like tabular data
Option: Treat Gridded Data Like Tabular Data

  • Standard request: time, lat, lon bounding boxWhat about unusual requests of gridded data,e.g., SST>35 ("Select by value")

  • ERDDAP's EDDTableFromEDDGrid creates a giant virtual table from a gridded dataset.Columns: longitude, latitude, time, sstQuery: e.g., longitude,latitude,time,sst&sst>35Response: a table (one data point per row)

  • Risk: huge effort for server.


Summary huge advantages of aggregation and subsetting
Summary: Huge Advantages of Aggregation and Subsetting

  • Users can find and deal with one aggregated dataset.

  • Users can make one subset request to one aggregated datasetGrids: indices to get a temporal and spatial subset.Tables (selection constraints): any subset you want.(Not: one subset request to each unaggregated file,or worse, using FTP to download lots of entire files.)

  • Don't treat tabular/in-situ data like gridded data.


Aggregation and subsetting in erddap a middleman data server1
Aggregation and Subsettingin ERDDAP (a middleman data server)

http://coastwatch.pfeg.noaa.gov/erddap

Bob Simons <bob.simons@noaa.gov>

NOAA NMFS SWFSC ERD