1 / 32

New Resources in the Research Data Archive

New Resources in the Research Data Archive. Doug Schuster. Topic Outline. New Resources Search/Discovery and Data Delivery TIGGE JRA-25 Routine Updates. Data Search, Discovery and Delivery. Popular Datasets Google Style Search Drill Down Style Search File Level Metadata Example:

oro
Download Presentation

New Resources in the Research Data Archive

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. New Resources in the Research Data Archive Doug Schuster

  2. Topic Outline • New Resources • Search/Discovery and Data Delivery • TIGGE • JRA-25 • Routine Updates

  3. Data Search, Discovery and Delivery • Popular Datasets • Google Style Search • Drill Down Style Search • File Level Metadata • Example: • Search for model generated tropical cyclone track data using “Drill Down” method.

  4. Data Search, Discovery, and Delivery

  5. Data Search, Discovery, and Delivery (Drill Down)

  6. Data Search, Discovery, and Delivery (Drill Down)

  7. Data Search, Discovery, and Delivery (File Level Metadata)

  8. Background on TIGGE WMO World Weather Research Programme THORPEX • THeObserving system Research and Predictability EXperiment • THORPEX Interactive Global Grand Ensemble (TIGGE) Archive supports research • Grand Ensemble = multiple NWP centers ensembles are combined (an ensemble of ensembles) • 10 international NWP Centers contributing to TIGGE

  9. Background on TIGGE Three mirrored archive centers • NCAR • ECMWF • CMA {Shared System Development!} • Daily Data Flow Metrics • 245 GB • 1.6 Million gridded fields as separate data packets • 3000+ Files/day

  10. Data Receipt UKMO CMC CMA ECMWF MeteoFrance NCAR NCEP JMA KMA NCDC IDD/LDM HTTP FTP Archive Centre CPTEC Current Data Provider BoM Unidata IDD/LDM Internet Data Distribution / Local Data Manager Commodity internet application to send and receive data

  11. Archive Summary • Online Data • Period, most recent two weeks • ~ 4 TB , public products • ~ 2 TB, data preparation, subsetting, DB • Offline Data • Full period of record • ~ 200 TB, NCAR MSS system

  12. Major ChallengesInsure data receipt, build complete archive • Exchange manifest files as part of IDD/LDM data transmission between Archive centers • Verify send, receive • Automated resend requests for missing fields • Collate data fields into different files types • Harvest and hold metadata in MySQL DB’s • Identify location of every field in file set • Updated often • Critical for users interface and background data processing

  13. Major Challenges • Access system must accurately display what common parameters are available as users make selections • Driven by multi-center research (Grand Ensemble) • Parameters vary between centers.

  14. Variance between centers

  15. Get Forecast Data Two User Interfaces • NCAR online file archive • Selection options (Portal or RDA) • Center(s) • Date • File type (sl, pl, etc) • Initialization time • Forecast length • User customized files • Selection options (Portal) • Same as for files, plus • Parameter Subsets • Grid Interpolation • Spatial subsets • Formats,GRIB2,NetCDF Real Time Delayed Mode • Download Options • Point and click using browser, one file at a time • Script to run on local machine • User and password encrypted ‘wget’ commands • background process to access all files

  16. User access selection demonstration Animation, what you will see • Multiple centers • (ECMWF, UKMO, NCEP, CMA, CMC, KMA) • Fields/Parameters • (Geopotential Height, 2m Temperature) • Levels • (500 hPa, Single Level) • Spatial and temporal ranges • (Global, 3-days, 12Z initializations, 48 hour forecasts) • Regridding to common spatial resolution • (1.5°) • Output format • (netCDF)

  17. Sample Data Request for an Event

  18. Retrieve Completed Subset

  19. Subset Request Animation

  20. Gustav/Hannah Animation

  21. Features of JRA-25/JCDAS at NCAR All data available through web/RDA portal and NCAR MSS, 11 TB • Available dates, 1979 though 2007 • 23 different data products • 4 x daily, GRIB1 format • Monthly mean, netCDF (NCAR derived from binary) format • All data users are registered and must agree to JMA’s ‘Condition of Use’

  22. Typhoon Sepat, 16 August 2007 Images courtesy Dave Stepaniak

  23. Routine Updates • NCEP • FNL Global Tropospheric Analysis (Daily) • BUFR/PREPBUFR obs. data (Weekly) • Unidata IDD data (Daily) • NetCDF format obs collected from GTS • IDD model data (GRIB-2) • GFS • NAM • RUC

  24. Routine Updates • SST • NCEP OI Global SST 1x1 Deg (weekly) • NOAA OI Global 0.25 x 0.25 SST (monthly) • Hadley Centre Global Sea Ice and SST (monthly) • Reanalysis • NNR Yearly updates • NARR Yearly updates • JRA-25

  25. Questions?

  26. Lessons Learned • Manifest files and automated resend are critical for a complete archive • The impact of different contributions from the NWP centers across archive cannot be under estimated • There are important design considerations to insure prompt browser interactions • Caching data from the DB

  27. Lessons Learned • Computational resource requirements ramp up quickly with multi-dimensional problems • D’s, center, ensemble member, parameter, forecast length, etc. • Archive file structure choices greatly impact subsetting ability • TIGGE currently based on synoptic order • Time-series by parameter could be better?

  28. Major Challenges • Limited online storage – 4 TB, ≅ 2 weeks temporal coverage • Full archive on NCAR Mass Storage System • User registration and metrics required • Accept data policy; for research and education only • 48 hour delay from forecast initialization time

More Related