1 / 17

TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time

TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time. Baudouin Raoult Data and Services Section ECMWF. The TIGGE core dataset. T HORPEX I nteractive G rand G lobal E nsemble

nasnan
Download Presentation

TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

  2. The TIGGE core dataset • THORPEX Interactive Grand Global Ensemble • Global ensemble forecasts to around 14 days generated routinely at different centres around the world • Outputs collected in near real time and stored in a common format for access by the research community • Easy access to long series of data is necessary for applications such as bias correction and the optimal combination of ensembles from different sources

  3. Building the TIGGE database • Three archive centres: CMA, NCAR and ECMWF • Ten data providers: • Already sending data routinely: ECMWF, JMA (Japan), UK Met Office (UK), CMA (China), NCEP (USA), MSC (Canada), Météo-France (France), BOM (Australia), KMA (Korea) • Coming soon: CPTEC (Brazil) • Exchanges using UNIDATA LDM, HTTP and FTP • Operational since 1st of October 2006 • 88 TB, growing by ~ 1 TB/week • 1.5 millions fields/day

  4. TIGGE Archive Centres and Data Providers UKMO CMC ECMWF CMA NCAR MeteoFrance NCEP JMA KMA CPTEC Archive Centre Current Data Provider BoM Future Data Provider

  5. Strong governance • Precise definition of: • Which products: list of parameters, levels, steps, units,… • Which format: GRIB2 • Which transport protocol: UNIDATA’s LDM • Which naming convention: WMO file name convention • Only exception: the grid and resolution • Choice of the data provider. Data provider to provide interpolation to regular lat/lon • Best possible model output • Many tools and examples: • Sample dataset available • Various GRIB2 tools, “tigge_check” validator, … • Scripts that implement exchange protocol • Web site with documentation, sample data set, tools, news….

  6. Using SMS to handle TIGGE flow

  7. Quality assurance: homogeneity • Homogeneity is paramount for TIGGE to succeed • The more consistent the archive the easier it will be to develop applications • There are three aspects to homogeneity: • Common terminology (parameters names, file names,…) • Common data format (format, units, …) • Definition of an agreed list of products (Parameters, Steps, levels, …) • What is not homogeneous: • Resolution • Base time (although most provider have a run a 12 UTC) • Forecast length • Number of ensemble

  8. E.g. Cloud-cover: instantaneous or six hourly? QA: Checking for homogeneity

  9. QA: Completeness • The objective is to have 100% complete datasets at the Archive Centres • Completeness may not be achieved for two reasons: • The transfer of the data to the Archive Centre fails • Operational activities at a data provider are interrupted and back filling past runs is impractical • Incomplete datasets are often very difficult to use • Most of the current tools (e.g. epsgrams) used for ensemble forecasts assume a fixed number of members from day to day • These tools will have to be adapted

  10. QA: Checking completeness

  11. GRIB to NetCDF Conversion GRIB File NetCDF File Metadata Gather metadata and message locations t, EGRR, 1 d, ECMF, 2 t (1,2,3,4) t, ECMF, 2 Create NetCDF file structure t, EGRR, 2 d, EGRR, 1 Populate NetCDF parameter arrays (1,2,3,4) represents ensemble member id (Realization) d (1,2,3,4) t, ECMF, 1 d, EGRR, 2 d, ECMF, 1

  12. Ensemble NetCDF File Structure • NetCDF File format • Based on available CF conventions • File organization built according to Doblas-Reyes (ENSEMBLES project) proposed NetCDF file structure • Provides grid/ensemble specific metadata for each member • Data Provider • Forecast type (perturbed, control, deterministic) • Allows for multiple combinations of initialization times and forecast periods within one file. • Pairs of initialization and forecast step

  13. Ensemble NetCDF File Structure • NetCDF Parameter structure (5 dimensions): • Reftime • Realization (Ensemble member id) • Level • Latitude • Longitude • “Coordinate” variables are use to describe: • Realization • Provides metadata associated with each ensemble grid. • Reftime • Allows for multiple initialization times and forecast periods to be contained within one file

  14. Tool Performance • GRIB-2 Simple Packing to NetCDF 32 BIT • GRIB-2 size x ~2 • GRIB-2 Simple Packing to NetCDF 16 BIT • Similar size • GRIB-2 JPEG 2000 to NetCDF 32 BIT • GRIB-2 size x ~8 • GRIB-2 JPEG 2000 to NetCDF 16 BIT • GRIB-2 size x ~4 • Issue: packing of 4D fields (e.g. 2D + levels + time steps) • Packing in NetCDF similar to simple packing in GRIB2: • Value = scale_factor * packed_value+ add_offset; • All dimensions shares the same scale_factor and add_offset • For 16 bits, only different 65536 values can be encoded. This is a problem if there is a lot of variation in the 4D matrices

  15. GRIB2 • WMO Standard • Fine control on numerical accuracy of grid values • Good compression (Lossless JPEG) • GRIB is a record format • Many GRIBs can be written in a single file • GRIB Edition 2 is template based • It can easily be extended

  16. NetCDF • Work on the converter gave us a good understanding of both formats • NetCDF is a file format • Merging/splitting NetCDF files is non-trivial • Need to agree on a convention (CF) • Only lat/long and reduced grid (?) so far. Work in progress for adding other grids to the CF • There is no way to support multiple grids in the same file • Choose a convention for multi fields per NetCDF files • All levels? All variables? All time steps? • Simple packing possible, but only a convention • 2 to 8 times larger than GRIB2

  17. Conclusion • True interoperability • Data format, Units • Clear definition of the parameters (semantics) • Common tools are required (only guarantee of true interoperability) • Strong governance is needed • GRIB2 vs NetCDF • Different usage patterns • NetCDF: file based, little compression, need to agree on a convention • GRIB2: record based, easier to manage large volumes, WMO Standard

More Related