Verification Issues at NCEP

Verification Issues at NCEP Zoltan Toth and Yuejian Zhu Environmental Modeling Center NOAA/NWS/NCEP Acknowledgements: Geoff DiMego, Mark Iredell and Stephen Lord EMC

Contents • Design Principles: Modularity, Flexibility, Portability • An EMC-wide verification system is not worth doing unless it is modular, flexible, and portable. Most people will want to use such a system, and many will be able to contribute to its development. • Verification Statistics: • Types, scope and etc.. • Required diagnostic/verification scores • Deterministic, probabilistic … • Sample Modularity Design • General scripts - inputs • Components • Required Display Capabilities

Design Principles • Modularity. The verification system should be broken into modules with pre-defined interfaces so that different users can work on different parts of the code without affecting each other. This common shared stream of modules would be unified and much of the code would be shared across EMC. The design of the verification system would be bought into by all EMC groups. User groups can make necessary changes for their special applications. The VSDB format is an example of this kind of interface, but the statistic generating codes and the display codes could be more modular. • Flexibility. New kinds of verification scores need to be easily added to the verification system without having to ask an expert. We cannot readily anticipate ahead of time what these new kinds of scores are. Modularity would allow users to modify specific parts without detailed knowledge about the rest of the software. The system should be adaptable to other modeling systems: ocean, land, cryosphere, space, single column, osses, etc. • Portability. All codes should be able to run efficiently on the CCS or on the EMC workstations. GUI menu driven software should be able to reach all the way back into the robotic tape archives if necessary.

Verification Statistics • Major types of statistics • Diagnostics (depends on forecast only) • Verification (depends on comparison of forecast to estimate of truth) • Scope of statistics • Point-wise at a given time (eg, absolute error) • Multivariate defined over a set of variables • Expanded in space (eg, PAC) • Expanded in time (eg., temporal correlation) • Expanded over other variables • Choice of domain • Time • Single level • Multiple levels • Space • 3-D grid domain • Choice of variable(s) • Single • Multiple

Verification Statistics • Types of verifying data – all statistics to be identically computed for both types • Observations • NWP analysis • Error specification for verifying data • Standard deviation • Probability distribution • Types of forecasts • Single forecast • Ensemble of forecasts • Forecast data format • Gridded (single point or lat/lon array) • Feature based (eg, position and intensity of hurricane; generation of this could be considered as part of “forward model”) • Forecast data type • Operational • Parallel • User supplied experimental

Verification Statistics • Event definition for probabilistic scores • User defined thresholds • Climatological percentiles (based on eg global/regional reanalysis) • Defined by ensemble members (eg, Talagrand statistics) • Generation of probabilistic forecasts • Based on ensemble forecasts (with user defined weights) • User supplied pdf (based on statistical or other methods) • Benchmarks for skill score type statistics • Climatology • Persistence • Choice of other forecast system • Manipulation of partial statistics • Aggregate in time • Aggregate in space

Required diagnostic/verification scores • Single forecast • CFS • NINO 3.4 anomaly correlation (CFS) • Bias-corrected US 2 meter temperature (AC, RMS) • Bias-corrected US precipitation (AC, RMS) • Weekly, monthly, seasonal, annual, interannual stats • GFS • Feature tracking • Hurricane tracks • Raw track errors and compared to CLIPER • Frequency of being the best • By storm and basin • Hurricane intensity • Extratropical storm statistics • Verification against observations • Support both interpolation from pressure levels or from native model levels. • Horizontal bias and error maps • Vertical bias and error by region • Time series of error fits • Fits by month and year

Required diagnostic/verification scores • Single forecast • GFS (continue) • Verification against analyses • All fields in master pressure GRIB file can be compared • All kinds of fields, including tracers • All kinds of levels, including iso-IPV • Single field diagnostics (without a verifying field) • Mean, mode, median, range, variance • Masking capability - Only over snow covered, etc. • Region selection • Anomaly correlation • RMS error • FHO statistics by threshold • Count of difference and largest difference • Superanalysis verification • GDAS • All statistics segregated by instrument type • Observation counts and quality mark counts • Guess fits to observations by instrument type • Bias correction statistics • Contributions to penalty

Required diagnostic/verification scores • Ensemble forecasts • Point-wise • Ensemble mean statistics • RMS error • PAC correlation • Spread • Best member frequency statistics • Multivariate (for particular spatial domain, cannot be aggregated in space) • Perturbation vs. Error Correlation Analysis (PECA) • Independent degrees of freedom (DOF) • Explained error variance • Probabilistic forecasts • Point-wise (computed point by point, then aggregated in space and time) • Brier Skill Score (incl. Reliability & Resolution components) • Ranked Probability Skill Score (incl. Reliability & Resolution components) • Continuous Ranked Probability Skill Score (incl. Reliability & Resolution components) • Relative Operating Characteristics (ROC) • Relative Economic Value • Information content

Sample Modularity Design • Input • Agree on a list of things that need to be provided to define verification (see major points above); how all this info would be passed between the modules. It would be nice to have a common form, but possibly we need different templates for unique applications (for verifying gridded data; hurricane tracks; ensemble?) Example: • Verification statistics: Specify • Predefine • User provided? • Volume • Space - specify 3D box • Time – specify period, lead time, time frequency • Forecast: Choose from • Operational • Parallel • Experimental (define location) • Truth: Choose from • Analysis • Observation types • Variable: Choose one or more [in case of hurricane track, this for example would not be relevant] • Output format: Choose one or both • Table (select from pre-designed choices) • Graph (select from pre-designed choices) • Output display: Choose • Default corresponding to pre-designed configuration choices above • Interactive manipulation of display format, choices, etc

Sample Modularity Design • “Driver” component • Runs all modules as needed. Some functions: • Checks if requested stat has already been archived or can be computed from intermediate stats in database • Prepares main script that calls subscripts of data preprocessor, verification engine, and database steps for each verification unit (ie, one comparison of forecast and truth), if missing stats need to be computed • Prepares post-processing script for computation of final verification stats; • Prepares and calls display script • Input data component • read whatever format the raw forecast and observations come in and put into objects containing data and metadata; prepares corresponding climatological information if needed; prepares any other data needed (eg, event definitions, etc, for probabilistic forecasting?) • Read GRIB or BUFR • Requires an input object datatype

Sample Modularity Design • Forward model component • put input objects into the requested verification space (grid or observation location) • Interpolate to verification grid and time • Compute derived quantities (units change, vorticity, radiances, etc. • Compute anomalies, trends, indices, bias corrections • Compute probabilistic ensemble quantities?? • Apply any filters and averaging • Feature tracking • Partial statistics component • compute statistics in the verification space • Partial sum • Threshold stats • Probabilistic stats • Output VSDB (mysql)

Sample Modularity Design • Final statistics component (i.e., FVS) • Full sums • Interactively selectable • Output VSDB-like? • Can a commercial package do this? • Database • Display • Menu driven • Web resident • Reaches all the way back to step (input) if necessary • Plots output directly from step (“driver component”) if requested • Commercial package like IDL? • Output format • Table • Graphics

Required Display Capability • Required Display Capability • User interactive statistic selection • User interactive display options • GUI browser interface • Same interface whether operational or own experiment • Professional output

BACKGROUND

General Issues • Event definition - Must be able to define events in 3 different ways: • Based on reanalysis (global or regional or other) climatology. Example definition of event: Falling between 20 and 30 percentile of climatological distribution (or falling below or above a certain percentile). Actual range of values to be derived automatically from climatological distribution • Defined by user. Example: range between 2 and 4 C temperature. Corresponding climatological percentile values to be determined automatically by consulting climatological distribution of variable • Based on ensemble distribution (if verifying ensemble-based probabilities). Example: range between 3rd and 4th ensemble members. Climatological percentile values to be determined as above. • Generation of probabilistic forecasts: • Based on an ensemble of forecasts; user should be able to specify unequal weights for various members • Other methods – user supplied cumulative or probability density functions • Reference scores for computing skill scores and similar measures – Where appropriate, user must be able to select from three alternate references: • Climatological forecast • Persistence forecast • User specified probabilistic forecast (other than the system being tested) • Verifying data - User should be able to compute scores against either analysis or observations

Verification Issues at NCEP

Verification Issues at NCEP

Presentation Transcript

Ocean Modeling at NCEP

Ocean modeling at NCEP

Warning Verification Issues

Issues in FPGA Verification

Verification tools at Microsoft

NCEP Operational Global Cyclone Tracking and Verification System

NCEP

Atmospheric data assimilation at NCEP/EMC

GODAS Ocean Climate Modeling at NCEP

GRIB2 usage at NCEP

NCEP Production Suite Review: Land-Hydrology at NCEP

Land-Surface Modeling Performance At NCEP

LAT System Verification Issues

HYCOM for Ocean Modeling at NCEP

Climate Prediction Modeling at NCEP

WRF Progress at NCEP in 2002

NCEP

GRIB2 usage at NCEP

Global OSSEs at NCEP

Global OSSEs at NCEP

HYCOM for Ocean Modeling at NCEP

Verification issues