Fair scores for ensemble forecasts Chris Ferro University of Exeter

Fair scores for ensemble forecasts Chris Ferro University of Exeter 13th EMS Annual Meeting and 11th ECAM (10 September 2013, Reading, UK)

Evaluating ensemble forecasts Multiple predictions, e.g. model simulations from several initial conditions. Want scores that favour ensembles whose members behave as if they and the observation are drawn from the same probability distribution.

Current practice is unfair Current practice evaluates a proper scoring rule for the empirical distribution function of the ensemble. A scoring rule, s(p,y), for a probability forecast, p, and an observation, y, is proper if (for all p) the expected score, Ey{s(p,y)}, is optimized when y has distribution p. Proper scoring rules favour probability forecasts that behave as if the observations are randomly sampled from the forecast distributions.

Examples of proper scoring rules Brier score: s(p,y) = (p – y)2 for observation y = 0 or 1, and probability forecast 0 ≤ p ≤ 1. Ensemble Brier score: s(x,y) = (i/n – y)2 where i of the n ensemble members predict the event {y = 1}. CRPS: for real y and forecast p(t) = Pr(y ≤ t), where H is the Heaviside function. Ensemble CRPS: where i(t) of the n ensemble members predict the event {y ≤ t}.

Example: ensemble CRPS n = 2 Observations y ~ N(0,1) and n ensemble members xi ~ N(0,σ2) for i = 1, ..., n. Plot expected value of the ensemble CRPS against σ. The ensemble CRPS is optimized when ensemble is underdispersed (σ < 1). n = 4 n = 8

Fair scoring rules for ensembles Interpret the ensemble as a random sample. Fair scoring rules favour ensembles that behave as if the observations are sampled from the same distribution. A scoring rule, s(x,y), for an ensemble forecast, x, sampled from p, and an observation, y, is fair if (for all p) the expected score, Ex,y{s(x,y)}, is optimized when y ~ p. Fricker, Ferro, Stephenson (2013) Three recommendations for evaluating climate predictions. Meteorological Applications, 20, 246-255 (open access)

Characterization: binary case Let y = 1 if an event occurs, and let y = 0 otherwise. Let si,y be the (finite) score when i of n ensemble members forecast the event and the observation is y. The (negatively oriented) score is fair if (n – i)(si+1,0 – si,0) = i(si-1,1 – si,1) for i = 0, 1, ..., n and si+1,0 ≥ si,0 for i = 0, 1, ..., n – 1. Ferro (2013) Fair scores for ensemble forecasts. Submitted.

Examples of fair scoring rules Ensemble Brier score: s(x,y) = (i/n – y)2 where i of the n ensemble members predict the event {y = 1}. Fair Brier score: s(x,y) = (i/n – y)2 – i(n – i)/{n2(n – 1)}. Ensemble CRPS: where i(t) of the n ensemble members predict the event {y ≤ t}. Fair CRPS: if (x1, ..., xn) are the n ensemble members,

Example: ensemble CRPS n = 2 Observations y ~ N(0,1) and n ensemble members xi ~ N(0,σ2) for i = 1, ..., n. Plot expected value of the fair CRPS against σ. The fair CRPS is always optimized when ensemble is well dispersed (σ = 1). unfair score fair score n = 4 n = 8 all n

Summary Evaluate ensemble forecasts (not only probability forecasts) to learn about ensemble prediction systems. Use fair scoring rules to favour ensembles whose members behave as if they and the observation are drawn from the same probability distribution. Unfair scoring rules will favour ensembles whose members are drawn from mis-calibrated distributions.

Dependent ensemble members A scoring rule, s(x,y), for an exchangeable ensemble, x, with marginal distribution p, and an observation, y, is fair if (for all p) the expected score is optimized when y ~ p. Fair scores exist only for some dependence structures. We rarely know the ‘correct’ dependence structure for an ensemble, and using an estimate sacrifices fairness. Use scores that are fair for those dependence structures that may be adopted when using the ensemble.

Fair scores for ensemble forecasts Chris Ferro University of Exeter

Fair scores for ensemble forecasts Chris Ferro University of Exeter

Presentation Transcript

Camborne School of Mines University of Exeter

Rupert Wegerif University of Exeter

Verification Approaches for Ensemble Forecasts of Tropical Cyclones

Peter Cox (University of Exeter) Chris Huntingford , Lina Mercado (CEH),

JC Bernthal, University of Exeter

A Regression Model for Ensemble Forecasts

Post-Processing of Ensemble Forecasts

What is a good ensemble forecast? Chris Ferro University of Exeter, UK

CHRIS EVANS, DIRECTOR MARCHMONT OBSERVATORY UNIVERSITY F EXETER

Predicting the performance of climate predictions Chris Ferro (University of Exeter)

Dr Barrie Cooper University of Exeter

A Post-Processor for Hydrologic Ensemble Forecasts

Streamflow assimilation for improving ensemble streamflow forecasts

Clotilde Pegorier, University of Exeter, UK

Canadian ensemble forecasts

The University of Exeter

What is a good ensemble forecast? Chris Ferro University of Exeter, UK

Landon P. Karr University of Exeter

Common verification methods for ensemble forecasts

Hurricane Genesis Ensemble Forecasts

Verification of probability and ensemble forecasts

Verifying Ensemble Forecasts