1 / 37

New developments and issues in forecast verification

New developments and issues in forecast verification. Barbara Brown bgb@ucar.edu Co-authors and contributors : Randy Bullock, John Halley Gotway, Chris Davis, David Ahijevych, Eric Gilleland, Lacey Holland NCAR Boulder, Colorado October 2007. Issues. Uncertainty in verification statistics

lilian
Download Presentation

New developments and issues in forecast verification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. New developments and issues in forecast verification Barbara Brown bgb@ucar.edu Co-authors and contributors: Randy Bullock, John Halley Gotway, Chris Davis, David Ahijevych, Eric Gilleland, Lacey Holland NCAR Boulder, Colorado October 2007

  2. Issues • Uncertainty in verification statistics • Diagnostic and user relevant verification • Verification of high-resolution forecasts • Spatial forecast verification • Incorporation of observational uncertainty • Verification of probabilistic and ensemble forecasts • Verification of extremes • Properties of verification measures • Propriety, Equitability

  3. Issues and new developments • Uncertainty in verification statistics • Diagnostic and user relevant verification • Verification of high-resolution forecasts • Spatial forecast verification • Incorporation of observational uncertainty • Verification of probabilistic and ensemble forecasts • Verification of extremes • Properties of verification measures • Propriety, Equitability

  4. Uncertainty in verification measures Model precipitation example: Equitable Threat Score (ETS) Confidence intervals take into account various sources of error, including sampling and observational Computation of confidence intervals for verification stats is not always straight-forward

  5. F O User-relevant verification: Good forecast or Bad forecast?

  6. F O User-relevant verification: Good forecast or Bad forecast? If I’m a water manager for this watershed, it’s a pretty bad forecast…

  7. F O A B Flight Route User-relevant verification: Good forecast or Bad forecast? O If I’m an aviation traffic strategic planner… It might be a pretty good forecast Different users have different ideas about what makes a good forecast

  8. Diagnostic and user relevant forecast evaluation approaches • Provide the link between weather forecasting and forecast value • Identify and evaluate attributes of the forecasts that are meaningful for particular users • Users could be managers, forecast developers, forecasters, decision makers • Answer questions about forecast performance in the context of users’ decisions • Example questions: How do model changes impact user-relevant variables? What is the typical location error of a thunderstorm? Size of a temperature error? Timing error? Lead time?

  9. Diagnostic and user relevant forecast evaluation approaches (cont.) • Provide more detailed information about forecast quality • What went wrong? What went right? • How can the forecast be improved? • How do 2 forecasts differ from each other, and in what ways is one better than the other?

  10. Which rain forecast is better? Mesoscale model (5 km) 21 Mar 2004 Global model (100 km) 21 Mar 2004 Observed 24h rain Sydney Sydney RMS=13.0 RMS=4.6 High vs. low resolution “Smooth” forecasts generally “Win” according to traditional verification approaches. From E. Ebert

  11. Traditional “Measures”-based approaches Consider forecasts and observations of some dichotomous field on a grid: Some problems with this approach: (1) Non-diagnostic – doesn’t tell us what was wrong with the forecast – or what was right (2) Utra-sensitive to small errors in simulation of localized phenomena CSI = 0 for first 4; CSI > 0 for the 5th

  12. Spatial forecasts Spatial verification techniques aim to: • account for uncertainties in timing and location • account for field spatial structure • provide information on error in physical terms • provide information that is • diagnostic • meaningful to forecast users Weather variables defined over spatial domains have coherent structureand features

  13. Recent research on spatial verification methods • Neighborhood verification methods • give credit to "close" forecasts • Scale decomposition methods • measure scale-dependent error • Object- and feature-based methods • evaluate attributes of identifiable features • Field verification approaches • measure distortion and displacement (phase error) for whole field

  14. Neighborhood verification • Also called “fuzzy” verification • Upscaling • put observations and/or forecast on coarser grid • calculate traditional metrics

  15. Neighborhood verification Treatment of forecast data within a window: • Mean value (upscaling) • Occurrence of event in window • Frequency of event in window  probability • Distribution of values within window • Fractions skill score (Roberts 2005; Roberts and Lean 2007) observed forecast Ebert (2007; Met Applications) provides a review and synthesis of these approaches

  16. Scale decomposition • Wavelet component analysis • Briggs and Levine, 1997 • Casati et al., 2004 • Removes noise • Examine how different scales contribute to traditional scores • Does forecast power spectra match the observed power spectra?

  17. Intense storm displaced Skill scale (km) 1 0 -1 -2 -3 -4 640 320 160 80 40 20 10 5 0 1/16 ¼ ½ 1 2 4 8 16 32 threshold = 1mm/h threshold (mm/h) Scale decomposition • Casati et al. (2004) intensity-scale approach • Wavelets applied to binary image • Traditional score as a function of intensity threshold and scale

  18. Feature-based verification • Composite approach (Nachamkin) • Contiguous rain area approach (CRA; Ebert and McBride, 2000; Gallus and others) • Error components • displacement • volume • pattern

  19. missed Feature- or object-based verification • Baldwin object-based approach • Cluster analysis (Marzban and Sandgathe) • SAL approach for watersheds • Method for Object-based Diagnostic Evaluation (MODE) • Others…

  20. MODE object definition Two parameters used to identify objects: • Convolution radius • Precipitation threshold Raw field Raw values are “restored” to the objects, to allow evaluation of precipitation amount distributions and other characteristics Objects

  21. Object merging and matching • Definitions • Merging: Associating objects in the same field • Matching: Associating objects between fields • Fuzzy logic approach • Attributes – used for matching, merging, evaluation Example single attributes: Location Size (area) Orientation angle Intensity (0.10, 0.25, 0.50, 0.75, 0.90 quantiles) Example paired attributes: Centroid/boundary distance Size ratio Angle difference Intensity differences

  22. Object-based example: 1 June 2005 WRF ARW (24-h) Stage II Radius = 15 grid squares, Threshold = 0.05”

  23. 1 3 2 Object-based example 1 June 2006 • Area ratios (1) 1.3 (2) 1.2 (3) 1.1 Û All forecast areas were somewhat too large • Location errors (1) Too far West (2) Too far South (3) Too far North WRF ARW-2 Objects with Stage II Objects overlaid

  24. 1 3 2 Object-based example 1 June 2006 • Ratio of median intensities in objects (1) 1.3 (2) 0.7 (3) 1.9 • Ratio of 0.90th quantiles of intensities in objects (1) 1.8 (2) 2.9 (3) 1.1 ÛAll WRF 0.90th intensities were too large; 2 of 3 median intensity values were too large WRF ARW-2 Objects with Stage II Objects overlaid

  25. 1 3 2 Object-based example 1 June 2006 • MODE provides info about areas, displacement, intensity, etc. • In contrast: POD = 0.40 FAR = 0.56 CSI = 0.27 WRF ARW-2 Objects with Stage II Objects overlaid

  26. Applications of MODE • Climatological summaries of object characteristics • Evaluation of individual forecasting systems • Systematic errors • Matching capabilities (overall skill measure) • Model diagnostics • User-relevant information • Performance as a function of scale • Comparison of forecasting systems • As above

  27. Example summary statistics 22-km WRF forecasts from 2001-2002

  28. Example summary statistics

  29. Example summary statistics • MODE “Rose Plots” • Displacement of matched forecast objects

  30. Threshold (in/100) Radius (grid sq) Verification “Quilts” • Forecast performance attributes as a function of spatial scale • Can be created for almost any attribute or statistic • Provides a summary of performance • Guides selection of parameters Verification quilt showing a measure of matching capability. Warm colors indicate stronger matches. Based on 9 cases

  31. MODE availability http://www.dtcenter.org/met/users/ Available as part of the Model Evaluation Tools (MET)

  32. How can we (rationally) decide which method(s) to use? • MODE is just one of many new approaches… • What methods should be recommended to operational centers, others doing verification? • What are the differences between the various approaches? • What different forecast attributes can each approach measure? • What can they tell us about forecast performance? • How can they be used to • Improve forecasts? • Help decision makers? • Which methods are most useful for specific types of applications?

  33. Spatial verification method intercomparison project • Methods applied to same datasets • WRF forecast and gridded observed precipitation in Central U.S. • NIMROD, MAP D-PHASE/COPS, MeteoSwiss cases • Perturbed cases • Idealized cases • Subjective forecast evaluations

  34. Intercomparison web page • References • Background • Data and cases • Software http://www.ral.ucar.edu/projects/icp/

  35. Subjective evaluation • Model performance rated on scale from 1-5 (5 was best) • N~22

  36. Subjective evaluation MODEL B OBS MODEL A MODEL C

  37. Conclusion • Many new spatial verification methods are becoming available – a new world of verification • Intercomparison project will help lead to better understanding of new methods • Many other issues remain: • Ensemble and probability forecasts • Extreme and high impact weather • Observational uncertainty • Understanding fundamentals of new methods and measures (e.g., equitability, propriety)

More Related