1 / 9

Diagnostic verification and extremes: 1 st Breakout

Diagnostic verification and extremes: 1 st Breakout. Discussed the need for toolkit to build beyond current capabilities (e.g., NCEP) Identified (and began to address) 3 major questions:

imaran
Download Presentation

Diagnostic verification and extremes: 1 st Breakout

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Diagnostic verification and extremes: 1st Breakout • Discussed the need for toolkit to build beyond current capabilities (e.g., NCEP) • Identified (and began to address) 3 major questions: • How should confidence intervals and hypothesis tests be computed, especially when there are spatial and temporal correlations? • What methods should be included for evaluating extremes? • What diagnostic approaches should be included initially; in a 2nd tier; in a 3rd tier

  2. Confidence intervals and hypothesis tests • Need to appropriately take into account autocorrelations • Reduce sample by eliminating cases • Block re-sampling (Candille results indicate spatial correlation may have more impact than temporal, at least for upper air) • Identify situations when parametric approaches are “ok” • Bootstrapping approaches are computer-intensive and require lots of data storage • May not always be practical in operational settings • Can also bootstrap on contingency tables • Best with percentile method • Need a way to measure confidence in spatial methods (e.g., matching, shifting, etc.) • E.g., Caren’s CIs on cluster matches • In future need to include other types of uncertainty in addition to sampling variability • E.g., obs uncertainty • Could maybe use info from data assimilation • Could consider in the future including a way to measure sensitivity to observational variations – re-sampling with random noise added to obs, or using a parametric model. This would be a way to get an initial estimate of variation in vx stats to obs uncertainty

  3. Methods for evaluation of extremes • Need to distinguish (in our minds) between extremes and high impact weather • We really mean rare events here; need to treat in a different way statistically • User should define thresholds for extremes • May be based on quantiles of sample distribution • Could use extreme value theory to help with this (e.g., return level methods) • Need to tell user when a threshold is not appropriate (i.e., insufficient cases) • Extreme dependency score is appropriate in many cases • Also compute standard scores: Yule’s Q; odds ratio; ORSS; ETS, etc. • Look into Rick Katz’ EVT method (compares Extreme value distributions)

  4. Diagnostic methods • Goal: Identify different tiers of methods/capabilities that will be implemented over time, starting with Tier 1 in 1st release • Initial discussion: Stratification • Friday discussion: Specific methods

  5. Stratification • Tier 1: Based on meta-data, including time-of-day, season, location, etc. User may need to do homework to select stratifications • Tier 2: Based on other information from the model or observations, such as temperature, wind direction, etc. (any kind of scalar) Could also include non-meteorological data (e.g., air traffic) Also should include derived parameters – e.g., potential vorticity • Tier 3: Based on feature such as location or strength of jet core; cyclone track; etc.

  6. Specific methods and capabilities • Tier 1 • NCEP operational capability • Allow user to specify space and time aggregation • Ex: User-input masks (e.g., small region, climate zone, etc.) • Ex: Allow individual station statistics, or groups of stations • Include traditional methods for • Contingency tables • Continuous variables • Probability forecasts • (Ensemble forecasts?) • GO Index • Confidence intervals for most statistics • Underlying distributions of forecasts, obs, and errors • Basic observations (surface, ACARS, soundings, radar – StII, St IV; etc.) • Extract basic forecast/obs pairs or grids for use elsewhere • Basic plotting capabilities (e.g., some capabilities from R toolkit, NRL)

  7. Specific methods and capabilities • Tier 2 • Allow alternative obs/forecast pairs (e.g., along satellite tracks) • Additional variables • Additional spatial methods, based on VX intercomparison • Scale-dependent methods • Fuzzy methods • Etc. • “Trial” methods • Additional diagnostic graphical capabilities (e.g., NRL capabilities)

  8. Specific methods and capabilities • Tier 3 • Integrate methods into a more user-focused framework • Incorporate user datasets • Decision making aspects

  9. General comments • User training will be important, even for Tier 1 • But – want to let user do what they want to do • As we develop the system will need to provide some guidance on what should be done to answer particular questions • Should be able to evaluate any model on any domain • Demonstrate equivalence to NCEP output • Need for a verification method testbed capability

More Related