260 likes | 368 Views
Data and algorithmic uncertainty in the transition from research to operations. CUAHSI Workshop, Logan, Utah July 17-19, 2013 Matthijs Lemans Deltares USA. Presentation Overview. Research to operations in literature (brief) Data and algorithmic uncertainty
E N D
Data and algorithmic uncertainty in the transition from research to operations CUAHSI Workshop, Logan, Utah July 17-19, 2013 Matthijs Lemans Deltares USA
Presentation Overview Research to operations in literature (brief) Data and algorithmic uncertainty Examples of ‘research to operation’ implementations Improvement of ESP skill using climate indices OpenDA for calibration of models and Ensemble Kalman Filtering (brief) Conclusions
Research to Operations • Some examples from Literature: • Goodman (2004), the NASA Short-term Prediction Research and Transition (SPoRT) Center: A Collaborative Model for Accelerating Research into Operations • Serafin (2002), transition of weather research to operations • American Meteorological Society (AMS): Second Conference on Transition of Research to Operations in 2012 • ….
Research to Operations Common notes: • Research agenda including non scientific factors (costs, technical feasibility, performance, user value) very important • Model advances made by researchers must first be translated into the framework of the operational models • Research products preferably available in near real-time to the forecasters • Difficult for the research community to test and to troubleshoot the operational models • Share same model framework
Transition of data and algorithmic uncertainity • Step 1: Conducting research • clear assessment of the algorithmic changes (focus) • data sets tightly controlled • Step 2: Operational testing • assess the technical and scientific performance of the new algorithm • test cases, push batches of operational raw data through algorithm • data pre and post processing important • some algorithmic tuning may still be required • Step 3: Implementation and operational use • algorithmic uncertainty small, data uncertainty high
Add climate information, like ENSO, to the meteo ensemble and use that in the current daily operational forecasts (based on Delft-FEWS) Add climate phase info Example 1: Improvement of ESP skill using climate indices Ensemble Streamflow Prediction (ESP) represents the probabilistic forecast. Long term weather is represented by an ensemble of historical time series (55year) MAP/ MAT time Q Hydrological model Initial state time
Research: Climate Indices Large scale atmospheric circulation patterns (such as ENSO and PDO) are known to affect monthly temperatures and precipitation in the Pacific Northwest Retrieved from web sites: http://www.cpc.ncep.noaa.gov/ http://www.esrl.noaa.gov/psd/ http://www.cawr.gov.au/ (Wheeler, MJO) http://jisao.washington.edu/ (PDO)
Research: Preprocessing of the meteo ensemble Use a combination of two methods: • Subsampler: reduce the original ESP by making a selection of ensemble members that match the current phase of climatic modes • Resampler: add synthetic ensemble traces from a stochastic sampling method while using climatic mode information • These methods are based on literature and augmented with new ideas
Research: Subsampler Reduce the original ESP of 55 members For each month in the hindcast suite, select the years with most similar climate indices (at forecast time) Select ESP members with similar phases Dismiss ESP members with dissimilar phases Climate phase at forecast time
Research: Testing the algorithm • Reference forecast system needed to assess skill of the methods • Series of monthly historical forecasts for 54 years (=hindcasts) for three basins, using FEWS (next 3 slides) • Clean data set for observed P and T (used in historical run and ensemble forecasts) • Comparing streamflow from historical run with forecasts • Observations about the state (snowpack, soil moisture, waterlevel) not included in forecast (difference with operational system) • Exporting results to script language for further analyses • Skill metrics: the RMSE, Brier Score and CRPS. Average scores over • a large number of forecasts
Delft-FEWS (Flood Early Warning System) • FEWS is an operational data management system • Toolbox for development of forecasting systems • Binding dataflows + models • Fully ‘configurable’ by user • Real-Time • Rapid implementation, scalable & flexible • High resilient & automatic / manual & stand alone
Delft-FEWS User Community www.delft-fews.com (free download) • USA, NWS (Flw) • USA, BPA (Flw, Res) • Canada (Flw) • UK (Flw, Gw) • Netherlands (Dr, Flw, Wq, Ds) • Germany (Flw) • Suisse (Flw) • Italy (Flw) • Austria (Flw, Res) • Spain (Flw) • Singapore (WQ, Flw) • Taiwan (Flw) • South-Korea (WQ) • Australia (Flw) • Sudan • Georgia • Mekong River Commission (Flw) • Indonesia (Peat, Flw) • Azerbaijan (Flw) • Zambezi (Dr, Flw) • Colombia (Flw) • Bolivia (Flw) • Uruguay (Flw) • Brazil (Flw, Res) • ... operational service Flw Dr Wq Res Ds Gw Flow Drought Water Quality Reservoir operation Dike strength Ground Water in development
import export & dessimination DELFT-FEWS Concept data (feeds) • Meteo • Hydro • WQ • ... Delft-FEWS modules • Import • Validation • Transformation e.g. Rating Curves) • Interpolation (lineair and spatial) • Data hierarchy • General adapter for models • Data assimilation • Manual interaction • Export / report (files, html, pdf,...) • Data Visualization! • … Native Models: • Rainfall-runoff • Hydraulics • WQ • ... Model adapter
Comparison with reference forecast system More restrictions • Focus on June volumes • Focus on only 1 climate index Positive skill for reduced ESP Reduction of skill for smaller ESP
Reduced Ensemble – effect of ensemble size A smaller ensemble will give less accurate estimates of mean and other quantiles. Less skill! Therefore, add synthetic time series from a weather resampler C.A.T. Ferro: Comparing Probabilistic Forecasting Systems with the Brier Score, Weather and Forecasting 22, pp 1076-1088 (2007).
Research: Weather resampler Re-sampling of historical periods of weather, to generate a synthetic time series of arbitrary length Resamplercan be conditioned on climate phase Repeated runs produce an ensemble
Weather resampler - algorithm Start from current climate indices Calculate difference to climate indices in historical years From N most similar historical years, randomly pick one Add a period of weather from that year to the time series Return to 2 MAP MAT MAP MAT MAP MAT MAP MAT MAP MAT MAP MAT MAP MAT MAP MAT MAP MAT MAP MAT MAP MAT MAP MAT the past now the future
Julian Day time series, ensemble 1991 1985 1982 1960
Resampler – conclusions (brief) Test runs so far show that resampler using ENSO, SOI and PDO indices have skill compared to the ESP Some fine tuning was needed (auto correlation, fixing resampling per forecast day etc) Combine the two methods Loss of skill for reduced ensemble due to statistical noise can be improved by adding synthetic members
Operational testing: Implementation in FEWS • Standalone testing with real time operational data (in FEWS) • Lot of QC! • New methods more contrained now: • Climate Indices are imported from ftp sites • N_index – imported CSV time series with number of indicesUsed by Subsampler to create a reduced ensembleUsed by Resampler to give weights to the climate indices • Subsampler is implemented in Delft-FEWS Java codeGenerates a reduced ensemble from an full historical ensemble • Resampler is done in two steps (python and Java) • Finally, the two meteo ensembles are merged and used as input for the hydrological models
Unknown unknowns…. • Synthetic flow traces cannot be associated with a historical year anymore. • But also other information than meteo, like power load and monthly change factors (for forecasting an external official volume forecast at the end of the next month) depend on historical years. • They can be assembled similarly to MAP and MAT, using the same JulianDays ensemble timeseries. Done already for the monthly change factors
Example 2: OpenDA operational in FEWS OpenDA (www.openda.org) for automated calibration for NWS • Too many basins • Bad performance Water Quality Forecasting System for the Four Major Rivers in Korea • HSPF and EFDC 3D water quality modeling • Better accuracy by using Ensemble Kalman Filtering in OpenDA) for improving initial states of EFDC
Conclusions • “Begin with the end in mind” (Stephen Covey) • (User friendly, robust, performance, technical feasibility) • “There are known unknowns... But there are also unknown unknowns" (Donald Rumsfeld) • FEWS can be used by scientists and operational forecasters as a shared model framework. Allowing: • Translating model advances made by researchers into the framework of the operational models • Easier for the research community to test and to troubleshoot the operational models • Conducting hindcasts with a canned and controlled dataset or with a batch of raw operational data