DATA ASSIMILATION WITH IMPERFECT MODELS Zoltan Toth and Malaquias Pena Mendez1 Environmental Modeling Center NOAA/NWS/NCEP USA 1SAIC at Environmental Modeling Center, NCEP/NWS Acknowledgements: Dusanka Zupanski, Guocheng Yuan http://wwwt.emc.ncep.noaa.gov/gmb/ens/index.html
OUTLINE • ANALYSIS ERRORS • Observational errors • Background errors • Chaotic errors • Model-related errors • Stochastic errors • Systematic errors • Tendency errors • State errors FORECAST DRIFT
OUTLINE / SUMMARY • ANALYSIS ERRORS • Observational errors • Background errors • Chaotic errors • Model-related errors • Stochastic errors • Systematic errors • Tendency errors • State errors FORECAST DRIFT HOW TO REDUCE DRIFT-INDUCED FORECAST ERRORS? • Mapping initial state on model attractor • Estimating asymptotic errors • Reducing model-related errors • Reducing total forecast errors • IMPROVED ANALYSES
DATA ASSIMILATION BASICS • GOAL • Represent nature as truthfully as possible • USE OBSERVATIONAL DATA • Incomplete coverage in • Space • Time • Variables • Noisy • Assume for this study that observations are unbiased • Otherwise, de-bias as in Derber & Wu • NEED OTHER (BACKGROUND) INFORMATION TO • Complete coverage • Filter out noise • Choices • Climatology – Independent of current situation • Persistence – Dynamics of situation ignored • Use dynamical short-range forecast - Best choice with caveats • “Chaotic” forecast errors related to initial uncertainty • Errors related to use of imperfect model
DATA ASSIMILATION BASICS - 2 • COMBINE OBSERVATIONS & BACKGROUND • Statistical procedure • Undesirable effects from dynamics point of view • Observational noise • Reduced but not eliminated • Weights on observations & background • Is truth in between? • Minimize arbitrariness by • Relying more on dynamically consistent information • Eg, ensemble-based background covariance • Other approaches? • CRUCIAL ROLE OF BACKGROUND • How to generate? • How to use? • BASICS ABOUT FORECASTING • Well known facts • Some assumptions critical • Will revisit a few
NWP FORECASTING BASICS • SOURCES OF FORECAST ERRORS • Initial conditions • Arise due to initial error (imperfect analysis) and unstable dynamics • Reasonably well understood • Model • Imperfect representation of nature • Caused, for example, by use of • Limited domain • Limited temporal/spatial/physical resolution (truncation) • Structural errors • Parametric errors • HOW TO REDUCE FORECAST ERRORS? • Reduce initial errors • Make model more similar to nature • USE OF FORECASTS • General applications • In DA cycles
HOW TO TREAT FORECAST ERRORS IN DA • CHAOTIC ERRORS • Statistical approach • “NMC” method (differences between past short-range forecasts verifying at same time) • Ensemble method (differences between past ensemble forecasts verifying at same time) • Dynamical approach • 4DVAR – Norm dependent adjustments • Ensemble-based DA (large ensemble of current forecasts) – Norm-independent adjustments • MODEL-RELATED ERRORS • 3 approaches used to cope with model errors in DA: • Assume model-related errors don’t differ from chaotic errors (Ignore problem) • Inflation of background errors (ie, move analysis closer to observations) • Multiply background error covariance matrix in 3/4DVAR • Increase initial perturbation size in ensemble-based DA • Assume model-related errors are stochastic with characteristics different from chaotic errors (D. Zupanski et al) • Introduce additional (model) error covariance term (allow analysis to move closer to obs.) • How statistics determined? • Assume errors are systematic (Dee et al) • Estimate systematic difference between analysis and background • Before their use, move background by systematic difference closer to analysis Move background toward nature Move background toward nature IS THIS THE RIGHT MOVE? Treat initial and model error the same way?
MODEL-RELATED BACKGROUND ERRORS • TWO COMPONENTS • Systematic • Time mean difference • Stochastic • What’s left over • Ignore for now • SYSTEMATIC ERROR • Estimate as • Climate mean difference • Regime dependent difference • Based on most recent data • TRADITIONAL PARADIGM FOR ANALYSIS/FORECAST SYSTEM • Estimate the state of nature as truthfully as possible (analysis); • Run numerical model forecast from the analysis field; • Statistically assess the systematic error in the numerical forecast; • Remove the estimated systematic error from the forecast. • ASSUMPTION • Removing systematic error will improve quality of analysis/forecast system • WILL IT???
SYSTEMATIC MODEL-RELATED FORECAST ERRORS Attractors of nature & model are different Nature Forecast ORIGIN OF SYSTEMATIC ERROR IN FORECAST Systematic difference between nature and our model – Model world is different from reality • Tendencies are different • Phenomena evolve differently in time • Ignore for now • States (ie, realizable, natural states) are different • Phenomena not (exactly) the same • Eg, climate mean for nature and model are different START MODEL FROM STATE OF NATURE • State of nature not compatible with model • Initial condition near nature is off of model attractor • Forecast drifts toward model attractor • Drift-induced errors introduced REDUCE SYSTEMATIC MODEL-RELATED ERRORS? • Tendencies will be imperfect • Accept that, but • Can we reduce drift-related errors?
SEARCH FOR BEST INITIAL CONDITIONS FOR IMPERFECT MODELS How can we reduce drift-induced errors? What is the best initial condition for an imperfect model? A state as close to nature as possible (“perfect” initial condition)? - Traditional, “fidelity” paradigm On/near attractor of nature Off attractor of model Forecast drifts form attractor of nature to that of model Lead-time dependent systematic errors A state on/near model attractor? – New paradigm? No forecast drift “Imperfect” initial conditions? How to find state on model attractor corresponding to state in nature? Is there a model trajectory that would “shadow” nature? Find an initial state on/near a model trajectory that corresponds with observed state Estimate vector mapping points in nature to points on model attractor Does such mapping exist? For now, assume it does
HOW TO USE MAPPING IN DA / NWP FORECASTING? Challenging step • Estimate the mapping between nature and the model attractors • Map the observed state of nature into the space of the model attractor • Move obs. with mapping vector • Analyze data • Run the model from the mapped initial condition • “Remap” the analysis and forecast back to the phase space of nature New step Standard procedure New step
COMPARING THE FIDELITY AND MAPPING PARADIGMS TRADITIONAL PARADIGM Estimate the state of nature as truthfully as possible (traditional DA) Run numerical model forecasts from the analysis field Statistically assess the systematic error in the numerical forecasts Correct the numerical forecasts for systematic errors MAPPING PARADIGM • Estimate the mapping between nature and the model attractors • Map the observed state of nature into the space of the model attractor • Move obs. with mapping vector • Analyze data • Run the model from the mapped initial condition • “Remap” the analysis and forecast back to the phase space of nature Analysis cycle Analysis cycle Move analysis toward nature Move analysis to model attractor
Does mapping exist? Assumption: Forecasting would not be possible with imperfect models if mapping did not exist Not sure, have to try and see Mapping exists and well estimated if forecast errors with mapping vs. fidelity paradigm reduced If it exists, how to estimate it? Don’t need perfect estimate of mapping Initial state must be closer to model attractor than with fidelity paradigm Remapping mitigates potential problems with poor mapping vector estimate The bigger the difference between nature and model, the less likely we can find mapping vector MAPPING QUESTIONS
MAPPING VECTOR • Definition • Vector that provides best remapped forecast performance • Estimation • Difference between long term time means of forecast trajectory & nature In practice, nature is not known • Use traditional analyses as proxy: 2. Adaptive technique • If systematic errors are regime dependent, or climate means are not available • Details later
MODEL & DA DETAILS • Lorenz (1963) 3-variable model: “NATURE” =10 b =8/3 r =28 “MODEL” =9 z=z+2.5 Runge-Kutta numerical scheme with a time step of 0.01 • Three initialization schemes • Perfect initial conditions • Replacement • All 3 variables observed • Observational error = 2 (~5% natural variability) • 3-DVAR: • 15 time step cycle length (~7 hrs in atmosphere) • Diagonal R, R=2 • B based on independent forecast errors, empirically tuned variance
RESULTS – CLIMATE MEAN MAPPING Except for very short lead time, mapped forecast beats traditional forecast with or without bias correction Remapped forecast beats traditional forecast at all leads PERFECT INITIAL CONDITIONS 67% error reduction • Drift-induced errors much reduced • Shadowing period extended 3-fold
RESULTS – CLIMATE MEAN MAPPING REPLACEMENT 3-DVAR Remapped analysis beats traditional analysis In the presence of initial errors, error reduction is smaller (~20%)
ADAPTIVE MAPPING VECTOR ESTIMATION • Needed when • Systematic errors are regime dependent or • Climate means are not available • Iterative process: • Based on relatively small amount of data • Length of iteration period ~15 days • 10 iterations (~half year) • Allow first guess fields to drift with each iteration: M = Mprior + MIncr closer to model attractor ALGORITHM 1. Mprior = M = 0 2. Use M in mapping algorithm during next iteration period 3. 4. Update M by M = Mprior + MIncr Mapping vector estimate Repeat steps 2-4 for each iteration Iteration number
RESULTS – ADAPTIVE MAPPING Comparison with climate mean mapping 3-DVAR 20+% error reduction • Mapping vector varies over attractor • Adaptive mapping captures at least some regime-dependent fluctuations • Remapped forecasts with adaptive mapping beat those with climate mean mapping • Differences are • Small but • Highly singificant
CONCLUSIONS • “Perfectly” known state is not best initial condition for imperfect models • Intentionally moving initial condition away from nature, toward model attractor yields superior forecast • Mapped forecasts used in DA yield superior analysis • Adaptive, regime dependent mapping vector estimation needs less data and yields improved analysis/forecast performance Taking a step back brings us closer to reality • If, like a fly, attracted too close to the light you get burnt; • By staying back, we can better understand / simulate nature
DISCUSSION • Drastic departure from traditional thinking • Only limited attempts to deviate from fidelity paradigm in literature • Representativeness error • Schneider et al ocean anomaly initialization • M. Clark et al conceptual snow model • Concept of mapping proven with simple system • Works “even” or “only” in simple systems? • Applicable to more complex systems? Only testing will tell • If it would work in real applications, several implications • Improved forecasts • Model used properly, no drift, less arbitrariness • Improved analyses • After remapping • Easier assessment of model errors • “Asymptotic” model error • No need for lead-time dependent bias correction • Significant savings by not having to generate huge reforecast dataset
OUTLINE / SUMMARY • ANALYSIS ERRORS • Observational errors • Background errors • Chaotic errors • Model-related errors • Stochastic errors • Systematic errors • Tendency errors • State errors FORECAST DRIFT SEPARATE INITIAL VALUE AND MODEL-RELATED ERRORS - NEED DIFFERENT ACTIONS To reduce initial error, draw to “nature”; to reduce model error, draw to model attractor HOW TO REDUCE DRIFT-INDUCED FORECAST ERRORS? • Mapping initial state on model attractor • Estimating asymptotic errors • Reducing model-related errors • Reducing total forecast errors • IMPROVED ANALYSES