1 / 22

Forecasting Health Services with Time Series Methods: Case Study

The value of forecasts?. Professor of Economics, Yale University, September 1929

lark
Download Presentation

Forecasting Health Services with Time Series Methods: Case Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Forecasting Health Services with Time Series Methods: Case Study Tim Bruckner Assistant Professor Public Health & Planning, Policy, and Design University of California, Irvine tim.bruckner@uci.edu

    3. The value of forecasts? Professor of Economics, Yale University, September 1929 “Stock prices have reached what looks like a permanently high plateau”

    4. The value of forecasts? Professor of Economics, Yale University, September 1929 “Stock prices have reached what looks like a permanently high plateau” just before the stock market crash and Great Depression

    5. We must make decisions today Policymakers must allocate health budgets and set priorities based on, in part, expectations of future need/capacity Time series methods provide forecasting options that, in the long run, routinely outperform other regression methods 2-4 percent off of actual claims 2-4 percent off of actual claims

    6. Learning Objectives Be able to clearly delineate forecasting goals given the context of the situation Describe three general forms of autocorrelation in a time series Understand the univariate ARIMA forecasting strategy and its applications

    7. Case Study: Children’s Mental Health California’s publicly funded children’s Medicaid Early Periodic Screening, Diagnosis, and Treatment (EPSDT) program Services for children 5 to 21 years Serves 130,000 children per month Annual costs > $1 billion CA Dept of Mental Health wanted to improve their forecasting accuracy

    8. EPSDT: What is the context? GOALS: Point forecast of annual total costs (vs. interval, monthly) Error less than 4% 12 to 24 month lead time (2 years ahead) Flexibility– incorporate “what if” policy changes into forecasts Use their expertise rather than outsource Transferable process in the presence of staff turnover DATA: Of good quality? Consistently measured? Cost Data not immediately available—6 month delay What data are usable? Recent data, post-expansion, most relevant 2-4 percent off of actual claims 2-4 percent off of actual claims

    9. Their Original Forecast Stepwise Auto-Regression with Linear Trend Cost2008 = (Wt2007 x Cost2007) + (Wt2006 x Cost2006) + (Wt2005 x Cost2005) + (Wt2004 x Cost2004) + (Wt x TimeTrend) + Error Method weighted most recent years more heavily than past years Good accuracy: 2 - 4 % error But, 4% of $1 Billion = $40 Million! 2-4 percent off of actual claims 2-4 percent off of actual claims

    10. Quality of Forecasts based on ARIMA Forecasts are extrapolations of historical data A well-behaved history tends to lead to more accurate forecasts ex: distance between Moon and Earth on Jan 1, 2020 2-4 percent off of actual claims 2-4 percent off of actual claims

    11. Quality of Forecasts But . . . an erratic history tends to lead to less accurate forecasts “Forecasting is like driving a car blindfolded with help from someone looking out of the rear window” –anon. In health policy, we deal with stochastic (not deterministic) series, with varying levels of predictability 2-4 percent off of actual claims 2-4 percent off of actual claims

    12. EPSDT Costs, FY 1994 to 2006

    14. AR - I - MA Autoregressive: tendency for high or low values to exhibit “memory” in subsequent periods Shock stays in system indefinitely, but diminishes exponentially (ex: temperature set by thermostat) Integrated: time series has non-constant mean Differencing t-1 from t as a strategy (ex: odometer) Moving Average: shock persists for q observations and then is gone; “echo” in subsequent periods (ex: aftershock)

    15. Box-Jenkins (ARIMA) models expected value of costs is not its mean, due to patterns (e.g., trend, seasonality) earlier values of the dependent variable itself are used to remove patterns, so that expected value of residuals = 0 Use this best-fitting ARIMA model to perform “out of sample,” step-ahead forecasts The deviation of a particular point from the regression line (its predicted value) is called the residual value. Analogous to regression analysis—arrive at best-fitting model (based on history of dependent variable) and then examine residuals—because I’m most interested in independent variable explanatory power. Observed T-S is the realization of some underlying stochastic process, a realization that is used to build a model of the process which generated the series. Box-Jenkins makes no assumptions on the shape of the dependent variable— we have no a priori expectation and impose no filter. Then we build empirically the best-fitting ARIMA model based on empirically-derived characteristics of the series. Removing autocorrelation from the dependent variable before testing the effect of the independent variable yields the added benefit of avoiding spurious associations induced by shared trends and cycles. The estimated coefficients are net of shared autocorrelation. A few assumptions: 1. Homogeneous sense stationarity—process has to be level EYt=Theta0 (no drift or trends)—this can be accomplished by differencing (backward shift operator) Dth order differencing operator is applied to the series (p.46) 2. Stationary variance: single constant variance throughout its course Usually fine after differencing, or log-transformation of data (and first differencing of log-transformed data Leads to stationary variance. Because ARIMA models must be identified from data to be modeled, t-s of over 50 observations are recommended/req’d.The deviation of a particular point from the regression line (its predicted value) is called the residual value. Analogous to regression analysis—arrive at best-fitting model (based on history of dependent variable) and then examine residuals—because I’m most interested in independent variable explanatory power. Observed T-S is the realization of some underlying stochastic process, a realization that is used to build a model of the process which generated the series. Box-Jenkins makes no assumptions on the shape of the dependent variable— we have no a priori expectation and impose no filter. Then we build empirically the best-fitting ARIMA model based on empirically-derived characteristics of the series. Removing autocorrelation from the dependent variable before testing the effect of the independent variable yields the added benefit of avoiding spurious associations induced by shared trends and cycles. The estimated coefficients are net of shared autocorrelation. A few assumptions: 1. Homogeneous sense stationarity—process has to be level EYt=Theta0 (no drift or trends)—this can be accomplished by differencing (backward shift operator) Dth order differencing operator is applied to the series (p.46) 2. Stationary variance: single constant variance throughout its course Usually fine after differencing, or log-transformation of data (and first differencing of log-transformed data Leads to stationary variance. Because ARIMA models must be identified from data to be modeled, t-s of over 50 observations are recommended/req’d.

    16. Forecast FY 04-05 from FY01-FY04 Cannot assess accuracy for FY ’07-08 unless we wait for future observations to become available Instead, check the forecasting ability of model using data already at hand—FY 04-05 was last complete yr Univariate method: depends only on present and past values of the single series being forecasted The deviation of a particular point from the regression line (its predicted value) is called the residual value. Analogous to regression analysis—arrive at best-fitting model (based on history of dependent variable) and then examine residuals—because I’m most interested in independent variable explanatory power. Observed T-S is the realization of some underlying stochastic process, a realization that is used to build a model of the process which generated the series. Box-Jenkins makes no assumptions on the shape of the dependent variable— we have no a priori expectation and impose no filter. Then we build empirically the best-fitting ARIMA model based on empirically-derived characteristics of the series. Removing autocorrelation from the dependent variable before testing the effect of the independent variable yields the added benefit of avoiding spurious associations induced by shared trends and cycles. The estimated coefficients are net of shared autocorrelation. A few assumptions: 1. Homogeneous sense stationarity—process has to be level EYt=Theta0 (no drift or trends)—this can be accomplished by differencing (backward shift operator) Dth order differencing operator is applied to the series (p.46) 2. Stationary variance: single constant variance throughout its course Usually fine after differencing, or log-transformation of data (and first differencing of log-transformed data Leads to stationary variance. Because ARIMA models must be identified from data to be modeled, t-s of over 50 observations are recommended/req’d.The deviation of a particular point from the regression line (its predicted value) is called the residual value. Analogous to regression analysis—arrive at best-fitting model (based on history of dependent variable) and then examine residuals—because I’m most interested in independent variable explanatory power. Observed T-S is the realization of some underlying stochastic process, a realization that is used to build a model of the process which generated the series. Box-Jenkins makes no assumptions on the shape of the dependent variable— we have no a priori expectation and impose no filter. Then we build empirically the best-fitting ARIMA model based on empirically-derived characteristics of the series. Removing autocorrelation from the dependent variable before testing the effect of the independent variable yields the added benefit of avoiding spurious associations induced by shared trends and cycles. The estimated coefficients are net of shared autocorrelation. A few assumptions: 1. Homogeneous sense stationarity—process has to be level EYt=Theta0 (no drift or trends)—this can be accomplished by differencing (backward shift operator) Dth order differencing operator is applied to the series (p.46) 2. Stationary variance: single constant variance throughout its course Usually fine after differencing, or log-transformation of data (and first differencing of log-transformed data Leads to stationary variance. Because ARIMA models must be identified from data to be modeled, t-s of over 50 observations are recommended/req’d.

    17. Out of Sample Forecast: 04-05 Use monthly values from July 2001 to June 2004 (n=36) Identify autocorrelation and specify the appropriate error term 1-step ahead minimum mean squared error forecast (MMSE) The deviation of a particular point from the regression line (its predicted value) is called the residual value. Analogous to regression analysis—arrive at best-fitting model (based on history of dependent variable) and then examine residuals—because I’m most interested in independent variable explanatory power. Observed T-S is the realization of some underlying stochastic process, a realization that is used to build a model of the process which generated the series. Box-Jenkins makes no assumptions on the shape of the dependent variable— we have no a priori expectation and impose no filter. Then we build empirically the best-fitting ARIMA model based on empirically-derived characteristics of the series. Removing autocorrelation from the dependent variable before testing the effect of the independent variable yields the added benefit of avoiding spurious associations induced by shared trends and cycles. The estimated coefficients are net of shared autocorrelation. A few assumptions: 1. Homogeneous sense stationarity—process has to be level EYt=Theta0 (no drift or trends)—this can be accomplished by differencing (backward shift operator) Dth order differencing operator is applied to the series (p.46) 2. Stationary variance: single constant variance throughout its course Usually fine after differencing, or log-transformation of data (and first differencing of log-transformed data Leads to stationary variance. Because ARIMA models must be identified from data to be modeled, t-s of over 50 observations are recommended/req’d.The deviation of a particular point from the regression line (its predicted value) is called the residual value. Analogous to regression analysis—arrive at best-fitting model (based on history of dependent variable) and then examine residuals—because I’m most interested in independent variable explanatory power. Observed T-S is the realization of some underlying stochastic process, a realization that is used to build a model of the process which generated the series. Box-Jenkins makes no assumptions on the shape of the dependent variable— we have no a priori expectation and impose no filter. Then we build empirically the best-fitting ARIMA model based on empirically-derived characteristics of the series. Removing autocorrelation from the dependent variable before testing the effect of the independent variable yields the added benefit of avoiding spurious associations induced by shared trends and cycles. The estimated coefficients are net of shared autocorrelation. A few assumptions: 1. Homogeneous sense stationarity—process has to be level EYt=Theta0 (no drift or trends)—this can be accomplished by differencing (backward shift operator) Dth order differencing operator is applied to the series (p.46) 2. Stationary variance: single constant variance throughout its course Usually fine after differencing, or log-transformation of data (and first differencing of log-transformed data Leads to stationary variance. Because ARIMA models must be identified from data to be modeled, t-s of over 50 observations are recommended/req’d.

    18. ?12 indicates that the variable has been differenced at lag 12 (i.e., value at month t subtracted from value t+12). Yt is EPSDT costs during month t. B3 “autoregressive” parameter, implies that a proportion (estimated by phi that is always less than 1??) of the estimated value of y at month t is “remembered” into t+3. at is the error term at month t.

    20. Other ARIMA approaches “What if” scenarios: Intervention approach Quantify past shocks to predict impact of future changes Example: How did change in age qualification affect EPSDT? Multivariate approach Incorporate time series independent variables on the “right side” Example for EPSDT: Caseload, Services/Client, Unit Cost If you have lead data on these, in advance of total cost data, this is useful The deviation of a particular point from the regression line (its predicted value) is called the residual value. Analogous to regression analysis—arrive at best-fitting model (based on history of dependent variable) and then examine residuals—because I’m most interested in independent variable explanatory power. Observed T-S is the realization of some underlying stochastic process, a realization that is used to build a model of the process which generated the series. Box-Jenkins makes no assumptions on the shape of the dependent variable— we have no a priori expectation and impose no filter. Then we build empirically the best-fitting ARIMA model based on empirically-derived characteristics of the series. Removing autocorrelation from the dependent variable before testing the effect of the independent variable yields the added benefit of avoiding spurious associations induced by shared trends and cycles. The estimated coefficients are net of shared autocorrelation. A few assumptions: 1. Homogeneous sense stationarity—process has to be level EYt=Theta0 (no drift or trends)—this can be accomplished by differencing (backward shift operator) Dth order differencing operator is applied to the series (p.46) 2. Stationary variance: single constant variance throughout its course Usually fine after differencing, or log-transformation of data (and first differencing of log-transformed data Leads to stationary variance. Because ARIMA models must be identified from data to be modeled, t-s of over 50 observations are recommended/req’d.The deviation of a particular point from the regression line (its predicted value) is called the residual value. Analogous to regression analysis—arrive at best-fitting model (based on history of dependent variable) and then examine residuals—because I’m most interested in independent variable explanatory power. Observed T-S is the realization of some underlying stochastic process, a realization that is used to build a model of the process which generated the series. Box-Jenkins makes no assumptions on the shape of the dependent variable— we have no a priori expectation and impose no filter. Then we build empirically the best-fitting ARIMA model based on empirically-derived characteristics of the series. Removing autocorrelation from the dependent variable before testing the effect of the independent variable yields the added benefit of avoiding spurious associations induced by shared trends and cycles. The estimated coefficients are net of shared autocorrelation. A few assumptions: 1. Homogeneous sense stationarity—process has to be level EYt=Theta0 (no drift or trends)—this can be accomplished by differencing (backward shift operator) Dth order differencing operator is applied to the series (p.46) 2. Stationary variance: single constant variance throughout its course Usually fine after differencing, or log-transformation of data (and first differencing of log-transformed data Leads to stationary variance. Because ARIMA models must be identified from data to be modeled, t-s of over 50 observations are recommended/req’d.

    21. Practical Limitations Data availability often dictate approach Requires some practice ARIMA forecasts perform best for short lead times; long-term is questionable for any method Considerable heterogeneity of economic conditions in CA Omitting such estimates from our equations could have induced a type I error only if conceptions among blacks, Hispanics, and non-Hispanic whites moved above or below their expected values nine to fifteen months before monthly employment moved in the opposite direction around its expected value. We do not believe that this possibility is a compelling rival to our theory. Considerable heterogeneity of economic conditions in CA Omitting such estimates from our equations could have induced a type I error only if conceptions among blacks, Hispanics, and non-Hispanic whites moved above or below their expected values nine to fifteen months before monthly employment moved in the opposite direction around its expected value. We do not believe that this possibility is a compelling rival to our theory.

    22. Summary Time series methods outperform other forecasting approaches A clear understanding of planning goals and data attributes benefit any forecast Autoregressive, Integrated, and Moving Average parameters reflect three general forms of autocorrelation ARIMA forecasts perform best in stable systems, but can flexibly handle perturbations The deviation of a particular point from the regression line (its predicted value) is called the residual value. Analogous to regression analysis—arrive at best-fitting model (based on history of dependent variable) and then examine residuals—because I’m most interested in independent variable explanatory power. Observed T-S is the realization of some underlying stochastic process, a realization that is used to build a model of the process which generated the series. Box-Jenkins makes no assumptions on the shape of the dependent variable— we have no a priori expectation and impose no filter. Then we build empirically the best-fitting ARIMA model based on empirically-derived characteristics of the series. Removing autocorrelation from the dependent variable before testing the effect of the independent variable yields the added benefit of avoiding spurious associations induced by shared trends and cycles. The estimated coefficients are net of shared autocorrelation. A few assumptions: 1. Homogeneous sense stationarity—process has to be level EYt=Theta0 (no drift or trends)—this can be accomplished by differencing (backward shift operator) Dth order differencing operator is applied to the series (p.46) 2. Stationary variance: single constant variance throughout its course Usually fine after differencing, or log-transformation of data (and first differencing of log-transformed data Leads to stationary variance. Because ARIMA models must be identified from data to be modeled, t-s of over 50 observations are recommended/req’d.The deviation of a particular point from the regression line (its predicted value) is called the residual value. Analogous to regression analysis—arrive at best-fitting model (based on history of dependent variable) and then examine residuals—because I’m most interested in independent variable explanatory power. Observed T-S is the realization of some underlying stochastic process, a realization that is used to build a model of the process which generated the series. Box-Jenkins makes no assumptions on the shape of the dependent variable— we have no a priori expectation and impose no filter. Then we build empirically the best-fitting ARIMA model based on empirically-derived characteristics of the series. Removing autocorrelation from the dependent variable before testing the effect of the independent variable yields the added benefit of avoiding spurious associations induced by shared trends and cycles. The estimated coefficients are net of shared autocorrelation. A few assumptions: 1. Homogeneous sense stationarity—process has to be level EYt=Theta0 (no drift or trends)—this can be accomplished by differencing (backward shift operator) Dth order differencing operator is applied to the series (p.46) 2. Stationary variance: single constant variance throughout its course Usually fine after differencing, or log-transformation of data (and first differencing of log-transformed data Leads to stationary variance. Because ARIMA models must be identified from data to be modeled, t-s of over 50 observations are recommended/req’d.

    23. Computer Resources STATA: tsset; arima; predict; corrgram; ac; pac SAS: proc ARIMA SCA: http://www.scausa.com/ Considerable heterogeneity of economic conditions in CA Omitting such estimates from our equations could have induced a type I error only if conceptions among blacks, Hispanics, and non-Hispanic whites moved above or below their expected values nine to fifteen months before monthly employment moved in the opposite direction around its expected value. We do not believe that this possibility is a compelling rival to our theory. Considerable heterogeneity of economic conditions in CA Omitting such estimates from our equations could have induced a type I error only if conceptions among blacks, Hispanics, and non-Hispanic whites moved above or below their expected values nine to fifteen months before monthly employment moved in the opposite direction around its expected value. We do not believe that this possibility is a compelling rival to our theory.

More Related