1 / 38

Shrinkage Estimation of Vector Autoregressive Models

Shrinkage Estimation of Vector Autoregressive Models. Pawin Siriprapanukul pawin@econ.tu.ac.th 11 January 2010. Introduction (1). We want to forecast: The rate of growth of employment, The change in annual inflation, The change in federal fund rate.

breena
Download Presentation

Shrinkage Estimation of Vector Autoregressive Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Shrinkage Estimation ofVector Autoregressive Models Pawin Siriprapanukul pawin@econ.tu.ac.th 11 January 2010

  2. Introduction (1) • We want to forecast: • The rate of growth of employment, • The change in annual inflation, • The change in federal fund rate. • A standard and simple system approach in economics is the VAR.

  3. Introduction (2) • OLS provides the efficient estimator for the VAR. • However, there are a lot of evidences showing that Bayesian VAR outperforms unrestricted OLS VAR in out-of-sample forecasting: • Litterman (1986), and Robertson and Tallman (1999).

  4. Introduction (3) • Banbura et al. (2008) also show that it is possible and satisfactory to employ many endogenous variables with long lags in the Bayesian VAR (131 var, 13 lags). • We see some studies following this direction.

  5. Introduction (4) • There is another related literature in forecasting using large number of predictors in the model. • A popular method is the “Approximate Factor Model”, proposed by Stock and Watson (2002).

  6. Introduction (5) • In this literature, it was shown that using larger number of predictors (independent variables) does not always help improve the forecasting performances. • Bai and Ng (2008) show that selecting variables using the LASSO or the elastic net, before applying the methodology of the approximate factor model can outperform bigger models.

  7. Introduction (6) • Even they interpret their results differently, we see that this is an evidence of redundancy of models with large predictors. • Now, considering VAR with large endogenous variables and long lags, we think that redundancy should be the case as well.

  8. Introduction (7) • We have not gone into VAR with large endogenous variables yet. But we are working with 13 lags in the VAR.

  9. Bias-Variance Tradeoff (1) • Suppose OLS estimate is unbiased. • Gauss-Markov Theorem: • OLS estimate has the smallest variance among all linear unbiased estimates. • However, we know that there are some biased estimates that have smaller variances than the OLS estimate.

  10. OLS; Unbiased, but High Variances True Model x x Shrinkage; Biased, but Small Variance Bias-Variance Tradeoff (2)

  11. VAR (1) • We consider a VAR relationship. • Note here that we cannot write the bias-variance tradeoff for the VAR. • The OLS estimate is biased under finite sample. • We still think similar logic applies. However, direction of shrinkage may be important.

  12. VAR (2) • With T observations, we have: where We assume

  13. VAR (3) • The unrestrictedOLS estimator is: • This estimator may not be defined if we have too many endogenous variables or too many lags.

  14. Bayesian VAR (1) • This is a shrinkage regression. • We follow Kadiyala and Karlson (1997) and Banbura et al. (2008) to use the Normal-(Inverted)-Wishart as our prior distribution. • We work with stationary and demeaned variables. Hence, we set the mean of prior distribution at zero.

  15. Bayesian VAR (2) • We can write the (point) estimator of our Bayesian VAR estimate as: • where

  16. Ridge Regression (1) • Well-known in statistical literature. • Can be defined as: • This is a regression that imposes a penalty on the size of the estimated coefficients.

  17. Ridge Regression (2) • The solution of the previous problem is: • Observe the similarity with:

  18. BVAR v RR (1) • Proposition 1: • BVAR estimator can be seen as the solution of the optimization problem: • where is the (j,j)-th element of the matrix .

  19. BVAR v RR (2) • Proposition 2: • Let , we have: • Where • Note: If , is just standardized .

  20. LASSO (1) • Least Absolute Shrinkage and Selection Operator. • The LASSO estimate can be defined as:

  21. LASSO (2) • LASSO is proposed because: • Ridge regression is not parsimonious. • Ridge regression may generate huge prediction errors under sparse matrix of true (unknown) coefficients. • LASSO can outperform RR if: • True (unknown) coefficients are composed of a lot of zeros.

  22. LASSO (3) • If there are a lot of irrelevantvariables in the model, setting their coefficients at zeros every time can reduce variancewithout disturbing the bias that much. • We see that VAR with 13 lags may possess a lot of irrelevant variables.

  23. The Elastic Net (1) • Zou and Hastie (2005) propose another estimate that can further improve the performance of LASSO. • It is called the elastic net, and the naïve version can be defined as:

  24. The Elastic Net (2) • We modify the elastic to allow treating different lagged variables differently. • Our modified naïve elastic net is:

  25. Implementation • We can use the algorithm called “LARS” proposed by Efron, Hastie, Johnstone, and Tibshirani (2004) to implement both LASSO and EN efficiently. • This can be applied to our modified version as well.

  26. Empirical Study (1) • I use the US data set from Stock and Watson (2005). • Monthly data cover Jan 1959 – Dec 2003. • There are 132 variables. But I use only 7. • I transformed the data as in De Mol, Giannone, and Reichlin (2008) to obtain stationary. • Their replication file can be downloaded. • Their transformation make every variable to be annual growth or change in annual growth.

  27. Empirical Study (2) • Out-of-sample performances. • In each month from Jan 1981 to Dec 2003 (276 times), regress one model using the most recent 120 observations, to make one forecast. • The performances are measured using Relative Mean Squared Forecast Errors (RMSFE), using OLS as the benchmark regression.

  28. Empirical Study (3) • There are 3 variables that we want to forecast: • The employment (EMPL) • The annual inflation (INF) • The Federal Fund Rate (FFR). • The order of VAR is p = 13. • There are 4 forecast horizons (1,3,6,12), and 3 values of (0,1,2).

  29. Empirical Study (4) • The most time-consuming part is to figure out suitable parameters for each regression. • We use grid searches on out-of-sample performances during the test period Jan 1971 – Dec 1980 (120 times). • Bayesian VAR: We employ the process in my previous chapter. • LASSO: A grid of 90 values. • Modified Elastic Net: A grid of 420 pairs of values.

  30. Empirical Study (5) • We also employ the combination of LASSO and Bayesian VAR as well. • LASSO discards some variables that tend to correspond with zero true coefficients. • Bayesian VAR is similar to ridge regression, which assigns better amount of shrinkage to positive coefficients.

  31. Empirical Study (6) • For the smallest model, we use the 3 variables to forecast themselves.

  32. Empirical Study (7)

  33. Empirical Study (8)

  34. Empirical Study (9) Comparing different regressions. Pi = 0

  35. Empirical Study (10) Comparing different regressions. Pi = 0

  36. Empirical Study (11) When we change to 7-variable VAR.

  37. Conclusion • Even the empirical results are not impressive, we still think this is a promising way to improve the performances of Bayesian VARs. • When the model becomes bigger, e.g. models with 131 endogenous variables, this should be more relevant. • We can think of some cautions like Boivin and Ng’s (2006) for the VAR as well.

  38. Thank you very much.

More Related