- 110 Views
- Uploaded on
- Presentation posted in: General

STAT 497 LECTURE NOTE 12. COINTEGRATION. Multivariate Unit Root Processes.

STAT 497 LECTURE NOTE 12

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

STAT 497LECTURE NOTE 12

COINTEGRATION

- Generally wecannotrejectthenullhypothesis,thatmany time serieshaveunitroots.Forexample,logconsumptionandlogoutputarebothnon-stationary,butlogconsumption –logoutputisstationary.Thissituationiscalledcointegration.Thepracticalproblemisthatwhenwehavecointegration,asymptoticschangecompletely.Furthermore,wereallydonothaveenoughdata todefinitivelytellwhetherornotwehavecointegratedseries.

- In a univariate nonstationary time series Ytis said to be integrated of order d, I(d), if its (d1)th difference is nonstationary but d-th difference is stationary.
- If Yt is nonstationary but Yt=(1B)Ytis stationary, then Ytis integrated of order 1.
Yt~I(1) but Yt~I(0)

- In many time series, integrated processes are considered together and they form equilibrium relationships.
- Short-term and long-term interest rates
- Income and consumption
- These leads to the concept of cointegration.
- The idea behind the cointegration is that although multivariate time series is integrated, certain linear transformations of the time series may be stationary.

- According to Granger, causality can be further sub-divided into long-run and short-run causality.
- This requires the use of error correction models or VECMs, depending on the approach for determining causality.
- Long-run causality is determined by the error correction term, whereby if it is significant, then it indicates evidence of long run causality from the explanatory variable to the dependent variable.
- Short-run causality is determined as before, with a test on the joint significance of the lagged explanatory variables, using an F-test or Wald test.

- Before the ECM can be formed, there first has to be evidence of cointegration, given that cointegration implies a significant error correction term, cointegration can be viewed as an indirect test of long-run causality.
- It is possible to have evidence of long-run causality, but not short-run causality and vice versa.
- In multivariate causality tests, the testing of long-run causality between two variables is more problematic, as it is impossible to tell which explanatory variable is causing the causality through the error correction term.

- If we regress a y series with unit root on regressors who also have unit roots the usual t tests on regression coefficients show statistically significant regressions, even if in reality it is not so.
- TheSpuriousRegressionProblem can appearwith I(0) series (seeGranger, Hyung and Jeon (1998)). ThisistellingusthattheproblemisgeneratedbyusingWRONG CRITICAL VALUES!!!!
- In a SpuriousRegressiontheerrorswouldbecorrelated and thestandard t-statisticwillbewronglycalculatedbecausethevariance of theerrorsisnotconsistentlyestimated. In the I(0) case thesolutionis:

- How do we detect a Spurious Regression (between I(1) series)?
Looking at the correlogram of the residuals and also by testing for a unit root on them.

- How do we convert a Spurious Regression into a valid regression?
By taking differences.

- Does this solve the SPR problem?
It solves the statistical problems but not the economic interpretation of the regression. Think that by taking differences we are loosing information and also that it is not the same information contained in a regression involving growth rates than in a regression involved the levels of the variables.

Typical symptom: “High R2, t-values, F-value, but low DW”

1. Egyptian infant mortality rate (Y), 1971-1990, annual data, on Gross aggregate income of American farmers (I) and Total Honduran money supply (M)

Y ^= 179.9 - .2952 I - .0439 M, R2 = .918, DW = .4752, F = 95.17

(16.63) (-2.32) (-4.26) Corr = .8858, -.9113, -.9445

2. US Export Index (Y), 1960-1990, annual data, on Australian males’ life expectancy (X)

Y ^= -2943. + 45.7974 X, R2 = .916, DW = .3599, F = 315.2

(-16.70) (17.76) Corr = .9570

3. US Defense Expenditure (Y), 1971-1990, annual data, on Population of South African (X)

Y ^= -368.99 + .0179 X, R2 = .940, DW = .4069, F = 280.69

(-11.34) (16.75) Corr = .9694

4. Total Crime Rates in the US (Y), 1971-1991, annual data, on Life expectancy of South Africa (X)

Y ^= -24569 + 628.9 X, R2 = .811, DW = .5061, F = 81.72

(-6.03) (9.04) Corr = .9008

5. Population of South Africa (Y), 1971-1990, annual data, on Total R&D expenditure in the US (X)

Y ^= 21698.7 + 111.58 X, R2 = .974, DW = .3037, F = 696.96

(59.44) (26.40) Corr = .9873

- Does it make sense a regression between two I(1) variables?
Yes if the regression errors are I(0).

- Can this be possible?
The same question asked David Hendry to Clive Granger time ago. Clive answered NO WAY!!!!! but he also said that he would think about. In the plane trip back home to San Diego, Clive thought about it and concluded that YES IT IS POSSIBLE. It is possible when both variables share the same source of the I(1)’ness (co-I(1)), when both variables move together in the long-run (co-move), ... when both variables are COINTEGRATED!

- An mx1 vector time series Yt is said to be cointegrated of order (d, b), CI(d,b) where 0<bd, if each of its component series Yit is I(d) but some linear combination of the series ’Ytis I(db) for some nonzero constant vector ’.
- ’ is the cointegrating vector or the long run parameter and it is not unique.
- The most common case is d=b=1.

- More generally, if the mx1 vector series Ytcontains more than two components, each being I(1), then there may exist k (<m) linearly independent 1xm vectors 1’, 2’,…, k’, such that ’Ytis a nonstationary kx1 vector process where
is a kxm cointegrating matrix.

- The number of linearly independent cointegrating vectors is called the cointegrating rank.
Yt is cointegrated of rank k.

- Consider the following system of processes
where the three error terms are uncorrelated white noise processes. Clearly, all those three processes are individually I(1). Let yt=(x1t,x2t,x3t)’ and =(1,1,2), thenyt=a1t. which is a I(0)process. Another cointegrating relationship is between x2t and x3t. So, we can let *= (0, 1,−3), then *yt= a2tis also I(0).

- Let Ytbe mx1. Suppose we estimate VAR(p)
or

- Let say we have a unit root. Then, we can write
- ThisislikeamultivariateversionoftheaugmentedDickey- Fullertest

- Rearranging the equation
whereRank((1)I)<m.Therearetwocases:

- (1)= Ithenwehave m independent unitroots,sothereisnocointegration,andweshouldruntheVARindifferences.
- 0<Rank((1)I)=k<m, thenwecanwrite(1)I =’ whereandaremxk.Theequationbecomes:
Thisiscalledavectorerrorcorrectionmodel (VECM).

- NotethatifyourunOLSindifferences,thenthemodeledismisspecifiedand theresultswillbebiased.Whatcanyoudo?
(a)Ifyouknowthelocationoftheunitrootsandcointegrationrelations,thenyoucanrun the VECM bydoingOLSofYtonlagsofYand’Yt1.

(b)Ifyou knownothing,thenyoucaneither(i)runOLSinlevels,or(ii)test(manytimes)toestimatecointegratingrelations,andrun VECM.Theproblemwiththisapproachisthatyou aretestingmanytimesandyou areestimatingcointegratingrelationships.Thisleadstopoorfinitesampleproperties.

- Procedures designed to distinguish a system without cointegration from a system with at least one cointegrating relationship; they do not estimate the number of cointegrating vectors (the k). Tests are conditional on a pretest for unit roots in each of the variables.
- When the cointegration vector is known: construct the hypothesized linear combination that is stationary, treat it as data, and apply a Dickey-Fuller unit root test to that linear combination. The null hypothesis is that there is a unit root, or no cointegration.

- When the cointegration vector is not known: Assume that, if there exists a cointegrating relation, the coefficient on Y1t is nonzero, allowing us to express the “static regression equation as
- You can apply a unit root test to the estimated OLS residual from estimation of the above equation, but
- Include a constant in the static regression if the alternative allows for a nonzero mean in ut
- Include a trend in the static regression if the alternative is stochastic cointegration, i.e., a nonzero trend for A’Yt.

- The first step in testing cointegration is to test the null hypothesis of a unit root in each component series Yit individually using the univariate unit root tests.
- If the hypothesis is rejected, then the next step is to test cointegration among the components, i.e., to test whether ’Ytis stationary.

- In practice, the cointegration vector is unknown. One way to test the existence of cointegration is the regression method (Engle&Granger, 1986, 1987).
- If Yt=(Y1t,Y2t,…,Ymt) is cointegrated, ’Ytis stationary where =(1, 2,…, m). Then, (1/1) is also a cointegrated vector where 10.

- Consider the regression model for Y1t
and check whether t is I(1) or I(0).

- If t~I(1), thenYtis not cointegrated.
- If t~I(0), thenYtis cointegrated with a normalizing cointegrating vector
’=(1,1,…, m1) .

- In testing the error series for nonstationary,
- Calculate the OLS estimate
- Use the residual series for the test using the standard ADF or PP .
- if t~I(0).
- H0: =1 vs H1: <1 for the model
- H0: =0 vs H1: <1 for the model

- t-statistic:
- The critical values are obtained by simulation (Engle&Granger, 1987).
Level of significance1%5%

p=14.07 3.37

p>1 3.73 3.17

- If T<Critical Value, reject H0Cointegration exists.

- To test whether the variables are cointegrated or not, one of the well-known tests is the Johansen trace test. The Johansen test is used to test for the existence of cointegration and is based on the estimation of the ECM by the maximum likelihood, under various assumptions about the trend or intercepting parameters, and the number k of cointegrating vectors, and then conducting likelihood ratio tests.

- Assuming that the ECM errors are independent Nm[0, ] distribution, and given the cointegrating restrictions on the trend or intercept parameters, the maximum likelihood Lmax(k) is a function of the cointegration rank k.
- The trace test is based on the log-likelihood ratio ln[Lmax(k)/Lmax(k)], and is conducted sequentially for
k = m-1,...,1,0. The name comes from the fact that the test statistics involved are the trace (the sum of the diagonal elements) of a diagonal matrix of generalized eigenvalues. This test examines the null hypothesis that the cointegration rank is less than or equal to k, against the alternative that the cointegration rank is greater than k. If the trace is greater than the critical value for a certain rank, then the null hypothesis that the cointegration rank is equal to k is rejected.

- Consider a non-stationary cointegrated VAR(p) model
where atare normally distributed with mean 0 and covariance matrix . In a series of influential papers, Johansen (1988, 1991), and Johansen and Juselius (1990) proposed practical full maximum likelihood estimation and testing approaches based on the error correction representation (ECM).

- Consider the ECM
where ,dt is a vector of deterministic variables, such as constant and seasonal dummy variables,

are m×m, ,A and arem×kparameter matrices, the are i.i.d. Nm(0, ) errors, and

det() has all of its roots outside the unit circle.

- This ECM is based on the Engle-Granger (1987) error correction representation theorem for cointegrated systems, and the asymptotic inference involved is related to the work of Sims, Stock, and Watson (1990).
- By step-wise concentrating all the parameter matrices in the likelihood function out except for the matrix A, Johansen shows that the maximum likelihood estimator of A can be derived as the solution of a generalized eigenvalue problem. Likelihood ratio tests of hypotheses about the number of cointegrating vectors can then be based on these eigenvalues. Moreover, Johansen (1988) also proposes likelihood ratio tests for linear restrictions on these cointegrating vectors.

- The Johansen test for the existence of cointegration is based on the estimation of the above ECM by the maximum likelihood and is used to test the hypothesis
, where k is less than m. This formulation shows that I(1) models form nested sequence models

where H(m) is the unrestricted VAR model or I(0) model, and H(0) corresponds to the restriction =0, which is the VAR model for indifferences. Since , it is equivalent to test that A and are of full column rank k, the number of independent cointegrating vectors that forms the matrix A. The test has been named the Johansen trace test because the likelihood ratio test statistic is the trace of a diagonal matrix of generalized eigenvalues from .

Sequential tests:

i. H0: k=0, cannot be rejected →stop

(at most zero coint) rejected →next test

ii. H0: k<=1, cannot be rejected →stop→k=1

(at most one coint) rejected →next test

iii. H0: k2, cannot be rejected →stop→k=2

(at most two coint) rejected →next test

(i) Rank k = m: all variables in x are I(0), not an interesting case to start with.

(ii) Rank k = 0: there are no linear combinations of x that are I(0), no cointegration exists, and is full of zeros. Model on differenced series

(iii) Rank k (m-1): up to (m-1) cointegration relationships ´xt-k.

i.e. k (m-1) rows of form klinearly independent combinations of variables in x, each of which is I(0); alternatively (m-k) nonstationary vectors forming I(1) stochastic trends.

- Under some regularity conditions, we can write the cointegrated process as an Error Correction Model (ECM):
where is the difference operator , the at's are i.i.d. N(0, ).

- We can write this ECM as
where , , ,

- The likelihood ratio statistic for hypothesis
is given by

where denotes the eigenvalues of

and are ordered by

Where

- If the test statistics are greater than the critical value for rank k, then the null hypothesis that the cointegration rank is equal to k is rejected.

- The statistic ln has the following limiting distribution which can be expressed in terms of a mkdimensional Brownian motion as
- The percentiles of the asymptotic distribution for the trace statistic are tabulated in Johansen (1988, Table 1) using simulation analysis.

- An alternative LR statistic, given by
and called the maximal eigenvalue statistic, examines the null hypothesis of kcointegrating vectors versus the alternative k+1cointegrating vectors. The asymptotic distribution of this statistic is given by the maximum eigenvalue of the stochastic matrix in

- Consider the following four-dimensional system of U.S. economic variables. Quarterly data for the years 1954 to 1987 are used (Lütkepohl 1993, Table E.3.). The following statements plot the series and proceed with the VARMAX procedure.

symbol1 v=none height=1 c=black;

symbol2 v=none height=1 c=black;

title 'Analysis of U.S. Economic Variables';

data us_money;

date=intnx( 'qtr', '01jan54'd, _n_-1 );

format date yyq. ;

input y1 y2 y3 y4 @@;

y1=log(y1);

y2=log(y2);

label y1='log(real money stock M1)' y2='log(GNP in bil. of 1982 dollars)' y3='Discount rate on 91-day T-bills' y4='Yield on 20-year Treasury bonds';

datalines;

... data lines omitted ... ;

legend1 across=1 frame label=none;

proc gplot data=us_money;

symbol1 i = join l = 1;

symbol2 i = join l = 2;

axis2 label = (a=-90 r=90 " ");

plot y1 * date = 1 y2 * date = 2 / overlay vaxis=axis2 legend=legend1;

run;

proc gplot data=us_money;

symbol1 i = join l = 1;

symbol2 i = join l = 2;

axis2 label = (a=-90 r=90 " ");

plot y3 * date = 1 y4 * date = 2 / overlay vaxis=axis2 legend=legend1;

run;

proc varmax data=us_money;

id date interval=qtr;

model y1-y4 / p=2 lagmax=6 dftest print=(iarr(3)) cointtest=(johansen=(iorder=2)) ecm=(rank=1 normalize=y1); cointeg rank=1 normalize=y1 exogeneity;

run;

- This example performs the Dickey-Fuller test for stationarity, the Johansen cointegrated test integrated order 2, and the exogeneity test. The VECM(2) fits the data. From the outputs shown below, you can see that the series has unit roots and is cointegrated in rank 1 with integrated order 1. The fitted VECM(2) is given as

Whether each variable is the weak exogeneity of other variables. The variable y1 is not the weak exogeneity of other variables, y2, y3, and y4; the variable y2 is not the weak exogeneity of other variables, y1, y3, and y4.

If a variable can be taken as "given" without losing information for the purpose of statistical inference, it call weak exogenous.

Weak exogeneityLong-run noncausality