Multiple Regression

Multiple Regression Dr. C. Ertuna

Multiple Regression (CLR) (5.2) A MultipleLinear Regression Model will look like

Assumptions of CLR Model • Linearity • Variance in X: Var(X) ≠ 0 • Non-Stochastic X: Cov(Xs, ut) = 0 • No Multicollinearity • Homoskedasticity: Var(ut) = s2 • Zero expected value for Residuals: E(ut) = 0 • Serial Independence: Cov(us, ut) = 0 • Normality of Residuals Residuals

Goodness of Fit Measures : cannot be used to compare models with different number of explanatory variables, even if the functional forms of models are the same. Additional repressors will decrease RSS and hence increase . takes into account the number of additional Regressor. < (p.65 Eq. 5.57)

Goodness of Fit Measures (Cont.) AIC(Akaike Information Criteria): Best criteria among the alternatives. The decision criteria is that the model with the lowest AIC is the best. AIC = (RSS/n)*exp(2*k/n) Number of Regressors + Intercept Number of Observations Residual Sum of Squares

Goodness of Fit Measures (Cont.) AIC(Akaike Information Criteria): Best criteria among the alternatives. The decision criteria is that the model with the lowest AIC is the best. AIC = n*ln(RSS/n)+2*k Number of Regressors + Intercept Number of Observations Residual Sum of Squares

Testing Model Parameters For several reasons a researcher may wish to test whether certain regression parameters (or a certain set of regression parameters) are equal to one another or to a specific value. For example: Are the impact of two Regressors ( and same? That means; Is ? It is like setting restrictions to those parameters. That is why it is also called “Testing Linear Restrictions”

Restricted and UnrestrictedModels Restrictions on a model can take different forms: • Combination of coefficients assume certain value: for example • One coefficient assumes certain value: for example Or in the Redundant Variable case (meaning is redundant Regressor – does not contribute to the explanation of the model)

Testing Linear Restrictions There are several methods to test linear restrictions, such as: • Likelihood Ratio Test (based on estimation of both restricted and unrestricted model), • Wald Test (based on estimation of unrestricted model), and • Lagrange Multiplier Test (based on estimation of restricted model).

Restricted and UnrestrictedModels Unrestricted model means a model prior to any restrictions. In general unrestricted models have more Regressors than the restricted versions. The fundamental concept behind any linear restriction test is that the RSS of Restricted model (with fewer explanatory variables) is grater than the RSS of Unrestricted model (that has more explanatory variables).

Ho: of Linear Restriction Tests In all three approaches of linear restriction tests the Null Hypothesis is as follows Ho: There is no difference between restricted and unrestricted models in terms of Goodness of Fit, hence we don’t need extra Regressor(s) (or the unrestricted model). To express it another way, we can say that the change in the Goodness of fit between two specification is statistically insignificant.

Ha: of Linear Restriction Tests If on the other hand p-value < α Then the unrestricted model provides a better Goodness of Fit than the restricted model. For example if the restriction is and p-value < α that means that or in other words Regressor is not redundant, it does contribute to the explanation of the model.

Linear Restriction Application t-test in the SPSS’s output of Coefficient Table is a special case for Wald test. Ho: = 0 (it tests whether is redundant or not.) F-test in the SPSS’s output of ANOVA Table is a form of Likelihood Ratio test. Ho: = = = 0 (joint significance of the X’s are tested)

Example: Omitted Variable Test Page 78.Two Models Model-1: Table 5.4 & Model-2: Table 5.5 Using the F-form of the Likelihood Ratio test. F(, N - ) = p-value = FDIST(F-value; ; N - )

Definition of Test Parameters = Residual Sum of Squares for Restricted Model (Model with fewer variables) = Residual Sum of Squares for Unrestricted Model (Model with fewer variables) = Number of parameters (included the intercept) in the Unrestricted model. = Number of parameters (included the intercept) in the Unrestricted model. N = Number of observations of the unrestricted model.

Steps in Omitted Variable Test F(, N - ) = p-value = FDIST(F-value; ; N - ) 1) Get RSS and k of Restricted Model (the model with fewer variables) 2) Get RSS and k and Nof Unrestricted Model (the model with more variables) 3) Apply the F-form of the Likelihood Ratio test (get organized and use Excel for computation).

Omitted Variable Test: Step-1

Omitted Variable Test: Step-2

Omitted Variable Test: Step-3 Use formula on page 70 to compute on Excel F-value where, Model-1 is restricted model Why? Because it doesn’t have variable EDUC or in other words Regression Parameter for EDUC is set to zero in Model-1 (restriction). and than use =FDIST() to compute the p-value.

F-form of the Likelihood Ratio test

DECISION Since the p-value is smaller than alpha we decide that the restrictions does not hold. In other words unrestricted model is better than the restricted model. Particularly, variable “EDUC” should be part of the model’s explanatory variables.

END

Test for marginal contribution of new variable • Very useful test in deciding if a new variable should be retained in the model • Eg: mortality rate of a country is a function of its National Income, literacy rate, health indicators. • Question is should we include per capita income in the model. • Estimate a model without PCI and get Rsq(old). • Re-estimate including PCI and get its Rsq(new). Ho: Addition of new variable does not improve the model H1: Addition of new variable improves the model If estimated F is higher than critical F table value, reject null hypothesis. It means PCI needs to be included in the above example.

Testing equality of coefficients • To test if 2 slope coefficients are equal • T test approach, Ho: = ↔ - = 0

Testing linear equality restriction • Theory might lead you impose certain “a priori” restrictions in your model • Eg: constant returns to scale in Cob-Douglas model • b2+b3=1 This is a linear restriction • How do you check this is valid or not • One way is using t test: Ho: + = 1↔ +- 1 = 0

Multiple Regression