120 likes | 170 Views
Learn about assumptions of regression analysis, dealing with multicollinearity and heteroscedasticity, and solutions to maintain reliable estimates in modeling. Explore the impact of these issues and how to address them effectively.
E N D
Assumptions of Regression Analysis • The independent variables do not form a linearly dependent set--i.e. the explanatory variables are not perfectly correlated. • Homoscedasticity--the probability distributions of the error term have a constant variance for all values of the independent variables (Xi's).
Perfect multicollinearity is a violation of assumption (1).Heteroscedasticity is a violation of assumption (2)
Multicollinearity is a problem with time series regression Suppose we wanted to estimate the following specification using quarterly time series data: Auto Salest = 0 + 1Incomet + 2Pricest where Incomet is (nominal) income in quarter t and Pricest is an index of auto prices in quarter t. The data reveal there is a strong(positive) correlation betweennominal income and car prices
Approximate linear relationship between explanatory variables Car prices 0 (Nominal) income
Why is multicollinearity a problem? • In the case of perfectly collinear explanatory variables, OLS does not work. • In the case where there is an approximate linear relationship among the explanatory variables (Xi’s), the estimates of the coefficients are still unbiased, but you run into the following problems: • High standard errors of the estimates of the coefficients—thus low t-ratios • Co-mingling of the effects of explanatory variables. • Estimates of the coefficients tends to be “unstable.”
What do about multicollinearity • Increase sample size • Delete one or more explanatory variables
Understanding heteroscedasticity This problem pops up when using cross sectional data
Consider the following model: Yi is the “determined” part of the equation and εi is the error term. Remember we assume in regression that :E(εi) =0
JAR #1 JAR #2 4 400 -4 -400 0 0 -2 -200 200 2 = 0 = 0 Two distributions with the same mean and different variances
The disturbance distributions of heteroscedasticity f(x) Y 0 X1 X2 X2 X
Scatter diagram of ascending heteroscedasticity Spending for electronics Household Income
Why is heteroscedasticity a problem? • Heteroscedasticity does not give us biased estimates of the coefficients--however, it does make the standard errors of the estimates unreliable. That is, we will understate the standard errors. • Due to the aforementioned problem, t-tests cannot be trusted. We run the risk of rejecting a null hypothesis that should not be rejected.