Chapter 4: Simple or Bivariate Regression

1 / 106

# Chapter 4: Simple or Bivariate Regression - PowerPoint PPT Presentation

Chapter 4: Simple or Bivariate Regression. Terms Dependant variable (LHS) the series we are trying to estimate Independent variable (RHS) the data we are using to estimate the LHS. The line and the regression line. Y = f(X)…there is assumed to be a relationship between X and Y.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Chapter 4: Simple or Bivariate Regression

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Chapter 4: Simple or Bivariate Regression

Terms

• Dependant variable (LHS)

the series we are trying to estimate

• Independent variable (RHS)

the data we are using to estimate the LHS

The line and the regression line
• Y = f(X)…there is assumed to be a relationship between X and Y.
• Y = Mx + b

Because the line we are looking for is an estimate of the population, and not every observation falls on the estimate of the line we have error (e).

• Y = b0+ b1X1+e
What isb
• b0represents the intercept term.
• b1 represents the slope of the estimated regression line.

This term (b1) can be interpreted as the rate of change in Y with per unit change in X…just like a simple line eq.

Population vs Sample
• Y = b0 + b1X1 + e
• = b0 + b1X1 + e

Population (We don’t often have this data)

Sample (We usually have this)

Y - = e (a.k.a. error or the residuals)

Residuals another way
• Residuals can also be constructed by solving for e in the regression equation.
• e = Y - b0 – b1*X
The goal of Ordinary Least-Squares Regression (the type we are going to use)
• Minimize the sum of squared residuals.
• We could calculate the regression line and the residuals by hand….but, we ain’t gonna.
First step: ALWAYS, look at your data
• Plot it against time, or
• Plot it against your dependent variable.
• Why?...because dissimilar data can potentially generate very similar summary statistics…pictures help discern the differences…
Dissimilar data with similar stats

X’s have the same mean and St. Dev.

Y’s have the same mean and St. Dev.

From this we might conclude that each of the data sets are identical, but we’d be wrong

What do they look like?

Although, they result in the same OLS regression,

they are very different.

Forecasting: Simple Linear Trend

Disposable Personal Income (DPI)

• It’s sometimes reasonable to make a forecast on the basis of just a linear trend, where Y is just assumed to be a function of (T) or time.
• The regression looks like the following:
• Where Y(hat) is the series you want to estimate. In this case, it’s DPI.
To forecast with Simple OLS in ForecastX…
• You need to construct an index of T

For this data set, there

are 144 months.

Index goes from 1-144

T

The Data

Jan 1993: DPI1= 4588.58 + 27.93 (1) = 4616.51

Feb 1993: DPI2= 4588.58 + 27.93 (2) =4644.44

.

.

.

Dec 2004: DPI144= 4588.58 + 27.93 (144) =8610.50

Dec 2004: DPI145= 4588.58 + 27.93 (145) =8638.43

And, so on…

To forecast, we just need the index for the month (T)
Output

Hypothesis test for slope = 0 and intercept = 0…What does it say

Sampling

distribution

of

138.95

297.80

Reject H0

Reject H0

Do Not Reject H0





t

0

2.045

-2.045

Just to note…
• In the previous model, the only thing we are using to predict DPI is the progression of time.
• There are many more things that have the potential of increasing or decreasing DPI.
• We don’t account for anything else…yet.
The benefits of regression
• The true benefits of regression models is in its ability to examine cause and effect.
• In trend models (everything we’ve seen until now), we are depending on observed patterns of past values to predict future values.
• In a Causal model, we are hypothesizing a relationship between the dependent variable (the variable we are interested in predicting) and one or more independent variables (the data we use to predict).
Back to Jewelry
• There many things that might influence the total monthly sales of jewelry…things like
• - # Weddings
• - # Anniversaries
• - DPI
• Since this is bivariate regression, for now we will focus on DPI as the sole independent variable used to predict jewelry sales.
Let’s Look at the jewelry sales data plotted against DPI

Christmas

Othermonths

The big differences in sales during the Dec. months will make it hard to estimate with a bivariate regression.

We will use both the unadjusted and the seasonally adjusted series to see the difference in model accuracy.

Jewelry Example
• Our dependent (Y) variable is monthly jewelry sales
• unadjusted in the first example
• seasonally adjusted in the second example
• Our only independent variable (X) is DPI, so

the models we are going to estimate are:

• JS= b0 + b1*(DPI) + e
• SAJS= b0 + b1*(DPI) + e
Things to consider with ANY regression
• Do the signs on the b’s make sense?
• Your expectation should have SOME logical basis.
• If the sign is not what is expected, your regression may be:
• Underspecified-move on to multiple regression.
• Misspecified-consider other RHS variables that might provide a better measure.
Consider the Jewelry Example
• Do we get the right sign? i.e., what’s the relationship between DPI and sales?
• What is a normal good?
• What kind of good is jewelry, normal or inferior?
• What would be the expected sign if we were looking at a good we though was and inferior good?
Things to consider with ANY regression
• If you DO get the expected signs, are the effects statistically significant?
• Do the t-stats indicate a strong relationship?
• Can you reject the null that the relationship (slope) is 0?
Things to consider with ANY regression
• Are the effects economically significant
• Even with statistically significant results, a very small slope indicates a very large change in the RHS variable is necessary to get any change in the LHS.
• There is no hard & fast rule here. It requires judgment.
Consider the Jewelry Example
• In the jewelry example, it takes a \$250 million (or \$.25 billion) dollar increase in DPI to increase (adjusted) jewelry sales by \$1 million. Is this a lot or a little slope?

Let’s think of it a little differently…

• T his would be roughly a \$1 increase in (adjusted) jewelry sales with a \$250 increase in personal disposable income.
• Does this pass the “sniff test?”
Things to consider with ANY regression
• Does the regression explain much?
• In linear regressions, the fraction of the “variance” in the dependent variable “explained” by the independent variable is measured by the R-squared (A.K.A. the Coefficient of Variation).

Trend: R-sq = .9933

Causal (w/season): R-sq = .0845

Causal (w/o season): R-sq = .8641

Although the causal model explains less of the variance, we now have some evidence that sales are related to DPI.

Another thing to consider about the first model w/seasonality in it
• The first model was poorly specified when we were using the series with seasonality in it.
• The de-seasonalized data provides better fit in the simple regression.
• …why?
• Well, income is obviously related to sales, but so is the month of the year (e.g., Dec), so we need to adjust or account.
• Adjust for seasonality (use a more appropriate RHS var) , or
• Account for it in the model (move to multi-var and include the season in the regression…to be covered next chapt.)
Question
• Why would we want to forecast Jewelry sales based on a series like DPI?
• DPI is very close to a linear trend…we have a good idea what it might look like a several periods from now.
Other examples of simple regression models:Cross section (all in the same time)
• Car mileage as a function of engine size
• What do we expect this relationship to be on average?
• Body weight as a function of height
• What do we expect this relationship to be on average?
• Income as a function of educational attainment
• What do we expect this relationship to be on average?
Assumptions of the OLS regression
• One assumption of the OLS model is that the error terms DON’T have any regular patters. First off, this means…
• Errors are independantly distributed
• And, they are normally distributed
• They have a mean of 0
• They have a constant variance
Errors are independantly distributed

Errors might not be independantly distributed if we have Serial Correlation (or Autocorrelation)

• Serial correlation occurs when one period’s error is related to another period’s error
• You can have both positive and negative serial correlation
Negative Serial Correlation
• Negative serial correlation occurs when positive errors are followed by negative errors (or vice versa)

Y

X

Positive Serial Correlation
• Positive serial correlation occurs when positive errors tend to be followed by positive errors

Y

X

What does Serial Correlation Cause?
• The estimates for b are unbiased, but the errors are underestimated…this means our t-stats are overstated.
• If our t-stats are overstated, then it’s possible we THINK we have a significant effect for b, when we really don’t.
• Additionally, R-squared and F-stat are both unreliable.
Durbin-Watson Statistic
• The Durbin-Watson Statistic is used to test for the existence of serial correlation.

Sum of Sq

Errors

The Durbin-Watson Statistic ranges from 0 to 4.

Evaluation of the DW Statistic

The rule of thumb: If it’s near 2 (i.e., from 1.5 - 2.5) there is no evidence of serial correlation present.

For more precise evaluation you have to calculate and compare 5 inequalities and determine which of the 5 is true.

# of RHS vars

Lower and Upper DW

Evaluation of the DW Statistic

Evaluate (Choose

True Region)

4 > DW > (4-DWL) T/F A

• (4-DWL) > DW > (4-DWU) T/F B
• (4-DWU) > DW > DWU T/F C
• DWU > DW > DWL T/F D
• DWL > DW > 0 T/F E

Negative serial correlation

Positive serial correlation

Indeterminate or no observed serial correlation

For Example
• Suppose we get a DW of 0.21 with a 36 observations…

From the table: DWL = 1.41 DWU= 1.52

• The rest is just filling in and evaluating.

4 > 0.21 > (4 - 1.41) T/F A

(4 - 1.41) > 0.21 > (4 -1.52) T/F B

(4-1.52) > 0.21 > 1.52 T/F C

1.52 > 0.21 > 1.41 T/F D

1.41 > 0.21 > 0 T/F E

Errors are Normally Distributed

Each observation’s error is normally distributed

around the estimated regression line.

OLS

Regression

Line

Y

Error can be +/-, but they are grouped around the regression line.

X

When might errors be distributed some other way???
• One example would be a dependant variable that’s like 0/1 or similar (discrete and/or limited).
• Employed/Unemployed
• Full-time/Part-time
• =1 if above a certain value, =0 if not.
Errors have a mean of 0

“+” error is just as likely as “– “error

and they balance out.

OLS

Regression

Line

Y

+

_

X

Variance (or st. dev.) of errors is constant across values of the RHS variable

OLS

Regression

Line

Y

X

What would it look like if variance wasn’t constant

Here is one specific type of non-constant var. The mean is still 0, but errors get larger as X gets larger.

OLS

Regression

Line

Y

This is referred to as

heteroscedasticity,

Yes, you heard right

heteroscedasticity…

X

+

-

Looking at it from another angle, errors can be + or - , but they should be stable over time or X

+

-

Heteroscedasticity
• Can cause the estimated St. Error (those reported by the statistical package) to be smaller than the actual St. Error.
• This messes up the estimated t-stats. The estimated t-stats are reported as larger than they actually are…Based on the estimated t-stats, we might reject the null, when we really shouldn’t.
Common causes of Heteroscedasticity
• Personal hygiene aside, there are several potential sources of this problem.
• Model misspecification
• Omitting an important variable
• Improper functional form (may be non-linearity in the relationship between x&y)
Data problems and fixes for the bivariate model
• Trends…no problem.
• Adapting the bivariate model to forecast seasonal data.
• You might think the bivariate model is too simple to handle seasonality…well, it’s simple, but with a trick or two, you can extend its capabilities quite a bit.
Forecasting SA Total Houses Soldtwo Bivariate Regressions (linear trend & DPI)
• What are the “causal factors” for house purchases?
• Income
• Time trend (Inflation)
• Employment (rate)
• Interest rates
• Consumer confidence
• Price of housing
• Price of rental units (\$ substitutes)
• Price of insurance (\$ complements)
• Property taxes (other costs)

We will focus on these

• Compute a set of seasonal indices
• De-seasonalize the data
• Re-seasonalize the data
• Calculate your measures of accuracy
Getting your indices and de-seasonalizing the data
• What we need to do is “decompose” the series to get the seasonal index for each month. We could use the Winter’s model to estimate the seasonality index, but instead we use the Decomposition Model
• We estimate the Decomposition Model and choose the multiplicative option to get the index (in multiplicitive form).

(We will cover this model later, but for now…well, just think of it as magic!!!)

Getting the Seasonal Indices

They are repeated down the column next to each month.

/

=

/

=

/

=

.

.

.

.

.

.

.

.

.

.

.

.

.

.

/

=

Applying the Index to the Data

Seasonally Adjusted Total Housing Sales (SATHS)

There is still a trend, but we aren’t

Using the Adjusted Sales Data (SATHS)

Let’s now forecast adjusted housing sales as a function of time (a 12 month forecast).

The equation we are estimating is:

^

SATHS= b0 + b1(Time)+ e

What do we expect for the sign for b1?

Data

There are two ways to approach this in ForecastX:

• Use the “Linear Regression” model without a time index, or
• Use “Multiple Regression” with both the time index and the year and month variable.

Both provide essentially the same results.

Houses Sold and Time

Regression Results

Re-seasonalize

This!!!

x

=

x

=

x

=

x

=

x

=

x

=

x

=

x

=

x

=

x

=

x

=

x

=

Re-Seasonalizing

Actual

Forecast

The Trend Forecast

The thing to note here is that the simple linear model is

capturing some of the seasonal fluctuations…WOW!!!

What have we done?!
• Really, we have simply used a little math to incorporate the estimated seasonal variation into the bivariate forecast…without actually estimating it that way.
• The same steps are involved here.
• We’ve already obtained the seasonal indices and computed the de-seasonalized data.
• All we need to do is make the forecast, compute the re-seasonalize the data, and calculate your measures of accuracy.
Data

Houses Sold and DPI

What can we say about the bivariate model and seasonality?
• It’s really easy to forecast when there is a trend…that’s just the slope b.
• Although there are a few steps involved, it’s not terribly difficult to forecast a series that has seasonality.
• So, we can (substantially) do what the Winter’s model can, with the added benefit of being able to say “why” something is happening.
• We also conserve degrees of freedom
Other problems

Serial or Autocorrelation

Remember, autocorrelation occurs when adjacent observations are correlated this causes our estimated standard errors to be too small, and our t-stats to be too big, messing up inference.

Causes
• Long-term cycles or trends
• Inflation
• population growth
• i.e., any outside force that affects both series
• Misspecification
• Leaving out an important variable (see above)
• Failing to include the correct functional form of the RHS, i.e., a non-linear term (maybe neccitates the move to multi-variate)
Considerations

We generally aren’t concerned with AC or SC if we are just estimating the time trend (y=f(t)) in OLS.

• It’s mainly when we want to figure out the causal relationship between the RHS and the LHS that we need to worry about SC or AC.
Using what we know…
• We have looked at the DW statistic, how it’s calculated and what it measures.
• Let’s look at a Bivariate forecast that has a couple of problems and see if we can use some of the tools we currently have to identify the problems and fix them, if possible.
Example of Autocorrelation in Action: Background

Remember in Macroeconomics, a guy named Keynes made a couple of observations about aggregate consumption…

• What was not consumed out of current income is saved; and
• Current consumption depends on current income in a less than proportionate manner.
Keynesian Theory of Consumption
• Keynes’ theories placed emphasis on a parameter called the

Marginal Propensity to Consume (MPC),

which is the slope of the aggregate consumption function and a key factor in determining the "multiplier effect" of tax and spending policies.

What is MPC in everyday terms?
• MPC can be thought of as the share of each additional dollar of income that’s spent on consumption.
• The multiplier effect is the economic stimulus effect that comes from the spending and re-spending of that portion of the dollar.
• Good taxing and spending policies take into account the MPC and how higher MPC means a larger multiple effect.
MPC in action
• From a fiscal policy standpoint we want to
• inject \$ into activities that have a high multiplier effect, or
• provide lower taxes or higher subsidies to people who spend all their income (i.e., MPC=1).
• Before you say, “hey, don’t like this idea”…I am talking about YOU!!!
OK now, why do we care?
• Knowledge of the MPC can give us some idea how to stimulate the economy in recession or put the brakes on an overheated economy with government policies.
• For example, consider the last recession. Think of the polices that were used by the feds.
• Income tax rates were reduced
• Tax Rebate checks (based on # dependants)
• Other federal expenditures increased (?)
What were the intended effects of these policies?
• Anything aimed at increasing the disposable income is expected to increase C.

C+I+G+(X-M)=GNP

If the relationship holds, then increasing C has what effect on GNP?

C GNP

As C gets larger, GNP is expected to Grow

The Regression (with “issues”)

We can obtain an estimate of the MPC by applying OLS to the following aggregate consumption function:

^

GC = b0 + b1GNP + e

Where the slope, b1,is the estimate of MPC.

The Data

Output…Everything looks good, but…

These look pretty good!

...maybe too good.

DW indicates Serial Correlation

Things we need to keep in mind
• Both variables are probably non-stationary. In fact, that can be shown using ForecastX’s “Analyze” button and creating the Correlograms for both series (see Chpt 2, p83).
• And, therefore they may have a common trend. In other words, the regression may be plagued by serial correlation.
• Non-stationarity is not such a big deal in the time-trend models, because we aren’t trying to establish a causal relationship and we have models that can deal with it (linear).
In OLS
• In OLS models non-stationarity IS a problem in forecasting, because we ARE trying to establish a relationship, and
• If there is a “common trend” in the LHS and RHS, we erroneously attribute the trend’s influence to the RHS variable.
Detecting Serial correlation and Autocorrelation Graphically: What is the ACF?
• We just learned what the DW does, but there are graphical ways we can use to spot SC/AC.
• The ACF measures the Autocorrelation with observations from each previous period.
• If a time series is stationary, the Autocorrelation should diminish to towards 0 quickly as we go back in time.

Note: If a time series has seasonality, the autocorrelation is usually highest between like seasons in different years.

Be suspicious of results that look “too good”
• The forecaster should be suspicious of the results because, in a sense, they are too good.
• Both the ACF and the DW stat (0.16) indicate that in the original two series we likely have strong positive serial correlation. See
A Potential Method for Fixing a Non-Stationary Series

What method can we use to potentially fix a non-stationary series?…think back

• Right now, we only know about first-differencing or “de-trending,” so let’s use that

Just to Note: The Holt and Winter’s models allow for trend, but not for RHS variables, so we can’t use these directly to find the MPC.

Spurious Regression & First Differences
• When significant autocorrelation is present, spurious regression may arise in that our results appear to be highly accurate when in fact they are not, since the OLS estimator of the regression error variance is biased downward.
• To investigate this possibility, we will re-estimate our consumption function using first differences of the original data.
• This transformation is designed to eliminate any common linear trend in the data
ACF for first-differenced data

Didn’t completely take care of it in this series…

but, we use it anyway.

Scatter Plot of First Differences

There is still a positive relationship in the differenced data,

but it has more error and it’s weaker.

Back to the Data

Let’s now estimate the differenced model.

The Data

These are the more

accurate results

Summary Serial and Autocorrelation
• Serial and/or Autocorrelation can have a serious effect on your estimates and your inference.
• We can use both the DW and Correlation Coefficients (Correlograms) to detect serial or autocorrelation.
• 1st differencing can often help in purging SC or AC from the data.