Econometric Forecasting with Linear Regression

1 / 74

# Econometric Forecasting with Linear Regression - PowerPoint PPT Presentation

Econometric Forecasting with Linear Regression. A Brief Introduction. I. Fundamental Concepts. Data (variables). Can be in three forms:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Econometric Forecasting with Linear Regression

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Econometric Forecasting with Linear Regression

A Brief Introduction

I. Fundamental Concepts
• Data (variables). Can be in three forms:
• Interval – There is a common scale to measure the variable, so that a value of two is actually twice a value of one. Examples: % of vote, degrees Fahrenheit, number killed, duration of regime, number of soldiers, GDP
• Ordinal – There is a rank-ordering to the variable, so 2 > 1, but the scale varies so that 2 is not exactly twice one. Examples: Yes/No variables, how close a bill is to passage (no houses, one house, both houses, signature), war outcomes (win, lose, or draw)
• Nominal – There are numbers, but they are completely arbitrary. Examples: country codes, leader names, strategy choices, apples and oranges.
B. Dependent Variable: What you are trying to predict
• Examples include % of the two-party Presidential vote, % seats held by Dems, war/non-war, political (in)stability, etc.
• Easiest to have a continuous (interval) DV, but techniques exist for all three types
C. Independent Variables: What variables predict the DV
• Can be either interval or ordinal. So…
• Transform nominal into ordinal. Example: Is this country the US? A nominal variable (USA) becomes an ordinal one (Yes or No).
• Again, examples in syllabus
D. Correlation
• Positive (or direct) correlation: the values of the IV and DV move up and down together (poverty and crime, CO2 and global temperature, drug addiction and prostitution, geographic proximity and conflict)
D. Correlation
• Positive (or direct) correlation: the values of the IV and DV move up and down together (poverty and crime, CO2 and global temperature, drug addiction and prostitution, geographic proximity and conflict)
• Negative (or inverse): The values of the IV and DV move in opposite directions (alcohol and coordination, democracy and interstate conflict, war and development)
D. Correlation
• Positive (or direct) correlation: the values of the IV and DV move up and down together (poverty and crime, CO2 and global temperature, drug addiction and prostitution, geographic proximity and conflict)
• Negative (or inverse): The values of the IV and DV move in opposite directions (alcohol and coordination, democracy and interstate conflict, war and development)
• Conditional: Direction depends on the value of some other variable

Dependent

Variable

Independent

Variables

Statistical Relationships

II. Modeling Relationships
• Simplest tool: the scatterplot or scatter diagram. Example from medicine:
Example
• A researcher believes that there is a linear relationship between BMI (Kg/m2) of pregnant mothers and the birth-weight (BW in Kg) of their newborn
• The following data set provide information on 15 pregnant mothers who were contacted for this study

BMI (Kg/m2)

Birth-weight (Kg)

20

2.7

30

2.9

50

3.4

45

3.0

10

2.2

30

3.1

40

3.3

25

2.3

50

3.5

20

2.5

10

1.5

55

3.8

60

3.7

50

3.1

35

2.8

Scatter Diagrams / Scatterplots
• Scatter diagram plots bivariateobservations (X, Y)  BMI (the IV) is X and birthweight (the DV) is Y
• Y is the dependentvariable (Dependent goes Down the side)
• X is the independent variable (goes across the graph)
B. Interpreting Scatterplots
• People tend to mentally fit a line or curve to describe the shape of the scatterplot
• Examples:
Linear Correlation

Strong relationships

Weak relationships

Y

Y

X

X

Y

Y

X

X

Linear (lack of) Correlation

No relationship

Y

X

Y

X

Curvilinear Correlation

Linear relationships

Curvilinear relationships

Y

Y

X

X

Y

Y

X

X

C. What does the line mean?
• Intended to simplify relationship. The line is ultimately an estimate, usually known to be wrong (but close enough to be useful)
• Line is probabilistic, not deterministic – otherwise it would perfectly pass through every point on the scatterplot
• = key difference between predicting politics and predicting planetary orbits. Kepler’s equations are deterministic, but econometric models are probabilistic
D. Problem: How do we draw the “right” line?
• Sample scatterplot:

Y

60

40

20

0

X

0

20

40

60

Thinking Challenge

How would you draw a line through the points? How do you determine which line ‘fits best’?

Thinking Challenge

How would you draw a line through the points? How do you determine which line ‘fits best’?

Thinking Challenge

How would you draw a line through the points? How do you determine which line ‘fits best’?

Thinking Challenge

How would you draw a line through the points? How do you determine which line ‘fits best’?

Thinking Challenge

How would you draw a line through the points? How do you determine which line ‘fits best’?

Thinking Challenge

How would you draw a line through the points? How do you determine which line ‘fits best’?

Thinking Challenge

How would you draw a line through the points? How do you determine which line ‘fits best’?

E. Solution: Regression
• Regression = using an equation to find the line (or curve) that most closely fits the data
1. Linear Regression Model
• a.Relationship Between Variables Is a Linear Function

Coefficient of X, or Slope

Constant, or Y-Intercept

Random Error

Y

X

0

1

Dependent Variable

Independent (Explanatory or Control) Variable

b. Linear Equations

High School Teacher

c. Quick math review
• As you remember from high school math, the basic equation of a line is given by y=mx+b where m is the slope and b is the y-intercept
• One definition of m is that for every one unit increase in x, there is an m unit increase in y
• One definition of b is the value of y when x is equal to zero
Sample Scatterplot
• Look at the data in this picture
• Does there seem to be a correlation (linear relationship) in the data?
• Is the data perfectly linear?
• Could we fit a line to this data?
2. What is linear regression?
• Linear regression tries to find the best line (curve) to fit the data
• The equation of the line is
• The method of finding the best line (curve) is least squares, which minimizes the sum of the distance from the line for each of points
3. Ordinary Least Squares (OLS): The most common form of linear regression
• Find the values of b that minimize the squared vertical distance from the line to each of the point. This is the same as minimizing the sum of the ei2
• Why minimize squared errors? ‘Best Fit’ Means Difference Between Actual Y Values & Predicted Y Values Are a Minimum  But Positive Differences Offset Negative! (errors of 10 and -10 add to zero)  squaring errors solves the problem: 10 * 10 = 100 and -10 * -10 also = 100.
c. Least Squares Graphically: Predicted Values of Y vs. Actual Values of Y

For each observation i, the equation is merely an estimate, not the actual value. There are errors (εi), and the line minimizes the sum of ε12, ε22, ε32, ε42, ε52, and so on.

d. Recap: Interpreting the Linear Regression Formula

47

• Regression Formula: Y = a + bX, Y = α + βX, Y = α + β1X1, Y = β0 + β1X1, etc  all are the same formula!
• Y = the predicted value of the dependent variable (its estimated mean given X)
• a(or alpha: α, or beta-zero: β0) = the Y intercept, or the value of Y when X = 0 (constant)
• b(or beta: β) = the regression coefficient, the slope of the regression line, or the amount of change produced in Y by a unit change in X
• Positive sign of regression coefficient: positive direction of association
• Negative sign of regression coefficient: negative direction of association
• X = the value of the independent variable
Example
• What is:
• Y?
• X?
• β1?
• β0?
e. Multivariate Linear Regression

49

• Typical formula: Y = β0 + β1X1 + β2X2 + β3X3, etc.
• DV, constant haven’t changed
• But now there are several independent variables
• Each IV has its own coefficient. So the first X may be positively related to Y, while the others might be negatively related to Y.
• Could plot the effect of any one independent variable on Y as a line, but can no longer plot the whole equation since there are now as many dimensions as there are independent variables (plus one, for Y).
• Multivariate regression is best interpreted by consulting tables of coefficients, evaluating the effect of each X separately (i.e. all else being equal)
F. Other statistics generated by linear regression

1. R2 : Proportion of the variation in the dependent variable (Y ) that is explained by the independent variable (X)

R2 =Explained variation/Total variation

Ranges between 0 (no reduction in error) and 1 (no errors remain – the model perfectly predicts the dependent variable)

R2 is a comparative measure – it compares the amount of error made by the linear regression to the amount of error made by guessing the mean (average) value of Y for every case (e.g. Y = 12 for every case)

50

Example: Regression vs. guessing the mean

It is how much variation there is when you know X (i.e. how good your line fits the data) compared to how much variation there is when you don’t know X (which means you just assume the mean of Y is constant). First the regression….

…and now the variance without regression

51

2. Statistical Significance

56

Statistical significance of the regression model  uses one of a number of indicators (χ2, for example). No need to understand the indicator to interpret it. Look for a “p value” associated with the indicator.

Statistical Significanceof each Regression Coefficient (β1, for example). Also measured by a p value.

Key is to find p and see if p < .05 (in the social sciences). If yes  statistically significant. If no  not statistically significant.

The p value is the probability that random noise would have coincidentally given you an association this strong. Hence, lower values of p are “better.”

d. Finding and interpreting p

57

• The p value is the probability that random data (i.e. no real relationship with Y) would have coincidentally given you an association this strong. Hence, lower values of p are “better.”
• Authors sometimes say “significant at the .001 level.” This means p < .001. There may or may not be a table of p values for coefficients – authors frequently use asterisks to highlight coefficients at a given level of significance.
• If the model is not significant, the author has failed to discover a significant correlation between the model’s predicted values of Y and the actual values of Y.
• If a coefficient is not significant, then the author has failed to discover a significant correlation between that particular X and Y.
e. Two common mistakes

58

• “p <.6 so the relationship is statistically insignificant, and therefore I conclude that X doesn’t affect Y” – Not true, because p could be .001. All we know is that it is less than .23. In other words, absence of evidence is not evidence of absence. Indeed when the number of cases is very small, all of the p values – even for real relationships – are likely to be too large to make the coefficients statistically significant
• “p < .000001 so the relationship between X and Y is very strong” – Not true, because p values for any coefficient (no matter how tiny) becomes smaller as the number of cases increases. Millions of cases  just about every relationship is “statistically significant,” but many are substantively trivial
3. Substantive significance

This depends on what you are looking for!

• What units are X and Y measured in
• Does the coefficient mean that small increases in X lead to large increases in Y? If statistically significant, this is also substantively significant
• Does the coefficient mean that large increases in X only produce trivial changes in Y? Then regardless of statistical significance, the relationship is substantively uninteresting
• This is a qualitative judgment based on your needs, but it takes into account the numbers
Example

61

• Research hypothesis: The level of economic development has a positive effect on civil liberties in countries of the world
• Dependent variable: civil liberties
• Interval-ratio
• Independent variable: GDP per capita (\$1000)
• Measure of the level of the economic development
• Interval-ratio
Example

62

• Regression Coefficient (beta) = .257
• Substantive significance
• Increase of \$1000 in the level of GDP per capita increases the civil liberties score by .257.
• On a 5-point scale, this is interesting. On a 1000-point scale it would not be interesting.
• Statistical significance:
• p < .001
• Statistically significant at the .001 or .1% level
• R square=.525
• GDP per capita explains 52.5% of variation in civil liberties
• Research hypothesis: was not falsified by bivariate regression analysis (i.e. was consistent with the regression)
• The level of economic development has a positive and statistically significant effect on civil liberties
4.Confidence Intervals

63

• Linear regression predicts best near the mean values of X. Extreme values of X (low or high) are associated with greater error when predicting Y.
• Solution: Confidence intervals. A 95% confidence interval is where 95% of observations of Y at a given value of X are expected to fall, given the significance of the coefficient of X.
• Example: Polls with “margins of error” (typically 95% confidence intervals)
• Another example:
III. Extrapolation
• Also known as “time series analysis.”

A. Simplest form: Yt = Yt-1+α

• Y is the DV, t is time, and α is a constant
• If Yt-1Y is 38 and α is 1, then y will be 101, 102, 103, etc as time passes
• Note that this is simply a rearranged linear regression equation. The DV is predicted by previous values of the DV (which fill in as the IVs in the model)
B. More common form
• Form: Yt = βYt-1 + α
• β is the multiplicative relationship between Yt-1 and Yt
• So if β=1, then Y never changes over time.
• If β>1 then Y increases over time
• If β<1 then Y diminishes over time
C. Why use time series analysis?

1. Time’s arrow: Since cause must precede effect, time series analysis can be used to rule out the possibility that Y causes X

2. Autocorrelation: Sometimes we need to address the correlation of a variable with itself over time. Example: to predict defense budget, first thing to know is that it’s usually similar to last year’s budget. Then one can add IVs that might cause it to increase or decrease.

3. Omitted variable bias: Failing to “control” for a relevant IV (one that may correlate with both X and Y) can generate “false positives” – statistically significant relationships between variables that are causally unrelated (example: high correlation between Vietnam vets and supermarkets)

IV. Forecasting with Econometric Models
• Is the relationship causal? Difficult to know for sure…
• Possibility of coincidence: Addressed by requiring models to be statistically significant. Chance remains, but is low.
• Sources of bias:
• Y causes X. That is, perhaps the researcher has reversed the DV and IV. Use time-series analysis to rule this out.
• Faulty data – But only if the data is biased in some manner that makes X and Y correlate. Random noise is already accounted for. Example of bias = serial autocorrelation, or correlation across time. Many things (kids and dogs) grow larger over time. But height of your kid does not cause your dog to get bigger!
• Omitted variables – suppose Z causes X and Z causes Y. Then X and Y will appear to be causally related when in fact they are merely correlated. Adding Z to the model would reveal that X has no independent effect on Y.
IV. Forecasting with Econometric Models
• Is the relationship causal? Difficult to know for sure…
• Possibility of coincidence: Addressed by requiring models to be statistically significant. Chance remains, but is low.
• Sources of bias:
• Y causes X. That is, perhaps the researcher has reversed the DV and IV. Use time-series analysis to rule this out.
• Faulty data – But only if the data is biased in some manner that makes X and Y correlate. Random noise is already accounted for. Example of bias = serial autocorrelation, or correlation across time. Many things (kids and dogs) grow larger over time. But height of your kid does not cause your dog to get bigger!
• Omitted variables – suppose Z causes X and Z causes Y. Then X and Y will appear to be causally related when in fact they are merely correlated. Adding Z to the model would reveal that X has no independent effect on Y.
B. Extrapolating
• Requires either
• The ability to forecast the IVs themselves, or
• A model that forecasts Y(t) from IVs in t-1, t-10, etc.
• Long-term forecasting models are rare. Why?
C. Process
• Find a linear regression (OLS) that forecasts something
• Find the future values of X
• Plug these into the equation
• Multiply each X with its corresponding B (order of operations)
• Add it all together. Don’t forget the intercept.
• Presto! You have a forecast for Y!