Lecture 2.4 Preview: Interval Estimates and Hypothesis Testing

1 / 15

# Lecture 2.4 Preview: Interval Estimates and Hypothesis Testing - PowerPoint PPT Presentation

Lecture 2.4 Preview: Interval Estimates and Hypothesis Testing. Clint’s Assignment: Taking Stock. Estimate Reliability: Interval Estimate Question. Normal Distribution versus the Student t -Distribution: One Last Complication.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Lecture 2.4 Preview: Interval Estimates and Hypothesis Testing' - zarita

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Clint’s Assignment: Taking Stock

Estimate Reliability: Interval Estimate Question

Normal Distribution versus the Student t-Distribution: One Last Complication

Assessing the Reliability of a Coefficient Estimate: Applying the Student t-Distribution

Theory Assessment: Hypothesis Testing

Motivating Hypothesis Testing: The Cynic

Formalizing Hypothesis Testing: The Steps

Summary: The Ordinary Least Squares (OLS) Estimation Procedure

Standard Ordinary Least Squares (OLS) Premises

Ordinary Least Squares (OLS) Estimation Procedure: Three Important Parts

Properties of the Ordinary Least Squares Estimation Procedure

Clint’s Assignment: Taking Stock

Theory: Studying more results in higher quiz scores.

The Model:yt = Const + xxt+ et

yt = Actual quiz score

xt = Minutes studied

et = Error term

Const = Points given for showing up

x = Points earned for each minute studied

Clint wishes to find the values of Const and x?

But Const and x are not observable.

Clint can never determine the actual values of Const and x. How can he proceed?

First Quiz Student x y 1 5 66 2 15 87 3 25 90

Ordinary Least Squares (OLS) Estimates

Esty = 63 + 1.2x

bConst = 63 = Estimated points given for showing up

bx = 1.2 = Estimated points for each minute studied

Clint’s Assignment

Coefficient Reliability: How reliable is the coefficient estimate, 1.2, calculated from the first quiz? That is, how confident should Clint be that the coefficient estimate, 1.2, will be close to the actual value?

Theory Confidence: How much confidence should Clint have in the theory that additional studying increases quiz scores?

General Properties of the Ordinary Least Squares (OLS) Estimation Procedure

When the standard ordinary least squares premises are met, the following equations describe the coefficient estimate’s probability distribution:

Mean[bx] = x

Var[bx] =

Importance of the Probability Distribution’s Mean (Center) and Variance (Spread)

Mean: When the mean of the coefficient estimate’s probability distribution, Mean[bx], equals the actual value of the coefficient, x, the estimation procedure is unbiased.

Unbiased does not mean that the estimate will equal the actual value.

In fact, we can be all but certain that the estimate will not equal the actual value.

Unbiased does mean that the estimation procedure does not systematically underestimate or overestimate the actual coefficient value.

Formally, the mean of the estimate’s probability distribution equals the actual value.

For more intuition, suppose that the estimate’s probability distribution is symmetric: the chances that the estimate is too high equals the chances that it is too low.

Variance: When the estimation procedure for the coefficient value is unbiased, the variance of the estimate’s probability distribution, Var[bx], determines the reliability of the estimate.

As the variance decreases, the estimate is more likely to be close to the actual coefficient value.

The Problem: But there is a problem here, isn’t there?

Econometrician’s Philosophy: If you lack the information to determine the value directly, estimate the value to the best of your ability using the information you do have.

OLS Estimation Procedure: Three Estimation Procedures

The ordinary least squares (OLS) estimation procedure actually includes three procedures:

A Procedure to Estimate the Value of the Parameters

A Procedure to Estimate the Variance of the Error Term’s Probability Distribution

A Procedure to Estimate the Variance of the Coefficient Estimate’s Probability Distribution

Good News: When the standard ordinary least squares (OLS) premises are satisfied:

Each of the three procedures is unbiased.

The procedure to estimate the value of the parameters is the best linear unbiased estimation procedure.

Coefficient Reliability: How reliable is the coefficient estimate, 1.2, calculated from the first quiz? That is, how confident should Clint be that the coefficient estimate, 1.2, will be close to the actual value?

Interval Estimate Question: What is the probability that the coefficient estimate, 1.2, lies within _____ of the actual coefficient value? _____.

One Last Complication

We do not know the actual value of the variance for the coefficient estimate’s probability distribution.

To use the normal distribution we must know the actual value of the variance (and the standard deviation) for the random variable’s probability distribution.

We must estimate its variance.

We cannot use the normal distribution when dealing with the coefficient estimate.

Instead we must use another distribution, the Student t-distribution.

The Normal Distribution Versus the Student t-Distribution

Student t-distribution: t equals the number of estimated standard deviations the value lies from the mean.

Normal distribution: z equals the number of standard deviations the value lies from the mean.

Student t-distribution

Normal distribution

Standard deviation is not known

Standard deviation is known

Standard deviation must be estimated

PRS 1-3

Mean

Since we must estimate the value of the standard deviation, we are introducing an additional element of uncertainty into the mix. Why is the Student t-distribution more “spread out?”

Hence, the Student t-distribution more “spread out” than the normal distribution.

Furthermore, the Student t-distribution is more complicated than the normal distribution: its “spread” depends on the degrees of freedom.

The Normal Distribution’s z and the Student t-Distribution’s t

When the standard deviation is known, use the normal distribution:

Value of Random Variable  Mean of Random Variable

z =

Standard Deviation of Random Variable

= Number of Standard Deviations from the Mean

When the standard deviation must be estimated, use the t-distribution:

Value of Random Variable  Mean of Random Variable

t =

Estimated Standard Deviation of Random Variable

= Number of Estimated Standard Deviations (Standard Errors) from the Mean

t-distribution is affected by the degrees of freedom.

Coefficient Reliability: How reliable is the coefficient estimate, 1.2, calculated from the first quiz? That is, how confident should Clint be that the coefficient estimate, 1.2, will be close to the actual value?

Interval Estimate Question: What is the probability that the coefficient estimate, 1.20, lies within _____ of the actual coefficient value? _____.

1.50

First Blank: We begin by filling in the first blank, choosing our “close to” value. Suppose that we choose 1.50;

Close To Criterion = 1.50

So we write 1.50 in the first blank.

Interval Estimate Question: What is the probability that the coefficient estimate, 1.20, lies within _____ of the actual coefficient value? _____.

1.50

.78

Convert 1.50 into standard errors:

Second Blank: Calculate the probability.

Probability that the estimate lies within 1.50 of the actual value

1.50

Question: Why does the actual value equal the distribution mean?

= 2.89

.5196

.78

t = Number of standard errors from the mean

Answer: The ordinary least squares (OLS) estimation procedure is unbiased.

.11

.11

Left tail:

1.50

1.50

Right tail:

2.89 SE’s

2.89 SE’s

x  1.50

x + 1.50

Actual Value = x

t = 2.89

t = 2.89

Degrees Number of = Sample Size  of EstimatedFreedom Parameters

= 3  2 = 1

Probability that the estimate lies within 1.50 of the actual value

equals

Probability that the estimate lies within 2.89 SE’s of the actual value

Between t‘s of 2.89 and +2.89

= 1.00  .22

= .78

= 1.00  (.11 + .11)

Clint’s Assignment: Theory Confidence. How much confidence should Clint have in the theory that additional studying increases quiz scores?

Theory: Additional studying increases quiz scores.

Step 0: Construct a model reflecting the theory to be tested

yt = Const + xxt + et

yt = Actual quiz score

xt = Minutes studied

et = Error term

Const reflects points given for showing up

x reflects points earned for each minute studied

First Quiz Student xy 1 5 66 2 15 87 3 25 90

The theory suggests that x should be positive.

Step 1: Collect data, run the regression, and interpret the estimates

bConst = Estimate of Const = 63

The estimated equation:

Esty = 63 + 1.2x

bx = Estimate of x = 1.2

Interpretation: The regression suggests that students receive

63 points for showing up

1.2 additional points for each minute studied

Critical Result: The parameter estimate evidence suggests that the theory postulating the benefits of additional studying is correct.

The coefficient estimate is positive.

More specifically, the coefficient estimate lies 1.2 above 0.

Step 2: Play the cynic and challenge the results; construct the null and alternative hypotheses:

Cynic’s view: Sure, the coefficient estimate was positive, but this result was just “the luck of the draw.” In fact, studying has no impact on quiz scores, the actual coefficient, x, equals 0.

H0: x = 0 Cynic is correct: Studying has no impact on a student’s quiz score

H1: x > 0 Cynic is incorrect: Additional studying increases quiz scores

PRS 4

Question: Can we dismiss the cynic’s view as being impossible?

No

Step 3: Formulate the question to assess the cynic’s view, to assess the null hypothesis.

Generic Question: What is the probability that the results would be like those we actually obtained (or even stronger), if the cynic is correct and studying actually has no impact?

Specific Question: The regression’s coefficient estimate was 1.2. What is the probability that the coefficient estimate, bx, in one regression would be 1.2 or more, if H0 were true (if the actual coefficient, x, equaled 0)?

PRS 5

Answer: Prob[Results IF Cynic Correct] or equivalently Prob[Results IF H0 True]

Prob[Results IF H0 True] small

Prob[Results IF H0 True] large

Unlikely that H0 is true

Likely that H0 is true

Reject H0

 Do not reject H0

H0: x = 0 Cynic is correct: Studying has no impact on quiz score

H1: x > 0 Cynic is incorrect: As studying increases, the quiz score increases

Step 4: Use the estimation procedure’s general properties to calculate Prob[Results IF H0 True].

Estimate was 1.2: What is the probability that the coefficient estimate in one regression would be 1.2 or more, if H0 were true (if the actual coefficient, x, equaled 0)?

OLS estimation procedure unbiased

If H0 were true

Standarderror

Number of observations

Number of parameters

Mean[bx]

= x

= 0

SE[bx]

= .5196

DF

= 3

 2

= 1

Question: What do we know about the probability distribution of the coefficient estimates, bx?

t-distribution

Mean = 0

SE = .5196

DF = 1

Use the Econometrics Lab.

.13

Prob[Results IF H0 True]  .13

bx

0

1.2

Using Eviews to Calculate Prob[Results IF H0 True] ]

OLS estimator is unbiased

Assume cynic is correct

EViews SE column

Number of observations

Number of parameters

Mean[bx]

= x

= 0

SE[bx]

= .5196

DF

= 3

 2

= 1

t-Statistic Column: How many standard errors (number of estimated standard deviations) does the coefficient estimate, 1.2, lie from 0?

The estimate, 1.2, lies about 2.3 standard errors from 0.

=

= 2.309

= t-Statistic Column

Tails Probability: What is the “tails probability,” the probability that the coefficient estimate, bx, resulting from one regression would will lie at least 1.2 from 0, if the actual coefficient, x, equaled 0?

t-distribution

Mean = 0

SE = .5196

DF = 1

.13

.13

Tails Probability  .26

bx

NB: The Prob. Column is based on the premise that the actual coefficient, x, equals 0.

1.2

1.2

0

1.2

Tails Probability: EViews Prob. Column

t-distribution

Mean = 0

SE = .5196

DF = 1

.2601/2

.2601/2

bx

1.2

1.2

0

1.2

Question to Assess Cynic’s View: What is the probability of obtaining a result like the one calculated from the first quiz data (a coefficient estimate, bx, of 1.2 or more), if studying actually has no impact on quiz scores (if the actual coefficient, x, were 0)?

t-distribution

Mean = 0

SE = .5196

DF = 1

.2601/2

Prob[Results IF H0 True]

 .13

bx

0

1.2

Tails Probability = .2601

H0: x = 0 Cynic is correct: Studying has no impact on a student’s quiz score

H1: x > 0 Cynic is incorrect: As studying increases, quiz score increases

Prob[Results IF H0 True]  .13

Step 5: Decide on the standard of proof, a significance level

The significance level is the dividing line between the probability being small and the probability being large.

Prob[Results IF H0 True]Less Than Significance Level

Prob[Results IF H0 True]Greater Than Significance Level

Prob[Results IF H0 True] small

Prob[Results IF H0 True] large

Unlikely that H0 is true

Likely that H0 is true

Reject H0

 Do not reject H0

Would we reject H0 at a 1 percent (.01) significance level?

No.

Would we reject H0 at a 5 percent (.05) significance level?

No.

Would we reject H0 at a 10 percent (.10) significance level?

No.

At the “traditional” significance levels, we could not reject the null hypothesis; we cannot reject the notion that studying has no impact on quiz scores.

Summary: Standard Regression Assumptions and the Ordinary Least Squares (OLS) Estimation Procedure

The Model:yt = Const + xxt + et

Const and x are the parameters

yt = Dependent variable

xt = Explanatory variable

et = Error term

Role of the Error Term

The error term is a random variable representing random influences: Mean[et] = 0

Standard Ordinary Least Squares (OLS) Premises

Error Term Equal Variance Premise: The variance of the error term’s probability distribution for each observation is the same.

Error Term/Error Term Independence Premise: The error terms are independent.

Explanatory Variable Constant Premise: The explanatory variables, the xt’s, are constants; the explanatory variables, the xt’s, are not random variables.

Explanatory Variable/Error Term Independence Premise: The explanatory variables, the xt’s, and the error terms, the et’s, are not correlated.

OLS Estimation Procedure Includes Three Estimation Procedures

Good News: When the standard OLS regression assumptions are met each of these procedures is unbiased.

Good News: When the standard OLS regression assumptions are met the OLS estimation procedure is BLUE.

Value of the parameters,Const and x:

bx =

bConst =

SSR

Variance of the error term’s probability distribution,Var[e]:

EstVar[e] =

Degrees of Freedom

Variance of the coefficient estimate’s probability distribution,Var[bx]:

EstVar[bx] =