Prediction concerning the response y
This presentation is the property of its rightful owner.
Sponsored Links
1 / 40

Prediction concerning the response Y PowerPoint PPT Presentation


  • 67 Views
  • Uploaded on
  • Presentation posted in: General

Prediction concerning the response Y. Where does this topic fit in?. Model formulation Model estimation Model evaluation Model use. Translating two research questions into two reasonable statistical answers. What is the mean weight, μ , of all American women, aged 18-24 ?

Download Presentation

Prediction concerning the response Y

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Prediction concerning the response y

Prediction concerning the response Y


Where does this topic fit in

Where does this topic fit in?

  • Model formulation

  • Model estimation

  • Model evaluation

  • Model use


Translating two research questions into two reasonable statistical answers

Translating two research questions into two reasonable statistical answers

  • What is the mean weight, μ, of all American women, aged 18-24?

    • If we want to estimate μ, what would be a good estimate?

  • What is the weight, y, of a randomly selected American woman, aged 18-24?

    • If we want to predict y, what would be a good prediction?


Could we do better by taking into account a person s height

Could we do better by taking into account a person’s height?


One thing to estimate y and one thing to predict y

One thing to estimate (μy) and one thing to predict (y)


Two different research questions

Two different research questions

  • What is the mean responseμY when the predictor value is xh?

  • What value will anew observationYnew be when the predictor value is xh?


Example skin cancer mortality and latitude

Example: Skin cancer mortality and latitude

  • What is the expected (mean) mortality rate for all locations at 40o N latitude?

  • What is the predicted mortality rate for 1 new randomly selected location at 40o N?


Example skin cancer mortality and latitude1

Example: Skin cancer mortality and latitude


Point estimators

is the best answer to each research question.

“Point estimators”

  • That is, it is:

  • the best guess of the mean response at xh

  • the best guess of a new observation at xh

But, as always, to be confident in the answer to our research question, we should put an interval around our best guess.


It is dangerous to extrapolate beyond scope of model

It is dangerous to “extrapolate” beyond scope of model.


It is dangerous to extrapolate beyond scope of model1

It is dangerous to “extrapolate” beyond scope of model.


A confidence interval for the population mean response y

A confidence interval for the population mean response μY

… when the predictor value is xh


Again what are we estimating

Again, what are we estimating?


1 100 t interval for mean response y

(1-α)100% t-interval for mean response μY

Formula in words:

Sample estimate ± (t-multiplier × standard error)

Formula in notation:


Example skin cancer mortality and latitude2

Example: Skin cancer mortality and latitude

Predicted Values for New Observations

New Obs Fit SE Fit 95.0% CI 95.0% PI

1 150.08 2.75 (144.56, 155.61) (111.23,188.93)

Values of Predictors for New Observations

New Obs Lat

1 40.0


Factors affecting the length of the confidence interval for y

Factors affecting the length of the confidence interval for μY

  • As the confidence level decreases, …

  • As MSE decreases, …

  • As the sample size increases, …

  • The more spread out the predictor values, …

  • The closer xh is to the sample mean, …


Does the estimate of y when x h 1 vary more here

Does the estimate of μY when xh = 1 vary more here …?

Var N StDev

yhat(x=1) 5 0.320


Or here

… or here?

Var N StDev

yhat(x=1) 5 2.127


Does the estimate of y vary more when x h 1 or when x h 5 5

Does the estimate of μY vary more when xh = 1 or when xh = 5.5?

Var N StDev

yhat(x=1) 5 2.127

yhat(x=5.5) 5 0.512


Example skin cancer mortality and latitude3

Example: Skin cancer mortality and latitude

Predicted Values for New Observations

New Fit SE Fit95.0% CI 95.0% PI

1 150.08 2.75(144.6,155.6) (111.2,188.93)

2 221.82 7.42(206.9,236.8) (180.6,263.07)X

X denotes a row with X values away from the center

Values of Predictors for New Observations

New Obs Latitude

1 40.0 Mean of Lat = 39.533

2 28.0


When is it okay to use the confidence interval for y formula

When is it okay to use the confidence interval for μY formula?

  • When xh is a value within the scope of the model – xh does not have to be one of the actual x values in the data set.

  • When the “LINE” assumptions are met.

    • The formula works okay even if the error terms are only approximately normal.

    • If you have a large sample, the error terms can even deviate substantially from normality.


Prediction interval for a new response y new

Prediction interval for a new response Ynew


Again what are we predicting

Again, what are we predicting?


1 100 prediction interval for new response y new

(1-α)100% prediction interval for new response Ynew

Formula in words:

Sample prediction ± (t-multiplier × standard error)

Formula in notation:


Example skin cancer mortality and latitude4

Example: Skin cancer mortality and latitude

Predicted Values for New Observations

New Obs Fit SE Fit 95.0% CI 95.0% PI

1 150.08 2.75 (144.56, 155.61) (111.23,188.93)

Values of Predictors for New Observations

New Obs Lat

1 40.0


When is it okay to use the prediction interval for y new formula

When is it okay to use the prediction interval for Ynew formula?

  • When xh is a value within the scope of the model – xh does not have to be one of the actual x values in the data set.

  • When the “LINE” assumptions are met.

    • The formula for the prediction interval depends strongly on the assumption that the error terms are normally distributed.


What s the difference in the two formulas

What’s the difference in the two formulas?

Confidence interval for μY :

Prediction interval for Ynew:


Prediction of y new if the mean y is known

Prediction of Ynewif the mean μY is known

Suppose it were known that the mean skin cancer mortality at xh = 40o N is 150 deaths per million (with variance 400)?

What is the predicted skin cancer mortality in Columbus, Ohio?


And then reality sets in

And then reality sets in

  • The mean μY is not known.

  • Estimate it with the predicted response

  • The cost of using

to estimateμY is the

variance of

  • The variance σ2 is not known.

  • Estimate it with MSE.


Variance of the prediction

which is estimated by:

Variance of the prediction

The variation in the prediction of a new response depends on two components:

1. the variation due to estimating the mean μYwith

2. the variation in Y


What s the effect of the difference in the two formulas

What’s the effect of the difference in the two formulas?

Confidence interval for μY :

Prediction interval for Ynew:


What s the effect of the difference in the two formulas1

What’s the effect of the difference in the two formulas?

  • A (1-α)100% confidence interval for μY at xh will always be narrower than a (1-α)100% prediction interval for Ynew at xh.

  • The confidence interval’s standard error can approach 0, whereas the prediction interval’s standard error cannot get close to 0.


Confidence intervals and prediction intervals for response in minitab

Confidence intervals and prediction intervals for response in Minitab

  • Stat >> Regression >> Regression …

  • Specify response and predictor(s).

  • Select Options…

    • In “Prediction intervals for new observations” box, specify either the X value or a column name containing multiple X values.

    • Specify confidence level (default is 95%).

  • Click on OK. Click on OK.

  • Results appear in session window.


Confidence intervals and prediction intervals for response in minitab1

Confidence intervals and prediction intervals for response in Minitab


Prediction concerning the response y

Confidence intervals and prediction intervals for response in Minitab

C6

40

28


Example skin cancer mortality and latitude5

Example: Skin cancer mortality and latitude

Predicted Values for New Observations

New Fit SE Fit95.0% CI95.0% PI

1 150.08 2.75 (144.6,155.6)(111.2,188.93)

2 221.82 7.42 (206.9,236.8)(180.6,263.07)X

X denotes a row with X values away from the center

Values of Predictors for New Observations

New Obs Latitude

1 40.0 Mean of Lat = 39.533

2 28.0


A plot of the confidence interval and prediction interval in minitab

A plot of the confidence interval and prediction interval in Minitab

  • Stat >> Regression >> Fitted line plot …

  • Specify predictor and response.

  • Under Options …

    • Select Display confidence bands.

    • Select Display prediction bands.

    • Specify desired confidence level (95% default)

  • Select OK. Select OK.


A plot of the confidence interval and prediction interval in minitab1

A plot of the confidence interval and prediction interval in Minitab


A plot of the confidence interval and prediction interval in minitab2

A plot of the confidence interval and prediction interval in Minitab


  • Login