Loading in 5 sec....

Prediction concerning the response YPowerPoint Presentation

Prediction concerning the response Y

- 97 Views
- Uploaded on
- Presentation posted in: General

Prediction concerning the response Y

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Prediction concerning the response Y

Where does this topic fit in?

- Model formulation
- Model estimation
- Model evaluation
- Model use

Translating two research questions into two reasonable statistical answers

- What is the mean weight, μ, of all American women, aged 18-24?
- If we want to estimate μ, what would be a good estimate?

- What is the weight, y, of a randomly selected American woman, aged 18-24?
- If we want to predict y, what would be a good prediction?

One thing to height?estimate (μy) and one thing to predict (y)

Two different research questions height?

- What is the mean responseμY when the predictor value is xh?
- What value will anew observationYnew be when the predictor value is xh?

Example: height?Skin cancer mortality and latitude

- What is the expected (mean) mortality rate for all locations at 40o N latitude?
- What is the predicted mortality rate for 1 new randomly selected location at 40o N?

Example: height?Skin cancer mortality and latitude

is the best answer to each research question. height?

“Point estimators”- That is, it is:
- the best guess of the mean response at xh
- the best guess of a new observation at xh

But, as always, to be confident in the answer to our research question, we should put an interval around our best guess.

It is dangerous to “ height?extrapolate” beyond scope of model.

It is dangerous to “ height?extrapolate” beyond scope of model.

A confidence interval for height?the population mean response μY

… when the predictor value is xh

Again, what are we estimating? height?

(1- height?α)100% t-interval for mean response μY

Formula in words:

Sample estimate ± (t-multiplier × standard error)

Formula in notation:

Example: height?Skin cancer mortality and latitude

Predicted Values for New Observations

New Obs Fit SE Fit 95.0% CI 95.0% PI

1 150.08 2.75 (144.56, 155.61) (111.23,188.93)

Values of Predictors for New Observations

New Obs Lat

1 40.0

Factors affecting the length of the confidence interval for height?μY

- As the confidence level decreases, …
- As MSE decreases, …
- As the sample size increases, …
- The more spread out the predictor values, …
- The closer xh is to the sample mean, …

Does the estimate of height?μY vary more when xh = 1 or when xh = 5.5?

Var N StDev

yhat(x=1) 5 2.127

yhat(x=5.5) 5 0.512

Example: height?Skin cancer mortality and latitude

Predicted Values for New Observations

New Fit SE Fit95.0% CI 95.0% PI

1 150.08 2.75(144.6,155.6) (111.2,188.93)

2 221.82 7.42(206.9,236.8) (180.6,263.07)X

X denotes a row with X values away from the center

Values of Predictors for New Observations

New Obs Latitude

1 40.0 Mean of Lat = 39.533

2 28.0

When is it okay to use the height?confidence interval for μY formula?

- When xh is a value within the scope of the model – xh does not have to be one of the actual x values in the data set.
- When the “LINE” assumptions are met.
- The formula works okay even if the error terms are only approximately normal.
- If you have a large sample, the error terms can even deviate substantially from normality.

Prediction interval for height?a new response Ynew

Again, what are we predicting? height?

(1- height?α)100% prediction interval for new response Ynew

Formula in words:

Sample prediction ± (t-multiplier × standard error)

Formula in notation:

Example: height?Skin cancer mortality and latitude

Predicted Values for New Observations

New Obs Fit SE Fit 95.0% CI 95.0% PI

1 150.08 2.75 (144.56, 155.61) (111.23,188.93)

Values of Predictors for New Observations

New Obs Lat

1 40.0

When is it okay to use the height?prediction interval for Ynew formula?

- When xh is a value within the scope of the model – xh does not have to be one of the actual x values in the data set.
- When the “LINE” assumptions are met.
- The formula for the prediction interval depends strongly on the assumption that the error terms are normally distributed.

What’s the difference height?in the two formulas?

Confidence interval for μY :

Prediction interval for Ynew:

Prediction of height?Ynewif the mean μY is known

Suppose it were known that the mean skin cancer mortality at xh = 40o N is 150 deaths per million (with variance 400)?

What is the predicted skin cancer mortality in Columbus, Ohio?

And then reality sets in height?

- The mean μY is not known.

- Estimate it with the predicted response

- The cost of using

to estimateμY is the

variance of

- The variance σ2 is not known.

- Estimate it with MSE.

which is estimated by: height?

Variance of the predictionThe variation in the prediction of a new response depends on two components:

1. the variation due to estimating the mean μYwith

2. the variation in Y

What’s the effect of the height?difference in the two formulas?

Confidence interval for μY :

Prediction interval for Ynew:

What’s the effect of the height?difference in the two formulas?

- A (1-α)100% confidence interval for μY at xh will always be narrower than a (1-α)100% prediction interval for Ynew at xh.
- The confidence interval’s standard error can approach 0, whereas the prediction interval’s standard error cannot get close to 0.

Confidence intervals and prediction intervals for response in Minitab

- Stat >> Regression >> Regression …
- Specify response and predictor(s).
- Select Options…
- In “Prediction intervals for new observations” box, specify either the X value or a column name containing multiple X values.
- Specify confidence level (default is 95%).

- Click on OK. Click on OK.
- Results appear in session window.

Example: in MinitabSkin cancer mortality and latitude

Predicted Values for New Observations

New Fit SE Fit95.0% CI95.0% PI

1 150.08 2.75 (144.6,155.6)(111.2,188.93)

2 221.82 7.42 (206.9,236.8)(180.6,263.07)X

X denotes a row with X values away from the center

Values of Predictors for New Observations

New Obs Latitude

1 40.0 Mean of Lat = 39.533

2 28.0

A plot of the confidence interval and prediction interval in Minitab

- Stat >> Regression >> Fitted line plot …
- Specify predictor and response.
- Under Options …
- Select Display confidence bands.
- Select Display prediction bands.
- Specify desired confidence level (95% default)

- Select OK. Select OK.