Slides Prepared by
This presentation is the property of its rightful owner.
Sponsored Links
1 / 61

Slides Prepared by JOHN S. LOUCKS St. Edward’s University PowerPoint PPT Presentation


  • 78 Views
  • Uploaded on
  • Presentation posted in: General

Slides Prepared by JOHN S. LOUCKS St. Edward’s University. Chapter 14 Simple Linear Regression. Simple Linear Regression Model Least Squares Method Coefficient of Determination Model Assumptions Testing for Significance Excel’s Regression Tool Using the Estimated Regression Equation

Download Presentation

Slides Prepared by JOHN S. LOUCKS St. Edward’s University

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Slides prepared by john s loucks st edward s university

Slides Prepared by

JOHN S. LOUCKS

St. Edward’s University


Chapter 14 simple linear regression

Chapter 14 Simple Linear Regression

  • Simple Linear Regression Model

  • Least Squares Method

  • Coefficient of Determination

  • Model Assumptions

  • Testing for Significance

  • Excel’s Regression Tool

  • Using the Estimated Regression Equation

    for Estimation and Prediction

  • Residual Analysis: Validating Model Assumptions

  • Outliers and Influential Observations


The simple linear regression model

The Simple Linear Regression Model

  • Simple Linear Regression Model

    y = 0 + 1x+ 

  • Simple Linear Regression Equation

    E(y) = 0 + 1x

  • Estimated Simple Linear Regression Equation

    y = b0 + b1x

^


Least squares method

Least Squares Method

  • Least Squares Criterion

    where:

    yi = observed value of the dependent variable

    for the ith observation

    yi = estimated value of the dependent variable

    for the ith observation

^


The least squares method

The Least Squares Method

  • Slope for the Estimated Regression Equation

  • y-Intercept for the Estimated Regression Equation

    b0 = y - b1x

    where:

    xi = value of independent variable for ith observation

    yi = value of dependent variable for ith observation

    x = mean value for independent variable

    y = mean value for dependent variable

    n = total number of observations

_

_

_

_


Example reed auto sales

Example: Reed Auto Sales

  • Simple Linear Regression

    Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown below.

    Number of TV AdsNumber of Cars Sold

    114

    324

    218

    117

    327


Example reed auto sales1

Example: Reed Auto Sales

  • Slope for the Estimated Regression Equation

    b1 = 220 - (10)(100)/5 = 5

    24 - (10)2/5

  • y-Intercept for the Estimated Regression Equation

    b0 = 20 - 5(2) = 10

  • Estimated Regression Equation

    y = 10 + 5x

^


Using excel to develop a scatter diagram and compute the estimated regression equation

Using Excel to Develop a Scatter Diagramand Compute the Estimated Regression Equation

  • Formula Worksheet (showing data)


Using excel to develop a scatter diagram and compute the estimated regression equation1

Using Excel to Develop a Scatter Diagramand Compute the Estimated Regression Equation

  • Producing a Scatter Diagram

    Step 1 Select cells B1:C6

    Step 2 Select the Chart Wizard

    Step 3 When the Chart Type dialog box appears:

    Choose XY (Scatter) in the Chart type list

    Choose Scatter from the Chart sub-type display

    Select Next >

    Step 4 When the Chart Source Data dialog box appears

    Select Next >

    … continued


Using excel to develop a scatter diagram and compute the estimated regression equation2

Using Excel to Develop a Scatter Diagramand Compute the Estimated Regression Equation

  • Producing a Scatter Diagram

    Step 5 When the Chart Options dialog box appears:

    Select the Titles tab and then

    Delete Cars Sold in the Chart title box

    Enter TV Ads in the Value (X) axis box

    Enter Cars Sold in the Value (Y) axis box

    Select the Legend tab and then

    Remove the check in the Show Legend box

    Select Next >

    … continued


Using excel to develop a scatter diagram and compute the estimated regression equation3

Using Excel to Develop a Scatter Diagramand Compute the Estimated Regression Equation

  • Producing a Scatter Diagram

    Step 6 When the Chart Location dialog box appears:

    Specify the location for the new chart

    Select Finish to display the scatter diagram


Using excel to develop a scatter diagram and compute the estimated regression equation4

Using Excel to Develop a Scatter Diagramand Compute the Estimated Regression Equation

  • Adding the Trendline

    Step 1 Position the mouse pointer over any data

    point and right click to display the Chart menu

    Step 2 Select the Add Trendline option

    Step 3 When the Add Trendline dialog box appears:

    On the Type tab select Linear

    On the Options tab select the Displayequation on chart box

    Click OK


Using excel to develop a scatter diagram and compute the estimated regression equation5

Using Excel to Develop a Scatter Diagramand Compute the Estimated Regression Equation

  • Scatter Diagram


The coefficient of determination

^

^

The Coefficient of Determination

  • Relationship Among SST, SSR, SSE

    SST = SSR + SSE

  • Coefficient of Determination

    r2 = SSR/SST

    where:

    SST = total sum of squares

    SSR = sum of squares due to regression

    SSE = sum of squares due to error


Example reed auto sales2

Example: Reed Auto Sales

  • Coefficient of Determination

    r2 = SSR/SST = 100/114 = .8772

    The regression relationship is very strong since 88% of the variation in number of cars sold can be explained by the linear relationship between the number of TV ads and the number of cars sold.


Using excel to compute the coefficient of determination

Using Excel to Computethe Coefficient of Determination

  • Producing R2

    Step 1 Position the mouse pointer over any data

    point in the scatter diagram and right click

    Step 2 When the Chart menu appears:

    Select the Add Trendline option

    Step 3 When the Add Trendline dialog box appears:

    On the Options tab, select the Display R- squared value on chart box

    Click OK


Using excel to compute the coefficient of determination1

Using Excel to Computethe Coefficient of Determination

  • Value Worksheet (showing R2)


The correlation coefficient

The Correlation Coefficient

  • Sample Correlation Coefficient

    where:

    b1 = the slope of the estimated regression

    equation


Example reed auto sales3

Example: Reed Auto Sales

  • Sample Correlation Coefficient

    The sign of b1 in the equation is “+”.

    rxy = +.9366


Model assumptions

Model Assumptions

  • Assumptions About the Error Term 

    • The error  is a random variable with mean of zero.

    • The variance of  , denoted by  2, is the same for all values of the independent variable.

    • The values of  are independent.

    • The error  is a normally distributed random variable.


Testing for significance

Testing for Significance

  • To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of b1 is zero.

  • Two tests are commonly used

    • t Test

    • F Test

  • Both tests require an estimate of s2, the variance of e in the regression model.


Testing for significance1

Testing for Significance

  • An Estimate of s2

    The mean square error (MSE) provides the estimate

    of s2, and the notation s2 is also used.

    s2 = MSE = SSE/(n-2)

    where:


Testing for significance2

Testing for Significance

  • An Estimate of s

    • To estimate s we take the square root of s 2.

    • The resulting s is called the standard error of the estimate.


Testing for significance t test

Testing for Significance: t Test

  • Hypotheses

    H0: 1 = 0

    Ha: 1 = 0

  • Test Statistic

  • Rejection Rule

    Reject H0 if t < -tor t > t

    where tis based on a t distribution with

    n - 2 degrees of freedom.


Example reed auto sales4

Example: Reed Auto Sales

  • t Test

    • Hypotheses H0: 1 = 0

      Ha: 1 = 0

    • Rejection Rule

      For  = .05 and d.f. = 3, t.025 = 3.182

      Reject H0 if t > 3.182

    • Test Statistics

      t = 5/1.08 = 4.63

    • Conclusions

      Reject H0


Confidence interval for 1

Confidence Interval for 1

  • We can use a 95% confidence interval for 1 to test the hypotheses just used in the t test.

  • H0 is rejected if the hypothesized value of 1 is not included in the confidence interval for 1.


Confidence interval for 11

Confidence Interval for 1

  • The form of a confidence interval for 1 is:

    where b1 is the point estimate

    is the margin of error

    is the t value providing an area

    of a/2 in the upper tail of a

    t distribution with n - 2 degrees

    of freedom


Example reed auto sales5

Example: Reed Auto Sales

  • Rejection Rule

    Reject H0 if 0 is not included in the confidence interval for 1.

  • 95% Confidence Interval for 1

    = 5 +/- 3.182(1.08) = 5 +/- 3.44

    or 1.56 to 8.44

  • Conclusion

    Reject H0


Slides prepared by john s loucks st edward s university

Testing for Significance: F Test

  • Hypotheses

    H0: 1 = 0

    Ha: 1 = 0

  • Test Statistic

    F = MSR/MSE

  • Rejection Rule

    Reject H0 if F > F

    where F is based on an F distribution with 1 d.f. in

    the numerator and n - 2 d.f. in the denominator.


Slides prepared by john s loucks st edward s university

Example: Reed Auto Sales

  • F Test

    • Hypotheses H0: 1 = 0

      Ha: 1 = 0

    • Rejection Rule

    • For  = .05 and d.f. = 1, 3: F.05 = 10.13

    • Reject H0 if F > 10.13.

    • Test Statistic

    • F = MSR/MSE = 100/4.667 = 21.43

    • Conclusion

    • We can reject H0.


Some cautions about the interpretation of significance tests

Some Cautions about theInterpretation of Significance Tests

  • Rejecting H0: b1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y.

  • Just because we are able to reject H0: b1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y.


Using excel s regression tool

Using Excel’s Regression Tool

  • Up to this point, you have seen how Excel can be used for various parts of a regression analysis.

  • Excel also has a comprehensive tool in its Data Analysis package called Regression.

  • The Regression tool can be used to perform a complete regression analysis.


Using excel s regression tool1

Using Excel’s Regression Tool

  • Formula Worksheet (showing data)


Using excel s regression tool2

Using Excel’s Regression Tool

  • Performing the Regression Analysis

    Step 1 Select the Tools pull-down menu

    Step 2 Choose the Data Analysis option

    Step 3 Choose Regression from the list of

    Analysis Tools

    … continued


Using excel s regression tool3

Using Excel’s Regression Tool

  • Performing the Regression Analysis

    Step 4 When the Regression dialog box appears:

    Enter C1:C6 in the Input Y Range box

    Enter B1:B6 in the Input X Range box

    Select Labels

    Select Confidence Level

    Enter 95 in the Confidence Level box

    Select Output Range

    Enter A9 (any cell) in the Ouput Range box

    Click OK to begin the regression analysis


Using excel s regression tool4

Using Excel’s Regression Tool

  • Value Worksheet

Data

Regression Statistics Output

ANOVA Output

Regression Equation Output


Using excel s regression tool5

Using Excel’s Regression Tool

  • Estimated Regression Equation Output (left portion)

Note: Columns F-I are not shown.


Using excel s regression tool6

Using Excel’s Regression Tool

  • Estimated Regression Equation Output (right portion)

Note: Columns C-E are hidden.


Using excel s regression tool7

Using Excel’s Regression Tool

  • ANOVA Output


Using excel s regression tool8

Using Excel’s Regression Tool

  • Regression Statistics Output


Slides prepared by john s loucks st edward s university

Using the Estimated Regression Equationfor Estimation and Prediction

  • Confidence Interval Estimate of E(yp)

  • Prediction Interval Estimate of yp

    yp+t/2 sind

    where the confidence coefficient is 1 -  and

    t/2 is based on a t distribution with n - 2 d.f.


Example reed auto sales6

Example: Reed Auto Sales

  • Point Estimation

    If 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be:

    y = 10 + 5(3) = 25 cars

  • Confidence Interval for E(yp)

    95% confidence interval estimate of the mean number of cars sold when 3 TV ads are run is:

    25 + 4.61 = 20.39 to 29.61 cars

  • Prediction Interval for yp

    95% prediction interval estimate of the number of cars sold in one particular week when 3 TV ads are run is: 25 + 8.28 = 16.72 to 33.28 cars

^


Using excel to develop confidence and prediction interval estimates

Using Excel to Develop Confidence and Prediction Interval Estimates

  • Formula Worksheet (confidence interval portion)


Using excel to develop confidence and prediction interval estimates1

Using Excel to Develop Confidence and Prediction Interval Estimates

  • Value Worksheet (confidence interval portion)


Using excel to develop confidence and prediction interval estimates2

Using Excel to Develop Confidence and Prediction Interval Estimates

  • Formula Worksheet (prediction interval portion)


Using excel to develop confidence and prediction interval estimates3

Using Excel to Develop Confidence and Prediction Interval Estimates

  • Value Worksheet (prediction interval portion)


Residual analysis

Residual Analysis

  • If the assumptions about the error term e appear questionable, the hypothesis tests about the significance of the regression relationship and the interval estimation results may not be valid.

  • The residuals provide the best information about e.

  • Much of the residual analysis is based on an examination of graphical plots.


Residual plot against x

Residual Plot Against x

  • If the assumption that the variance of e is the same for all values of x is valid, and the assumed regression model is an adequate representation of the relationship between the variables:

    The residual plot should give an overall

    impression of a horizontal band of points


Slides prepared by john s loucks st edward s university

Example: Reed Auto Sales

  • Residuals


Using excel s regression tool to construct a residual plot

Using Excel’s Regression Tool to Construct a Residual Plot

  • Producing a Residual Plot

    • The steps outlined earlier to obtain the regression output are performed with one change.

    • When the Regression dialog box appears, we must also select the Residual Plot option.

    • The output will include two new items:

      • A plot of the residuals against the independent variable, and

      • A list of predicted values of y and the corresponding residual values.


Using excel s regression tool to construct a residual plot1

Using Excel’s Regression Tool to Construct a Residual Plot

  • Value Worksheet (showing Residual Plot)


Using excel s regression tool to construct a residual plot2

Using Excel’s Regression Tool to Construct a Residual Plot

  • Value Worksheet (Residual Output portion)


Standardized residuals

Standardized Residuals

  • Standardized Residual for Observation i

    where:


Standardized residual plot

Standardized Residual Plot

  • The standardized residual plot can provide insight about the assumption that the error term e has a normal distribution.

  • If this assumption is satisfied, the distribution of the standardized residuals should appear to come from a standard normal probability distribution.


Using excel to construct a standardized residual plot

Using Excel to Construct a Standardized Residual Plot

  • Excel’s Regression tool be used to obtain the standardized residuals.

  • The steps described earlier in order to conduct a regression analysis are performed with one change:

    • When the Regression dialog box appears, we must select the Standardized Residuals option

  • The Standardized Residuals option does not automatically produce a standardized residual plot.


Using excel to construct a standardized residual plot1

Using Excel to Construct a Standardized Residual Plot

  • Value Worksheet


Using excel to construct a standardized residual plot2

Using Excel to Construct a Standardized Residual Plot

  • Excel’s Chart Wizard can be used to construct the standardized residual plot.

  • A scatter diagram is developed in which:

    • The values of the independent variable are placed on the horizontal axis

    • The values of the standardized residuals are placed on the vertical axis


Using excel to construct a standardized residual plot3

Using Excel to Construct a Standardized Residual Plot

  • Standardized Residual Plot


Standardized residual plot1

Standardized Residual Plot

  • All of the standardized residuals are between –1.5 and +1.5 indicating that there is no reason to question the assumption that e has a normal distribution.


Outliers and influential observations

Outliers and Influential Observations

  • Detecting Outliers

    • An outlier is an observation that is unusual in comparison with the other data.

    • Minitab classifies an observation as an outlier if its standardized residual value is < -2 or > +2.

    • This standardized residual rule sometimes fails to identify an unusually large observation as being an outlier.

    • This rule’s shortcoming can be circumvented by using studentized deleted residuals.

    • The |i th studentized deleted residual| will be larger than the |i th standardized residual|.


End of chapter 14

End of Chapter 14


  • Login