Chapter 12

1 / 58

# Chapter 12 - PowerPoint PPT Presentation

Chapter 12. Multiple Regression Analysis and Model Building. Chapter 12 - Chapter Outcomes. After studying the material in this chapter, you should be able to: Understand the general concepts behind model building using multiple regression analysis.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Chapter 12

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Chapter 12

Multiple Regression Analysis and Model Building

### Chapter 12 - Chapter Outcomes

After studying the material in this chapter, you should be able to:

Understand the general concepts behind model building using multiple regression analysis.

Apply multiple regression analysis to business, decision-making situations.

Analyze the computer output for a multiple regression model and test the significance of the independent variables in the model.

### Chapter 12 - Chapter Outcomes(continued)

After studying the material in this chapter, you should be able to:

Recognize potential problems when using multiple regression analysis and take the steps to correct the problems.

Incorporate qualitative variables into the regression model by using dummy variables.

### Multiple Regression Analysis

SIMPLE LINEAR REGRESSION MODEL (POPULATION MODEL)

where:

y = Value of the dependent variable

x = Value of the independent variable

= Population’s y-intercept

= Slope of the population regression line

= Error term, or residual

### Multiple Regression Analysis

ESTIMATED SIMPLE LINEAR REGRESSION MODEL

where:

b0 = Estimated y intercept

b1 = Estimated slope coefficient

### Multiple Regression Analysis

A residual or prediction error is the difference between the actual value of y and the predicted value of y.

### Multiple Regression Analysis

The standard error of the estimate refers to the standard deviation of the model errors. The standard error measures the dispersion of the actual values of the dependent variable around the fitted regression plane.

### Multiple Regression Analysis

MULTIPLE REGRESSION MODEL (POPULATION MODEL)

where:

= Population’s regression constant

= Population’s regression coefficient for variable j; j=1, 2, … k

k =Number of independent variables

= Model error

### Multiple Regression Analysis

ESTIMATED MULTIPLE REGRESSION MODEL

### Multiple Regression Analysis

A model is a representation of an actual system using either a physical or mathematical portrayal.

Model Specification
• Decide what you want to do and select the dependent variable.
• List the potential independent variables for your model.
• Gather the sample data (observations) for all variables.

### Multiple Regression Analysis

The correlation coefficient is a quantitative measure of the strength of the linear relationship between two variables. The correlation coefficient, r, ranges between -1.0 and +1.0.

### Multiple Regression Analysis

CORRELATION COEFFICIENT

One x variable with y

or

### Multiple Regression Analysis

CORRELATION COEFFICIENT

One x variable with another x

Multiple Regression Analysis(Example)

Multiple Regression Model:

House Characteristics:

x1 = Square feet = 2,100; x2 = Age = 15; x3 = Number of Bedrooms = 4;

x4 = Number of baths = 3;

x5 = Size of garage = 2

Point Estimate for Sale Price:

### Coefficient of Determination

MULTIPLE COEFFICIENT OF DETERMINATION

The percentage of variation in the dependent variable explained by the independent variable in the regression model:

Model Diagnosis
• Is the overall model significant?
• Are the individual variables significant?
• Is the standard deviation of the model error too large to provide meaningful results?
• Is multicollinearity a problem?
Is the Model Significant?

If the null hypothesis is true, the overall regression model is not useful for predictive purposes.

### Is the Model Significant?

F-TEST STATISTIC

where:

SSR = Sum of squares regression

SSE = Sum of squares error

n = Number of data points

k = Number of independent variables

Degrees of freedom = D1 = k and D2 = n - k - 1

### Is the Model Significant?

A measure of the percentage of explained variation in the dependent variable that takes into account the relationship between the number of cases and the number of independent variables in the regression model.

where:

n = Number of data points

k = Number of independent variables

### Are the Individual Variables Significant?

t-TEST FOR SIGNIFICANCE OF EACH REGRESSION COEFFICIENT

where:

bi = Sample slope coefficient for the ith independent variable

sbi= Estimate of the standard error for the ith sample slope coefficient

n-k-1 = Degrees of freedom

Are the Individual Variables Significant? (From Figure 12-7)

 /2 = 0.01

 /2 = 0.01

Decision Rule: If -2.364  t  2.364, accept H0 Otherwise, reject H0

### Is the Standard Deviation of the Regression Model Too Large?

ESTIMATE FOR THE STANDARD DEVIATION OF THE MODEL

where:

SSE = Sum of squares error

n = Sample size

k = Number of independent variables

### Is Multicollinearity A Problem?

Multicollinearity refers to the situation when high correlation exists between two independent variables. This means the two variables contribute redundant information to the multiple regression model. When highly correlated independent variables are included in the regression model, they can adversely affect the regression results.

Some Indications of Severe Multicollinearity
• Incorrect signs on the coefficients.
• A sizable change in the values of the previous coefficients when a new variable is added to the model.
• A variable the previously significant in the model becomes insignificant when a new independent variable is added.
• The estimate of the standard deviation of the model increases when a variable is added to the model.

### Is Multicollinearity A Problem?

The variance inflation factor is a measure of how much the variance of an estimated regression coefficient increases if the independent variables are correlated. A VIF equal to one for a given independent variable indicates that this independent variable is not correlated with the remaining independent variables in the model. The greater the multicollinearity, the larger the VIF will be.

### Is Multicollinearity A Problem?

VARIANCE INFLATION FACTOR

where:

Rj2 = Coefficient of determination when the jth independent variable is regressed against the remaining

k - 1 independent variables.

### Multiple Regression Analysis

CONFIDENCE INTERVAL FOR THE REGRESSION COEFFICIENT

where:

bi = Point estimate for the regression coefficient xi

t/2= Critical t-value for a 1 -  confidence interval

sbi= The standard error of the ith regression coefficient

### Using Qualitative Independent Variables

A dummy variable is a variable that is assigned a value equal to 0 or 1 depending on whether the observation possesses a given characteristic or not.

Using Qualitative Independent Variables (Example 12-2)

Dummy Variable:

Estimated Regression:

Using Qualitative Independent Variables(Figure 12-11)

MBAs

Non-MBAs

b2 = 35,236 = Regression coefficient on the dummy variable

Nonlinear Relationships

Exponential Relationship of Increased Demand for Electricity versus Population Growth

Electricity Demand

Population

Nonlinear Relationships

Diminishing Returns Relationship of Advertising versus Sales

Sales

### Nonlinear Relationships

POLYNOMIAL POPULATION REGRESSION MODEL

where:

0 = Population’s regression constant

i = Population’s regression coefficient for variable xj : j = 1, 2, …k

p = Order of the polynomial

i = Model error

### Nonlinear Relationships

Interaction refers to the case in which one independent variable (such as x2) affects the relationship between another independent variable (x1) and a dependent variable (y).

### Nonlinear Relationships

A composite model is the model that contains both the basic terms and the interactive terms.

Nonlinear Relationships

A Composite Model

Basic Terms

Interactive Terms

### Stepwise Regression

Stepwise regression refers to a method which develops the least squares regression equation in steps, either through forward selection, backward elimination, or through standard stepwise regression.

### Stepwise Regression

The coefficient of partial determination is the measure of the marginal contribution of each independent variable, given that other independent variables are in the model.

### Best Subsets Regression

Cp STATISTIC

where:

p = k(Number of independent variables in model) + 1

T = 1 + The total number of independent variables to be considered for inclusion in the model

Rp2 = Coefficient of multiple determination for the model with p = k parameters

RT2 = Coefficient of multiple determination for the model that contains all T parameters

Analysis of Residuals

The following problems can be inferred through graphical analysis of residuals:

• The regression function is not linear.
• The model errors do not have a constant variance.
• The model errors are not independent.
• The model errors are not normally distributed.

### Analysis of Residuals

RESIDUAL

Analysis of Residuals(Figure 12-36)

3

2

1

Residuals

0

-1

-2

(a) Nonlinear Pattern

-3

3

2

1

Residuals

0

-1

-2

(b) Linear Pattern

-3

Analysis of Residuals(Figure 12-39)

3

2

1

Residuals

0

-1

-2

-3

x1

(a) Variance Decreases as x Increases

Analysis of Residuals(Figure 12-39)

3

2

1

Residuals

0

-1

-2

-3

x1

(b) Variance Increases as x Increases

Analysis of Residuals(Figure 12-39)

3

2

1

Residuals

0

-1

-2

-3

x1

(c) Constant Variance

Analysis of Residuals(Figure 12-42)

3

2

1

Residuals

0

-1

-2

(a) Independent Residuals

Time

-3

3

2

1

Residuals

0

-1

-2

(b) Residuals Not Independent

-3

Time

### Analysis of Residuals

STANDARDIZED RESIDUAL

where:

ei = Residual value

s = Estimate of the standard error of the estimate

xp = Value of x used to generate the predicted y value

Aptness

Coefficient of Partial Determination

Composite Model

Correlation Coefficient

Correlation Matrix

Dummy Variables

Key Terms
• Interaction
• Model
• Multicollinearity
• Multiple Coefficient of Determination
• Multiple Regression Model
• Polynomial Model
Residual (Prediction Error)

Second-Order Regression Model

Standard Error of the Estimate

Standardized Residual

Variance Inflation Factor

Key Terms(continued)