Improve Phase Advanced Process Modeling Multiple Linear Regression (MLR)

Improve PhaseAdvanced Process ModelingMultiple Linear Regression (MLR)

Advanced Modeling & Regression Welcome to Improve Review Corr./Regression Process Modeling: Regression Non-Linear Regression Advanced Process Modeling: MLR Transforming Process Data Designing Experiments Multiple Regression Wrap Up & Action Items

Correlation and Linear Regression Review • Correlation and Linear Regression are used: • With historical process data. It is NOT a form of experimentation. • To determine if two variables are related in a linear fashion. • To understand the strength of the relationship. • To understand what happens to the value of Y when the value of X is increased by one unit. • To establish a Prediction Equation enabling us to predict Y for any level of X. • Correlation explores association. Correlation and regression do not imply a causal relationship. • Designed experiments allowfor true cause and effectrelationships to be identified. Correlations: Stirrate, Impurity Pearson correlation of Stirrate and Impurity = 0.966 P-value = 0.000

“r” StrongCorrelation StrongCorrelation NoCorrelation +1.0 -1.0 0 Decision Points Correlation Review • Correlation is used to measure the linear relationship between two Continuous Variables (bi-variate data). • Pearson Correlation Coefficient, “r”, will always fall between –1 and +1. • A Correlation of –1 indicates a strong negative relationship, one factor increases the other decreases. • A Correlation of +1 indicates a strong positive relationship, one factor increases so does the other. P-value > 0.05, Ho: No relationship P-value < 0.05, Ha: Is relationship

F i t t e d L i n e P l o t I m p u r i t y = - 0 . 2 8 9 + 0 . 4 5 6 6 S t i r r a t e 2 0 . 0 S 0 . 9 1 9 3 1 6 R - S q 9 3 . 4 % R - S q ( a d j ) 9 2 . 7 % 1 7 . 5 y 1 5 . 0 t i r u p m I 1 2 . 5 1 0 . 0 2 0 2 5 3 0 3 5 4 0 4 5 S t i r r a t e Linear Regression Review • Linear Regression is used to model the relationship between a Continuous response variable (Y) and one or more Continuous independent variables (X). The independent predictor variables are most often Continuous but can be ordinal. • Example of ordinal - Shift 1, 2, 3, etc. P-value > 0.05, Ho: Regression Equation is not significant P-value < 0.05, Ha: Regression Equation is significant The change in Impurity for every one unit change in Stirrate (Slope of the Line)

Regression Analysis Review • Correlation tells us the strength of a linear relationship not the numerical relationship. • The last step to proper analysis of Continuous Data is to determine the Regression Equation. • The Regression Equation can mathematically predict Y for any given X. • The Regression Equation from MINITABTM is the best fit for the plotted data. Prediction Equations: Y = a + bx(Linear or 1st order model) Y = a + bx + cx2(Quadratic or 2nd order model) Y = a + bx + cx2 + dx3(Cubic or 3rd order model) Y = a (bx)(Exponential)

Simple Regression One X, One Y Analyze in MINITABTM using Stat>Regression>Fitted Line Plot or Stat>Regression>Regression Multiple Regression Two or More X’s, One Y Analyze in MINITABTM Using Stat>Regression>Best Subsets Stat>Regression>Regression Simple versus Multiple Regression Review In both cases the R-sq value estimates the amount of variation explained by the model.

The basic steps to follow in Regression are as follows: Create Scatter Plot (Graph>Scatterplot) Determine Correlation (Stat>Basic Statistics>Correlation – P-value less than 0.05) Run Fitted Line Plot choosing linear option (Stat>Regression>Fitted Line Plot) Run Regression (Stat>Regression>Regression) (Unusual Observations?) Evaluate R2, adjusted R2 and P-values Run Non-linear Regression if necessary (Stat>Regression>Fitted Line Plot) Analyze residuals to validate assumptions. (Stat>Regression>Fitted Line Plot>Graphs) Normally distributed Equal variance Independence Confirm one or two points do not overly influence model. Regression Step Review One step at a time….

Simple Regression Example • This data set is from the mining industry. It is an evaluation of ore concentrators. Graph > Scatterplot…

Correlation Example Correlations: PGM concentrate (g/ton), Agitator RPM Pearson Correlation of PGM concentrate (g/ton) and Agitator RPM = 0.847 P-value = 0.001

Regression Line Example Stat > Regression > Fitted Line Plot…

Linear Regression Example Regression Analysis: PGM concentrate (g/ton) versus Agitator RPM The regression equation is PGM concentrate (g/ton) = 1.12 + 1.33 Agitator RPM Predictor Coef SE Coef T P Constant 1.119 7.106 0.16 0.878 Agitator RPM 1.3332 0.2642 5.05 0.001 S = 9.08220 R-Sq = 71.8% R-Sq(adj) = 69.0% Analysis of Variance Source DF SS MS F P Regression 1 2101.1 2101.1 25.47 0.001 Residual Error 10 824.9 82.5 Total 11 2925.9 Unusual Observations PGM Agitator concentrate Obs RPM (g/ton) Fit SE Fit Residual St Resid 3 32.0 23.30 43.78 3.21 -20.48 -2.41R R denotes an observation with a large standardized residual. The P-value < 0.05 therefore the Regression is significant. Notice the unusual observation may indicate a Non-linear analysis may explain more of the variation in the data.

Regression Line Example Stat>Regression>Fitted Line Plot

Linear and Non-Linear Regression Example Linear Model Regression Analysis: PGM concentrate (g/ton) versus Agitator RPM The regression equation is PGM concentrate (g/ton) = 1.119 + 1.333 Agitator RPM S = 9.08220 R-Sq = 71.8% R-Sq(adj) = 69.0% Analysis of Variance Source DF SS MS F P Regression 1 2101.07 2101.07 25.47 0.001 Error 10 824.86 82.49 Total 11 2925.93 More variation is explained using the Non-linear model since the R-Squared is higher and the S statistic is lower which is the estimated Standard Deviation of the error in the model. Non- Linear Model Polynomial Regression Analysis: PGM concentrate (g/ton) versus Agitator RPM The regression equation is PGM concentrate (g/ton) = 30.53 - 1.460 Agitator RPM + 0.05586 Agitator RPM**2 S = 7.61499 R-Sq = 82.2% R-Sq(adj) = 78.2% Analysis of Variance Source DF SS MS F P Regression 2 2404.04 1202.02 20.73 0.000 Error 9 521.89 57.99 Total 11 2925.93 Sequential Analysis of Variance Source DF SS F P Linear 1 2101.07 25.47 0.001 Quadratic 1 302.97 5.22 0.048

Residual Analysis Example

Residual Analysis Example • Normally Distributed residuals (Normal Probability Plot) • Equal variance (Residuals vs. Fitted Values) • Independence (Residuals vs. Order of Data)

Methods to find Non-linear Relationships: Scatter Plot indicating curvature. Unusual observations in Linear Regression model. Trends of the residuals versus the Fitted Values Plot in Simple Linear Regression. Subject matter expert knowledge or team experience. Non Linear Relationships Summary

Types of Non-Linear Relationships Oh, which formula to use?!

Mailing Response Example • This example will demonstrate how to use confidence and prediction intervals. • What percent discount should be offered to achieve a minimum 10% response from the mailing? The discount is created through sales coupons being sent in the mail. Clip ’em!

Mailing Response Scatterplot Graph > Scatterplot…

Mailing Response Correlation Correlations: % discount, % response from mailing Pearson Correlation of % discount and % response from mailing = 0.972 P-value = 0.000

Mailing Response Fitted Line Plot Regression Analysis: % response from mailing versus % discount The regression equation is % response from mailing = - 11.2 + 1.83 % discount Predictor Coef SE Coef T P Constant -11.215 2.541 -4.41 0.001 % discount 1.8301 0.1179 15.52 0.000 S = 5.60971 R-Sq = 94.5% R-Sq(adj) = 94.1% Analysis of Variance Source DF SS MS F P Regression 1 7580.0 7580.0 240.87 0.000 Residual Error 14 440.6 31.5 Total 15 8020.5 Note there are no unusual observations. Even though the R squared values are high a Non-linear fit may be better based on the Fitted Line Plot.

Mailing Response Non-linear Fitted Line Plot Notice the R squared value for the Non-linear fit increased to 98.6% from 94.5% in the Linear Regression. Polynomial Regression Analysis: % response from mailing versus % discount The regression equation is % response from mailing = - 0.416 + 0.1526 % discount + 0.04166 % discount**2 S = 2.91382 R-Sq = 98.6% R-Sq(adj) = 98.4% Analysis of Variance Source DF SS MS F P Regression 2 7910.14 3955.07 465.83 0.000 Error 13 110.37 8.49 Total 15 8020.51 Sequential Analysis of Variance Source DF SS F P Linear 1 7579.95 240.87 0.000 Quadratic 1 330.19 38.89 0.000

Confidence and Prediction Intervals In order to answer the original question it is necessary to evaluate the confidence and prediction intervals. …..Options The original task - What percent discount should be offered to achieve a 10% response from the mailing? In order to answer this question it is necessary to evaluate the Confidence and Prediction Intervals.

Confidence and Prediction Intervals Draw a vertical line where 10% intersects the lower prediction interval line. Draw a horizontal line at 10%. With 95% confidence a discount of 18% should create at least a 10% response from the mailing.

Confidence and Prediction Intervals The Prediction Interval is the range where a new observation is expected to fall. In this case we are 95% confident an 18% discount will yield between 10% and 23% response from the mailing. The Confidence Interval is the range where the Prediction Equation is expected to fall. The true Prediction Equation could be different. However, given the data we are 95% confident the true Prediction Equation falls within the Confidence Interval.

Residual Analysis To complete the example check the Residual Analysis for validation of the assumptions for Regression Analysis.

In the case where data is Non-linear it is possible to perform Regression using two different methods: Non-linear Regression (already discussed) Linear Regression on transformed data Either the X or Y may be transformed. Any statistical tools requiring transformation uses these methods. Advantages of transforming data: Linear Regression is easier to visually understand and manage. Non-normal Data can be changed to resemble Normal Data for statistical analyses where Normality is required. Disadvantages of transforming data: Difficult to understand transformed units. Difficult without automation or computers. Transforming Process Data

Transformation Power(p) Cube 3 { } Square 2 xp xtrans= No Change 1 log(x) Square Root 0.5 Logarithm 0 Reciprocal Root -0.5 Reciprocal -1 Transforming Process Data • Transform Rules: • The transform must preserve the relative order of the data. • The transform must be a smooth and continuous function. • Most often useful when the ratio of largest to smallest value is greater than two. In most cases the transform will have little effect when this rule is violated. • All external reference points (spec limits, etc.) must use the same transform.

Effect of Transformation The transformed data now shows a Normal Distribution.

Transforming Data Using MINITABTM The Box Cox Transformation procedure in MINITABTM is a method of determining the transform power (called “lambda” in the software) for a set of data. Stat>Control Charts>Box-Cox Transformation Transform.MTW

Box Cox Transform Before Transform After Transform

Transforming Without the Box Cox Routine Transform.MTW An alternative method of transforming data is to use standard transforms. The square root and natural log transform are most commonly. A disadvantage of using the Box Cox transformation is the difficulty in reversing the transformation. The column of process data is in C1, labeled Pos Skew. Remember this data was not Normally Distributed as determined with the Anderson Darling Normality test. Using the MINITABTM calculator, calculate the square root of each observation in C1 and store in C3, calling it “Square Root”.

Transforming Without the Box Cox Routine • The output should resemble this view. • Confirm if the new data set found in C3 is • Normally Distributed. • Our transform is the square root - the same as the Box Cox transform of lambda = 0.5 Transform.MTW

Multiple Linear Regression • Multiple Linear Regression investigates multiple input variable’s effect on an output simultaneously. • If R2 is not as high as desired in the Simple Linear Regression. • Process knowledge implies more than one input affects the output. • The assumptions for residuals with Simple Regressions are still necessary for Multiple Linear Regressions. • An additional assumption for MLR is the independence of predictors (X’s). • MINITABTM can test for multicollinearity (Correlation between the predictors or X’s). • Model error (residuals) is impacted by the addition of • measurement error for all the input variables. Ah oh…model error!

Definitions of MLR Equation Elements • The definitions for the elements of the Multiple Linear Regression model are as follows: • Y = The response (dependent) variable. • X1, X2, X3: The predictor (independent) inputs. The predictor variables used to explain the variation in the observed response variable, Y. • β0: The value of Y when all the explanatory variables (the Xs) are equal to zero. • β1, β2, β3 (Partial Regression Coefficient): The amount by which the response variable (Y) changes when the corresponding Xi changes by one unit with the other input variables remaining constant. • ε (Error or Residual): The observed Y minus the predicted value of Y from the Regression. Y = b0+ b1X1 + b2X2 + b3X3 + e

The basic steps to follow in Multiple Linear Regression are: Create matrix plot (Graph>Matrix Plot) Run Best Subsets Regression (Stat>Regression>Best Subsets) Evaluate R2, adjusted R2 , Mallows’ Cp, number of predictors and S. Iteratively determine appropriate regression model. (Stat>Regression> Regression >Options) Analyze residuals (Stat>Regression>Regression >Graphs) Normally Distributed Equal variance Independence Confirm one or two points do not overly influence model Verify your model by running present process data to confirm your model error. MLR Step Review

Multiple Linear Regression Model Selection • When comparing and verifying models consider the following: • Should be a reasonably small difference between R2 and R2 - adjusted (much less than 10% difference). • When more terms are included in the model does the adjusted R2 increase? • Use the statistic Mallows’Cp. It should be small and less than the number of terms in the model. • Models with smaller S (Standard Deviation of error for the model) are desired. • Simpler models should be weighed against models with multiple predictors (independent variables). • The best technique is to use MINITABTM’s Best Subsets command.

Flight Regression Example • An airplane manufacturer wanted to see what variables affect flight speed. The historical data available covered a period of 10 months. Graph > Matrix Plot… Flight Regression MLR.MTW

Flight Regression Example Matrix Plot Look for plots that show Correlation. Output Response Predictors Since two or more predictors show Correlation, run MLR.

Flight Regression Example Best Subsets Best Subsets Regression: Flight Speed versus Altitude, Turbine Angl, ... Response is Flight Speed F T u u e r l b / i A A n i l e r t i A r t n a T u g t I e Mallows d l i C m Vars R-Sq R-Sq(adj) C-p S e e o R p 1 72.1 71.1 38.4 28.054 X 1 39.4 37.2 112.8 41.358 X 2 85.9 84.8 9.0 20.316 X X 2 82.0 80.6 17.9 22.958 X X 3 87.5 85.9 7.5 19.561 X X X 3 86.5 84.9 9.6 20.267 X X X 4 89.1 87.3 5.7 18.589 X X X X 4 88.1 86.1 8.2 19.481 X X X X 5 89.9 87.7 6.0 18.309 X X X X X

Flight Regression Example Model Selection Best Subsets Regression: Flight Speed versus Altitude, Turbine Angl, ... Response is Flight Speed F T u u e r l b / i A A n i l e r t i A r t n a T u g t I e Mallows d l i C m Vars R-Sq R-Sq(adj) C-p S e e o R p 1 72.1 71.1 38.4 28.054 X 1 39.4 37.2 112.8 41.358 X 2 85.9 84.8 9.0 20.316 X X 2 82.0 80.6 17.9 22.958 X X 3 87.5 85.9 7.5 19.561 X X X 3 86.5 84.9 9.6 20.267 X X X 4 89.1 87.3 5.7 18.589 X X X X 4 88.1 86.1 8.2 19.481 X X X X 5 89.9 87.7 6.0 18.309 X X X X X List of all the Predictors (X’s) What model would you select? • Let’s consider the 5 predictor model: • Highest R-Sq(adj) • Lowest Mallows Cp • Lowest S • However there are many terms

Flight Regression Example Model Selection Stat>Regression>Regression… …Options

Flight Regression Example Model Selection Regression Analysis: Flight Speed versus Altitude, Turbine Angle, ... The regression equation is Flight Speed = 770 + 0.153 Altitude + 5.81 Turbine Angle + 8.70 Fuel/Air ratio - 52.3 ICR + 4.11 Temp Predictor Coef SE Coef T P VIF Constant 770.4 229.7 3.35 0.003 Altitude 0.15318 0.06605 2.32 0.030 2.3 Turbine Angle 5.806 2.843 2.04 0.053 1.4 Fuel/Air ratio 8.696 3.327 2.61 0.016 3.2 ICR -52.269 6.157 -8.49 0.000 2.6 Temp 4.107 3.114 1.32 0.200 5.4 S = 18.3088 R-Sq = 89.9% R-Sq(adj) = 87.7% The VIF for temp indicates it should be removed from the model. Go back to the Best Subsets analysis and select the best model that does not include the predictor temp. • Variance Inflation Factor (VIF) detects Correlation among predictors. • VIF = 1 indicates no relation among predictors • VIF > 1 indicates predictors are correlated to some degree • VIF between 5 and 10 indicates Regression Coefficients are poorly estimated and are unacceptable

Flight Regression Example Model Selection Note: It is not necessary to re-run the Best Subsets analysis. The numbers do not change. Best Subsets Regression: Flight Speed versus Altitude, Turbine Angl, ... Response is Flight Speed F T u u e r l b / i A A n i l e r t i A r t n a T u g t I e Mallows d l i C m Vars R-Sq R-Sq(adj) C-p S e e o R p 1 72.1 71.1 38.4 28.054 X 1 39.4 37.2 112.8 41.358 X 2 85.9 84.8 9.0 20.316 X X 2 82.0 80.6 17.9 22.958 X X 3 87.5 85.9 7.5 19.561 X X X 3 86.5 84.9 9.6 20.267 X X X 4 89.1 87.3 5.7 18.589 X X X X 4 88.1 86.1 8.2 19.481 X X X X 5 89.9 87.7 6.0 18.309 X X X X X Select a model with 4 terms because Temp was removed as a predictor since it had Correlation with the other variables. Re-run the Regression.

Flight Regression Example Model Selection Regression Analysis: Flight Speed versus Altitude, Turbine Angle, ... The regression equation is Flight Speed = 616 + 0.117 Altitude + 6.70 Turbine Angle + 12.2 Fuel/Air ratio - 48.2 ICR Predictor Coef SE Coef T P VIF Constant 616.1 200.7 3.07 0.005 Altitude 0.11726 0.06109 1.92 0.067 1.9 Turbine Angle 6.702 2.802 2.39 0.025 1.3 Fuel/Air ratio 12.151 2.082 5.84 0.000 1.2 ICR -48.158 5.391 -8.93 0.000 1.9 S = 18.5889 R-Sq = 89.1% R-Sq(adj) = 87.3% • The VIF values are NOW acceptable. • Evaluate the P-values. • If p > 0.05 the term(s) should be removed from the Regression. Remove altitude, re-run model.

Flight Regression Example Model Selection Regression Analysis: Flight Speed versus Turbine Angl, Fuel/Air rat, ICR The regression equation is Flight Speed = 887 + 4.82 Turbine Angle + 12.1 Fuel/Air ratio - 55.0 ICR Predictor Coef SE Coef T P VIF Constant 886.6 150.4 5.90 0.000 Turbine Angle 4.822 2.763 1.75 0.093 1.1 Fuel/Air ratio 12.106 2.191 5.53 0.000 1.2 ICR -55.009 4.251 -12.94 0.000 1.1 S = 19.5613 R-Sq = 87.5% R-Sq(adj) = 85.9% Re-run the Regression The P-value for Turbine Angle now indicates it should be removed and the Regression re-run because P > 0.05

Flight Regression Final Regression Model Regression Analysis: Flight Speed versus Fuel/Air ratio, ICR The regression equation is Flight Speed = 1101 + 10.9 Fuel/Air ratio - 55.2 ICR Predictor Coef SE Coef T P VIF Constant 1101.04 90.00 12.23 0.000 Fuel/Air ratio 10.921 2.163 5.05 0.000 1.1 ICR -55.197 4.414 -12.51 0.000 1.1 S = 20.3162 R-Sq = 85.9% R-Sq(adj) = 84.8% Analysis of Variance Source DF SS MS F P Regression 2 65500 32750 79.35 0.000 Residual Error 26 10731 413 Total 28 76231 Source DF Seq SS Fuel/Air ratio 1 951 ICR 1 64549 Unusual Observations Fuel/Air Flight Obs ratio Speed Fit SE Fit Residual St Resid 1 40.6 618.00 624.29 11.55 -6.29 -0.38 X 22 36.3 578.00 524.45 5.43 53.55 2.74R R denotes an observation with a large standardized residual. X denotes an observation whose X value gives it large influence. This is the final Regression model because all remaining terms are statistically significant (we wanted 95% confidence or P-value < 0.05) and the R-Sq shows the remaining terms explain 85% of the variation of flight speed. Consider removing this Outlier but be careful, this is historical data that has no further information. Remember the objective is to get information to be used in a Designed Experiment where true cause and effect relationships can be established. Note the ICR predictor accounts for 84.7% of the variation. 84.7% = 64549/76231

Flight Regression Example Residual Analysis

Flight Regression Example Residual Analysis • Normally Distributed Residuals (Normal Probability Plot) • Equal Variance (Residuals vs. Fitted Values) • Independence (Residuals vs. Order of Data)

Improve Phase Advanced Process Modeling Multiple Linear Regression (MLR)

Improve Phase Advanced Process Modeling Multiple Linear Regression (MLR)

Presentation Transcript

MULTIPLE REGRESSION ANALYSIS

Multiple Linear Regression

Multiple Linear Regression

Multiple Linear Regression

Multiple Linear Regression

Multiple Linear Regression

Multiple Linear Regression

Lecture 6: Multiple Linear Regression Adjusted Variable Plots

Multiple Regression

Multiple Linear Regression

Multiple Linear Regression

Multiple Linear Regression

Multiple linear regression

Multiple Linear Regression

Lecture 6: Multiple Linear Regression Adjusted Variable Plots

Multiple linear regression MLR

Multiple Linear Regression.

Multiple Linear Regression

Multiple Linear Regression

Improve Phase Designing Experiments

Improve Phase Welcome to Improve

MULTIPLE REGRESSION ANALYSIS