Statistical Tools for Multivariate Six Sigma. Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc. The Challenge. The quality of an item or service usually depends on more than one characteristic.
Dr. Neil W. Polhemus
CTO & Director of Development
The quality of an item or service usually depends on more than one characteristic.
When the characteristics are not independent, considering each characteristic separately can give a misleading estimate of overall performance.
Proper analysis of data from such processes requires the use of multivariate statistical techniques.
Characteristic #1: tensile strength - 115 ± 1
Characteristic #2: diameter - 1.05 ± 0.05
n = 100
Determines joint probability of being within the specification limits on all characteristics
Defined to give the
same DPM as in the
S = sample covariance matrix
= vector of sample means
Subtracts the value of T-squared if each variable is removed.
Large values indicate that a variable has an important contribution.
Plots the determinant of the variance-covariance matrix for data that is sampled in subgroups.
When the number of variables is large, the dimensionality of the problem often makes it difficult to determine the underlying relationships.
Reduction of dimensionality can be very helpful.
The goal of a principal components analysis (PCA) is to construct k linear combinations of the p variables X that contain the greatest variance.
Shows the number of significant components.
Similar to PCA, except that it finds components that minimize the variance in both the X’s and the Y’s.
May be used with many X variables, even exceeding n.
Starts with number of components equal to the minimum of p and (n-1).
Principal components can also be used to classify new observations.
A useful method for classification is a Bayesian classifier, which can be expressed as a neural network.
When more than one characteristic is important, finding the optimal operating conditions usually requires a tradeoff of one characteristic for another.
One approach to finding a single solution is to use desirability functions.
Myers and Montgomery (2002) describe an experiment on a chemical process:
where m = # of factors and 0 ≤ Ij ≤ 5. D ranges from 0 to 1.