1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc.
2 The Challenge The quality of an item or service usually depends on more than one characteristic. When the characteristics are not independent, considering each characteristic separately can give a misleading estimate of overall performance.
3 The Solution Proper analysis of data from such processes requires the use of multivariate statistical techniques.
4 Outline Multivariate SPC Multivariate control charts Multivariate capability analysis Data exploration and modeling Principal components analysis (PCA) Partial least squares (PLS) Neural network classifiers Design of experiments (DOE) Multivariate optimization
5 Example #1 Textile fiber Characteristic #1: tensile strength ± 1 Characteristic #2: diameter ± 0.05
6 Sample Data n = 100
7 Individuals Chart - strength
8 Individuals Chart - diameter
9 Capability Analysis - strength
10 Capability Analysis - diameter
11 Scatterplot
12 Multivariate Normal Distribution
13 Control Ellipse
14 Multivariate Capability Determines joint probability of being within the specification limits on all characteristics
15 Multivariate Capability
16 Capability Ellipse
17 Mult. Capability Indices Defined to give the same DPM as in the univariate case.
18 Test for Normality
19 More than 2 Characteristics Calculate T-squared: where S = sample covariance matrix = vector of sample means
20 T-Squared Chart
21 T-Squared Decomposition Subtracts the value of T-squared if each variable is removed. Large values indicate that a variable has an important contribution.
22 Control Ellipsoid
23 Multivariate EWMA Chart
24 Generalized Variance Chart Plots the determinant of the variance-covariance matrix for data that is sampled in subgroups.
25 Data Exploration and Modeling When the number of variables is large, the dimensionality of the problem often makes it difficult to determine the underlying relationships. Reduction of dimensionality can be very helpful.
26 Example #2
27 Matrix Plot
28 Analysis Methods Predicting certain characteristics based on others (regression and ANOVA) Separating items into groups (classification) Detecting unusual items
29 Multiple Regression
30 Principal Components The goal of a principal components analysis (PCA) is to construct k linear combinations of the p variables X that contain the greatest variance.
31 Scree Plot Shows the number of significant components.
32 Percentage Explained
33 Components
34 Interpretation
35 Principal Component Regression
36 Partial Least Squares (PLS) Similar to PCA, except that it finds components that minimize the variance in both the X’s and the Y’s. May be used with many X variables, even exceeding n.
37 Component Extraction Starts with number of components equal to the minimum of p and (n-1).
38 Coefficient Plot
39 Model in Original Units
40 Classification Principal components can also be used to classify new observations. A useful method for classification is a Bayesian classifier, which can be expressed as a neural network.
41 6 Types of Automobiles
42 Neural Networks
43 Bayesian Classifier Begins with prior probabilities for membership in each group Uses a Parzen-like density estimator of the density function for each group
44 Options The prior probabilities may be determined in several ways. A training set is usually used to find a good value for .
45 Output
46 Classification Regions
47 Changing Sigma
48 Overlay Plot
49 Outlier Detection
50 Cluster Analysis
51 Design of Experiments When more than one characteristic is important, finding the optimal operating conditions usually requires a tradeoff of one characteristic for another. One approach to finding a single solution is to use desirability functions.
52 Example #3 Myers and Montgomery (2002) describe an experiment on a chemical process: Response variableGoal Conversion percentagemaximize Thermal activityMaintain between 55 and 60 Input factorLowHigh time8 minutes17 minutes temperature160˚ C210˚ C catalyst1.5%3.5%
53 Experiment
54 Step #1: Model Conversion
55 Step #2: Optimize Conversion
56 Step #3: Model Activity
57 Step #4: Optimize Activity
58 Step #5: Select Desirability Fcns. Maximize
59 Desirability Function Hit Target
60 Combined Desirability where m = # of factors and 0 ≤ I j ≤ 5. D ranges from 0 to 1.
61 Example
62 Desirability Contours
63 Desirability Surface
64 Overlaid Contours
65 References Johnson, R.A. and Wichern, D.W. (2002). Applied Multivariate Statistical Analysis. Upper Saddle River: Prentice Hall.Mason, R.L. and Young, J.C. (2002). Mason and Young (2002). Multivariate Statistical Process Control with Industrial Applications. Philadelphia: SIAM. Montgomery, D. C. (2005). Introduction to Statistical Quality Control, 5th edition. New York: John Wiley and Sons. Myers, R. H. and Montgomery, D. C. (2002). Response Surface Methodology: Process and Product optimization Using Designed Experiments, 2nd edition. New York: John Wiley and Sons.
66 PowerPoint Slides Available at: