Design and Analysis of Experiments Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN, ROC 1/33
Two-Level Factorial Designs Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN, ROC 2/33
Outline Introduction The 2 2 Design The 2 3 Design The general 2 k Design A single Replicate of the 2 k design Additional Examples of Unreplicated 2 k Designs 2 k Designs are Optimal Designs The additional of center Point to the 2 k Design
Introduction Special case of general factorial designs k factors each with two levels Factors maybe qualitative or quantitative A complete replicate of such design is 2 k factorial design Assumed factors are fixed, the design are completely randomized, and normality Used as factor screening experiments Response between levels is assumed linear
The 2 2 Design FactorTreatment Combination Replication ABIIIIIIIV --A low, B low A high, B low A low, B high A high, B high
The 2 2 Design “-” and “+” denote the low and high levels of a factor, respectively Low and high are arbitrary terms Geometrically, the four runs form the corners of a square Factors can be quantitative or qualitative, although their treatment in the final model will be different
Estimate factor effects Formulate model With replication, use full model With an unreplicated design, use normal probability plots Statistical testing (ANOVA) Refine the model Analyze residuals (graphical) Interpret results The 2 2 Design
Standard order Yates’s order Effects(1)abab A+1+1 B +1 AB+1 +1 Effects A, B, AB are orthogonal contrasts with one degree of freedom Thus 2 k designs are orthogonal designs
The 2 2 Design ANOVA table
The 2 2 Design Algebraic sign for calculating effects in 2 2 design
The 2 2 Design Regression model x 1 and x 2 are code variable in this case Where con and catalyst are natural variables
The 2 2 Design Regression model Factorial Fit: Yield versus Conc., Catalyst Estimated Effects and Coefficients for Yield (coded units) Term Effect Coef SE Coef T P Constant Conc Catalyst Conc.*Catalyst S = PRESS = 70.5 R-Sq = 90.30% R-Sq(pred) = 78.17% R-Sq(adj) = 86.66% Analysis of Variance for Yield (coded units) Source DF Seq SS Adj SS Adj MS F P Main Effects Way Interactions Residual Error Pure Error Total
The 2 2 Design Regression model
The 2 2 Design Regression model
The 2 2 Design Regression model Estimated Coefficients for Yield using data in uncoded units Term Coef Constant Conc Catalyst Conc.*Catalyst Estimated Coefficients for Yield using data in uncoded units Term Coef Constant Conc Catalyst Regression model (without interaction)
The 2 2 Design Response surface
The 2 2 Design Response surface (note: the axis of catalyst is reversed with the one from textbook)
The 2 3 Design 3 factors, each at two level. Eight combinations
The 2 3 Design Design matrix Or geometric notation
The 2 3 Design Algebraic sign
22 The 2 3 Design -- Properties of the Table Except for column I, every column has an equal number of + and – signs The sum of the product of signs in any two columns is zero Multiplying any column by I leaves that column unchanged (identity element)
23 The 2 3 Design -- Properties of the Table The product of any two columns yields a column in the table: Orthogonal design Orthogonality is an important property shared by all factorial designs
The 2 3 Design -- example Nitride etch process Gap, gas flow, and RF power
The 2 3 Design -- example Nitride etch process Gap, gas flow, and RF power
The 2 3 Design -- example Estimated Effects and Coefficients for Etch Rate (coded units) Term Effect Coef SE Coef T P Constant Gap Gas Flow Power Gap*Gas Flow Gap*Power Gas Flow*Power Gap*Gas Flow*Power S = PRESS = R-Sq = 96.61% R-Sq(pred) = 86.44% R-Sq(adj) = 93.64% Analysis of Variance for Etch Rate (coded units) Source DF Seq SS Adj SS Adj MS F P Main Effects Way Interactions Way Interactions Residual Error Pure Error Total Full model
The 2 3 Design -- example Factorial Fit: Etch Rate versus Gap, Power Estimated Effects and Coefficients for Etch Rate (coded units) Term Effect Coef SE Coef T P Constant Gap Power Gap*Power S = PRESS = R-Sq = 96.08% R-Sq(pred) = 93.02% R-Sq(adj) = 95.09% Analysis of Variance for Etch Rate (coded units) Source DF Seq SS Adj SS Adj MS F P Main Effects Way Interactions Residual Error Pure Error Total Reduced model
28 R 2 and adjusted R 2 R 2 for prediction (based on PRESS) The 2 3 Design – example -- Model Summary Statistics for Reduced Model
The 2 3 Design -- example
31 The Regression Model
32 Cube Plot of Ranges What do the large ranges when gap and power are at the high level tell you?
33 The General 2 k Factorial Design There will be k main effects, and
34 The General 2 k Factorial Design Statistical Analysis
35 The General 2 k Factorial Design Statistical Analysis
36 Unreplicated 2 k Factorial Designs These are 2 k factorial designs with one observation at each corner of the “cube” An unreplicated 2 k factorial design is also sometimes called a “single replicate” of the 2 k These designs are very widely used Risks…if there is only one observation at each corner, is there a chance of unusual response observations spoiling the results? Modeling “noise”?
37 If the factors are spaced too closely, it increases the chances that the noise will overwhelm the signal in the data More aggressive spacing is usually best Unreplicated 2 k Factorial Designs
38 Lack of replication causes potential problems in statistical testing Replication admits an estimate of “pure error” (a better phrase is an internal estimate of error) With no replication, fitting the full model results in zero degrees of freedom for error Potential solutions to this problem Pooling high-order interactions to estimate error Normal probability plotting of effects (Daniels, 1959) Unreplicated 2 k Factorial Designs
39 A 2 4 factorial was used to investigate the effects of four factors on the filtration rate of a resin The factors are A = temperature, B = pressure, C = mole ratio, D= stirring rate Experiment was performed in a pilot plant Unreplicated 2 k Factorial Designs -- example
40 Unreplicated 2 k Factorial Designs -- example
41 Unreplicated 2 k Factorial Designs -- example
42 Unreplicated 2 k Factorial Designs – example –full model
43 Unreplicated 2 k Factorial Designs -- example –full model
44 Unreplicated 2 k Factorial Designs -- example –full model
45 Unreplicated 2 k Factorial Designs -- example –reduced model Factorial Fit: Filtration versus Temperature, Conc., Stir Rate Estimated Effects and Coefficients for Filtration (coded units) Term Effect Coef SE Coef T P Constant Temperature Conc Stir Rate Temperature*Conc Temperature*Stir Rate S = PRESS = R-Sq = 96.60% R-Sq(pred) = 91.28% R-Sq(adj) = 94.89% Analysis of Variance for Filtration (coded units) Source DF Seq SS Adj SS Adj MS F P Main Effects Way Interactions Residual Error Lack of Fit Pure Error Total
46 Unreplicated 2 k Factorial Designs -- example –reduced model
47 Unreplicated 2 k Factorial Designs -- example –reduced model
48 Unreplicated 2 k Factorial Designs -- example –reduced model
49 Unreplicated 2 k Factorial Designs -- example –Design projection Since factor B is negligible, the experiment can be interpreted as a 2 3 factorial design with factors A, C, D. 2 replicates
50 Unreplicated 2 k Factorial Designs -- example –Design projection
51 Unreplicated 2 k Factorial Designs -- example –Design projection Factorial Fit: Filtration versus Temperature, Conc., Stir Rate Estimated Effects and Coefficients for Filtration (coded units) Term Effect Coef SE Coef T P Constant Temperature Conc Stir Rate Temperature*Conc Temperature*Stir Rate Conc.*Stir Rate Temperature*Conc.*Stir Rate S = PRESS = 718 R-Sq = 96.87% R-Sq(pred) = 87.47% R-Sq(adj) = 94.13% Analysis of Variance for Filtration (coded units) Source DF Seq SS Adj SS Adj MS F P Main Effects Way Interactions Way Interactions Residual Error Pure Error Total
52 Dealing with Outliers Replace with an estimate Make the highest-order interaction zero In this case, estimate cd such that ABCD = 0 Analyze only the data you have Now the design isn’t orthogonal Consequences?
53 Duplicate Measurements on the Response Four wafers are stacked in the furnace Four factors: temperature, time, gas flow, and pressure. Response: thickness Treated as duplicate not replicate Use average as the response
54 Duplicate Measurements on the Response
55 Duplicate Measurements on the Response Stat DOE Factorial Pre- process Response for Analyze
56 Duplicate Measurements on the Response Stat DOE Factorial Analyze Factorial Design
57 Duplicate Measurements on the Response Factorial Fit: average versus Temperature, Time, Pressure Estimated Effects and Coefficients for average (coded units) Term Effect Coef SE Coef T P Constant Temperature Time Pressure Temperature*Time Temperature*Pressure S = PRESS = R-Sq = 98.39% R-Sq(pred) = 95.88% R-Sq(adj) = 97.59% Analysis of Variance for average (coded units) Source DF Seq SS Adj SS Adj MS F P Main Effects Way Interactions Residual Error Lack of Fit Pure Error Total
58 Duplicate Measurements on the Response
59 Duplicate Measurements on the Response
60 The 2 k design and design optimality The model parameter estimates in a 2 k design (and the effect estimates) are least squares estimates. For example, for a 2 2 design the model is
61 The four observations from a 2 2 design The 2 k design and design optimality In matrix form:
62 The matrix is diagonal – consequences of an orthogonal design The regression coefficient estimates are exactly half of the ‘usual” effect estimates The “usual” contrasts The 2 k design and design optimality
63 The 2 k design and design optimality The matrix X’X has interesting and useful properties: Minimum possible value for a four-run design Maximum possible value for a four-run design Notice that these results depend on both the design that you have chosen and the model
The 2 k design and design optimality The 2 2 design is called D-optimal design In fact, all 2 k design is D-optimal design for fitting first order model with interaction. Consider the variance of the predicted response in the 2 2 design:
The 2 k design and design optimality
The 2 2 design is called G-optimal design In fact, all 2 k design is G-optimal design for fitting first order model with interaction. Minimize the maximum prediction variance
The 2 k design and design optimality The 2 2 design is called I-optimal design In fact, all 2 k design is I-optimal design for fitting first order model with interaction. Smallest possible value of the average prediction variance
The 2 k design and design optimality The Minitab provide the function on “Select Optimal Design” when you have a full factorial design and are trying to reduce the it to a partial design or “fractional design”. It only provide the “D-optimal design” One needs to have a full factorial design first and the choose the number of data points to be allowed to use.
69 These results give us some assurance that these designs are “good” designs in some general ways Factorial designs typically share some (most) of these properties There are excellent computer routines for finding optimal designs The 2 k design and design optimality
70 Addition of Center Points to a 2 k Designs Based on the idea of replicating some of the runs in a factorial design Runs at the center provide an estimate of error and allow the experimenter to distinguish between two possible models: Quadratic effects
71 Addition of Center Points to a 2 k Designs When adding center points, we assume that the k factors are quantitative. Example on 2 2 design
72 Addition of Center Points to a 2 k Designs Five point: (-,-),(-,+),(+,-),(+,+), and (0,0). n F =4 and n C =4 Let be the average of the four runs at the four factorial points and let be the average of n C run at the center point.
73 Addition of Center Points to a 2 k Designs If the difference of is small, the center points lie on or near the plane passing through factorial points and there is no quadratic effects. The hypotheses are:
74 Addition of Center Points to a 2 k Designs Test statistics: with one degree of freedom
75 Addition of Center Points to a 2 k Designs -- example In example 6.2, it is a 2 4 factorial. By adding center points x1=x2=x3=x4=0, four additional responses (filtration rates) are : 73, 75, 66,69. So =70.75 and =70.06.
76 Addition of Center Points to a 2 k Designs -- example Term Effect Coef SE Coef T P Constant Temperature Pressure Conc Stir Rate Temperature*Pressure Temperature*Conc Temperature*Stir Rate Pressure*Conc Pressure*Stir Rate Conc.*Stir Rate Temperature*Pressure*Conc Temperature*Pressure*Stir Rate Temperature*Conc.*Stir Rate Pressure*Conc.*Stir Rate Temperature*Pressure*Conc.*Stir Rate Ct Pt
77 Addition of Center Points to a 2 k Designs -- example Analysis of Variance for Filtration (coded units) Source DF Seq SS Adj SS Adj MS F P Main Effects Way Interactions Way Interactions Way Interactions Curvature Residual Error Pure Error Total
78 Addition of Center Points to a 2 k Designs If curvature is significant, augment the design with axial runs to create a central composite design. The CCD is a very effective design for fitting a second-order response surface model
79 Addition of Center Points to a 2 k Designs
80 Addition of Center Points to a 2 k Designs Use current operating conditions as the center point Check for “abnormal” conditions during the time the experiment was conducted Check for time trends Use center points as the first few runs when there is little or no information available about the magnitude of error
81 Center Points and Qualitative Factors