1 Design of Experiments
2 DESIGN OF EXPERIMENTS Purposeful changes of the inputs (factors) to a process in order to observe corresponding changes in the output (response). Process InputsOutputs Douglas Montgomery, Design and Analysis of Experiments
3 Why use DOE ? A basis of action -- allows purposeful changes. An analytic study -- one in which action will be taken on a cause-and-effect system to improve performance of a product or process in the future. Follows the scientific approach to problem solving. Provides a way to measure natural variation. Permits the clear analysis of complex effects. Most efficient way to derive the required information at the least expenditure of resources. Moen, Nolan and Provost, Improving Quality Through Planned Experimentation
4 Interactions Varying factors together vs. one at a time. BUCKBUCK D O E George Box, Do Interactions Really Matter, Quality Engineering, 1990.
5 BUCKBUCK D O E Voila! George Box, Do Interactions Really Matter, Quality Engineering, 1990.
6 Experiment run at SKF -- largest producer of rolling bearing in the world. Looked at three factors: heat treatment, outer ring osculation and cage design. Results: choice of cage design did not matter (contrary to previously accepted folklore -- considerable savings) life of bearing increased five fold if osculation and heat treatment are increased together -- saved millions of dollars ! George Box, Do Interactions Really Matter, Quality Engineering, Industry Example
7 Bearings like this have been made for decades. Why did it take so long to discover this improvement ? One factor vs. interaction effects ! Osculation Cage Heat George Box, Do Interactions Really Matter, Quality Engineering, 1990.
Osculation Heat The Power of Interactions ! George Box, Do Interactions Really Matter, Quality Engineering, 1990.
9 2 Design Example 2 Consider an investigation into the effect of the concentration of the reactant and the amount of catalyst on the reaction time of a chemical process. L H reactant (factor A) 15% 25% catalyst (factor B) 1 bag 2 bags Douglas Montgomery, Design and Analysis of Experiments
10 Design Matrix for 2 2 ABABTotalAverage Main effects Interaction
11 Factor A - B SettingsA + B A - B A + B IIIIIITotal Replicates Douglas Montgomery, Design and Analysis of Experiments
12 An effect is the difference in the average response at one level of the factor versus the other level of the factor. - + A A effect = ( [ ] - [ ] ) / 2(3) = 8.33 Douglas Montgomery, Design and Analysis of Experiments
13 Use a matrix to find the effects of each factor, including the interaction effect between the two factors. ABABTotalAverage Avg Avg Effect 8.4 Douglas Montgomery, Design and Analysis of Experiments
14 ABABTotalAverage Avg Avg Effect Completing the matrix with the effect calculations: Douglas Montgomery, Design and Analysis of Experiments
B AB A Dot Diagram Douglas Montgomery, Design and Analysis of Experiments
A Response Plots B Douglas Montgomery, Design and Analysis of Experiments
A B - B + B - B + A - A Interaction Response Plot Douglas Montgomery, Design and Analysis of Experiments
18 Normal Probability Plots Effects are the differences between two averages. As we know, the distribution of averages are approximately normal. NPP can be used to identify the effects that are different from noise. Soren Bisgaard, A Practical Introduction to Experimental Design
19 Construction of NPP Can be constructed with effects on horizontal and cumulative percentages on vertical -- but this requires normal probability paper. Can also be constructed using the inverse standard normal of the plotting point ( (i -.5) / n ). Look for effects that are different from plotted ‘vertical’ reference line. Soren Bisgaard, A Practical Introduction to Experimental Design
20 Steps in constructing NPP 1. Compute effects. 2. Order effects from smallest to largest. 3. Let i be the order number (1 to n). 4. Calculate probability plotting position of the ordered effect using the formula ( p = [i -.5]/n). 5. Using a standard normal table determine the Z value corresponding to each left tail probability of step Plot the effects on horizontal axis and Z on vertical. 7. Fit a line through the most points. 8. Those ‘off the line’ are significant effects. Soren Bisgaard, A Practical Introduction to Experimental Design
22 Plot reference line through the majority of points. Look for effects which are off this line
23 Prediction Equation The ‘intercept’ in the equation is the overall average of all observations. The coefficients of the factors in the model are 1/2 the effect. Y = /2 A - 5/2 B + 1.7/2 AB Y = A B AB or note: A and B will be values between -1 and +1.
24 Analysis of Variance Source of Sum ofDegrees ofMeanF VariationSquaresFreedomSquare A * B * AB Error Total * = significant at 1% (see F table)
25 Calculating SS, df and MS for Effects and Interactions Source of Sum ofDegrees ofMeanF VariationSquaresFreedomSquare A * SS = Effect 2 x n = x 3 where n = replicates always 1 for this type design SS / df Use this same process for A, B and AB
26 Source of Sum ofDegrees ofMeanF VariationSquaresFreedomSquare Total This is found by adding up every squared observation and then subtracting what is called a correction factor (sum of all observations, square this amount, then divide by the number of observations). SST = (330 2 / 12) = = Total df = n - 1 = = 11 Calculating total sum of squares and total degrees of freedom
27 Source of Sum ofDegrees ofMeanF VariationSquaresFreedomSquare Error Found by subtraction: Total SS - SS A - SS B - SS AB = = Found by subtraction: = Total df - A df - B df - AB df = = 8 SS / df Calculating error sum of squares, df and mean square
28 Source of Sum ofDegrees ofMeanF VariationSquaresFreedomSquare A * B * AB Error 3.92 Calculating F ratios F ratios: F = MS (A or B or AB) MS (error) / / / 3.92 Compare to F table
29 Interpreting F ratios F table at num df = 1 and denom df = 8 F F F F F F ratios confirm that factors A and B are significant at the 1% level. F ratio shows there is not a significant interaction.
30 Exercise You will conduct a 2 3 experiment with 2 replicates. Factors:LH A -- Tower35 B -- Front Stop02 C -- Back Stop57
31 Requirements: 1. Collect data -- total of 16 observations (random order). 2. Fill in matrix and compute effects. 3. Put averages on a cube plot. 4. Plot effects on dot plot and normal probability plot. 5. Create appropriate response plots for significant interactions and main effects. 6. Interpret results and make recommendations to management.
32 Design Matrix
33 Cube Plot
34 Response Plots
35 Z Effect Normal Probability Plot
36 ANOVA table Source of Sum ofDegrees ofMeanF VariationSquaresFreedomSquare A B C AB AC BC ABC Error Total note: for 2 3 the SS =effect 2 x 2n
37 Why use 2 k designs ? Easy to use and data analysis can be performed using graphical methods. Relatively few runs required. 2 k designs have been found to meet the majority of the experimental needs of those involved in the improvement of quality. 2 k designs are easy to use in sequential experimentation. Fractions of the 2 k (fractional factorials) can be used to further reduce the experiment size. Moen, Nolan and Provost, Improving Quality Through Planned Experimentation
38 A review of the concepts behind Analysis of Variance
39 Analysis of Variance ANOVA is used to compare the means of two or more populations. Procedure is based on the spread (variance) between sample averages of populations and spread within sample averages. Possibly the most widely used procedure across disciplines.
40 Example Consider a cereal manufacturer who wants to evaluate the impact on sales of four package designs. Ten stores are randomly assigned to one of the designs and sales data are collected for a given period. This type of design is called a Completely Randomized Design.
41 It’s called analysis of VARIANCE! Recall that variance is the “almost average” of the squared differences of a set of data around its mean. For this set of data then, we have: ( ) 2 + ( ) ( ) 2 = 304 units of variation
42 Variation What can account for this variation? type of package design (SSB) treatment everything else (SSE) error The total variation can be expressed in this relationship: SST = SSB + SSE
43 Why didn’t we sell the same amount of each package type? Why not the same at each store for package 1? (or 2?, or 3?, or 4?) thousands of extraneous factors! SSE = ( ) 2 + ( ) 2 package 1 + ( ) 2 + ( ) 2 + ( ) 2 package 2 + ( ) 2 + ( ) 2 + ( ) 2 package 3 + ( ) 2 + ( ) 2 package 4 = 46 total units of variation
44 What number best represents the long-term average for package 1? The package 1 average (just as package 2 average does for package 2 designs, and so forth. How do these package averages vary from the overall average? ( ) 2 + ( ) 2 + ( ) 2 + ( ) 2 = 116 But, we must weigh each one by the number of observations in that average -- that gives us... 2(9) + 3(25) + 3(1) + 2(81) = 258 (note: 304 = )
45 OK -- we’ve got some ‘sum of squares’ But this procedure is called ‘analysis of variance’ FACT: variance = sum of squares divided by appropriate df Lets organize our results so far in an ANOVA table: Sources of variationSum of squares dfVariance SSB SSE Total3049
46 degrees of freedom As the name implies, this is the number of things that are free to vary and still get the same result. For example, if I told you the average of five numbers is 7, you could pick any four numbers, and if I can pick the fifth I can ensure the average is 7. Generally speaking, the df will be one less than the number of things being compared. For example, SSB df = 4(package designs) - 1 = 3 SSE df = (2 - 1) + (3 - 1) + (3 - 1) + (2 - 1) = 6 Total df = = 9
47 A ratio of variances We next form a ratio of variances = 86 / 7.67 = 11.2 We need a reference distribution to evaluate this -- we compare it to the Fisher distribution (F distribution) A Short Fisher Table df forDegree ofdf for Numerator DenominatorConfidence
48 Making a decision We are really carrying out a hypothesis test. Our Ho is that are package design means are equal. H o : Our Ha is that at least one mean is different than the rest. We can make two decisions with our data: 1. No difference in means. This is one of those less than 1 in a 100 times we would get a value this large. 2. The null is false -- we reject the null and accept H a.
49 Summary of ANOVA concept 1. Decompose the total sum of squares. 2. Convert sum of squares into variances. 3. Compute variance ratio and compare to F table.
50 Assumptions! 1. Populations being compared are normally distributed -- moderate departures OK -- “robust” in this regard. 2. Variances of the populations are equal -- can be tested -- if this assumption is not met there is “trouble in River City.” 3. Observations are statistically independent (use randomization).