Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Design of Experiments

Similar presentations


Presentation on theme: "Statistical Design of Experiments"— Presentation transcript:

1 Statistical Design of Experiments
SECTION III SINGLE FACTOR EXPERIMENTS Monday, Aug 13, 2007

2 INTRODUCTION TO DESIGN OF EXPERIMENTAL (DOE)
Definition A systematic procedure for manipulating process variables to guide the search for the optimum of the process Based on a mathematical model used to approximate the process Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

3 PURPOSE OF DOE Understand and quantify process fluctuations or variability Identify the most important variables affecting the output levels Maximize profitability and quality of product accomplish all of the above in the MINIMAL number of experiments and with LITTLE knowledge of the Process Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

4 WHEN TO USE DOE Only want to “make product” or demonstrate
Want to develop new knowledge NOT Already know some things about process NOT New, unfamiliar process Have resources to make several runs Can afford many more runs NOT Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

5 LIMITATIONS Yield “black box” polynomial models only
Answer “which” and “how” questions but not “why” questions Lack of physically significant of terms/parameters in the model equations Preclude extrapolation/scale-up Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

6 PHASE OF A PROJECT The Experiment Statement of the problem
Choice of response Selection of factors that can be controlled and varied Feasible ranges and choice of levels of these factors (Prior Information) Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

7 PHASE OF A PROJECT The Design Number of Experiments
Sequential experimentation Hypothesis Consequence Consequence Experimentation Experimentation Randomization/Blocking/Repeating Mathematical model to describe experiment Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

8 PHASE OF A PROJECT The Analysis Data collection and processing
Computation of test statistics Interpretation of results Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

9 PURPOSE OF SINGLE FACTOR EXPERIMENTS
Quantify relationship between a single factor and a single measured or response variable Compare the relative effectiveness of two or more treatments (levels of the factor). Estimate the level of the factor that optimizes the response variable Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

10 DEFINITION OF TERMS Factor - controllable variable
Level - value of the factor Treatment - distinct collection of factor levels Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

11 CONSIDERATIONS IN PLANNING THE EXPERIMENTS
Factor Levels Replicates Randomization Blocking (Restriction on Randomization) Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

12 FACTOR LEVELS The range of levels over which a factor is examined is determined by exploratory testing, subjective knowledge and/or the literature (this is your prior knowledge) Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

13 FACTOR LEVELS FOR QUANTITATIVE FACTORS
For quantitative factors space the levels reasonably far apart so both detection and estimation of effects are possible (but not too far apart) Level of Factor Response (-) (+) Effect Experimental Error Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

14 REPLICATION Repeat of an entire experiment or run (trial)
Used to Define and understand sources of experimental error Used to test for significance of different levels Replicates-Necessary Number Number of replicates is a function of experimental error and the difference to be detected in the response (yield) Replication is the repetition of the creation of a phenomenon so that the variability associated with the phenomenon can be estimated. Replications and repeated measurements are dealt with differently in statistical experimental design and analysis. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

15 NUMBER OF REPEATS / REPLICATES
The number of replicates, n, needed is a function of experimental error, σ2, the difference to be detected in the response, D, and degree of confidence you want in the result(α): Where Zα/2 is the Z value at α/2. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

16 RANDOMIZATION Is conducted in order to minimizes the effect of uncontrolled factors / nuisance variables Randomization involves randomly allocating the experimental units across the treatment groups. Thus, if the experiment compares a new drug against a standard drug used as a control. the patients should be allocated to new drug or control by a random process. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

17 RANDOMIZATION Possible nuisance variables: Warm-up time for tester
Operator differences Time of day Raw material changes Batch/lot differences Equipment wear Systematic process change Catalyst deactivation Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

18 RANDOMIZATION What should be randomized?
Assignment of experimental Units to Treatments Order of running Experiments Order of evaluating Experimental Results Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

19 A SINGLE FACTOR AT THREE LEVELS EXAMPLE
Randomly apply treatments A, B and C to nine objects Results: Treatment A B C Average Is there a difference between the treatments? Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

20 PLOT OF EXAMPLE DATA IN JMP
Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

21 ANALYSIS OF EXAMPLE DATA IN JMP
Ybar=(3+4+5)/3=4 SSTotal= (36*36)/9=24 SStreatment= ( )/3-36*36/9=6 SSerror= 24-6=18 MStreatment=6/2=3 MSerror=18/6=3 F=2/3 < F(.05,2,6)=5.14 Conclude no difference between treatments. There are no significant differences between the factors. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

22 SAME EXAMPLE WITH DIFFERENT RESULTS
A B C Average Now is there a difference between the treatments? Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

23 PLOT OF EXAMPLE DATA IN JMP
dd Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

24 ANALYSIS OF EXAMPLE WITH NEW DATA IN JMP
There are significant differences between the factors. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

25 EXAMPLE COMPARISON Variability in the data affects the ability to discriminate levels. Without replication no estimation of experimental error is possible Intuition should guide analysis Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

26 % TABLET DISSOLUTION EXAMPLE
Problem: Determine the effects of four different excipients on tablet dissolution after 45 minutes. Background: Twenty tablets were selected at random. Four different treatments (excipients) were applied to the tablets. The % dissolution was measured and are shown in the data set below Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

27 RESULTS OF % TABLET DISSOLUTION EXAMPLE
Results of completely randomized design for four incipient types: Overall = 50.3 Is there a difference between the excipients? Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

28 JMP PLOT OF % DISSOLUTION DATA
In each rhombus, the middle line is the average for this group. The distance between the above line and the middle line( the distance between the middle line and the below line as well) is the mean squre error for the overall sample. The above vertex is the upper 95% confidence value for the group mean using pooled estimate of error variance as standard error. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

29 JMP ANALYSIS OF % DISSOLUTION DATA
There are significant differences between the factors. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

30 JMP ANALYSIS OF % DISSOLUTION DATA (CONTINUED)
Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

31 BLOCKING Blocking - Sort treatments into “blocks” which are reasonably alike in order to reduce the impact of non-controlled sources of variability Blocking is the arranging of experiment units in groups (blocks) that are similar to one another. For example, an experiment is designed to test a new drug on patients. There are two levels of the treatment, drug, and placebo, administered to male and female patients in a double blind trial. The sex of the patient is a blocking factor accounting for treatment variablility between males and females. This reduces sources of variability and thus leads to greater precision. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

32 COMPLETELY RANDOMIZED AND RANDOMIZED BLOCK DESIGN
Completely Randomized Design (CRD) Order in which treatments are selected and the runs are carried out is completely random Completely Randomized Block Design (CRBD) Treatments and runs are randomly assigned within the blocks Every treatment must occur in each block Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

33 API SHELF LIFE STUDY (CRD)
A completely randomized design was conducted on expensive tablets to determine the % API after one year with three different coatings: Residual API Concentration(%) Coatings: A______ _ B C AVG Is there a difference between the coatings? Mathematical Model Y = Overall Mean + Treatment Effect + Error Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

34 JMP PLOT OF % API Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

35 JMP ANALYSIS OF CRD There are no significant differences between the factors. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

36 API SHELF LIFE STUDY WITH RANDOMIZED BLOCK DESIGN
This time the production lots of the tablets were identified and tablets with different coatings were settled at random. The data are the same as the previous example: Residual API Concentration (%) A B C Lot # Lot # Lot # Lot # AVE Now is there a difference between coatings? Mathematical Model = overall mean + treatment effect + block effect + error Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

37 JMP PLOT OF RANDOMIZED BLOCK DESIGN
Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

38 JMP ANALYSIS OF RANDOMIZED BLOCK DESIGN
Both factors are significant. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

39 EXAMPLE COMPARISON Blocking can be used to eliminate its effect on the comparison among treatments Without blocking no blocking effects is taken into consideration With blocking both factors became significant Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

40 EXAMPLE OF A LATIN SQUARE DESIGN
A fleet manager wants to select from among four brands of tires, A, B, C, and D, which will give the least amount of tire wear after 20,000 miles. The response variable is the difference in maximum tread thickness on a tire between the time it is mounted on the wheel of the car and after it has completed 20,000 miles. Four tires of each brand will be used to get an estimate of experiment error. In order to conduct this study, four cars, a BMW, a Ferrari, a Lamborghini, and a Mercedes Benz were selected to accommodate the 16 tires. Problem: How do we assign the 16 tires to the four cars? Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

41 DIFFERENT DESIGNS FOR TIRE WEAR EXAMPLE
(A) Randomly assign one brand to each car (randomized complete design): Good – Reproducibility Bad – Cars confused with brands Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

42 DIFFERENT DESIGNS FOR TIRE WEAR EXAMPLE
(B) Randomly assign brands to cars (completely randomized design): Good – Randomized Bad – Not each brand on each car Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

43 DIFFERENT DESIGNS FOR TIRE WEAR EXAMPLE
(C) Randomly assign brands to the different cars and make sure each brand is on each car: Good – Each brand on each car Bad – Have not balanced positions on cars Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

44 ADDITIONAL RESTRICTION ON RANDOMIZATION
(D) Randomly assign brands to cars and positions so that each brand is on each car and at each position (Latin Square Design): Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

45 TIRE WEAR DATA Design (C) is called a latin square design with two restrictions on randomization. The results for this design are: Cars BMW Ferrari Lambo MB Positions LF A= B= C= D=25 RF B= C= D= A=37 LR C= D= A= B=24 RR D= A= B= C=30 Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

46 JMP ANALYSIS OF TIRE WEAR DATA
Therefore significant differences exist between brands and cars, but positions are not significant. Therefore significant differences exist between brands and cars, but positions are not significant. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

47 DETERMINE THE OPTIMUM LEVEL
In the case of quantitative factors, (i.e. factors which have measurable levels) it is often important to determine the level which optimizes (i.e. maximizes/minimizes) the response variable. By spacing the levels uniformly over the range the optimum can be observed. However this may require a large number of levels and only the optimal one may be of interest. A more useful approach is to sequentially search out the optimal level. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

48 OPTIMUM SEEKING METHOD
Unimodality Assumption: Assume the response only has one optimum value and is monotonically decreasing (in the case of a maximum) or increasing (in the case of a minimum) from this value. Two Methods: Dichotomous Search Golden Section Search Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

49 DICHOTOMOUS SEARCH The dichotomous search works by repeatedly locating two experiments symmetrically about the midpoint and sufficiently far apart to detect a difference. The interval which contains the optimum value is selected for further subdivision. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

50 GRAPHICAL ILLUSTRATION OF DICHOTOMOUS SEARCH
Suppose we want to find the optimum (maximum) value of Response Y for single factor X in the interval between two points a and b. The method computes the experiment points by c1 = (a+b)/2 - d/2, c2=(a+b)/2 + d/2, where d is the distance that will produce a minimum detectable change in response. If the response Y at X=c1 is greater than Y at X=c2, then the next working interval will be [a,c2], otherwise the next working interval will be [c1, b]. The algorithm is then applied to the new working interval where the optimum exists, meaning that the algorithm is inherently recursive. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

51 GOLDEN SECTION SEARCH Analogous to the dichotomous search strategy the golden section search finds the optimum (minimum or maximum) by successively narrowing the range of values inside which the optimum is known to exist. The technique derives its name from the fact that rather than halving the interval the experiments are symmetrically located by the “golden section” ratio. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

52 GRAPHICAL ILLUSTRATION OF GOLDEN SECTION
Locate the first two experiments as shown: The optimum (maximum) lies in the interval [a, d] by unimodality. Discard [d, b]. Locate third experiment symmetrically such that new d is old c, new c is calculated by b-c = d-a: Suppose we want to find the optimum (maximum) value of Response Y for single factor X in interval between two points a and b. The method divides the interval in three by computing c =a + (b-a)*0.382, d = a + (b-a)*.618. If the response Y at X=c1 is greater than Y at X=d, then the next working interval will be [a,d], otherwise the next working interval will be [c, b]. The algorithm is then applied to the new working interval where the optimum exists, meaning that the algorithm is inherently recursive. Meanwhile, the golden ratio, (c-a)/(b-c) = (d-c)/(b-d). If the next working interval is [a,d], we only need to calculate a new c in the new round. Because the new d will be the old c. If the next working working interval is [c,b], then we only calculate a new d in the new round. The new c will be the old d. This significantly reduces our calculation workload. This method converges faster than the dichotomous search. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007

53 SUMMARY Randomization, Blocking and Replication are key design considerations in testing for significance of levels in single factor experiments. Optimum seeking methods are useful tools to reduce the number of experiments required to find the best conditions. Dr. Gary Blau, Sean Han Monday, Aug 13, 2007


Download ppt "Statistical Design of Experiments"

Similar presentations


Ads by Google