Design of Experiments.. Introduction.  Experiments are performed to discover something about a particular process or system.  Literally experiments.

Slides:



Advertisements
Similar presentations
Design of Experiments Lecture I
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Polynomial Regression and Transformations STA 671 Summer 2008.
Chapter 7 Statistical Data Treatment and Evaluation
Chapter 6 The 2k Factorial Design
CHAPTER 2 Building Empirical Model. Basic Statistical Concepts Consider this situation: The tension bond strength of portland cement mortar is an important.
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
IE341 Problems. 1.Nuisance effects can be known or unknown. (a) If they are known, what are the ways you can deal with them? (b) What happens if they.
Experimental Design, Response Surface Analysis, and Optimization
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
11.1 Introduction to Response Surface Methodology
Using process knowledge to identify uncontrolled variables and control variables as inputs for Process Improvement 1.
Chapter 5 Introduction to Factorial Designs
1 Chapter 6 The 2 k Factorial Design Introduction The special cases of the general factorial design (Chapter 5) k factors and each factor has only.
14-1 Introduction An experiment is a test or series of tests. The design of an experiment plays a major role in the eventual solution of the problem.
Curve-Fitting Regression
1 Chapter 5 Introduction to Factorial Designs Basic Definitions and Principles Study the effects of two or more factors. Factorial designs Crossed:
Analysis of Variance Chapter 3Design & Analysis of Experiments 7E 2009 Montgomery 1.
Multiple Linear Regression
Inferences About Process Quality
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Correlation and Regression Analysis
Calibration & Curve Fitting
1 14 Design of Experiments with Several Factors 14-1 Introduction 14-2 Factorial Experiments 14-3 Two-Factor Factorial Experiments Statistical analysis.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Chapter 8Design and Analysis of Experiments 8E 2012 Montgomery 1 Design of Engineering Experiments The 2 k-p Fractional Factorial Design Text reference,
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
ITK-226 Statistika & Rancangan Percobaan Dicky Dermawan
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
DOX 6E Montgomery1 Design of Engineering Experiments Part 7 – The 2 k-p Fractional Factorial Design Text reference, Chapter 8 Motivation for fractional.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Fundamentals of Data Analysis Lecture 9 Management of data sets and improving the precision of measurement.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
Ch4 Describing Relationships Between Variables. Pressure.
Engineering Statistics ENGR 592 Prepared by: Mariam El-Maghraby Date: 26/05/04 Design of Experiments Plackett-Burman Box-Behnken.
1 Chapter 3: Screening Designs 3.1 Fractional Factorial Designs 3.2 Blocking with Screening Designs.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Experimental Design If a process is in statistical control but has poor capability it will often be necessary to reduce variability. Experimental design.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
IE341 Midterm. 1. The effects of a 2 x 2 fixed effects factorial design are: A effect = 20 B effect = 10 AB effect = 16 = 35 (a) Write the fitted regression.
Section 10.1 Confidence Intervals
Solutions. 1.The tensile strength of concrete produced by 4 mixer levels is being studied with 4 replications. The data are: Compute the MS due to mixers.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
1 Design of Engineering Experiments – The 2 k Factorial Design Text reference, Chapter 6 Special case of the general factorial design; k factors, all at.
 Will help you gain knowledge in: ◦ Improving performance characteristics ◦ Reducing costs ◦ Understand regression analysis ◦ Understand relationships.
Designs for Experiments with More Than One Factor When the experimenter is interested in the effect of multiple factors on a response a factorial design.
CORRELATION-REGULATION ANALYSIS Томский политехнический университет.
1 Chapter 8 Two-level Fractional Factorial Designs.
1 Chapter 5.8 What if We Have More Than Two Samples?
Slide 1 DESIGN OF EXPERIMENT (DOE) OVERVIEW Dedy Sugiarto.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Canadian Bioinformatics Workshops
Estimating standard error using bootstrap
Chapter 7. Classification and Prediction
Chapter 5 Introduction to Factorial Designs
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
Text reference, Chapter 8
ENM 310 Design of Experiments and Regression Analysis Chapter 3
DESIGN OF EXPERIMENTS by R. C. Baker
14 Design of Experiments with Several Factors CHAPTER OUTLINE
Presentation transcript:

Design of Experiments.

Introduction.  Experiments are performed to discover something about a particular process or system.  Literally experiments is a test or series of tests in which purposeful changes are made to the input variables of a process or system, so that we may observe and identify the reasons for changes that may be observed in the output response.  In DOE we first try to make a layout of the process by taking appropriate number of input and output parameters.  In the next stage statistical analysis is made to extract meaningful information from the process.

Area of application.  Experimental Design is critically important tool in the engineering world for improving the performance of a manufacturing process.  It also has extensive application in the development of new process.  The application of experimental design techniques in process development can result in, a.Improved process yields. b.Reduced variability and closer conformance to nominal or target requirements. c.Reduced development time. d.Reduced overall cost.

Objective of DOE. Main objective of DOE, a.Characterizing a process. b.Optimizing a process. Process Characterizing : In this stage our main objective is to find out those factors which has significant effect to the response or out put parameters.  some time it is called screening of factors. Process optimization : In this stage our main objective is to find out the region in the important factors that leads to their best possible response.  For example if the out put is yield we would look for a region of maximum yield, if the output is variability then we would look for a region of minimum variability.

Basic Principal of DOE Three basic principles of DOE are a.Replication b.Randomization c.Blocking Replication – By replication we mean repetition of basic design. Replication has two important properties, 1 st – it allow the experimenter to estimate the experimental error more precisely. 2 nd – it helps to estimate the effect of factor more precisely.

Randomization- By randomization we mean allocation of both the experimental materials and the orders in which the individual runs or trials of the experiments are to be performed are randomly determined.  Statistical method require the observation (or error) should be randomly distributed, randomization helps to make this assumption true.  By proper randomization we also assist in “averaging out” the effects of extraneous factors that may be present.

Blocking – Often blocking is used to reduce or eliminate the variability due to nuisance factors (the factor which may influence the experimental design but are not directly interested).  For example an experiment in a chemical process may require 2 batches of raw materials.  However there may be a difference between batches due to supplier to supplier variability, and we are not interested of this difference.  We would think of the batches of raw materials as a nuisance factor and will be used a blocking effect.  There may be three kind of nuisance factor, a. unknown and uncontrollable – Randomization will be used to deal with this situation. b. Known and uncontrollable – ANCOVA will be used to deal with this situation. c. Known and controllable - Blocking techniques can be used to deal with this situation.

Guidelines of DOE. 1.Recognition of and statement of the problem. 2.Choice of the factors, levels,and ranges. 3.Selection of the response variable. 4.Choice of the experimental design. 5.Performing the experiment. 6.Statistical analysis of the data. 7.Conclusion and recommendation.

Factorial Design.  By factorial design we mean that in each complete trial or replication of the experiment, all possible combinations of the levels, of the factors are investigated.  Using factorial design we can estimate the effect of two or more factors.  Factorial design is also known as crossed design.  Depending on the level of factor it is either 2^k (when each factor has two levels) or 3^k (when each factor has three levels)

 The 2^k design is important in early stages of experimental work, when there are likely to be many factors to be investigated.  It provides smallest number of runs in which k factors can be studied in a complete factorial design.  These design are widely used in factor screening experiments.  In many experiments involving 2^k design, we examine the magnitude and direction of the factor effects to determine which variables are likely to be important.  The ANOVA can generally be used to confirm this interpretation.

Terms.  Effect : It refers the average change of response when any factor moves from low to high level. Let say for any factor A, main effect is 21.5, it means that when factor A is going to its low to high level, response is increasing 21.5 units.  Interaction Effect : It occur when difference in response between the level of one factor is not the same, at all levels of the other factors.  This significance of this effect can be found in ANOVA table.  If p<0.05 for any interaction effect then it is considered as significant.

General formula of factorial design. There is a general formula for 2^k or 3^k factorial design, related to the number of runs.  2^k design : If we have 2 factors each at 2 levels number of runs will be 4 (2^2), if factor is 3 run will be 8 (2^3) for full factorial design.  Similarly for 3^k design : If there are 2 factors each at 3 levels, number of runs will be 9 (3^2), if it is 3 run will be 27 (3^3) for full factorial design.

Fitting Response curve and Surfaces.  Response curve : This is an equation for the quantitative factors, which relates the response to the factors.  This equation might be used for interpolation, i.e, predicting the response at factor levels between those actually used in the experiments.  When at least two factors are quantitative, we can fit response curve for predicting response at various combinations of design factors.  Contour plot /Surface plot : These two are graphical presentation of the response curve.  It shows the region where we can get optimum (minimum/maximum) solution for the response variable.

 Example 1: 2^4 full factorial Design A chemical product is produced in a pressure vessel. Factorial experiment is carried out in the pilot plant to study the influence of factors to the filtration rate of this product.  Input factors : a. Temperature :100 – 120 degree b. Pressure : 60 psi – 75 psi c. Concentration of Formaldehyde : 6 % - 10% d. Stirring rate : 25 – 40  Output parameter (response) : Filtration rate (gal/h)

Data set  The above data contains four factors, with their level of combinations.  Out put is Filtration rate.

ANOVA table  From the ANOVA table it is clear that Temp and Temp * Concentration have significant effect to the response ( as p< 0.05)  Concentration and Stirring have also contribution to the filtration rate.

 Normal Plot : It is clear from the first graph, Temp Stirring rate, and Concentration have positive effect on the response.  Pareto Chart : It is showing the temp has highest positive effect (21.65 unit) on response, Temp and Concentration interaction has negative effect on response.

Conclusion from analysis.  The main effect of Temp, concentration and Stirring rate plotted in the previous graphs are positive.  If we consider only the main effect, we would run all three factors at high level to maximize the filtration rate.  It is always important to examine the interaction effects.  Remember that main effect do not have much meaning when they involve in significant interactions.

Interaction plot  Temp and Concentration interaction : Temp effect is very low when concentration is at high level. Temp effect is high when concentration is at Low level.  Conclusion : Best result can obtained with Low Concentration and High Temp levels.

 Temp and Stirring rate interaction :  Temp effect is large at high level of Stirring Rate.  Temp effect is less at Lower value of Stirring rate.  Conclusion : The best filtration rate would be obtained when Temp and Stirring rate are at high level and Concentration is at Low level.

Fractional Factorial Design  In 2^k factorial design as the number of factor increase,number of runs that require for full factorial design also increase rapidly. For example if we have 6 factors each at two levels, we need 2^6 = 64 runs to perform full factorial design.  Full factorial design some time not advisable/possible due to lack of material/money/time.  In these conditions, fractional factorials are used that "sacrifice" interaction effects so that main effects may still be computed correctly  In fractional factorial we can assume certain higher order interactions are negligible.

 In fractional factorial design, information on main effect and low- order interaction effect may be obtained by running only a fraction of the complete factorial design.  Major use of fractional factorial design is in screening experiments. Screening are performed in the early stages of a project when many factors are considered and we need to find out the most significant factors from the design. The factors which are identified as important are then investigated more thoroughly in subsequent experiments.

 Aliases : In fractional factorial design we always assume that some higher interactions are not so important for the analysis. So during the analysis we will not get information of those terms. A fractional factorial experiment is generated from a full factorial experiment by choosing an alias structure. The alias structure determines which effects are confounded with each other.  For example, let say we have 4 factors, we need 16 runs for full factorial design, but we can make only 8 runs with three factors say A, B, and C and create D with interactions generated by D = A*B*C. This expression is called the generator of the design.

 When the experiment is run and the experimenter estimates the effects for factor D, what is really being estimated is a combination of the main effect of D and the interaction of A B and C.  It is said that ABC is confounded with D or Aliased with D

Design Resolution.  Resolution is a rule which inform which effects are aliased/confounded with another factors.  Rule : A design is of resolution R if no p-factor effects is aliased with another effect containing less than (R-p) factors.  Resolution 3 : Here no main effects are aliased with any other main effect, but main effect are aliased with two factor interaction, and two factor interaction are aliased to each other.  Resolution 4 : No main effect is aliased with any other main effect or any two factor interaction effect. But two factor interaction are aliased to each other.  Resolution 5 : No main effect and two factor interaction are aliased with any main effect or two factor interaction effect. Two factor interaction may aliased with three factor interaction.

Example 2 : Fractional factorial.  Five factors of a manufacturing process for an integrated circuit were investigated in a 2^(5-1) design. Factors are : a. Apertures setting (Small – Large). b. Exposure time (less than 20% - grater than 20%). c. Develop Time (30sec – 45 sec) d. Mask Dimension (Small –Large) e. Etch Time (14.5 min – 15.5 sec) Objective : Improve the process yield with 16 number of runs.

 The above data has been generated by taking full factorial of first four factors, and last factor has been generated by taking multiplication of levels of first four columns.

 1 st table : The circled factors are confounded with the related main effects and 1 st order interactions.  So in the ANOVA table we will not get any information of those interactions effects.  2 nd table : There is no values in the F and p columns, since no degree of freedom is left in the Error cell.  We have to combine some interaction whose SS values are very low as error value.

 1 st table : The ANOVA table is showing the significant factors (p values<0.05).  Pareto chart : It is clear from the Pareto chart that Develop time has the highest positive effect to the yield, then Aperture, Exposure Time and so on.

Plackett-Burman Design  Plackett-Burman is very useful to screen a large number of factors to identify those that may be important (i.e., those that are related to the dependent variable of interest).  This design allows one to test the largest number of factor main effects with the least number of observations, that is to construct a resolution III design with as few runs as possible.  Plackett and Burman (1946) showed how full factorial design can be fractionalized in a different manner, to yield saturated designs where the number of runs is a multiple of 4, rather than a power of 2.  These designs are also sometimes called Hadamard matrix designs.  Of course, you do not have to use all available factors in those designs, and, in fact, sometimes you want to generate a saturated design for one more factor than you are expecting to test. This will allow you to estimate the random error variability, and test for the statistical significance of the parameter estimates.

Example 3 : Plackett-Burman Design  This is an example of formulation process where we have 35 number of factors.  Dependent variable : Content Uniformity of Chemical.  Objective : Out of 35 factors we want to screen out only those, which are most significant for content uniformity.  In this data we have taken only 40 runs to check the effect of 35 number of factors.

Data set  1 st variable is the dependent variable called Content Uniformity.  From variable 2 to 35 all of them are independent variables.

Analysis.  1 st table : ANOVA table is showing which factors are most important for dependent variable.  Pareto chart : It is showing the important factors according to the descending order.

3^k Factorial Design.  In this Design each factor should have 3 levels, “Low-Medium-High”.  Three levels factorial design is very useful in estimating nonlinear effect of any factor.  Since Two level factorial design assume only the linear effect of factors, so any kind of non-linear relationship can not be captured by these kind of design.  Unfortunately, the three-level design is prohibitive in terms of the number of runs, and thus in terms of cost and effort. For example a two-level design with center points is much less expensive while it still is a very good (and simple) way to establish the presence or absence of curvature.

Mixed Level design.  In some cases, factors that have more than 2 levels have to be examined. For example, if one suspects that the effect of the factors on the dependent variable is not simply linear. One needs at least 3 levels in order to test for the linear and quadratic effects (and interactions) for those factors.  Also, sometimes some factors may be categorical in nature, with more than 2 categories.  In these kind of situation user can use mixed level design to check both linear and quadratic effect. For example if experimenter want to test the effect of temp (3 levels) and pressure (2 levels) on the Yield of a chemical process.

Example 4 : Mixed level Design  A soft drink bottler is interested in obtaining more uniform fill heights in the bottles produced by his manufacturing process.  The filling machine theoretically fills each bottle to the correct target height, but in practice there is variation in the process.  Main objective : Find out the crucial (significant) parameters for which variation around the target will be minimum.  Input parameters :  percent carbonation (10 % - 12% - 14%).  Pressure ( 25 Psi – 30 Psi)  Line speed ( 200 bpm – 250 bpm)

Data Set.  First Factor Carbon concentration has three levels.  Line speed and pressure, both of these two factors have two levels.

Analysis.  ANOVA table :It I clear,all main effects are significant for the response.  Pareto chart : Carbon concentration has large effect out of three variables, followed by pressure and Line speed.

Mean Plot.  First Graph : Response is increasing with the carbon Concentration.  Second Graph : Response is increasing with the Line Speed.

 From the main effect plot it is clear that all of them have positive effect.  So increasing the variable moves the average deviation from the fill target upward.  Pressure and carbon Concentration have small interaction which is linearly increasing.

Fitting Regression Model  Objective. In many problem two or three variables are related and it is of interest to model and explore this relationship.  For example : In a chemical process the yield of process is related to operating temperature. Chemical engineer may want to build a model for yield and Temperature. This model can be used for prediction or optimization or process control.  The relationship between these variables is characterized by a mathematical model called Regression model.  Most of the cases true functional relationship is unknown.  Experimenters choose an appropriate function to approximate the model.  Lower order polynomial are widely used to approximate functions.

 Fitting linear Regression Model. Suppose we want develop an empirical model related to yield of a chemical process to temperature and catalyst concentration. A linear model can be described as- Y= b0+b1*X1+b2*X2+€ ………… (1) Where Y= Yield, X1 = temp: and X2 = Cat.concent. b1 and b2 are known as Partial Regression Coefficient. b1= measures expected changes in y per unit change in X1 when X2 is held constant. A more complicated model can be described as Y = b0+b1*X1+b2*X2+b3*X1^2+b4*X2^2+b5*X1*X2+€ …….. (2) Equation-2 is known as Response Surface Model. Least Square method is typically used to estimate the unknown parameters.

Example 5 : Regression Analysis. Here we have used the data of Example-1.  Independent factors : a. Temperature b. Pressure c. Catalyst Concentration. d. Stirring rate.  Dependent variable : Filtration Rate.  Objective : a. Our main objective is to build a functional relationship between four factors and dependent variable. b. Check the accuracy of this model. c. Predict the filtration rate for any combination of factors.

 First table : Regression Coefficients for main effects as well as interaction effects. R-square is  Second table : Modified regression equation after ignoring some insignificant interaction effects. R square is  We will go for diagnostic checking, if it satisfies all test we will continue with the second model.

Diagnostic Checking.  In this section we will perform three major analysis for diagnostic checking.  Normal Probability plot of residual (It will establish the normality assumption of residual)  Predicted vs Residual plot. (Based on studentized residual to detect potential outliers)  Residual vs independent variable plot. (It will help to detect which factor has large contribution to process variability)

Normal probability plot.  P-value >0.05 which indicate residuals are normally distributed.

Predicted vs Residual plot.  From the graph we can conclude, there is no indication of outlier in the data set.  Absolute value of studentized residuals < 2.5, which clearly indicate there is no outlier in the data set.

Residual vs Independent variable.  Pressure which is not so important insofar as the average filtration rate is concerned.  But it is very important in its effect on process variability, as higher Pressure is resulting more variability for the filtration rate.

Prediction.  Objective : The main objective of this part is to interpolate the value of dependent variable for any value of independent factors. The Regression model we have estimated has successfully gone through all kind of diagnostic checking. We can use this model to predict Filtration rate. For example : Suppose we want to predict the filtration rate for Temp = 115, Pressure = 65, Concentration = 8 and Stirring rate = 30. We will put all these value to the equation and predicted value of filtration rate for this combination will be calculated.

 The predicted value is  If we take the above combinations of factors predicted Yield will be 72.5 with 95% confidence intervals (64.91 – 80.10).

Response Desirability profiling  Basic Idea : A typical problem in product development is to find a set of conditions, or levels of the input variables, that produces the most desirable product in terms of its characteristics, or responses on the output variables.  The procedures used to solve this problem generally involve two steps: a.Predicting responses on the dependent by using an equation based on the levels of the independent variables (Regression method). b.Finding the levels of the X variables that simultaneously produce the most desirable predicted responses on the Y variables. Method – b is optimization process. In this example we want to find out the values of independent variables for which Filtration rate will be approximate 90 with minimum value 80 and high value 95.

 In this example our target yield was 90, corresponding factors should be Temp = 120, Pressure = 75, Concentration = 6 Stirring rate =  The graph is describing how Yield is varying for different level of Factors.

 Regression vs Response Desirability profiling  Objective of these two methods are almost same “How to get a consistent output”.  Only the approaches are different.  Regression : Try to get output for predefined input.  Profiling : Try to get value of input parameters for a predefined output.  User can combine these two methods for finding the optimum region for any process.  These two methods completely depend on the data set, to get a satisfying result user must have an idea about the process.

Thanks Krishnendu Kundu StatSoft India. Mob :