/ department of mathematics and computer science 1212 1 2DS01 Statistics 2 for Chemical Engineering lecture 2

/ department of mathematics and computer science 1212 1 2DS01 Statistics 2 for Chemical Engineering lecture 2 http://www.win.tue.nl/~sandro/2DS01

/ department of mathematics and computer science 1212 2 Contents why is design of experiments useful? regression analysis and effects 2 p -experiments blocks 2 p-k -experiments (fractional factorial experiments) software literature

/ department of mathematics and computer science 1212 3 Traditional approach to experimentation change setting of one factor perform measurement(s) change setting of another factor perform measement(s)... This is called a One-Factor-At-a-Time (OFAT) or Change-One-Separate-factor-at-a-Time (COST) strategy.

/ department of mathematics and computer science 1212 4 Use of statistics in design of experiments Design of Experiments (short: DOE) is a general term for a collection of statistical techniques for systematic experimentation. The most important benefits of using DOE are: less experiments are needed more precise estimates of parameter effects interactions between factors are taken into account

/ department of mathematics and computer science 1212 5 Interactions Factors may influence each other. E.g, the optimal setting of a factor may depend on the settings of the other factors. When factors are optimised separately, the overall result (as function of all factors) may be suboptimal...

/ department of mathematics and computer science 1212 6 30 40 50 60 The real maximum The apparent maximum factor A has been optimised factor B has been optimised

/ department of mathematics and computer science 1212 7 Types of experimental designs “screening designs” These designs are used to investigate which factors are important (“significant”). “response surface designs” These designs are used to determine the optimal settings of the significant factors.

/ department of mathematics and computer science 1212 8 Three factors: example Response:deviation filling height bottles Factors:carbon dioxide level (%)A pressure (psi)B speed (bottles/min)C

/ department of mathematics and computer science 1212 9 Effects How do we determine whether an individual factor is of importance? Measure the outcome at 2 different settings of that factor. Scale the settings such that they become the values +1 and -1.

/ department of mathematics and computer science 1212 10 setting factor A measurement +1

/ department of mathematics and computer science 1212 11 setting factor A measurement +1

/ department of mathematics and computer science 1212 12 setting factor A measurement +1 effect

/ department of mathematics and computer science 1212 13 setting factor A measurement +1 effect N.B. effect = 2 * slope slope

/ department of mathematics and computer science 1212 14 setting factor A measurement +1 50 35 Effect factor A = 50 – 35 = 15

/ department of mathematics and computer science 1212 15 More factors We denote factors with capitals: A, B,… Each factor only attains two settings: -1 and +1 The joint settings of all factors in one measurement is called a level combination.

/ department of mathematics and computer science 1212 16 More factors AB 1 1 11 Level Combination

/ department of mathematics and computer science 1212 17 Notation A level combination consists of small letters. The small letters denote which factors are set at +1; the letters that do not appear are set at -1. Example: ac means: A and C at 1, the remaining factors at -1 N.B. (1) means that all factors are set at -1.

/ department of mathematics and computer science 1212 18 An experiment consists of performing measurements at different level combinations. A run is a measurement at one level combination. Suppose that there are 2 factors, A and B. We perform 4 measurements with the following settings: A -1 and B -1 (short: (1) ) A +1 and B -1 (short: a ) A -1 and B +1 (short: b ) A +1 and B +1 (short: ab )

/ department of mathematics and computer science 1212 19 A 2 2 Experiment with 4 runs AByield (1) b 1 a1 ab11

/ department of mathematics and computer science 1212 20 Note: CAPITALS for factors and effects small letters for level combinations ( = settings of the experiments) (A, BC, CDEF) (a, bc, cde, (1))

/ department of mathematics and computer science 1212 21 Graphical display A B +1 +1 a ab (1) b

/ department of mathematics and computer science 1212 22 B A +1 +1 50 60 35 40

/ department of mathematics and computer science 1212 23 B A +1 +1 50 60 35 40 2 estimates for effect A:

/ department of mathematics and computer science 1212 24 B A +1 +1 50 60 35 40 2 estimates for effect A:50 - 35 = 15

/ department of mathematics and computer science 1212 25 B A +1 +1 50 60 35 40 2 estimates for effect A: 60 - 40 = 20 50 - 35 = 15

/ department of mathematics and computer science 1212 26 B A +1 +1 50 60 35 40 2 estimates for effect A: 60 - 40 = 20 50 - 35 = 15 Which estimate is superior?

/ department of mathematics and computer science 1212 27 B A +1 +1 50 60 35 40 2 estimates for effect A: 60 - 40 = 20 50 - 35 = 15 Combine both estimates: ½(50-35) + ½(60-40) = 17.5

/ department of mathematics and computer science 1212 28 B A +1 +1 50 60 35 40 In the same way we estimate the effect B (note that all 4 measurements are used!): ½(40-35) ½(60-50) + = 7.5

/ department of mathematics and computer science 1212 29 B A +1 +1 50 60 35 40 The interaction effect AB is the difference between the estimates for the effect A: ½(60-40)½(50-35)-= 2.5

/ department of mathematics and computer science 1212 30 Interaction effects Cross terms in linear regression models cause interaction effects: Y = 3 + 2 x A + 4 x B + 7 x A x B x A  x A +1  Y  Y + 2 + 7 x B, so increase depends on x B. Likewise for x B  x B +1 This explains the notation AB.

/ department of mathematics and computer science 1212 31 No interaction Factor A Output lowhigh B low B high 20 50 55 25

/ department of mathematics and computer science 1212 32 Interaction I Factor A Output lowhigh B low B high 20 50 55 45

/ department of mathematics and computer science 1212 33 Interaction II Factor A Output lowhigh B low 55 50 B high 20 45

/ department of mathematics and computer science 1212 34 Interaction III Factor A Output lowhigh B low 55 20 B high 20 45

/ department of mathematics and computer science 1212 35 Trick to Compute Effects AByield (1) 35 b140 a150 ab1160 (coded) measurement settings

/ department of mathematics and computer science 1212 36 AByield (1) 35 b140 a150 ab1160 Effect estimates Trick to Compute Effects

/ department of mathematics and computer science 1212 37 AByield (1) 35 b140 a150 ab1160 Effect estimates Effect A = ½(-35 - 40 + 50 + 60) = 17.5 Effect B = ½(-35 + 40 – 50 + 60) = 7.5 Trick to Compute Effects

/ department of mathematics and computer science 1212 38 ABAByield (1) ?35 b1?40 a1?50 ab11?60 Trick to Compute Effects Effect AB = ½(60-40) - ½(50-35) = 2.5

/ department of mathematics and computer science 1212 39 ABAByield (1) 135 b1 40 a1 50 ab11160 Trick to Compute Effects Effect AB = ½(60-40) - ½(50-35) = 2.5 × = AB equals the product of the columns A and B

/ department of mathematics and computer science 1212 40 IABAByield (1)+--+35 b+-+-40 a++--50 ab++++60 Trick to Compute Effects Computational rules: I×A = A, I×B = B, A×B=AB etc. This holds true in general (i.e., also for more factors).

/ department of mathematics and computer science 1212 41 3 Factors: a 2 3 Design

/ department of mathematics and computer science 1212 42 3 Factors: a 2 3 Design ABCYield (1)---5 a+--2 b-+-7 ab++-1 c--+7 ac+-+6 bc-++9 abc+++7

/ department of mathematics and computer science 1212 43 (1)=5 a=2 ab=1b=7 ac=6 abc=7bc=9 c=7      effect A = ¼(+16-28)=-3 A B C scheme 2 3 design

/ department of mathematics and computer science 1212 44 effect AB = ¼(+20-24)=-1 scheme 2 3 design (1)=5 a=2 ab=1b=7 ac= 6 abc=7bc=9 c=7      A B C

/ department of mathematics and computer science 1212 45 Back to 2 factors – Blocking IABAB (1)+--+ b+-+- a++-- ab++++ Suppose that we cannot perform all measurements at the same day. We are not interested in the difference between 2 days, but we must take the effect of this into account. How do we accomplish that? day 1 day 2

/ department of mathematics and computer science 1212 46 Back to 2 factors – Blocking IABABday (1)+--+1 b+-+-1 a++--2 ab++++2 Suppose that we cannot perform all measurements at the same day. We are not interested in the difference between 2 days, but we must take the effect of this into account. How do we accomplish that? “hidden” block effect

/ department of mathematics and computer science 1212 47 Back to 2 factors – Blocking IABABday (1)+--+- b+-+-- a++--+ ab+++++ We note that the columns A and day are the same. Consequence: the effect of A and the day effect cannot be distinguished. This is called confounding or aliasing).

/ department of mathematics and computer science 1212 48 Back to 2 factors – Blocking IABABday (1)+--+? b+-+-? a++--? ab++++? A general guide-line is to confound the day effect with an interaction of highest possible order. How can we accomplish that here?

/ department of mathematics and computer science 1212 49 Back to 2 factors – Blocking Solution: day 1: a, bday 2: (1), ab or interchange the days! IABABday (1)+--++ b+-+-- a++--- ab+++++

/ department of mathematics and computer science 1212 50 Back to 2 factors – Blocking Solution: day 1: a, bday 2: (1), ab or interchange the days! IABABday (1)+--++ b+-+-- a++--- ab+++++ Choose within the days by drawing lots which experiment must be performed first. In general, the order of experiments must be determined by drawing lots. This is called randomisation.

/ department of mathematics and computer science 1212 51 Here is a scheme for 3 factors. Interactions of order 3 or higher can be neglected in practice. How should we divide the experiments over 2 days? day 1 day 2

/ department of mathematics and computer science 1212 52 Fractional experiments Often the number of parameters is too large to allow a complete 2 p design (i.e, all 2 p possible settings -1 and 1 of the p factors). By performing only a subset of the 2 p experiments in a smart way, we can arrange that by performing relatively few, it is possible to estimate the main effects and (possibly) 2nd order interactions.

/ department of mathematics and computer science 1212 53 Fractional experiments IABABCACBC AB C (1)+--+-++- a++----++ b+-+--+-+ ab++++---- c+--++--+ ac++--++-- bc+-+-+-+- abc++++++++

/ department of mathematics and computer science 1212 54 Fractional experiments IABABCACBC AB C (1)+--+-++- a++----++ b+-+--+-+ ab++++---- c+--++--+ ac++--++-- bc+-+-+-+- abc++++++++

/ department of mathematics and computer science 1212 55 Fractional experiments IABABCACBC AB C (1)+--+-++- a++----++ b+-+--+-+ ab++++----

/ department of mathematics and computer science 1212 56 Fractional experiments IABABCACBC AB C (1)+--+-++- a++----++ b+-+--+-+ ab++++---- With this half fraction (only 4 = ½×8 experiments) we see that a number of columns are the same (apart from a minus sign): I = -C, A = -AC, B = -BC, AB = -ABC

/ department of mathematics and computer science 1212 57 Fractional experiments IABABCACBC AB C (1)+--+-++- a++----++ b+-+--+-+ ab++++---- We say that these factors are confounded or aliased. In this particular case we have an ill-chosen fraction, because I and C are confounded. I = -C, A = -AC, B = -BC, AB = -ABC

/ department of mathematics and computer science 1212 58 Fractional experiments – Better Choice: I = ABC IABABCACBC AB C (1)+--+-++- a++----++ b+-+--+-+ ab++++---- c+--++--+ ac++--++-- bc+-+-+-+- abc++++++++

/ department of mathematics and computer science 1212 59 Fractional experiments – Better Choice: I = ABC IABABCACBC AB C a++----++ b+-+--+-+ c+--++--+ abc++++++++ The other “best choice” would be: I = -ABC Aliasing structure: I = ABC, A = BC, B = AC, C = AB

/ department of mathematics and computer science 1212 60 IABABCACBC AB C a++----++ b+-+--+-+ c+--++--+ abc++++++++ In the case of 3 factors further reducing the number of experiments is not possible in practice, because this leads to undesired confounding, e.g. : I = A = BC = ABC, B = C = AB = AC,

/ department of mathematics and computer science 1212 61 IABABCACBC AB C a++----++ abc++++++++ Other quarter fractions also have confounded main effects, which is unacceptable.

/ department of mathematics and computer science 1212 62 Further remarks on fractions there exist computational rules for aliases. E.g., it follows from A=C that AB = BC. Note that I = A 2 = B 2 = C 2 etc. always holds (see the next lecture) tables and software are available for choosing a suitable fraction. The extent of confounding is indicated by the resolution. Resolution III is a minimal ; designs with a higher resolution are very much preferred.

/ department of mathematics and computer science 1212 63 Plackett-Burman designs So far we discussed fractional designs for screening. This is sensible if one cannot exclude the possibility of interactions. If one knows based on foreknowledge that there are no interactions or if one is for some reason is only interested in main effects, than Plackett-Burman designs are preferred. They are able to detect significant main effects using only very few runs. A disadvantage of these designs is their complicated aliasing structure.

/ department of mathematics and computer science 1212 64 Number of measurements For every main or interaction effect that has to estimated separately, at least one measurement is necessary. If there are k blocks, then this requires additional k - 1 measurements. The remaining measurements are used for estimation of the variance. It is important to have sufficient measurements for the variance.

/ department of mathematics and computer science 1212 65 Choice of design After a design has been chosen, the factors A, B, … must be assigned to the factors of the experiment. It is recommended to combine any foreknowledge on the factors with the alias structure. The individual measurements must be performed in a random order. never confound two effects that might both be significant if you know that a certain effect will not be significant, you can confound it with an effect that might be significant.

/ department of mathematics and computer science 1212 66 Centre points and Replications If there are not enough measurements to obtain a good estimate of the variance, then one can perform replications. Another possibility is to add centre points. B A +1 +1 a ab (1) b Adding centre points serves two purposes: better variance estimate allow to test curvature using a lack-of-fit test Centre point

/ department of mathematics and computer science 1212 67 Curvature A design in which each factor is only allowed to attain the levels -1 and 1, is implicitly assuming a linear model. This is because knowing only the functions values at -1 and +1, then 1 and x 2 cannot be distinguished. We can distinguish them by adding the level 0. This is the idea behind adding centre points.

/ department of mathematics and computer science 1212 68 Analysis of a Design ABCYield (1)---5 a+--2 b-+-7 ab++-1 c--+7 ac+-+6 bc-++9 abc+++7

/ department of mathematics and computer science 1212 69 Analysis of a Design – With 2-way Interactions Analysis Summary ---------------- File name: Estimated effects for Yield ---------------------------------------------------------------------- average = 5.5 +/- 0.25 A:A = -3.0 +/- 0.5 B:B = 1.0 +/- 0.5 C:C = 3.5 +/- 0.5 AB = -1.0 +/- 0.5 AC = 1.5 +/- 0.5 BC = 0.5 +/- 0.5 ---------------------------------------------------------------------- Standard errors are based on total error with 1 d.f.

/ department of mathematics and computer science 1212 70 Analysis of a Design – With 2-way Interactions Analysis of Variance for Yield -------------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------- A:A 18.0 1 18.0 36.00 0.1051 B:B 2.0 1 2.0 4.00 0.2952 C:C 24.5 1 24.5 49.00 0.0903 AB 2.0 1 2.0 4.00 0.2952 AC 4.5 1 4.5 9.00 0.2048 BC 0.5 1 0.5 1.00 0.5000 Total error 0.5 1 0.5 -------------------------------------------------------------------------------- Total (corr.) 52.0 7 R-squared = 99.0385 percent R-squared (adjusted for d.f.) = 93.2692 percent Standard Error of Est. = 0.707107 Mean absolute error = 0.25 Durbin-Watson statistic = 2.5 Lag 1 residual autocorrelation = -0.375

/ department of mathematics and computer science 1212 71 Analysis of a Design – Only Main Effects Analysis Summary ---------------- File name: Estimated effects for Yield ---------------------------------------------------------------------- average = 5.5 +/- 0.484123 A:A = -3.0 +/- 0.968246 B:B = 1.0 +/- 0.968246 C:C = 3.5 +/- 0.968246 ---------------------------------------------------------------------- Standard errors are based on total error with 4 d.f. Effect estimates remain the same!

/ department of mathematics and computer science 1212 72 Analysis of a Design – Only Main Effects Analysis of Variance for Yield -------------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------- A:A 18.0 1 18.0 9.60 0.0363 B:B 2.0 1 2.0 1.07 0.3601 C:C 24.5 1 24.5 13.07 0.0225 Total error 7.5 4 1.875 -------------------------------------------------------------------------------- Total (corr.) 52.0 7 R-squared = 85.5769 percent R-squared (adjusted for d.f.) = 74.7596 percent Standard Error of Est. = 1.36931 Mean absolute error = 0.8125 Durbin-Watson statistic = 2.16667 (P=0.3180) Lag 1 residual autocorrelation = -0.125

/ department of mathematics and computer science 1212 73 Analysis of a Design with Blocks BlockABCYield (1)1---5 ab1++-1 ac1+-+6 bc1-++9 a2+--2 b2-+-7 c2--+7 abc2+++7

/ department of mathematics and computer science 1212 74 Analysis of a Design with Blocks – With 2-way Interactions Analysis of Variance for Yield -------------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------- A:A 18.0 1 18.0 B:B 2.0 1 2.0 C:C 24.5 1 24.5 AB 2.0 1 2.0 AC 4.5 1 4.5 BC 0.5 1 0.5 blocks 0.5 1 0.5 Total error 0.0 0 -------------------------------------------------------------------------------- Total (corr.) 52.0 7 R-squared = 100.0 percent R-squared (adjusted for d.f.) = 100.0 percent Saturated design: 0 df for the error term → no testing possible

/ department of mathematics and computer science 1212 75 Analysis of a Design with Blocks – Only Main Effects Analysis of Variance for Yield -------------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------- A:A 18.0 1 18.0 7.71 0.0691 B:B 2.0 1 2.0 0.86 0.4228 C:C 24.5 1 24.5 10.50 0.0478 blocks 0.5 1 0.5 0.21 0.6749 Total error 7.0 3 2.33333 -------------------------------------------------------------------------------- Total (corr.) 52.0 7 R-squared = 86.5385 percent R-squared (adjusted for d.f.) = 76.4423 percent Standard Error of Est. = 1.52753 Mean absolute error = 0.75 Durbin-Watson statistic = 3.21429 (P=0.0478) Lag 1 residual autocorrelation = -0.642857

/ department of mathematics and computer science 1212 76 Analysis of a Fractional Design (I = -ABC) ABCYield (1)---5 ac+-+6 bc-++9 ab++-1

/ department of mathematics and computer science 1212 77 Analysis of a Fractional Design (I = -ABC) Analysis of Variance for Yield -------------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------- A:A-BC 12.25 1 12.25 B:B-AC 0.25 1 0.25 C:C-AB 20.25 1 20.25 Total error 0.0 0 -------------------------------------------------------------------------------- Total (corr.) 32.75 3 R-squared = 100.0 percent R-squared (adjusted for d.f.) = 0.0 percent Estimated effects for Yield ---------------------------------------------------------------------- average = 5.25 A:A-BC = -3.5 B:B-AC = -0.5 C:C-AB = 4.5 ---------------------------------------------------------------------- No degrees of freedom left to estimate standard errors.

/ department of mathematics and computer science 1212 78 ABYield (1)--5 a+-6 b-+9 ab++1 008 008 007 Pure Error = Analysis of a Design with Centre Points

/ department of mathematics and computer science 1212 79 Analysis of a Design with Centre Points Analysis of Variance for Yield -------------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------- A:A 12.25 1 12.25 36.75 0.0261 B:B 0.25 1 0.25 0.75 0.4778 AB 20.25 1 20.25 60.75 0.0161 Lack-of-fit 10.0119 1 10.0119 30.04 0.0317 Pure error 0.666667 2 0.333333 -------------------------------------------------------------------------------- Total (corr.) 43.4286 6 R-squared = 75.4112 percent R-squared (adjusted for d.f.) = 50.8224 percent Standard Error of Est. = 0.57735 Mean absolute error = 1.18367 Durbin-Watson statistic = 0.801839 (P=0.1157) Lag 1 residual autocorrelation = 0.524964 P-Value < 0.05 → Lack-of-fit!

/ department of mathematics and computer science 1212 80 Confirmation experiment When a screening design has been performed and analysed, then this yields a list of significant factors and interactions. Before continuing with follow-up experiments for determining optimal settings of the significant factors and interactions, it is recommended to verify the results through a confirmation experiment.

/ department of mathematics and computer science 1212 81 Short checklist for DOE clearly state objective of experiment check constraints on experiment –constraints on factor combinations and/or changes –constraints on size of experiment make sure that measurements are obtained under constant external conditions (if not, apply blocking!) include centre points to validate model assumptions –check of constant variance –check of non-linearity make clear protocol of execution of experiment (including randomised order of measurements)

/ department of mathematics and computer science 1212 82 Software Statgraphics: menu Special -> Experimental Design StatLab: http://www.win.tue.nl/statlab2/http://www.win.tue.nl/statlab2/ Design Wizard (illustrates blocks and fractions): http://www.win.tue.nl/statlab2/designApplet.html http://www.win.tue.nl/statlab2/designApplet.html Box (simple optimization illustration): http://www.win.tue.nl/~marko/box/box.html http://www.win.tue.nl/~marko/box/box.html

/ department of mathematics and computer science 1212 83 Literature J. Trygg and S. Wold, Introduction to Experimental Design – What is it? Why and Where is it Useful?, homepage of chemometrics, editorial August 2002: www.acc.umu.se/~tnkjtg/Chemometrics/editorial/aug2002.html www.acc.umu.se/~tnkjtg/Chemometrics/editorial/aug2002.html V. Czitrom, One-Factor-at-a-Time Versus Designed Experiments, American Statistician 53 (1999), 126-131 Thumbnail Handbook for Factorial DOE, StatEase

/ department of mathematics and computer science 1212 1 2DS01 Statistics 2 for Chemical Engineering lecture 2

Similar presentations

Presentation on theme: "/ department of mathematics and computer science 1212 1 2DS01 Statistics 2 for Chemical Engineering lecture 2"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

/ department of mathematics and computer science 1212 1 2DS01 Statistics 2 for Chemical Engineering lecture 2

Similar presentations

Presentation on theme: "/ department of mathematics and computer science 1212 1 2DS01 Statistics 2 for Chemical Engineering lecture 2"— Presentation transcript:

Similar presentations

About project

Feedback