Introduction to design Olav M. Kvalheim
Content Making your data work twice Effect of correlation on data interpretation Effect of interaction on data interpretation
Chemometrics/Infometrics Design of information-rich experiments and use of multivariate methods for extraction of maximum relevant information from data
Making your data work twice
What is Information? A B C A - mean value, no standard deviation given B - mean value with standard deviation given, large value of stand. dev. C - mean value, low standard deviation
A B Hotelling (1944) Ann. Math. Statistics 15, Measurement strategy? Unknowns Calibration Weights
The univariate weighing design Weigh A and B separately m A ± A m B ± B A = B = Precision is for both A and B
The multivariate design Weigh A and B jointly to determine sum and difference: m A + m B =S m A - m B =D m A = ½S + ½D m B = ½S - ½D Precision is 0.7 for both A and B Precision for S Precision for D
Precision is improved by 30% by using a multivariate design with the same number of measurementsas for the univariate! Univariate vs Bivariate strategy
With N masses to weigh, a multivariate design provides an estimate of each mass with a precision The larger the number of unknowns, the larger the gain in precision using a multivariate weighing design. Univariate vs Multivariate weighing
Effect of correlation on data interpretation X1 X2X1 X2
Example Process output is function of temperature and amount of catalyst
Correlation between amount of catalyst and amount produced Strong positive correspondence
Correlation between Temperature and Produced amount Weak positive correspondence
Conclusion from correlation analysis Increase amount of catalyst and temperature to increase production
Result of test Produced amount was lowered!
Bivariate Regression Model Produced amount = * Catalyst * Temperature
Correlation between temperature and amount of catalyst Strong positive correspondence
Solution to correlation problem Multivariate Design - Change many process variables simultaneously according to experimental designs
Effect of interaction on data interpretation X1X2X1X2
The yield of a chemical reaction is a function of temperature (t) and concentration (c). y = f (t,c) The task Optimise the yield for the reaction!
Concentration, M Temperature, ºC Response surface in the presence of interaction
Univariate design (COST) Multivariate design Information Number of experiments Efficiency of information extraction
Multivariate Design vs. Univariate Design Correct Models Possible (Interactions) Efficient Experimentation Improved Precision/Information quality