1 Design of experiment for computer simulations Let X = (X 1,…,X p ) R p denote the vector of input values chosen for the computer program Each X j is continuously adjustable between a lower and an upper limit, or 0 and 1 after transformation Let Y = (Y 1,…,Y q ) R q denote the vector of q output quantities Y = f(X), X [0,1] p Important considerations: The number of input p The number of output q The speed with which f can be computed They are deterministic, not stochastic Why a statistical approach is called for?
2 Design of experiment for computer simulations Conventional one-factor-at-a-time approach It may miss good combinations of X because it doesn’t fully explore the design space. It is slow, especially when p is large It may be misleading when interactions among the components of X are strong Randomness is required in order to generate probability or confidence intervals Introducing randomness by modeling the function f as a realization of a Gaussian process Introducing randomness by taking random input points
3 Goals in computer experiments Optimization Standard optimization methods (e.g. quasi-Newton or conjugate gradients) can be unsatisfactory for computer experiments as they usually require first and possibly second derivatives of f Standard methods also depend strongly on having good starting values Computer experimentation is useful in the early stages of optimization where one is searching for a suitable starting value, and for searching for several widely separated regions for the predictor space that might all have good Y values
4 Goals in computer experiments Visualization – being able to compute a function f at any given X doesn’t necessarily imply that one “understands” the function Computer simulation results can be used to help identify strong dependencies Approximation If the original program f is exceedingly expensive to evaluate, it may be approximated by some very simple function, holding adequately in a region of interest, though not necessarily over the entire domain of f Optimization may be done using large number of runs of the simple function
5 Approaches to computer experiments There are two main statistical approaches to computer experiments One is based on Bayesian statistics Another is a frequentist one based on sampling techniques It is essential to introduce randomness in both approaches Frequentist approach For a scalar function Y = f(X), consider a regression model Y = f(X) Z(X) The coefficients can be determined by least squares method with respect to some distribution F on [0,1] p LS = ( Z(X)’Z(X)dF) -1 Z(X)’f(X)dF The quality of the approximation may be assessed globally by the integrated mean squared error (Y – Z(X) ) 2 dF
6 Frequentist experimental design Assume the region of interest is the unit cube [0,1] p, p = 5 Grids (choose k different values for each of X 1 through X p and run all k p combinations) – works well but completely impractical when p is large. In situations where one of the responses Y k depends very strongly on only one or two of the inputs X j the grid design leads to much wasteful duplication
7 Frequentist experimental design Good lattice points (based on number theory)
8 Frequentist experimental design Latin hypercubes
9 Frequentist experimental design Randomized orthogonal arrays
10 Example – critical specimen size study
11 W critical = f(t, h; E, y, 0, e; k) 19 mm (ANSI/AWS) 25 mm (MIL) 35 mm (ISO) 76 mm (ANSI/AWS) 102 mm (MIL) 45 m m (IS O) 25 mm (MI L) 105 mm (ISO) 19 mm (ANSI/A WS) Specimen size requirements for tensile shear tests of 0.8 mm gauge steel sheets.
12 P D Peak Load E Energy Maximum Displacement
Peak Load (kN)
14 Table 1. Ranges selected for computer simulation. t (mm)h (mm)E (GPa) y (MPa) 0 (MPa) e (%)k 0.5~2.00.1~1.5190~200205~172550~2002~651.0~3.0 Table 2. Design matrix and simulation results. Ru n v1v1 t (mm) v2v2 h (mm) v3v3 E (MPa ) v4v4 kv5v5 y (MPa) v6v6 e (%) v7v7 uts (MPa) W criti cal (mm)
15 Coupon Size Determination Simulation A two level full factorial would require 2 7 = 128 runs In the computer experiment, N levels of each variable can be chosen (based on the number of variables n). N is also the total number of runs needed. N = 17 for seven (7) variables in the example The computer simulation results are used to create the dependence of critical specimen size on the variables by Kriging regression method
16 W critical,1 = t W critical,2 = t y h W critical,3 = t y h e ( uts - y ) E h e ( uts - y )