Sampling plans Given a domain, we can reduce the prediction error by good choice of the sampling points The choice of sampling locations is called “design.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction.
Sampling plans for linear regression
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
3.3 Hypothesis Testing in Multiple Linear Regression
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
11.1 Introduction to Response Surface Methodology
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Lecture 7: Principal component analysis (PCA)
The Simple Linear Regression Model: Specification and Estimation
Design of Engineering Experiments - Experiments with Random Factors
MAE 552 Heuristic Optimization Instructor: John Eddy Lecture #19 3/8/02 Taguchi’s Orthogonal Arrays.
1 Chapter 6 The 2 k Factorial Design Introduction The special cases of the general factorial design (Chapter 5) k factors and each factor has only.
Principal Components. Karl Pearson Principal Components (PC) Objective: Given a data matrix of dimensions nxp (p variables and n elements) try to represent.
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Probability theory 2011 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different definitions.
MAE 552 Heuristic Optimization
A quick introduction to the analysis of questionnaire data John Richardson.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 11 Notes Class notes for ISE 201 San Jose State University.
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Probability theory 2008 Outline of lecture 5 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different.
Chapter 5 Transformations and Weighting to Correct Model Inadequacies
Applications in GIS (Kriging Interpolation)
Simple Linear Regression and Correlation
CSE554SimplificationSlide 1 CSE 554 Lecture 7: Simplification Fall 2014.
Collaborative Filtering Matrix Factorization Approach
Space-Filling DOEs Design of experiments (DOE) for noisy data tend to place points on the boundary of the domain. When the error in the surrogate is due.
Algebra Review. Polynomial Manipulation Combine like terms, multiply, FOIL, factor, etc.
Summarized by Soo-Jin Kim
MultiSimplex and experimental design as chemometric tools to optimize a SPE-HPLC-UV method for the determination of eprosartan in human plasma samples.
Random Sampling, Point Estimation and Maximum Likelihood.
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
Statistical Design of Experiments
Engineering Statistics ENGR 592 Prepared by: Mariam El-Maghraby Date: 26/05/04 Design of Experiments Plackett-Burman Box-Behnken.
Engineering Analysis ENG 3420 Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 11:00-12:00.
1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.
Chapter 11Design & Analysis of Experiments 8E 2012 Montgomery 1.
Functions of Several Variables Copyright © Cengage Learning. All rights reserved.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
Interpreting Principal Components Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University L i n.
GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005.
DOX 6E Montgomery1 Design of Engineering Experiments Part 9 – Experiments with Random Factors Text reference, Chapter 13, Pg. 484 Previous chapters have.
1 Principal stresses/Invariants. 2 In many real situations, some of the components of the stress tensor (Eqn. 4-1) are zero. E.g., Tensile test For most.
International Conference on Design of Experiments and Its Applications July 9-13, 2006, Tianjin, P.R. China Sung Hyun Park, Hyuk Joo Kim and Jae-Il.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
The American University in Cairo Interdisciplinary Engineering Program ENGR 592: Probability & Statistics 2 k Factorial & Central Composite Designs Presented.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Estimates and Sample Sizes Chapter 6 M A R I O F. T R I O L A Copyright © 1998,
Lecture 18 Today: More Chapter 9 Next day: Finish Chapter 9.
Global predictors of regression fidelity A single number to characterize the overall quality of the surrogate. Equivalence measures –Coefficient of multiple.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Global predictors of regression fidelity A single number to characterize the overall quality of the surrogate. Equivalence measures –Coefficient of multiple.
11 Designs for the First Order Model “Orthogonal first order designs” minimize the variance of If the off diagonal elements of X´X are all zero then the.
Sampling plans for linear regression
The simple linear regression model and parameter estimation
Probability Theory and Parameter Estimation I
The Simple Linear Regression Model: Specification and Estimation
Collaborative Filtering Matrix Factorization Approach
Chapter 3 Multiple Linear Regression
AP Calculus BC September 29, 2016.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Instructor :Dr. Aamer Iqbal Bhatti
Contact: Machine Learning – (Linear) Regression Wilson Mckerrow (Fenyo lab postdoc) Contact:
Parametric Methods Berlin Chen, 2005 References:
Ch 4.1 & 4.2 Two dimensions concept
Multiple linear regression
Presentation transcript:

Sampling plans Given a domain, we can reduce the prediction error by good choice of the sampling points The choice of sampling locations is called “design of experiments” or DOE The simplest DOE is full factorial design where we sample each variable (factor) at a fixed number of values (levels) Example, with four factors and three levels each we will sample 81 points Full factorial design is not practical except for low dimensions

Prediction variance for full factorial design Recall that standard error (square root of prediction variance is We start with simple design domain: Box Cheapest full factorial design: two levels (not good for quadratic polynomials) For a linear polynomial standard error is then Maximum error at vertices Why do we get this result?

Designs for linear RS Traditionally use only two levels Orthogonal design when X T X is diagonal Full factorial design is orthogonal, not so easy to produce other orthogonal designs with less points. Stability: Small variation of prediction variance in domain is also desirable property

Example Compare an orthogonal array based on equilateral triangle to right triangle at vertices (both are saturated) Linear polynomial y=b 1 +b 2 x 1 +b 3 x 2 For right triangle we obtained

Comparison Prediction variances for equilateral triangle The maximum variance at (1,1) is three times larger than the lowest one. For right triangle Maximum variance is nine times the lowest A fairer comparison is when we restrict triangle to lie inside box The prediction variance is doubled. Maximum error and stability are still better, but variance in coefficients is not as good.

Quadratic RS Need at least (n+1)(n+2)/2 points Need at least three points in every direction Simplest DOE is three-level, full factorial design Impractical for n>5 Also unreasonable ratio between number of points and number of coefficients For example, for n=8 we get 6561 samples for 45 coefficients. My rule of thumb is that you want twice as many points as coefficients

Central Composite Design Includes 2 n vertices, 2n face points plus n c repetitions of central point Distance of face point α varies Can choose α so to – achieve spherical design – achieve rotatibility (prediction variance is a spherical function) – Stay in box (face centered) FCCCD Still impractical for n>8

Spherical CCD From Myers and Montgomery’s Response Surface Methodology. Figure 7.4 in 1995 edition (Fig. 7.5 on next slide)

Spherical CCD for n=3

Repeated observations at origin Unlike linear designs, prediction variance high at origin Repetition at origin decreases variance there and improves stability What other rational for choosing the origin for repetition? Repetition also gives an independent measure of magnitude of noise Can be used also for lack-of-fit tests

Without repetition (9 points) Contours of prediction variance for spherical CCD design. From Myers and Montgomery’s Response Surface Methodology. Figure 7.10 in 1995 edition (Fig on next slide)

Center repeated 5 times (13 points). d=ccdesign(2,'center', 'uniform') d =

Variance optimal designs Full factorial and CCD are not flexible in number of points Standard error in coefficients A key to most optimal DOE methods is moment matrix A good design of experiments will maximize the terms in this matrix, especially the diagonal elements D-optimal designs maximize determinant of moment matrix Inversely proportional to square of volume of confidence region on coefficients

Example Given the model y=b 1 x 1 +b 2 x 2, and the two data points (0,0) and (1,0), find the optimum third data point (p,q) in the unit square. We have So that the third point is (p,1), for any value of p Finding D-optimal design in higher dimensions is a difficult optimization problem often solved heuristically

Matlab example >> ny=6;nbeta=6; >> [dce,x]=cordexch(2,ny,'quadratic'); >> dce' scatter(dce(:,1),dce(:,2),200,'filled') >> det(x'*x)/ny^nbeta ans = With 12 points: >> ny=12; >> [dce,x]=cordexch(2,ny,'quadratic'); >> dce' scatter(dce(:,1),dce(:,2),200,'filled') >> det(x'*x)/ny^nbeta ans =0.0102

Other criteria A-optimal minimizes trace of inverse of moment matrix, minimizes the sum of the variances of the coefficients G-optimality minimizes the maximum of the prediction variance.

Example For the previous example, find the A-optimal design Minimum at (0,1)