Cost of surrogates In linear regression, the process of fitting involves solving a set of linear equations once. For moving least squares, we need to.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.
Kriging.
Cost of surrogates In linear regression, the process of fitting involves solving a set of linear equations once. For moving least squares, we need to form.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Ch11 Curve Fitting Dr. Deshi Ye
Local surrogates To model a complex wavy function we need a lot of data. Modeling a wavy function with high order polynomials is inherently ill-conditioned.
Curve fit metrics When we fit a curve to data we ask: –What is the error metric for the best fit? –What is more accurate, the data or the fit? This lecture.
Simple Linear Regression
Basic geostatistics Austin Troy.
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
The Simple Linear Regression Model: Specification and Estimation
Curve-Fitting Regression
7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.
Ordinary Kriging Process in ArcGIS
Quantitative Business Analysis for Decision Making Simple Linear Regression.
REGRESSION Predict future scores on Y based on measured scores on X Predictions are based on a correlation from a sample where both X and Y were measured.
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Curve fit noise=randn(1,30); x=1:1:30; y=x+noise ………………………………… [p,s]=polyfit(x,y,1);
Regression and Correlation Methods Judy Zhong Ph.D.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 5 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
Simple Linear Regression
Gaussian process modelling
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Geo479/579: Geostatistics Ch12. Ordinary Kriging (1)
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Gu Yuxian Wang Weinan Beijing National Day School.
Local surrogates To model a complex wavy function we need a lot of data. Modeling a wavy function with high order polynomials is inherently ill-conditioned.
Geographic Information Science
Curve-Fitting Regression
GEOSTATISICAL ANALYSIS Course: Special Topics in Remote Sensing & GIS Mirza Muhammad Waqar Contact: EXT:2257.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Spatial Interpolation Chapter 13. Introduction Land surface in Chapter 13 Land surface in Chapter 13 Also a non-existing surface, but visualized as a.
Chapter 13 Multiple Regression
Spatial Analysis & Geostatistics Methods of Interpolation Linear interpolation using an equation to compute z at any point on a triangle.
Lecture 16 - Approximation Methods CVEN 302 July 15, 2002.
Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)
INCLUDING UNCERTAINTY MODELS FOR SURROGATE BASED GLOBAL DESIGN OPTIMIZATION The EGO algorithm STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION GROUP Thanks.
Machine Learning 5. Parametric Methods.
Curve Fitting Introduction Least-Squares Regression Linear Regression Polynomial Regression Multiple Linear Regression Today’s class Numerical Methods.
CORRELATION ANALYSIS.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Global predictors of regression fidelity A single number to characterize the overall quality of the surrogate. Equivalence measures –Coefficient of multiple.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Curve fit metrics When we fit a curve to data we ask: –What is the error metric for the best fit? –What is more accurate, the data or the fit? This lecture.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Environmental Data Analysis with MatLab 2 nd Edition Lecture 22: Linear Approximations and Non Linear Least Squares.
CWR 6536 Stochastic Subsurface Hydrology Optimal Estimation of Hydrologic Parameters.
Kriging - Introduction Method invented in the 1950s by South African geologist Daniel Krige (1919-) for predicting distribution of minerals. Became very.
Questions from lectures
Probability Theory and Parameter Estimation I
Parameter Estimation and Fitting to Data
CH 5: Multivariate Methods
ECONOMETRICS DR. DEEPTI.
Inference for Geostatistical Data: Kriging for Spatial Interpolation
CORRELATION ANALYSIS.
10701 / Machine Learning Today: - Cross validation,
Curve fit metrics When we fit a curve to data we ask:
Curve fit metrics When we fit a curve to data we ask:
Simple Linear Regression
Concepts and Applications of Kriging
Probabilistic Surrogate Models
Presentation transcript:

Cost of surrogates In linear regression, the process of fitting involves solving a set of linear equations once. For moving least squares, we need to form and solve the system at every prediction point. With radial basis neural networks we have to optimize the selection of neurons, which will again entail multiple solutions of the linear system. We may find the best spread by minimizing cross-validation errors. Kriging, our next surrogate is even more expensive, we have a spread constant in every direction and we have to perform optimization to calculate the best set of constants. With many hundreds of data points this can become significant computational burden.

Kriging - Introduction Method invented in the 1950s by South African geologist Daniel Krige (1919-) for predicting distribution of minerals. Became very popular for fitting surrogates to expensive computer simulations in the 21 st century. It is one of the best surrogates available. It probably became popular late mostly because of the high computer cost of fitting it to data.

Kriging philosophy We assume that the data is sampled from an unknown function that obeys simple correlation rules. The value of the function at a point is correlated to the values at neighboring points based on their separation in different directions. The correlation is strong to nearby points and weak with far away points, but strength does not change based on location. Normally Kriging is used with the assumption that there is no noise so that it interpolates exactly the function values. It works out to be a local surrogate, and it uses functions that are very similar to radial basis functions.

Reminder: Covariance and Correlation Covariance of two random variables X and Y The covariance of a random variable with itself is the square of the standard deviation Covariance matrix for a vector contains the covariances of the components Correlation The correlation matrix has 1 on the diagonal.

Correlation between function values at nearby points for sine(x) Generate 10 random numbers, translate them by a bit (0.1), and by more (1.0) x=10*rand(1,10) xnear=x+0.1; xfar=x+1; Calculate the sine function at the three sets. ynear=sin(xnear) y=sin(x) yfar=sin(xfar) Compare corelations. r=corrcoef(y,ynear) ; rfar=corrcoef(y,yfar) Decay to about 0.4 over one sixth of the wavelength.

Gaussian correlation function

Linear trend function is most often a low order polynomial We will cover ordinary kriging, where linear trend is just a constant to be estimated by data. There is also simple kriging, where constant is assumed to be known. Assumption: Systematic departures Z(x) are correlated. Kriging prediction comes with a normal distribution of the uncertainty in the prediction. Universal Kriging x y Kriging Sampling data points Systematic Departure Linear Trend Model Linear trend model Systematic departure

Notation

Prediction and shape functions

Fitting the data

Prediction variance Square root of variance is called standard error The uncertainty at any x is normally distributed.

Kriging fitting problems The maximum likelihood or cross-validation optimization problem solved to obtain the kriging fit is often ill-conditioned leading to poor fit, or poor estimate of the prediction variance. Poor estimate of the prediction variance can be checked by comparing it to the cross validation error. Poor fits are often characterized by the kriging surrogate having large curvature near data points (see example on next slide). It is recommended to visualize by plotting the kriging fit and its standard error.

Example of poor fits.

SE: standard error

Problems Fit the quadratic function of Slide 13 with kriging using different options, like different covariance and trend function and compare the accuracy of the fit. For this problems compare the standard error with the actual error.