Local surrogates To model a complex wavy function we need a lot of data. Modeling a wavy function with high order polynomials is inherently ill-conditioned.

Slides:



Advertisements
Similar presentations
Nonparametric Methods: Nearest Neighbors
Advertisements

Pattern Recognition and Machine Learning
Kriging.
Cost of surrogates In linear regression, the process of fitting involves solving a set of linear equations once. For moving least squares, we need to form.
Support Vector Machines
Pattern Recognition and Machine Learning: Kernel Methods.
Cost of surrogates In linear regression, the process of fitting involves solving a set of linear equations once. For moving least squares, we need to.
Local surrogates To model a complex wavy function we need a lot of data. Modeling a wavy function with high order polynomials is inherently ill-conditioned.
Curve fit metrics When we fit a curve to data we ask: –What is the error metric for the best fit? –What is more accurate, the data or the fit? This lecture.
EGR 105 Foundations of Engineering I
ES 240: Scientific and Engineering Computation. InterpolationPolynomial  Definition –a function f(x) that can be written as a finite series of power functions.
WFM 6202: Remote Sensing and GIS in Water Management © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 6202: Remote Sensing and GIS in Water Management Akm.
Radial Basis Functions
I welcome you all to this presentation On: Neural Network Applications Systems Engineering Dept. KFUPM Imran Nadeem & Naveed R. Butt &
Function Approximation
Curve-Fitting Regression
REGRESSION What is Regression? What is the Regression Equation? What is the Least-Squares Solution? How is Regression Based on Correlation? What are the.
Radial Basis Function Networks 표현아 Computer Science, KAIST.
November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.
Prediction Networks Prediction –Predict f(t) based on values of f(t – 1), f(t – 2),… –Two NN models: feedforward and recurrent A simple example (section.
INSTANCE-BASE LEARNING
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.
REGRESSION Predict future scores on Y based on measured scores on X Predictions are based on a correlation from a sample where both X and Y were measured.
Curve fit noise=randn(1,30); x=1:1:30; y=x+noise ………………………………… [p,s]=polyfit(x,y,1);
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
Chapter 6-2 Radial Basis Function Networks 1. Topics Basis Functions Radial Basis Functions Gaussian Basis Functions Nadaraya Watson Kernel Regression.
Calibration & Curve Fitting
Radial Basis Function Networks
Gaussian process modelling
Radial Basis Function Networks
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
$88.65 $ $22.05/A profit increase Improving Wheat Profits Eakly, OK Irrigated, Behind Cotton.
Radial Basis Function Networks:
Neural Networks Dr. Thompson March 26, Artificial Neural Network Topology.
WB1440 Engineering Optimization – Concepts and Applications Engineering Optimization Concepts and Applications Fred van Keulen Matthijs Langelaar CLA H21.1.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
Curve-Fitting Regression
GEOSTATISICAL ANALYSIS Course: Special Topics in Remote Sensing & GIS Mirza Muhammad Waqar Contact: EXT:2257.
Regression Regression relationship = trend + scatter
Introduction to regression 3D. Interpretation, interpolation, and extrapolation.
Spatial Interpolation Chapter 13. Introduction Land surface in Chapter 13 Land surface in Chapter 13 Also a non-existing surface, but visualized as a.
Lecture 6: Point Interpolation
Math 4030 – 11b Method of Least Squares. Model: Dependent (response) Variable Independent (control) Variable Random Error Objectives: Find (estimated)
INCLUDING UNCERTAINTY MODELS FOR SURROGATE BASED GLOBAL DESIGN OPTIMIZATION The EGO algorithm STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION GROUP Thanks.
Section 1.6 Fitting Linear Functions to Data. Consider the set of points {(3,1), (4,3), (6,6), (8,12)} Plot these points on a graph –This is called a.
Gaussian Process and Prediction. (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)2 Outline Gaussian Process and Bayesian Regression  Bayesian regression.
Kernel Methods Arie Nakhmani. Outline Kernel Smoothers Kernel Density Estimators Kernel Density Classifiers.
Global predictors of regression fidelity A single number to characterize the overall quality of the surrogate. Equivalence measures –Coefficient of multiple.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Correlation and Regression Ch 4. Why Regression and Correlation We need to be able to analyze the relationship between two variables (up to now we have.
RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU 2: 1 Nonlinear Models.
Curve fit metrics When we fit a curve to data we ask: –What is the error metric for the best fit? –What is more accurate, the data or the fit? This lecture.
The Least Squares Regression Line. The problem with drawing line of best fit by eye is that the line drawn will vary from person to person. Instead, use.
Kriging - Introduction Method invented in the 1950s by South African geologist Daniel Krige (1919-) for predicting distribution of minerals. Became very.
Global predictors of regression fidelity A single number to characterize the overall quality of the surrogate. Equivalence measures –Coefficient of multiple.
Global predictors of regression fidelity A single number to characterize the overall quality of the surrogate. Equivalence measures –Coefficient of multiple.
Part 5 - Chapter
The Simple Linear Regression Model: Specification and Estimation
Linear Regression.
Quadratic Patterns.
Linear regression Fitting a straight line to observations.
Chapter 1 Linear Functions
Lesson 5.7 Predict with Linear Models The Zeros of a Function
Section 2: Linear Regression.
Chapter 8: Generalization and Function Approximation
Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Jan. 9, 2001
Nonlinear Fitting.
Lesson 2.2 Linear Regression.
Prediction Networks Prediction A simple example (section 3.7.3)
Regression and Correlation of Data
Presentation transcript:

Local surrogates To model a complex wavy function we need a lot of data. Modeling a wavy function with high order polynomials is inherently ill-conditioned. With a lot of data we normally predict function values using only nearby values. We may fit several local surrogates as in figure. For example, if you have the price of gasoline every first of the month from 2000 through 2009, how many values would you use to estimate the price on June 15, 2007?

Popular local surrogates Moving least squares: Weighting more heavily points near the prediction location. Radial basis neural network: Regression with local functions that decay away from data points. Kriging: Radial basis functions, but fitting philosophy not based on error at data points but on correlation between function values at near and far points.

Review of Linear Regression

Moving least squares

Weighted least squares

Six-hump camelback function Definition: Function fit with moving least squares using quadratic polynomials.

Effect of number of points and decay rate.

Radial basis neural networks a1a1 a2a2 a3a3 x ŷ(x) Input Output Radial basis function W1W1 W2W2 W3W3 0.5 Radial basis functions b Input

In regression notation

Example Evaluate the function y=x+0.5sin(5x) at 21 points in the interval [1,9], fit an RBF to it and compare the surrogate to the function over the interval[0,10]. Fit using default options in Matlab, achieves zero rms error by using all data points as basis functions (neurons) Very good interpolation, but even mild extrapolation is horrible.

Accept 0.1 mean squared error net=newrb(x,y,0.1,1,20,1); spread set to 1, ( 11 neurons were used). With about half of the data points used as basis functions, the fit is more like polynomial regression. Interpolation is not as good, but the trend is captured, so that extrapolation is not as disastrous. Obviously, if we just wanted to capture the trend, we would have been better with a polynomial.

Too narrow a spread net=newrb(x,y,0.1,0.2,20,1); ( 17 neurons used) With a spread of 0.2 and the points being 0.4 apart (21 points in [1,9]), the shape functions decay to less than 0.02 at the nearest point. This means that each data point if fitted individually, so that we get spikes at data points. A rule of thumb is that the spread should not be smaller than the distance to the nearest point.