Data mining and statistical learning - lecture 6

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.
Lecture 4. Linear Models for Regression
Flexible smoothing with B-splines and Penalties or P-splines P-splines = B-splines + Penalization Applications : Generalized Linear and non linear Modelling.
Pattern Recognition and Machine Learning: Kernel Methods.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Ch11 Curve Fitting Dr. Deshi Ye
Introduction to Smoothing Splines
Model assessment and cross-validation - overview
MATH 685/ CSI 700/ OR 682 Lecture Notes
Selected from presentations by Jim Ramsay, McGill University, Hongliang Fei, and Brian Quanz Basis Basics.
Basis Expansion and Regularization Presenter: Hongliang Fei Brian Quanz Brian Quanz Date: July 03, 2008.
Chapter 18 Interpolation The Islamic University of Gaza
1 Chapter 4 Interpolation and Approximation Lagrange Interpolation The basic interpolation problem can be posed in one of two ways: The basic interpolation.
Vector Generalized Additive Models and applications to extreme value analysis Olivier Mestre (1,2) (1) Météo-France, Ecole Nationale de la Météorologie,
1 Curve-Fitting Spline Interpolation. 2 Curve Fitting Regression Linear Regression Polynomial Regression Multiple Linear Regression Non-linear Regression.
Basis Expansion and Regularization
EARS1160 – Numerical Methods notes by G. Houseman
ES 240: Scientific and Engineering Computation. InterpolationPolynomial  Definition –a function f(x) that can be written as a finite series of power functions.
Data mining in 1D: curve fitting
Read Chapter 17 of the textbook
Coefficient Path Algorithms Karl Sjöstrand Informatics and Mathematical Modelling, DTU.
Kernel methods - overview
Curve-Fitting Interpolation
Curve-Fitting Regression
Petter Mostad Linear regression Petter Mostad
Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman.
Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7.
Ordinary least squares regression (OLS)
ECIV 301 Programming & Graphics Numerical Methods for Engineers Lecture 24 Regression Analysis-Chapter 17.
Course AE4-T40 Lecture 5: Control Apllication
Linear and generalised linear models
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Interpolation Chapter 18.
Linear and generalised linear models
Chapter 6 Numerical Interpolation
Objectives of Multiple Regression
Computational Methods in Physics PHYS 3437 Dr Rob Thacker Dept of Astronomy & Physics (MM-301C)
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
Chapter 12 Multiple Linear Regression Doing it with more variables! More is better. Chapter 12A.
Lecture 22 - Exam 2 Review CVEN 302 July 29, 2002.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Curve-Fitting Regression
Splines and applications
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Basis Expansions and Regularization Part II. Outline Review of Splines Wavelet Smoothing Reproducing Kernel Hilbert Spaces.
Lecture 16 - Approximation Methods CVEN 302 July 15, 2002.
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Chapter 15 General Least Squares and Non- Linear.
Machine Learning 5. Parametric Methods.
Curve Fitting Introduction Least-Squares Regression Linear Regression Polynomial Regression Multiple Linear Regression Today’s class Numerical Methods.
1 Chapter 4 Interpolation and Approximation Lagrange Interpolation The basic interpolation problem can be posed in one of two ways: The basic interpolation.
Basis Expansions and Generalized Additive Models Basis expansion Piecewise polynomials Splines Generalized Additive Model MARS.
LECTURE 17: BEYOND LINEARITY PT. 2 March 30, 2016 SDS 293 Machine Learning.
Model Selection and the Bias–Variance Tradeoff All models described have a smoothing or complexity parameter that has to be considered: multiplier of the.
Interpolation - Introduction
Lecture XXVII. Orthonormal Bases and Projections Suppose that a set of vectors {x 1,…,x r } for a basis for some space S in R m space such that r  m.
Piecewise Polynomials and Splines
Chapter 7. Classification and Prediction
ECE3340 Numerical Fitting, Interpolation and Approximation
Curve-Fitting Spline Interpolation
Machine learning, pattern recognition and statistical data modelling
Linear regression Fitting a straight line to observations.
Basis Expansions and Generalized Additive Models (2)
Basis Expansions and Generalized Additive Models (1)
SKTN 2393 Numerical Methods for Nuclear Engineers
Theory of Approximation: Interpolation
Presentation transcript:

Data mining and statistical learning - lecture 6 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression Multidimensional splines Wavelets Data mining and statistical learning - lecture 6

Linear basis expansion (1) Linear regression True model: Question: How to find ? Answer: Solve a system of linear equations to obtain x1 x2 x3 y 1 -3 6 12 … Data mining and statistical learning - lecture 6

Linear basis expansion (2) Nonlinear model True model: Question: How to find ? Answer: A) Introduce new variables x1 x2 x3 y 1 -3 -1 12 … Data mining and statistical learning - lecture 6

Linear basis expansion (3) Nonlinear model B) Transform the data set True model: C) Apply linear regression to obtain u1 u2 u3 u4 y -3 -1.1 -0.84 1 12 … Data mining and statistical learning - lecture 6

Linear basis expansion (4) Conclusion: We can easily fit any model of the type i.e., we can easily undertake a linear basis expansion in X Example: If the model is known to be nonlinear, but the exact form is unknown, we can try to introduce interaction terms Data mining and statistical learning - lecture 6

Piecewise polynomial functions Assume X is one-dimesional Def. Assume the domain [a, b] of X is split into intervals [a, ξ1], [ξ 1, ξ 2], ..., [ξ n, b]. Then f(X) is said to be piecewise polynomial if f(X) is represented by separate polynomials in the different intervals. Note The points ξ1,..., ξ n are called knots Data mining and statistical learning - lecture 6

Piecewise polynomials Example. Continuous piecewise linear function Alternative A. Introduce linear functions on each interval and a set of constraints (4 free parameters) INS. FIG 5.1 lower left Alternative B. Use a basis expansion (4 free parameters) Theorem. The given formulations are equivalent. Data mining and statistical learning - lecture 6

Data mining and statistical learning - lecture 6 Splines Definition A piecewise polynomial is called order-M spline if it has continuous derivatives up to order M-1 at the knots. Alternative definition An order-M spline is a function which can be represented by basis functions ( K= #knots ) Theorem. The definitions above are equivalent. Terrminology. Order-4 spline is called cubic spline INS. FIG 5.2 LR (look at basis and compare #free parameters) Note. Cubic splines: knot-discontinuity is not visible Data mining and statistical learning - lecture 6

Variance of spline estimators – boundary effects INSERT FIG 5.3 Data mining and statistical learning - lecture 6

Data mining and statistical learning - lecture 6 Natural cubic spline Def. A cubic spline f is called natural cubic spline if the its 2nd and 3rd derivatives are zero at a and b Note It implies that f is linear on extreme intervals Basis functions of natural cubic splines Data mining and statistical learning - lecture 6

Fitting smooth functions to data Minimize a penalized sum of squared residuals where λ is smoothing parameter. λ=0 : any function interpolating data λ=+ : least squares line fit Data mining and statistical learning - lecture 6

Optimality of smoothing splines Theorem The function f minimizing RSS for a given  is a natural cubic spline with knots at all unique values of xi (NOTE: N knots!) The optimal spline can be computed as follows. Data mining and statistical learning - lecture 6

A smoothing spline is a linear smoother The fitted function is linear in the response values. Data mining and statistical learning - lecture 6

Degrees of freedom of smoothing splines The effective degrees of freedom is dfλ = trace(Sλ) i.e., the sum of the diagonal elements of S. Data mining and statistical learning - lecture 6

Smoothing splines and eigenvectors It can be shown that where K is the so-called penalty matrix Furthermore, the eigen-decomposition is Note: dk and uk are eigenvalues and eigenvectors, respectively, of K Data mining and statistical learning - lecture 6

Smoothing splines and shrinkage Smoothing spline decomposes vector y with respect to basis of eigenvectors and shrinks respective contributions The eigenvectors ordered by ρ increase in complexity. The higher the complexity, the more the contribution is shrunk. Data mining and statistical learning - lecture 6

Smoothing splines and local curve fitting Eigenvalues are reverse functions of λ. The higher λ, the higher penalization. Smoother matrix is has banded nature -> local fitting method INSERT fig 5.8 Data mining and statistical learning - lecture 6

Fitting smoothing splines in practice (1) Reinsch form: Theorem. If f is natural cubic spline with values at knots f and second derivative at knots then where Q and R are band matrices, dependent on ξ only. Theorem. Data mining and statistical learning - lecture 6

Fitting smoothing splines in practice (2) Reinsch algorithm Evaluate QTy Compute R+λQTQ and find Cholesky decomposition (in linear time!) Solve matrix equation (in linear time!) Obtain f=y-λQγ Data mining and statistical learning - lecture 6

Automated selection of smoothing parameters (1) What can be selected: Regression splines Degree of spline Placement of knots ->MARS procedure Smoothing spline Penalization parameter Data mining and statistical learning - lecture 6

Automated selection of smoothing parameters (2) Fixing the degrees of freedom If we fix dfλ then we can find λ by solving the equation numerically One could try two different dfλ and choose one based on F-tests, residual plots etc. Data mining and statistical learning - lecture 6

Automated selection of smoothing parameters (3) The bias-variance trade off INSERT FIG. 5.9 EPE – integrated squared prediction error, CV- cross validation Data mining and statistical learning - lecture 6

Nonparametric logistic regression Logistic regression model Note: X is one-dimensional What is f: Linear -> ordinary logistic regression (Chapter 4) Enough smooth -> nonparametric logistic regression (splines+others) Other choices are possible Data mining and statistical learning - lecture 6

Nonparametric logistic regression Problem formulation: Minimize penalized log-likelihood Good news: Solution is still a natural cubic spline. Bad news: There is no analytic expression of that spline function Data mining and statistical learning - lecture 6

Nonparametric logistic regression How to proceed? Use Newton-Rapson to compute spline numerically, i.e Compute (analytically) Compute Newton direction using current value of parameter and derivative information Compute new value of parameter using old value and update formula Data mining and statistical learning - lecture 6

Multidimensional splines How to fit data smoothly in higher dimensions? Use basis of one dimensional functions and produce basis by tensor product Problem: Exponential INS FIG. 6.10 growth of basis with dim Data mining and statistical learning - lecture 6

Multidimensional splines How to fit data smoothly in higher dimensions? B) Formulate a new problem The solution is thin-plate splines The similar properties for λ=0. The solution in 2 dimension is essentially sum of radial basis functions Data mining and statistical learning - lecture 6

Data mining and statistical learning - lecture 6 Wavelets Introduction The idea: to fit bumpy function by removing noise Application area: Signal processing, compression How it works: The function is represented in the basis of bumpy functions. The small coefficients are filtered. Data mining and statistical learning - lecture 6

Data mining and statistical learning - lecture 6 Wavelets Basis functions (Haar Wavelets, Symmlet-8 Wavelets) INSERT FIG 5.13 Data mining and statistical learning - lecture 6

Data mining and statistical learning - lecture 6 Wavelets Example Insert FIG 5.14 Data mining and statistical learning - lecture 6