Intro to Linear Methods Reading: Bishop, 3.0, 3.1, 4.0, 4.1 hip to be hyperplanar...

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Pattern Recognition and Machine Learning: Kernel Methods.
Computer vision: models, learning and inference Chapter 8 Regression.
Matrix Algebra Matrix algebra is a means of expressing large numbers of calculations made upon ordered sets of numbers. Often referred to as Linear Algebra.
Matrix Algebra Matrix algebra is a means of expressing large numbers of calculations made upon ordered sets of numbers. Often referred to as Linear Algebra.
The loss function, the normal equation,
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Section 4.2 Fitting Curves and Surfaces by Least Squares.
Pattern Recognition and Machine Learning
Intro to Linear Methods Reading: DH&S, Ch 5.{1-4,8} hip to be hyperplanar...
GG 313 Geological Data Analysis # 18 On Kilo Moana at sea October 25, 2005 Orthogonal Regression: Major axis and RMA Regression.
Simple Linear Regression
Margins, support vectors, and linear programming Thanks to Terran Lane and S. Dreiseitl.
Linear methods: Regression & Discrimination Sec 4.6.
Margins, support vectors, and linear programming, oh my! Reading: Bishop, 4.0, 4.1, 7.0, 7.1 Burges tutorial (on class resources page)
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Lecture # 9 Matrix Representation of Symmetry Groups
Lecture 12 Least Square Approximation Shang-Hua Teng.
Kernel Methods and SVM’s. Predictive Modeling Goal: learn a mapping: y = f(x;  ) Need: 1. A model structure 2. A score function 3. An optimization strategy.
Linear Methods, cont’d; SVMs intro. Straw poll Which would you rather do first? Unsupervised learning Clustering Structure of data Scientific discovery.
SVMs Reprised Reading: Bishop, Sec 4.1.1, 6.0, 6.1, 7.0, 7.1.
Linear and generalised linear models
Bayesian Learning 1 of (probably) 2. Administrivia Readings 1 back today Good job, overall Watch your spelling/grammar! Nice analyses, though Possible.
Tracking with Linear Dynamic Models. Introduction Tracking is the problem of generating an inference about the motion of an object given a sequence of.
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)
Modern Navigation Thomas Herring
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Today Wrap up of probability Vectors, Matrices. Calculus
Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.
Overview of Kernel Methods Prof. Bennett Math Model of Learning and Discovery 2/27/05 Based on Chapter 2 of Shawe-Taylor and Cristianini.
Mrs. Martinez CHS MATH DEPT.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Scientific Computing Linear Least Squares. Interpolation vs Approximation Recall: Given a set of (x,y) data points, Interpolation is the process of finding.
Analytical vs. Numerical Minimization Each experimental data point, l, has an error, ε l, associated with it ‣ Difference between the experimentally measured.
Linear Regression Andy Jacobson July 2006 Statistical Anecdotes: Do hospitals make you sick? Student’s story Etymology of “regression”
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Chapter 7 Inner Product Spaces 大葉大學 資訊工程系 黃鈴玲 Linear Algebra.
GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005.
Geology 5670/6670 Inverse Theory 21 Jan 2015 © A.R. Lowry 2015 Read for Fri 23 Jan: Menke Ch 3 (39-68) Last time: Ordinary Least Squares Inversion Ordinary.
Sparse Kernel Methods 1 Sparse Kernel Methods for Classification and Regression October 17, 2007 Kyungchul Park SKKU.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Review of Matrix Operations Vector: a sequence of elements (the order is important) e.g., x = (2, 1) denotes a vector length = sqrt(2*2+1*1) orientation.
Dimensionality reduction
ABE425 Engineering Measurement Systems Ordinary Least Squares (OLS) Fitting Dr. Tony E. Grift Dept. of Agricultural & Biological Engineering University.
University of Colorado Boulder ASEN 5070 Statistical Orbit determination I Fall 2012 Professor George H. Born Professor Jeffrey S. Parker Lecture 10: Batch.
1 Kernel-class Jan Recap: Feature Spaces non-linear mapping to F 1. high-D space 2. infinite-D countable space : 3. function space (Hilbert.
Geology 5670/6670 Inverse Theory 12 Jan 2015 © A.R. Lowry 2015 Read for Wed 14 Jan: Menke Ch 2 (15-37) Last time: Course Introduction (Cont’d) Goal is.
Linear Regression. Regression Consider the following 10 data pairs comparing the yield of an experiment to the temperature at which the experiment was.
Section 8 Numerical Analysis CSCI E-24 José Luis Ramírez Herrán October 20, 2015.
Lecture 2 Linear Inverse Problems and Introduction to Least Squares.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Linear Algebra Review.
CH 5: Multivariate Methods
A Simple Artificial Neuron
Quantum Two.
Simple Linear Regression - Introduction
Physics 114: Lecture 14 Linear Fitting
Ying shen Sse, tongji university Sep. 2016
Linear regression Fitting a straight line to observations.
Biointelligence Laboratory, Seoul National University
5.2 Least-Squares Fit to a Straight Line
Nonlinear Fitting.
Mathematical Foundations of BME
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)
Evaluating Algebraic Expressions
Presentation transcript:

Intro to Linear Methods Reading: Bishop, 3.0, 3.1, 4.0, 4.1 hip to be hyperplanar...

Administrivia Office hours today cancelled -- UNM presidential candidate interview Office hours tomorrow: normal time/place

Tides of history Last time: Performance evaluation Hold-out data, cross-validation, variance, etc. This time: Homework 2 assigned Linear models

Homework 2 1. Replicate Example 1.1 in your text: (a) Generate 10 training data points according to the model given (b) Write unregularized and regularized polynomial curve fit programs (language of your choice; Matlab recommended) (c) Generate plots of Fig 1.4 type, only w/ M=0, 1, 2, 4, 7, 9, and 12 w/ unregularized fit (d) Generate test set of 1000 points according to original model (e) Generate Fig 1.5 for your test data

Homework 2 (cont’d) (f) Vary the amount of training data from N= (by steps of 10) and show learning curves for M=1, 3, and 9 (g) Repeat plots of 1(c) using the regularized learner, w/ ln(λ)=-18, -10, -1, 0 2) Bishop ) Bishop 3.5 4) Bishop 4.1

Linear regression prelims Basic idea: assume y(x,w) is a linear function of x plus a parameter vector w : Our job: find best to fit y(x,w) “as well as possible”

Linear regression prelims By “as well as possible”, we mean here, minimum squared error: Loss function Sum over all training data

Useful definitions Definition: A trick is a clever mathematical hack Definition: A method is a trick you use more than once

A helpful “method” Recall Want to be able to easily write Introduce “pseudo-feature” of x, Now have: This is called homogeneous coordinates Note: often omit the ~ for simplicity

A helpful “method” Now have: And: So: And our “loss function” becomes:

Minimizing loss Finally, can write: Want the “best” set of w : the weights that minimize the above Q: how can we minimize this? label (target) vector Data matrix

5 minutes of math... Some useful linear algebra identities: If A and B are matrices, (for invertible square matrices)

5 minutes of math... What about derivatives of vectors/matrices? There’s more than one kind... For the moment, we’ll need the derivative of a vector function with respect to a vector If x is a vector of variables, y is a vector of constants, and A is a matrix of constants, then:

Exercise Derive the vector derivative expressions: Find an expression for the minimum squared error weight vector, w, in the loss function:

The LSE method The quantity is called a Gram matrix and is positive semidefinite and symmetric The quantity is the pseudoinverse of X May not exist if Gram matrix is not invertible The complete “learning algorithm” is 2 whole lines of Matlab code

Back to vector spaces End of last time we were talking about vector spaces Complete, normed vector space w/ inner product == Hilbert space Everything we’ve done today written in terms of vector operations: Addition (subtraction) Inner product Inverse ⇒ Can generalize this to any Hilbert space...