CS 2750: Machine Learning Linear Regression

Slides:

Advertisements

Similar presentations

Least squares CS1114

Advertisements

Pattern Recognition and Machine Learning

Feature Matching and Robust Fitting Computer Vision CS 143, Brown James Hays Acknowledgment: Many slides from Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial.

Photo by Carl Warner.

Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.

Fitting. We’ve learned how to detect edges, corners, blobs. Now what? We would like to form a higher-level, more compact representation of the features.

Geometric Optimization Problems in Computer Vision.

Fitting a Model to Data Reading: 15.1,

Lecture 8: Image Alignment and RANSAC

Lecture 10: Robust fitting CS4670: Computer Vision Noah Snavely.

כמה מהתעשייה? מבנה הקורס השתנה Computer vision.

Fitting and Registration

Robust fitting Prof. Noah Snavely CS1114

Collaborative Filtering Matrix Factorization Approach

Image Stitching Ali Farhadi CSE 455

CSE 185 Introduction to Computer Vision

Computer Vision - Fitting and Alignment

PATTERN RECOGNITION AND MACHINE LEARNING

Fitting and Registration Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/14/12.

Fitting & Matching Lecture 4 – Prof. Bregler Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.

CS 1699: Intro to Computer Vision Matching and Fitting Prof. Adriana Kovashka University of Pittsburgh September 29, 2015.

CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.

Model Fitting Computer Vision CS 143, Brown James Hays 10/03/11 Slides from Silvio Savarese, Svetlana Lazebnik, and Derek Hoiem.

CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.

Computer Vision - Fitting and Alignment (Slides borrowed from various presentations)

CS 2750: Machine Learning The Bias-Variance Tradeoff Prof. Adriana Kovashka University of Pittsburgh January 13, 2016.

CS 2750: Machine Learning Linear Regression Prof. Adriana Kovashka University of Pittsburgh February 10, 2016.

CS 2750: Machine Learning Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh February 17, 2016.

Grouping and Segmentation. Sometimes edge detectors find the boundary pretty well.

CS 1674: Intro to Computer Vision Final Review

CS 2770: Computer Vision Fitting Models: Hough Transform & RANSAC

CS 4501: Introduction to Computer Vision Sparse Feature Descriptors: SIFT, Model Fitting, Panorama Stitching. Connelly Barnes Slides from Jason Lawrence,

University of Ioannina

Fitting a transformation: feature-based alignment

Fitting: Voting and the Hough Transform

Line Fitting James Hayes.

CS 2750: Machine Learning Dimensionality Reduction

Machine Learning I & II.

Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)

Lecture 7: Image alignment

Softmax Classifier + Generalization

A special case of calibration

CS 2750: Machine Learning Review (cont’d) + Linear Regression Intro

CS 2770: Computer Vision Intro to Visual Recognition

CS 2750: Machine Learning Line Fitting + Bias-Variance Trade-off

Photo by Carl Warner.

CS 2750: Machine Learning Support Vector Machines

Prof. Adriana Kovashka University of Pittsburgh October 19, 2016

Collaborative Filtering Matrix Factorization Approach

Photo by Carl Warner.

Segmentation by fitting a model: robust estimators and RANSAC

CS 1675: Intro to Machine Learning Regression and Overfitting

Fitting CS 678 Spring 2018.

776 Computer Vision Jan-Michael Frahm.

Hough Transform.

Lecture 8: Image alignment

Lecture 8: Image alignment

Introduction to Sensor Interpretation

Introduction to Sensor Interpretation

Calibration and homographies

CS5760: Computer Vision Lecture 9: RANSAC Noah Snavely

CS5760: Computer Vision Lecture 9: RANSAC Noah Snavely

Image Stitching Linda Shapiro ECE/CSE 576.

Lecture 11: Image alignment, Part 2

Image Stitching Linda Shapiro ECE P 596.

Prof. Adriana Kovashka University of Pittsburgh September 26, 2019

Presentation transcript:

CS 2750: Machine Learning Linear Regression Prof. Adriana Kovashka University of Pittsburgh January 31, 2017

Announcement Your TA will be back this Wednesday Office hours: Tuesday and Thursday 4-5pm Friday 12-2pm Office hours start this Thursday

Plan for Today Linear regression definition Solution via least squares Solution via gradient descent Regularized least squares Dealing with outliers

Linear Models Sometimes want to add a bias term Can add 1 as x0 such that x = (1, x1, …, xd) Figure from Milos Hauskrecht

Regression f(x, w): x  y At training time: Use given {(x1,y1), …, (xN,yN)}, to estimate mapping function f Objective: minimize (yi – f(w, xi))2, for all i = 1, …, N xi are the input features (d-dimensional) yi is the target continuous output label (given by human/oracle) At test time: Use f to make prediction for some new xtest

Regression vs Classification Problem is called classification if… y is discrete Does your patient have cancer? Should your bank give this person a credit card? Is it going to rain tomorrow? What animal is in this image? Problem is called regression if … y is continuous What price should you ask for this house? What is the temperature going to be tomorrow? What score should your system give to this person’s figure-skating performance?

= b + mx

1-d example Fit line to points Use parameters of line to predict the y-coordinate of a new data point xnew new Figure from Greg Shakhnarovich

2-d example Find parameters of plane Figure from Milos Hauskrecht

(Derivation on board) test Slide credit: Greg Shakhnarovich

Challenges Computing the pseudoinverse might be slow for large matrices Cubic in number of features D (due to SVD) Linear in number of samples N We might want to adjust solution as new examples come in, without recomputing the pseudoinverse for each new sample that comes in Another solution: Gradient descent Cost: linear in both D and N If D > 10,000, use gradient descent

Solution optimality Global vs local minima Fortunately, least squares is convex Adapted from Erik Sudderth

Solution optimality Slide credit: Subhransu Maji

Plan for Today Linear regression definition Solution via least squares Solution via gradient descent Regularized least squares Dealing with outliers

Linear Basis Function Models (1) Example: Polynomial Curve Fitting Slide credit: Christopher Bishop

Slide credit: Christopher Bishop

Regularized Least Squares (1) Consider the error function: With the sum-of-squares error function and a quadratic regularizer, we get which is minimized by Data term + Regularization term λ is called the regularization coefficient. Slide credit: Christopher Bishop

Regularized Least Squares (2) With a more general regularizer, we have Isosurfaces: ||w||q = constant, e.g. 1 w1 w0 Lasso Quadratic Adapted from Christopher Bishop

Regularized Least Squares (3) Lasso (L1) tends to generate sparser solutions than a quadratic regularizer. Adapted from Christopher Bishop, figures from Alexander Ihler

Plan for Today Linear regression definition Solution via least squares Solution via gradient descent Regularized least squares Dealing with outliers

Outliers affect least squares fit Kristen Grauman CS 376 Lecture 11 21

Outliers affect least squares fit Kristen Grauman CS 376 Lecture 11 22

Hypothesize and test Try all possible parameter combinations Repeatedly sample enough points to solve for parameters Each point votes for all consistent parameters E.g. each point votes for all possible lines on which it might lie Score the given parameters: Number of consistent points Choose the optimal parameters using the scores Noise & clutter features? They will cast votes too, but typically their votes should be inconsistent with the majority of “good” features Two methods: Hough transform and RANSAC Adapted from Derek Hoiem and Kristen Grauman

Hough transform for finding lines y b b0 x m0 m image space Hough (parameter) space Connection between image (x,y) and Hough (m,b) spaces A line in the image corresponds to a point in Hough space Steve Seitz CS 376 Lecture 8 Fitting

Hough transform for finding lines y b y0 Answer: the solutions of b = -x0m + y0 This is a line in Hough space x0 x m image space Hough (parameter) space Connection between image (x,y) and Hough (m,b) spaces A line in the image corresponds to a point in Hough space What does a point (x0, y0) in the image space map to? Steve Seitz CS 376 Lecture 8 Fitting

Hough transform for finding lines y b x m image space Hough (parameter) space How can we use this to find the most likely parameters (m,b) for the most prominent line in the image space? Let each edge point in image space vote for a set of possible parameters in Hough space Accumulate votes in discrete set of bins; parameters with the most votes indicate line in image space Steve Seitz CS 376 Lecture 8 Fitting

RANdom Sample Consensus (RANSAC) RANSAC loop: Randomly select a seed group of s points on which to base model estimate Fit model to these s points Find inliers to this model (i.e., points whose distance from the line is less than t) If there are d or more inliers, re-compute estimate of model on all of the inliers Repeat N times Keep the model with the largest number of inliers Modified from Kristen Grauman and Svetlana Lazebnik CS 376 Lecture 11

(RANdom SAmple Consensus) : RANSAC (RANdom SAmple Consensus) : Fischler & Bolles in ‘81. Algorithm: Sample (randomly) the number of points required to fit the model Solve for model parameters using samples Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence Let me give you an intuition of what is going on. Suppose we have the standard line fitting problem in presence of outliers. We can formulate this problem as follows: want to find the best partition of points in inlier set and outlier set such that… The objective consists of adjusting the parameters of a model function so as to best fit a data set. "best" is defined by a function f that needs to be minimized. Such that the best parameter of fitting the line give rise to a residual error lower that delta as when the sum, S, of squared residuals Silvio Savarese

RANSAC Algorithm: Line fitting example Sample (randomly) the number of points required to fit the model (#=2) Solve for model parameters using samples Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence Let me give you an intuition of what is going on. Suppose we have the standard line fitting problem in presence of outliers. We can formulate this problem as follows: want to find the best partition of points in inlier set and outlier set such that… The objective consists of adjusting the parameters of a model function so as to best fit a data set. "best" is defined by a function f that needs to be minimized. Such that the best parameter of fitting the line give rise to a residual error lower that delta as when the sum, S, of squared residuals Silvio Savarese

RANSAC Algorithm: Line fitting example Sample (randomly) the number of points required to fit the model (#=2) Solve for model parameters using samples Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence Let me give you an intuition of what is going on. Suppose we have the standard line fitting problem in presence of outliers. We can formulate this problem as follows: want to find the best partition of points in inlier set and outlier set such that… The objective consists of adjusting the parameters of a model function so as to best fit a data set. "best" is defined by a function f that needs to be minimized. Such that the best parameter of fitting the line give rise to a residual error lower that delta as when the sum, S, of squared residuals Silvio Savarese

RANSAC Algorithm: Line fitting example Sample (randomly) the number of points required to fit the model (#=2) Solve for model parameters using samples Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence Let me give you an intuition of what is going on. Suppose we have the standard line fitting problem in presence of outliers. We can formulate this problem as follows: want to find the best partition of points in inlier set and outlier set such that… The objective consists of adjusting the parameters of a model function so as to best fit a data set. "best" is defined by a function f that needs to be minimized. Such that the best parameter of fitting the line give rise to a residual error lower that delta as when the sum, S, of squared residuals Silvio Savarese

RANSAC Algorithm: Sample (randomly) the number of points required to fit the model (#=2) Solve for model parameters using samples Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence Let me give you an intuition of what is going on. Suppose we have the standard line fitting problem in presence of outliers. We can formulate this problem as follows: want to find the best partition of points in inlier set and outlier set such that… The objective consists of adjusting the parameters of a model function so as to best fit a data set. "best" is defined by a function f that needs to be minimized. Such that the best parameter of fitting the line give rise to a residual error lower that delta as when the sum, S, of squared residuals Silvio Savarese