Optimization and least squares Prof. Noah Snavely CS1114

Slides:

Advertisements

Similar presentations

Incremental Linear Programming Linear programming involves finding a solution to the constraints, one that maximizes the given linear function of variables.

Advertisements

3.3 Hypothesis Testing in Multiple Linear Regression

Machine Learning and Data Mining Linear regression

Regularization David Kauchak CS 451 – Fall 2013.

MS&E 211 Quadratic Programming Ashish Goel. A simple quadratic program Minimize (x 1 ) 2 Subject to: -x 1 + x 2 ≥ 3 -x 1 – x 2 ≥ -2.

Least squares CS1114

MATH 685/ CSI 700/ OR 682 Lecture Notes

Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.

2.2 Correlation Correlation measures the direction and strength of the linear relationship between two quantitative variables.

Section 4.2 Fitting Curves and Surfaces by Least Squares.

x – independent variable (input)

Clustering and greedy algorithms — Part 2 Prof. Noah Snavely CS1114

Interpolation Prof. Noah Snavely CS1114

Regressions and approximation Prof. Graeme Bailey (notes modified from Noah Snavely, Spring 2009)

Lecture 7: Image Alignment and Panoramas CS6670: Computer Vision Noah Snavely What’s inside your fridge?

Petter Mostad Linear regression Petter Mostad

Clicker Question 1 What is an equation of the tangent line to the curve f (x ) = x 2 at the point (1, 1)? A. y = 2x B. y = 2 C. y = 2x 2 D. y = 2x + 1.

Feature-based object recognition Prof. Noah Snavely CS1114

Polygons and the convex hull Prof. Noah Snavely CS1114

Lecture 8: Image Alignment and RANSAC

Clustering and greedy algorithms Prof. Noah Snavely CS1114

Lecture 10: Robust fitting CS4670: Computer Vision Noah Snavely.

Computing transformations Prof. Noah Snavely CS1114

Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points.

Robust fitting Prof. Noah Snavely CS1114

Collaborative Filtering Matrix Factorization Approach

Image Stitching Ali Farhadi CSE 455

The Game of Algebra or The Other Side of Arithmetic The Game of Algebra or The Other Side of Arithmetic © 2007 Herbert I. Gross by Herbert I. Gross & Richard.

CSC 589 Lecture 22 Image Alignment and least square methods Bei Xiao American University April 13.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.

Lecture 12: Image alignment CS4670/5760: Computer Vision Kavita Bala

 Informally: In an art gallery with n paintings what is the optimal position for a camera?  Formally: Given a set of n points in the Euclidean space,

Ch4 Describing Relationships Between Variables. Pressure.

1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.

Image segmentation Prof. Noah Snavely CS1114

SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.

Interpolation Prof. Noah Snavely CS1114

GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005.

Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.

Optimization revisited Prof. Noah Snavely CS1114

Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)

Correlation/Regression - part 2 Consider Example 2.12 in section 2.3. Look at the scatterplot… Example 2.13 shows that the prediction line is given by.

Fitting image transformations Prof. Noah Snavely CS1114

Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.

CS 2750: Machine Learning Linear Regression Prof. Adriana Kovashka University of Pittsburgh February 10, 2016.

The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.

Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,

1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.

Lecture 10: Image alignment CS4670/5760: Computer Vision Noah Snavely

Recognizing objects Prof. Noah Snavely CS1114

Linear Regression Essentials Line Basics y = mx + b vs. Definitions

Lightstick orientation; line fitting

Lecture 7: Image alignment

Computer Science cpsc322, Lecture 14

CS 2750: Machine Learning Linear Regression

A special case of calibration

Regression and Residual Plots

Object recognition Prof. Graeme Bailey

Collaborative Filtering Matrix Factorization Approach

Prof. Ramin Zabih Least squares fitting Prof. Ramin Zabih

Fitting CS 678 Spring 2018.

Lecture 8: Image alignment

Lecture 8: Image alignment

Calibration and homographies

CS5760: Computer Vision Lecture 9: RANSAC Noah Snavely

CS5760: Computer Vision Lecture 9: RANSAC Noah Snavely

Image Stitching Linda Shapiro ECE/CSE 576.

Lecture 11: Image alignment, Part 2

Image Stitching Linda Shapiro ECE P 596.

Presentation transcript:

Optimization and least squares Prof. Noah Snavely CS1114

Administrivia  A5 Part 1 due tomorrow by 5pm (please sign up for a demo slot)  Part 2 will be due in two weeks (4/17)  Prelim 2 on Tuesday 4/7 (in class) –Covers everything since Prelim 1, including: –Polygons, convex hull, interpolation, image transformation, feature-based object recognition, solving for affine transformations –Review session on Monday, 7pm, Upson 315 2

Image transformations  What happens when the image moves outside the image boundary?  Previously, we tried to fix this by changing the transformation: 3

Image transformations  Another approach: –We can compute where the image moves to in particular the minimum row and the minimum column as well as the width and height of the output image (e.g., max_col – min_col + 1) –Then we can “shift” the image so it fits inside the output image –This could be done with another transformation, but could also just add/subtract min_col and min_row 4

Shifting the image  need to shift (row, col) by adding (min_row, min_col), before applying T -1 5 h w (30, -10) (-15, -50) matrix entry (1,1)

Convex hulls  If I have 100 points, and 10 are on the convex hull  If I add 1 more point –the max # of points on the convex hull is 11 –the min # of points on the convex hull is 3 6 new point

Object recognition 1.Detect features in two images 2.Match features between the two images 3.Select three matches at random 4.Solve for the affine transformation T 5.Count the number of inlier matches to T 6.If T is has the highest number of inliers so far, save it 7.Repeat 3-6 for N rounds, return the best T 7

Object recognition 8

When this won’t work  If the percentage of inlier matches is small, then this may not work  In theory, < 50% inliers could break it –But all the outliers would have to fit a single transform  Often works with fewer inliers 9

A slightly trickier problem  What if we want to fit T to more than three points? –For instance, all of the inliers we find?  Say we found 100 inliers  Now we have 200 equations, but still only 6 unknowns  Overdetermined system of equations  This brings us back to linear regression 10

Linear regression, > 2 points  The line won’t necessarily pass through any data point 11 y = mx + b (y i, x i )

12 Some new definitions  No line is perfect – we can only find the best line out of all the imperfect ones  We’ll define a function Cost(m,b) that measures how far a line is from the data, then find the best line –I.e., the (m,b) that minimizes Cost(m,b) –Such a function Cost(m,b) is called an objective function –You learned about these in section yesterday

13 Line goodness  What makes a line good versus bad? –This is actually a very subtle question

14 Residual errors  The difference between what the model predicts and what we observe is called a residual error (i.e., a left-over) –Consider the data point (x,y) –The model m,b predicts (x,mx+b) –The residual is y – (mx + b)  For 1D regressions, residuals can be easily visualized –Vertical distance to the line

15 Least squares fitting This is a reasonable cost function, but we usually use something slightly different

16 Least squares fitting We prefer to make this a squared distance Called “least squares”

17 Why least squares?  There are lots of reasonable objective functions  Why do we want to use least squares?  This is a very deep question –We will soon point out two things that are special about least squares –The full story probably needs to wait for graduate-level courses, or at least next semester

18 Gradient descent  You learned about this in section  Basic strategy: 1.Start with some guess for the minimum 2.Find the direction of steepest descent (gradient) 3.Take a step in that direction (making sure that you get lower, if not, adjust the step size) 4.Repeat until taking a step doesn’t get you much lower

Gradient descent, 1D quadratic  There is some black magic in setting the step size 19

20 Some error functions are easy  A (positive) quadratic is a convex function –The set of points above the curve forms a (infinite) convex set –The previous slide shows this in 1D But it’s true in any dimension  A sum of convex functions is convex  Thus, the sum of squared error is convex  Convex functions are “nice” –They have a single global minimum –Rolling downhill from anywhere gets you there

21 Consequences  Our gradient descent method will always converge to the right answer –By slowly rolling downhill –It might take a long time, hard to predict exactly how long (see CS3220 and beyond)

22 What else about LS?  Least squares has an even more amazing property than convexity –Consider the linear regression problem  There is a magic formula for the optimal choice of ( m, b ) –You don’t need to roll downhill, you can “simply” compute the right answer

23 Closed-form solution!  This is a huge part of why everyone uses least squares  Other functions are convex, but have no closed-form solution, e.g.

24  Closed form LS formula  The derivation requires linear algebra –Most books use calculus also, but it’s not required (see the “Links” section on the course web page) –There’s a closed form for any linear least- squares problem

Linear least squares  Any formula where the residual is linear in the variables  Examples: 1.simple linear regression: [y – (mx + b)] 2 2.finding an affine transformation [x’ – (ax + by + c)] 2 + [y’ – (dx + ey + f)] 2  Non-example: [x’ – abc x] 2 (variables: a, b, c) 25

Linear least squares  Surprisingly, fitting the coefficients of a quadratic is still linear least squares  The residual is still linear in the coefficients β 1, β 2, β 3 26 Wikipedia, “Least squares fitting”

Optimization  Least squares is an example of an optimization problem  Optimization: define a cost function and a set of possible solutions, find the one with the minimum cost  Optimization is a huge field 27

Optimization strategies  From worse to pretty good: 1.Guess an answer until you get it right 2.Guess an answer, improve it until you get it right 3.Magically compute the right answer  For most problems 2 is the best we can do  Some problems are easy enough that we can do 3  We’ve seen several examples of this 28

Sorting as optimization  Set of allowed answers: permutations of the input sequence  Cost(permutation) = number of out-of- order pairs  Algorithm 1: Snailsort  Algorithm 2: Bubble sort  Algorithm 3: ??? 29

Camera 1 Camera 2 Camera 3 R 1,t 1 R 2,t 2 R 3,t 3 p1p1 p4p4 p3p3 p2p2 p5p5 p6p6 p7p7 minimize f (R, T, P) Another example: “structure from motion”

Optimization is everywhere  How can you give someone the smallest number of coins back as change?  How do you schedule your classes so that there are the fewest conflicts?  How do you manage your time to maximize the grade / work tradeoff? 32

Next time: Prelim 2 33