Download presentation
Presentation is loading. Please wait.
Published byCollin Franklin Modified over 9 years ago
1
Regression UC Berkeley Fall 2004, E77 http://jagger.me.berkeley.edu/~pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons Attribution-ShareAlike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA. http://jagger.me.berkeley.edu/~pack/e77http://creativecommons.org/licenses/by-sa/2.0/
2
Info Midterm next Friday (11/5) 1-2 if you actually enrolled in my section. Check BlackBoard to see Room Location If you have a conflict (as before), bring a letter/scheduleprintout, etc to class next Monday so that we can make arrangements. Review Session Wednesday 11/3 evening. Check BlackBoard HW and Lab due this Friday (10/29) Mid-Course evaluation on BlackBoard. Do it by Thursday, 11/4 at noon. We can’t see your answers, but we know if you’ve done it. Get an extra point.
3
Regression: Curve-fitting with minimum error Given (x,y) data pairs (x 1,y 1 ), (x 2,y 2 ), …, (x N,y N ) From a prespecified collection of “simple” functions – for example: all linear functions find one that “explains the data pairs with minimum error.” e 1 = f(x 1 ) – y 1 e 2 = f(x 2 ) – y 2 … e k = f(x k ) – y k … e N = f(x N ) – y N For a given function f, the mismatch (error) is defined
4
Fitting data with a linear function -8-6-4-20246810 -6 -5 -4 -3 -2 0 1 2 X Y Data points Linear function Positive e i Negative e i
5
Straight-line functions How does it work if the function f is to be of the form f(x) = ax + b for to-be-chosen parameters a and b? Given (x,y) data pairs (x 1,y 1 ), (x 2,y 2 ), …, (x N,y N ) For fixed values of a and b, the mismatch (error) is e 1 = ax 1 +b – y 1 e 2 = ax 2 +b – y 2 … e N = ax N +b – y N “ data ” by choosing Goal: make this small
6
Measuring the “amount” of mismatch Several ways to quantify the amount of mismatch. All have the property that if one component of mismatch is “big”, then the measure-of-mismatch is big. This is motivated for a few reasons: –It will lead to least squares problems, which we have already been exposed. And, it makes sense. And… –By making “reasonable” assumptions about the cause of the mismatch (independent, random, zero-mean, identically distributed, Gaussian additive errors in observing y), then it is the best measure of how likely a candidate function led to the data observed. For convenience, we pick … “ noise ” in measurement
7
Euclidean Norms of vectors If v is an m-by-1 column (or row) vector, the “norm of v” is defined as Symbol to denote “norm of v” Square-root of sum-of-squares of components, generalizing Pythagorean ’ s theorem The norm of a vector is a measure of its length. Some facts: ||v||=0 if and only if every component of v is zero ||v + w|| ≤ ||v|| + ||w||
8
Straight-line functions Given (x,y) data pairs (x 1,y 1 ), (x 2,y 2 ), …, (x N,y N ) the “ e ” vector ||e|| This says: “ By choice of a and b, minimize the Euclidean norm of the mismatch. ”
9
The “Least Squares” Problem If A is an n-by-m array, and b is an n-by-1 vector, let c * be the smallest possible (over all choices of m-by-1 vectors x) mismatch between Ax and b (ie., pick x to make Ax as much like b as possible). “is defined as” “the minimum, over all m-by-1 vectors x” “ the length (ie., norm) of the difference/mismatch between Ax and b. ”
10
Four cases for Least Squares Recall least squares formulation There are 4 scenarios c * = 0: the equation Ax=b has at least one solution –only one x vector achieves this minimum –many different x vectors achieves the minimum c * > 0: the equation Ax=b has no solutions –only one x vector achieves this minimum –many different x vectors achieves the minimum In regression, this is almost always the case
11
The backslash operator If A is an n-by-m array, and b is an n-by-1 vector, then >> x = A\b solves the “least squares” problem. Namely –If there is an x which solves Ax=b, then this x is computed –If there is no x which solves Ax=b, then an x which minimizes the mismatch between Ax and b is computed. In the case where many x satisfy one of the criterion above, then a smallest (in terms of vector norm) such x is computed. So, mismatch is handled first. Among all equally suitable x vectors that minimize the mismatch, choose a smallest one.
12
Straight-line functions Given (x,y) data pairs (x 1,y 1 ), (x 2,y 2 ), …, (x N,y N ) the “ e ” vector ||e|| This says: “ By choice of a and b, minimize the Euclidean norm of the mismatch. ”
13
Linear Regression Code function [a,b] = linreg(Xdata,Ydata) % Fits a linear % function Y = aX + b % to the data given % by Xdata, Ydata % Verify Xdata and Ydata are column % vectors of same length. N = length(Xdata); optpara = [Xdata ones(N,1)]\Ydata; a = optpara(1); b = optpara(2);
14
Quadratic functions How does it work if the function f is to be of the form f(x) = ax 2 + bx + c for to-be-chosen parameters a, b and c? For fixed values of a, b and c, the error at (x k,y k ) is e k = ax k 2 + bx k + c – y k f(xk)f(xk)
15
Polynomial functions How does it work if the function f is to be of the form f(x) = a 1 x n + a 2 x n-1 + … + a n x + a n+1 for to-be-chosen parameters a 1, a 2,…,a n+1 ? For fixed values of a 1, a 2,…,a n+1, the error at (x k,y k ) is f(xk)f(xk)
16
Polynomial Regression Psuedo-Code function p=polyreg(Xdata,Ydata,nOrd) % Fits an nOrd’th order polynomial % to the data given by Xdata, Ydata N = length(Xdata); RM = zeros(N,nOrd+1); RM(:,end) = ones(N,1); for i=1:nOrd RM(:,end-i) = RM(:,end-i+1).*Xdata; end p = RM\Ydata; p = p.’;
17
General “basis” functions How does it work if the function f is to be of the form for fixed functions b i (called “basis” functions), and to-be- chosen parameters a 1, a 2,…,a n. For fixed values of a 1, a 2,…,a n, the error at (x k,y k ) is
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.