4.4 Application--OLS Estimation. Background When we do a study of data and are looking at the relationship between 2 variables, and have reason to believe.

Slides:



Advertisements
Similar presentations
The complex numbers To make many of the rules of mathematics apply universally we need to enlarge our number field. If we desire that every integer has.
Advertisements

Ordinary Least-Squares
5.4 Basis And Dimension.
5.1 Real Vector Spaces.
Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial.
1 Functions and Applications
Polynomials and Polynomial Functions
Simple Regression. Major Questions Given an economic model involving a relationship between two economic variables, how do we go about specifying the.
Signal Spaces.
P M V Subbarao Professor Mechanical Engineering Department
7.1 The Greatest Common Factor and Factoring by Grouping
Copyright © Cengage Learning. All rights reserved. 0 Precalculus Review.
Solving Quadratic Equations Algebraically Lesson 2.2.
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
2.2 Correlation Correlation measures the direction and strength of the linear relationship between two quantitative variables.
6 6.1 © 2012 Pearson Education, Inc. Orthogonality and Least Squares INNER PRODUCT, LENGTH, AND ORTHOGONALITY.
1cs542g-term Notes  Simpler right-looking derivation (sorry):
Chapter 5 Orthogonality
3D Geometry for Computer Graphics. 2 The plan today Least squares approach  General / Polynomial fitting  Linear systems of equations  Local polynomial.
1cs542g-term Notes  r 2 log r is technically not defined at r=0 but can be smoothly continued to =0 there  Question (not required in assignment):
Ch 7.3: Systems of Linear Equations, Linear Independence, Eigenvalues
1 MF-852 Financial Econometrics Lecture 2 Matrix Operations in Econometrics, Optimization with Excel Roy J. Epstein Fall 2003.
Lecture 12 Projection and Least Square Approximation Shang-Hua Teng.
Lecture 12 Least Square Approximation Shang-Hua Teng.
Orthogonality and Least Squares
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems
6 6.1 © 2012 Pearson Education, Inc. Orthogonality and Least Squares INNER PRODUCT, LENGTH, AND ORTHOGONALITY.
6 6.3 © 2012 Pearson Education, Inc. Orthogonality and Least Squares ORTHOGONAL PROJECTIONS.
Chi Square Distribution (c2) and Least Squares Fitting
Computer Graphics Recitation The plan today Least squares approach  General / Polynomial fitting  Linear systems of equations  Local polynomial.
Last lecture summary independent vectors x
Least Squares Regression
Stats & Linear Models.
The Real Zeros of a Polynomial Function
Chapter 4 – Polynomials and Rational Functions
Computational Methods in Physics PHYS 3437 Dr Rob Thacker Dept of Astronomy & Physics (MM-301C)
Systems and Matrices (Chapter5)
1 Preliminaries Precalculus Review I Precalculus Review II
Inner Product Spaces Euclidean n-space: Euclidean n-space: vector lengthdot productEuclidean n-space R n was defined to be the set of all ordered.
CHAPTER FIVE Orthogonality Why orthogonal? Least square problem Accuracy of Numerical computation.
Linear Algebra Chapter 4 Vector Spaces.
Unit 4: Modeling Topic 6: Least Squares Method April 1, 2003.
Scientific Computing Linear Least Squares. Interpolation vs Approximation Recall: Given a set of (x,y) data points, Interpolation is the process of finding.
AN ORTHOGONAL PROJECTION
Orthogonality and Least Squares
Functions of Several Variables Copyright © Cengage Learning. All rights reserved.
Copyright © 2013, 2009, 2005 Pearson Education, Inc. 1 3 Polynomial and Rational Functions Copyright © 2013, 2009, 2005 Pearson Education, Inc.
Chapter 7 Inner Product Spaces 大葉大學 資訊工程系 黃鈴玲 Linear Algebra.
Elementary Linear Algebra Anton & Rorres, 9th Edition
Chapter 4 – Linear Spaces
Scientific Computing General Least Squares. Polynomial Least Squares Polynomial Least Squares: We assume that the class of functions is the class of all.
Copyright © 2009 Pearson Education, Inc. CHAPTER 4: Polynomial and Rational Functions 4.1 Polynomial Functions and Models 4.2 Graphing Polynomial Functions.
Section 4.4 Theorems about Zeros of Polynomial Functions Copyright ©2013, 2009, 2006, 2001 Pearson Education, Inc.
Tangent Planes and Normal Lines
1 Section 5.3 Linear Systems of Equations. 2 THREE EQUATIONS WITH THREE VARIABLES Consider the linear system of three equations below with three unknowns.
Least Squares Regression.   If we have two variables X and Y, we often would like to model the relation as a line  Draw a line through the scatter.
Local Linear Approximation for Functions of Several Variables.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Matrices CHAPTER 8.9 ~ Ch _2 Contents  8.9 Power of Matrices 8.9 Power of Matrices  8.10 Orthogonal Matrices 8.10 Orthogonal Matrices 
Section 8 Numerical Analysis CSCI E-24 José Luis Ramírez Herrán October 20, 2015.
Let W be a subspace of R n, y any vector in R n, and the orthogonal projection of y onto W. …
Inner Product Spaces Euclidean n-space: Euclidean n-space: vector lengthdot productEuclidean n-space R n was defined to be the set of all ordered.
Lecture XXVII. Orthonormal Bases and Projections Suppose that a set of vectors {x 1,…,x r } for a basis for some space S in R m space such that r  m.
Chapter 7 Inner Product Spaces
Systems of First Order Linear Equations
Least Squares Approximations
Linear regression Fitting a straight line to observations.
Maths for Signals and Systems Linear Algebra in Engineering Lecture 6, Friday 21st October 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL.
Presentation transcript:

4.4 Application--OLS Estimation

Background When we do a study of data and are looking at the relationship between 2 variables, and have reason to believe that a linear fit is appropriate, we need a way to determine a model that gives the optimal linear fit (one that best reflects the trend in the data). Ex. Relationship between hits and RBI’s A perfect fit would result if every point in the data exactly satisfied some equation y = a + bx, but this is next to impossible -- too much variability in real world data.

So what do we do? Assume y = a + bx is the best fit for the data. Then we can find a point on the line, (x i, f(x i )), with the same x- value as each of the points in the data set, (x i,y i ) Draw a diagram on the board. Then we say d i = y i - f(x i ) = distance from point in the data to the point on the line. d i is also called the error or residual at x i -- how far data is from line So to measure the fit of the line, we could add up all the errors, d 1 + d 2 + … + d n However, note that a best fit line will have some data above and some below, so this error will turn out to be 0.

So what do we do? Therefore, we need to make all of our errors positive by taking either |d i | or d i 2. |d i | will give the same weight to large and small errors, where d i 2 gives more weight to larger errors Ex d’s: {0,0,0,0,50} vs {10,10,10,10,10} avg of |d i | = 10 = 10

Which one is a better fit? Which should be considered a better fit? graph that goes right through 4 points, but nowhere near #5 graph that is same distance from each point (yes) |d i | method will not show this, but d i 2 method will.

Ordinary Least Squares Method So, we will select the model which minimizes the sum of squared residuals: S = d d … + d n 2 = [y 1 - f(x 1 )] 2 + …+ [y n - f(x n )] 2 This line is called the least squares approximating line We can use vectors to help us choose y = a + bx to minimize S

Ordinary Least Squares Method S, which we will minimize, is just the sum of the squares of the entries in the matrix, Y-MZ. If n = 3, then Y-MZ is a vector = Then S = || Y-MZ|| 2

Ordinary Least Squares Method S = || Y-MZ|| 2 Recall Y and M are given since we have 3 data points to fit. We simply need to select Z to minimize S. Let P be the set of all vectors MZ where Z varies:

Ordinary Least Squares Method It turns out that all of the vectors in set P lie in the same plane through the origin (we discuss why later in the book). The equation of the plane is Take a=0,b=1, or a,b=0 and find that this plane contains: And the normal vector will be U x V =

Ordinary Least Squares Method Y Y Y-MA O MZMA Recall that we are trying to minimize S = || Y-MZ|| 2 Y = (y 1,y 2,y 3 ) is a point in space, and MZ is some vector in the set P which we have illustrated as a plane. S = || Y-MZ|| 2 is the squared distance from the point to the plane, so if we can find the point,MA, in the plane closest to Y, we will have our solution.

Ordinary Least Squares Method Y Y Y-MA O MZMA Y-MA is orthogonal to all vectors,MZ, in the plane, so (MZ) (Y-MA) = 0 Note this rule for dot products when vectors are written as matrices:

Ordinary Least Squares Method Y Y Y-MA O MZMA 0 = (MZ) (Y-MA) =(MZ) T (Y-MA)=Z T M T (Y-MA) =Z T (M T Y-M T MA) = Z (M T Y-M T MA) The last dot product is in two dimensions and tells that (M T Y-M T MA) is orthogonal to every possible Z which can only happen if (M T Y-M T MA) = 0,so M T Y=M T MA called the normal equations for A

Ordinary Least Squares Method Y Y Y-MA O MZMA With x 1, x 2,x 3 all distinct, we can show that M T M is invertible, so from M T Y=M T MA,we get A = (M T M) -1 M T Y, This will give us A=(a,b) which will give then give us the point (a+bx 1,a+bx 2,a+bx 3 ) closest to Y. Thus the best fit line will then be y=a + bx.

Ordinary Least Squares Method Y Y Y-MA O MZMA Recall that this argument started by defining n=3 so that we could use a 3 dimensional argument with vectors. The argument becomes more complex, but does extend to any n.

Theorem 1 Suppose that n data points (x 1,y 1 ),…,(x n,y n ) of which at least two x’s are distinct. If Then, the least squares approximating line has equation y=a 0 + a 1 x where A = is found by Gaussian elimination from the normal equations M T Y=M T MA Since at least two x’s are distinct, M T M is invertible so A=(M T M) -1 M T Y

Example Find the least squares approximating line for the following data: (1,0),(2,2),(4,5),(6,9),(8,12) See what you get with the TI83+

Example Find an equation of the plane through P(1,3,2) with normal (2,0,-1).

We extend further... We can generalize to select the least squares approximating polynmial of degree m: f(x)=a 0 +a 1 x+a 2 x 2 +…+a n x n where we estimate the a’s

Theorem 2 (proof in ch 6) If n data points are given with at least m+1 x’s distinct, then Then least squares approximating polynomial of degree m is: f(x)=a 0 +a 1 x+a 2 x 2 +…+a n x n where Is found by Gaussian elim from normal equations M T Y=M T MA Since at least m+1 x’s are distinct, M T M is invertible so A=(M T M) -1 M T Y

Note we need at least one more data point than the degree of the polynomial we are trying to estimate. I.e. With n data points, we could not estimate a polynomial of degree n.

Example Find the least squares approximating quadratic for the following data points: (-2,0),(0,-4),(2,-10),(4,-9),(6,-3)