Presentation is loading. Please wait.

Presentation is loading. Please wait.

J.-F. Pâris University of Houston

Similar presentations


Presentation on theme: "J.-F. Pâris University of Houston"— Presentation transcript:

1 J.-F. Pâris University of Houston
Linear regression J.-F. Pâris University of Houston

2 Introduction Special case of regression analysis

3 Regression Analysis Models the relationship between
Values of a dependent variable (also called a response variable) Values of one or more independent variables Main outcome is a function y = f(x1, …, xn)

4 Linear regression Studies linear dependencies y = ax + b
And more y = ax2 + bx + c Is linear in a and b Uses Least-Square Method Assumes that departures from ideal line are to be random noise

5 Basic Assumptions (I) Sample is representative of the whole population
The error is assumed to be a random variable with a mean of zero conditional on the independent variables. Independent variables are error-free and linearly independent. Errors are uncorrelated

6 Basic Assumptions (II)
The variance of the error is constant across observations For very small samples, the errors must be Gaussian Does not apply to large samples ( 30)

7 General Formulation y1, y2, …, yn x11, x12, …, x1n x21, x22, …, x2n …
n samples of the dependent variable: y1, y2, …, yn n samples of each of the p dependent variables: x11, x12, …, x1n x21, x22, …, x2n xp1, xp2, …, xpn

8 Objective Si (yi - b0 - b1x1i - b2x2i -… - bpxpi)2 Finding
Y = b0 + b1X1 + b2X2 +… + b2Xp Minimizing the sum of squares of the deviations Si (yi - b0 - b1x1i - b2x2i -… - bpxpi)2

9 Why the sum of squares It favors big deviations
Less likely to result from random noise than large variations Our objective is to estimate the function linking the dependent variable to the independent variable assuming that the experimental points represent random variations

10 Simplest case (I) One independent variable We must find Y = a + bX
Minimizing the sum of squares of errors Si (yi - a - bxi)2

11 Simplest case (II) Derive the previous expression with respect to the parameters a and b: Si -2a(yi - a - bxi) or na – Si xi b = Si yi Si 2 xi(yi - a - bxi) or Si xi a + Si xi2 b = Si xi yi

12 Simplest case (III) We obtain The second expression can be rewritten

13 More notations

14 Simplest case (IV) Solution can be rewritten

15 Coefficient of correlation
r = 1 would indicate a perfect fit r = 0 would indicate no linear dependency

16 More complex case (I) Use matrix formulation Y= Xb + e where Y is a column vector and X is

17 More complex case (II) Solution to the problem is b = (XTX)-1XTy

18 Non-linear dependencies
Can use polynomial model Y = b0 + b1X + b2X2 +… + b2Xp Or do a logarithmic transform Replace y = Keat by log y = K + at


Download ppt "J.-F. Pâris University of Houston"

Similar presentations


Ads by Google