J.-F. Pâris University of Houston

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Managerial Economics in a Global Economy
When Simultaneous observations on hydrological variables are available then one may be interested in the linear association between the variables. This.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Kin 304 Regression Linear Regression Least Sum of Squares
Statistical Techniques I EXST7005 Simple Linear Regression.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Ch11 Curve Fitting Dr. Deshi Ye
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Regression and Correlation Methods Judy Zhong Ph.D.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Chapter 11 Simple Regression
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
LECTURE 3: ANALYSIS OF EXPERIMENTAL DATA
Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)
Class 5 Multiple Regression CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques
Variance Stabilizing Transformations. Variance is Related to Mean Usual Assumption in ANOVA and Regression is that the variance of each observation is.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Econometrics III Evgeniya Anatolievna Kolomak, Professor.
Regression and Correlation of Data Correlation: Correlation is a measure of the association between random variables, say X and Y. No assumption that one.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Data Modeling Patrice Koehl Department of Biological Sciences
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
The simple linear regression model and parameter estimation
Chapter 4: Basic Estimation Techniques
Linear Regression Modelling
Chapter 7. Classification and Prediction
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Probability Theory and Parameter Estimation I
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.
Parameter Estimation and Fitting to Data
Multiple Regression Analysis: Estimation
Regression Chapter 6 I Introduction to Regression
Virtual COMSATS Inferential Statistics Lecture-26
CH 5: Multivariate Methods
Ch12.1 Simple Linear Regression
Evgeniya Anatolievna Kolomak, Professor
ECONOMETRICS DR. DEEPTI.
Simple Linear Regression - Introduction
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.
Correlation and Simple Linear Regression
1) A residual: a) is the amount of variation explained by the LSRL of y on x b) is how much an observed y-value differs from a predicted y-value c) predicts.
Regression Analysis Week 4.
CHAPTER 29: Multiple Regression*
Chapter 6: MULTIPLE REGRESSION ANALYSIS
Simple Regression Mary M. Whiteside, PhD.
6-1 Introduction To Empirical Models
Chapter 14 – Correlation and Simple Regression
Modelling data and curve fitting
Regression Models - Introduction
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
Simple Linear Regression
Correlation and Simple Linear Regression
Nonlinear regression.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
5.2 Least-Squares Fit to a Straight Line
5.4 General Linear Least-Squares
Simple Linear Regression and Correlation
Least Square Regression
Simple Linear Regression
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
Linear Panel Data Models
Statistics for Business and Economics
SKTN 2393 Numerical Methods for Nuclear Engineers
Introduction to Regression
Topic 11: Matrix Approach to Linear Regression
Regression Models - Introduction
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

J.-F. Pâris University of Houston Linear regression J.-F. Pâris University of Houston

Introduction Special case of regression analysis

Regression Analysis Models the relationship between Values of a dependent variable (also called a response variable) Values of one or more independent variables Main outcome is a function y = f(x1, …, xn)

Linear regression Studies linear dependencies y = ax + b And more y = ax2 + bx + c Is linear in a and b Uses Least-Square Method Assumes that departures from ideal line are to be random noise

Basic Assumptions (I) Sample is representative of the whole population The error is assumed to be a random variable with a mean of zero conditional on the independent variables. Independent variables are error-free and linearly independent. Errors are uncorrelated

Basic Assumptions (II) The variance of the error is constant across observations For very small samples, the errors must be Gaussian Does not apply to large samples ( 30)

General Formulation y1, y2, …, yn x11, x12, …, x1n x21, x22, …, x2n … n samples of the dependent variable: y1, y2, …, yn n samples of each of the p dependent variables: x11, x12, …, x1n x21, x22, …, x2n … xp1, xp2, …, xpn

Objective Si (yi - b0 - b1x1i - b2x2i -… - bpxpi)2 Finding Y = b0 + b1X1 + b2X2 +… + b2Xp Minimizing the sum of squares of the deviations Si (yi - b0 - b1x1i - b2x2i -… - bpxpi)2

Why the sum of squares It favors big deviations Less likely to result from random noise than large variations Our objective is to estimate the function linking the dependent variable to the independent variable assuming that the experimental points represent random variations

Simplest case (I) One independent variable We must find Y = a + bX Minimizing the sum of squares of errors Si (yi - a - bxi)2

Simplest case (II) Derive the previous expression with respect to the parameters a and b: Si -2a(yi - a - bxi) or na – Si xi b = Si yi Si 2 xi(yi - a - bxi) or Si xi a + Si xi2 b = Si xi yi

Simplest case (III) We obtain The second expression can be rewritten

More notations

Simplest case (IV) Solution can be rewritten

Coefficient of correlation r = 1 would indicate a perfect fit r = 0 would indicate no linear dependency

More complex case (I) Use matrix formulation Y= Xb + e where Y is a column vector and X is

More complex case (II) Solution to the problem is b = (XTX)-1XTy

Non-linear dependencies Can use polynomial model Y = b0 + b1X + b2X2 +… + b2Xp Or do a logarithmic transform Replace y = Keat by log y = K + at