5.4 General Linear Least-Squares

Slides:



Advertisements
Similar presentations
SJS SDI_21 Design of Statistical Investigations Stephen Senn 2 Background Stats.
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Tests of Static Asset Pricing Models
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
The General Linear Model. The Simple Linear Model Linear Regression.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Ch 7.9: Nonhomogeneous Linear Systems
Review of Matrix Algebra
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Basics of regression analysis
Linear regression models in matrix terms. The regression function in matrix terms.
Arithmetic Operations on Matrices. 1. Definition of Matrix 2. Column, Row and Square Matrix 3. Addition and Subtraction of Matrices 4. Multiplying Row.
Linear Functions.
1 Chapter 2 Matrices Matrices provide an orderly way of arranging values or functions to enhance the analysis of systems in a systematic manner. Their.
Today Wrap up of probability Vectors, Matrices. Calculus
5  Systems of Linear Equations: ✦ An Introduction ✦ Unique Solutions ✦ Underdetermined and Overdetermined Systems  Matrices  Multiplication of Matrices.
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
Systems and Matrices (Chapter5)
MATH 250 Linear Equations and Matrices
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Fitting a line to N data points – 1 If we use then a, b are not independent. To make a, b independent, compute: Then use: Intercept = optimally weighted.
Statistics and Linear Algebra (the real thing). Vector A vector is a rectangular arrangement of number in several rows and one column. A vector is denoted.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005.
Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)
College Algebra Sixth Edition James Stewart Lothar Redlin Saleem Watson.
Adam Q. Colley EMIS8381 March 31,  One Dependent Variable  One or more (n) Independent Variables  (Independent to the Dependent Variable)  More.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors.
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Data Modeling Patrice Koehl Department of Biological Sciences
Applied statistics Usman Roshan.
Matrices Introduction.
College Algebra Chapter 6 Matrices and Determinants and Applications
Inference about the slope parameter and correlation
The simple linear regression model and parameter estimation
The Maximum Likelihood Method
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Regression Analysis AGEC 784.
5 Systems of Linear Equations and Matrices
The Maximum Likelihood Method
Multiple Regression.
CHAPTER 29: Multiple Regression*
Linear Regression/Correlation
Modelling data and curve fitting
Hypothesis testing and Estimation
J.-F. Pâris University of Houston
Linear regression Fitting a straight line to observations.
Statistical Inference about Regression
Dr Huw Owens Room B44 Sackville Street Building Telephone Number 65891
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
6.5 Taylor Series Linearization
5.2 Least-Squares Fit to a Straight Line
Functions.
5.1 Introduction to Curve Fitting why do we fit data to a function?
Regression Lecture-5 Additional chapters of mathematics
Simple Linear Regression
6 Systems of Linear Equations and Matrices
Linear Regression and Correlation
Graphing Linear Equations
PROGRAMME F5 LINEAR EQUATIONS and SIMULTANEOUS LINEAR EQUATIONS.
Linear Regression and Correlation
DETERMINANT MATH 80 - Linear Algebra.
Matrices and Determinants
Solving Linear Systems of Equations - Inverse Matrix
Presentation transcript:

5.4 General Linear Least-Squares linear, non-linear, and linearizable equations the general linear equation notational simplification and matrix least-squares solution errors in the coefficients an example with Gaussian peaks alternative matrix formalism an example from multi-component spectrophotometry 5.4 : 1/17

What is a Linear Equation? With respect to curve fitting, a linear equation is defined as one that is linear in the unknown parameters. In the following examples let a represent the unknown parameters, k represent known constants, x represent the independent, error-free variable, and y represent the dependent random variable. The randomness of y follows a normal pdf. Linear Equations Non-Linear Equations 5.4 : 2/17

coefficient transformation Linearizable Equations Some equations can be made linear in their coefficients by a mathematical transformation. non-linear coefficient linear coefficient coefficient transformation N.A. Two important restrictions: the pdf for z may not be normal, voiding the least-squares analysis. CLT comes to the rescue for large numbers of measurements, though. the pdf for the transformed coefficients may not be normal, voiding the use of many hypothesis tests. 5.4 : 3/17

The General Linear Equation Consider a general equation linear in its n+1 coefficients, where the fi(xi) are functions of x that contain only known constants. For the straight line equation, f0(x) = 1 and f1(x) = x. There are N data pairs, (xi,yi). The method of maximum likelihood produces one equation for each of the n+1 unknown equation coefficients. 5.4 : 4/17

Notational Simplification Define two new variables to replace the summation operations. br is an (n+1) column vector, where 0  r  n. ar,c is an (n+1)(n+1) matrix, where 0  r  n and 0  c  n. For all sums, 1  i  N. With this change in variables, the set of equations becomes, which can be further simplified using matrix notation, where m = n+1 is the number of rows (and the number of columns). 5.4 : 5/17

Matrix Least-Squares Solution The matrix equation is solved by pre-multiplying both sides by the inverse alpha matrix, a-1, and remembering that a-1a = 1. The matrix least-squares solution: the N, x,y-pairs of data are collected the n+1 functions, f(x), are evaluated for each of the x-values the b terms are computed and arranged into a column vector the a terms are computed and arranged into a matrix the inverse alpha matrix, a-1, is computed the beta vector is left-multiplied by the inverse alpha matrix to obtain the n+1 equation coefficients in the form of a column vector, a 5.4 : 6/17

Errors in the Coefficients The a-1 matrix is called the variance/covariance matrix. The diagonal elements are the variances of the corresponding coefficients. In the common case of an unweighted least-squares (all si = s), the above expression is modified to, where the variance is estimated in the usual manner. Remember, n+1 coefficients have already been computed from the N data values, so the degrees of freedom are N-(n+1). 5.4 : 7/17

Error Matrix for y = a0 + a1x For the straight line equation, f0(x) = 1 and f1(x) = x. For an unweighted least-squares the a matrix is given by the following. The a-1 variance/covariance matrix yields the expressions derived on slide 5.3-7. where 5.4 : 8/17

Covariance The covariance of two random variables is given by the following. For statistically independent random variables, <xy> = mxmy, and cov(x,y) = 0. The least-squares solution on the previous slide showed that the slope and intercept are only statistically independent when the xaxis values are centered about zero. When the deviation in the slope is positive, the deviation in the intercept will be negative. Except for centered data, the two random variables, a0 and a1 are NOT statistically independent - the error in one of them influences the error in the other. 5.4 : 9/17

Example Matrix Least-Squares Consider a situation where the measured data are some combination of three known Gaussian functions. The measured data are shown at the right. The goal is to obtain the least-squares solution to, y = a0f0(t) + a1f1(t) + a2f2(t) 5.4 : 10/17

Matrix Solution 5.4 : 11/17

Alternative Matrix Formalism Suppose that the linear components of the equation to be fit are available as data vectors and not functions. An excellent example would be a multi-component determination using Beer's Law. Al =el,0C0 + el,1C1 + el,2C2 With this method the spectrum of each component, el,i, is known, but doesn't have a functional form. The collected data and known molar absorptivities might be arranged in a table as follows. l A el,0 el,1 el,2 400 0.000 10,342 105 405 0.001 2 10,281 108 ●●● 650 536 5.4 : 12/17

Matrix Algebra Let A(N) be a column vector of absorption values, where N is the number of wavelengths. Let e(N,m) be a matrix of e-values, where m = n+1 is the number of components. And, let C(m) be a column vector of the m concentrations. This cannot be solved directly using the inverse matrix since e is not square. Instead, pre-multiply both sides by the transpose of e. Note that the product, eTe, is square, mm, and has an inverse. Finally, pre-multiply both sides by the inverse, (eTe)-1. The term in brackets is the variance/covariance matrix. 5.4 : 13/17

Multi-Component Spectrophotometry Consider the analysis of a solution containing p-biphenyl (B), p-terphenyl (T), and p-quaterphenyl (Q). For the three compounds, molar absorptivities, e, are given in the table and spectra in the graph. l B T Q 240 13,900 5,350 2,600 250 15,500 13,100 4,960 260 10,400 24,000 10,900 270 4,440 32,900 21,300 280 2,590 33,300 32,600 290 24,800 40,200 300 12,100 38,500 310 3,760 28,300 320 400 14,400 330 5,200 5.4 : 14/17

Data, Vectors and Matrices Absorbance for the mixture was measured at 10 wavelengths. (240,0.610)(250,0.886) (260,1.095)(270,1.248) (280,1.310)(290,1.188) (300,0.822)(310,0.501) (320,0.310)(330,0.109) e is a 103 matrix formed by the right three columns on the last slide. 5.4 : 15/17

Solving for Concentrations Compute the inverse alpha matrix. Compute the concentrations. Plot the regression points, where 5.4 : 16/17

Concentration Errors Compute the standard error of the fit. Compute the coefficient standard deviations using the diagonal elements of the inverse alpha matrix. 95% Confidence limits with 10-3 = 7 degrees of freedom: 5.4 : 17/17