Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.

Slides:



Advertisements
Similar presentations
General Linear Model With correlated error terms  =  2 V ≠  2 I.
Advertisements

Copula Regression By Rahul A. Parsa Drake University &
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Matrix Algebra Matrix algebra is a means of expressing large numbers of calculations made upon ordered sets of numbers. Often referred to as Linear Algebra.
Matrix Algebra Matrix algebra is a means of expressing large numbers of calculations made upon ordered sets of numbers. Often referred to as Linear Algebra.
The General Linear Model. The Simple Linear Model Linear Regression.
Lecture 9: Introduction to Matrix Inversion Gaussian Elimination Sections 2.4, 2.5, 2.6 Sections 2.2.3, 2.3.
Maximum likelihood (ML) and likelihood ratio (LR) test
Basics of regression analysis I Purpose of linear models Least-squares solution for linear models Analysis of diagnostics.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Factor Analysis Purpose of Factor Analysis Maximum likelihood Factor Analysis Least-squares Factor rotation techniques R commands for factor analysis References.
Factor Analysis Purpose of Factor Analysis
Point estimation, interval estimation
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Principal component analysis (PCA)
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Maximum likelihood (ML)
Resampling techniques
Generalised linear models
Maximum likelihood (ML) and likelihood ratio (LR) test
Procrustes analysis Purpose of procrustes analysis Algorithm Various modifications.
Regression III: Robust regressions
© John M. Abowd 2005, all rights reserved Statistical Tools for Data Integration John M. Abowd April 2005.
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Log-linear and logistic models
Generalised linear models Generalised linear model Exponential family Example: logistic model - Binomial distribution Deviances R commands for generalised.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of scientific research When you know the system: Estimation.
Mixed models Various types of models and their relation
Generalised linear models Generalised linear model Exponential family Example: Log-linear model - Poisson distribution Example: logistic model- Binomial.
Ordinary least squares regression (OLS)
Linear and generalised linear models
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Linear and generalised linear models
Basics of regression analysis
Principal component analysis (PCA) Purpose of PCA Covariance and correlation matrices PCA using eigenvalues PCA using singular value decompositions Selection.
Proximity matrices and scaling Purpose of scaling Classical Euclidean scaling Non-Euclidean scaling Non-Metric Scaling Example.
Maximum likelihood (ML)
Lecture II-2: Probability Review
Linear regression models in matrix terms. The regression function in matrix terms.
Separate multivariate observations
1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture 5QF Introduction to Vector and Matrix Operations Needed for the.
Review of Lecture Two Linear Regression Normal Equation
Objectives of Multiple Regression
Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by.
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Modern Navigation Thomas Herring
Least SquaresELE Adaptive Signal Processing 1 Method of Least Squares.
Method of Least Squares. Least Squares Method of Least Squares:  Deterministic approach The inputs u(1), u(2),..., u(N) are applied to the system The.
CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Chapter 28 Cononical Correction Regression Analysis used for Temperature Retrieval.
Special Topic: Matrix Algebra and the ANOVA Matrix properties Types of matrices Matrix operations Matrix algebra in Excel Regression using matrices ANOVA.
Trees Example More than one variable. The residual plot suggests that the linear model is satisfactory. The R squared value seems quite low though,
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Computacion Inteligente Least-Square Methods for System Identification.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Statistical Interpretation of Least Squares ASEN.
Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Matrices IB Mathematics SL. Matrices Describing Matrices Adding Matrices.
The regression model in matrix form
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
10701 / Machine Learning Today: - Cross validation,
OVERVIEW OF LINEAR MODELS
OVERVIEW OF LINEAR MODELS
Parametric Methods Berlin Chen, 2005 References:
Presentation transcript:

Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised linear model

Reason for linear models Purpose of regression is to reveal statistical relations between input and output variables. Statistics cannot reveal functional relationship. It is purpose of other scientific studies. Statistics can help validation various functional relationship (models). Let us assume that we suspect that functional relationship is where  is a vector of unknown parameters, x=(x 1,x 2,,,x p ) a vector of controllable parameters, and y is output,  is an error associated with the experiment. Then we can set for various values of x experiments and get output (or response) for them. If number of experiments is n then we will have n output values. Denote them as a vector y=(y 1,y 2,,,y n ). Purpose of statistics is to evaluate parameter vector using input and output values. If function f is a linear function of the parameters and errors are additive then we are dealing with linear model. For this model we can write Linear model is linearly dependent on parameters but not on input variables. For example is a linear model. But is not.

Assumptions Basic assumptions for analysis of linear model are: 1)the model is linear in parameters 2)the error structure is additive 3)Random errors have 0 mean, equal variances and they are uncorrelated. These assumptions are sufficient to deal with linear models. Uncorrelated with equal variance assumptions (number 3) can be removed. Then the treatments becomes a little bit more complicated. Note that for general solution normality assumption is not used. This assumption is necessary to design test statistics. If this assumption does not work then we can use bootstrap to design test statistic. These assumptions can be written in a vector form: where y, 0, I,  are vectors and X is a matrix. This matrix is called design matrix, input matrix etc. I is nxn identity matrix.

Solution Solution to least-squares with linear model and and given assumptions is: Let us show this. If we use the form of the model and write least squares equation (since we want to find solution with minimum least-squares error): and get the first and solve the equation then we can see that this solution is correct. If we use the formula for the solution and the expression of y then we can write: So solution is unbiased. Variance of estimation is: Here we used the form of the solution and the assumption number 3)

Variance To calculate covariance matrix we need to be able to calculate  2. Since it is the variance of the error term we can find it using the form of the solution. For the estimated error (denoted by r) we can write: If we use: It gives Since the matrix M is idempotent and symmetric, i.e. M 2 =M=M T, we can write: Where n is the number of the observations and p is the number of the fitted parameters. Then for unbiased estimator for the variance of the residual we can write:

Singular case The above given form of the solution is true if matrices X and X T X are non-singular. I.e. the rank of the matrix X is equal to the number of parameters. If it is not true then either singular value decomposition or eignevalue filtering techniques are used. Fortunately most good properties of the linear model remains. Singular value decomposition (SVD): Any nxp matrix can be decomposed in a form: Where U is nxn and V is pxp orthogonal matrices (inverse is equal to transpose). D is nxp diagonal matrix of the singular values. If X is singular then number of non-zero diagonal elements of D is less than p. Then for X T X we can write: D T D is pxp diagonal matrix. If the matrix is non-singular then we can write: Since D T D is a diagonal matrix therefore its inverse is also diagonal matrix. Main trick used in SVD technique for equation solution is that when diagonals are 0 or close to 0 then instead of their inversion zero is used. I.e. pseudo inverse is calculated using:

Analysis of diagnostics Residuals and hat matrix: Residuals are differences between observation and fitted values: H is called a hat matrix. Diagonal terms h i are leverage of the observations. If these values are close to one then that fitted value is determined by this observation. Sometimes h i ’=h i /(1-h i ) is used to enhance high leverages. Q-Q plot can be used to check normality assumption. Q-Q plot is plot of quantiles of two distributions. If assumption on distribution is correct then this plot should be nearly linear. If the distribution is normal then tests designed for normal distributions can be used. Otherwise bootstrap can be used to derive desired distributions.

Analysis of diagnostics: Cont. Other analysis tools include: Where h i is leverage, h i ’ is enhanced leverage, s 2 is unbiased estimator of  2, s i 2 is unbiased estimator of  2 after removal of i-th observation

Bootstrap Simplest application of bootstrap for this problem is as follows: 1)Calculate residuals using 2)Sample with replacement from the residual vector and denote them r random 3)Design new “observations” using 4)Estimate parameters 5)Repeat steps 2 3 and 4 6)Estimate bootstrap estimation, variances, covariance matrix or the distribution Another technique for bootstrapping is: Resample observations and corresponding row of the design matrix simultaneously - (y i,x 1i,x 2i,,,,x pi ),i=1,n. It meant to be less sensitive to misspecified models. Note that for some samples, the matrix may become singular and problem may become ill defined.

Generalised linear models One of the main assumptions for linear model is that errors are additive. I.e. observations are equal to their expectation value plus an error. What happens if this assumption breaks down, e.g. errors are additive for some function of the expected value. In general we can of course use Maximum likelihood (or Bayesian estimation) for these cases. However there are class of problems that are widely being used in such fields as medicine, biosciences. They are especially important when observations are categorical, i.e. they have discrete values. This class of problems are usually dealt with using generalised linear models. Let us consider these problems. First consider generalised exponential family.

Generalised linear model: Exponential family Natural exponential family of distributions has a form: S(  ) is a scale parameter. We can replace A(  ) with  by change of variables. Many distributions including normal, binomial, Poisson, exponential distributions belong to this family. Moment generating function is: Then the first moment (mean value) and the second central moments are:

Generalised linear model If the distribution of observations is one of the distributions from the exponential family and some function of the expected value of the observations is a linear function of the parameters then generalised linear model is used: Function g is called the link function. Here is a list of the popular distribution and corresponding link functions: binomial - logit = ln(p/(1-p)) normal - identity Gamma - inverse Poisson - log All good statistical packages have implementation of several generalised linear models. To fit using generalised linear model, likelihood function is written Most natural way is to use  =X . The optimisation for this kind of functions is done iteratively.

Bootstrap Three techniques for bootstrapping for generalised linear models can be used I)Resampling differences between observations and expected values 1)Calculate differences between observations and expected values 2)Sample from these differences 3)Add them to the observations and make sure that observations have properties they meant to have 4)Estimate parameters 5)repeat steps 2-4 II)Parametric resampling using the form of distribution and estimated parameters 1)Built the distribution using the estimated parameters 2)Resample using these distributions. Note that each observation may have different distribution 3)Estimate parameters 4)Repeat step 2 and 3 and built up bootstrap estimations, distributions III)Resampling observations and corresponding rows of the design matrix simultaneously 1)Resample from vector (y i,x 1i,x 2i,,,x pi ),i=1,n 2)Estimate parameters 3)Repeat steps 1 and 2

R commands R command for general linear model is lm. R command for generalised linear model is glm