GENERAL LINEAR MODELS: Estimation algorithms

Slides:



Advertisements
Similar presentations
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Advertisements

Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Managerial Economics in a Global Economy
General Linear Model With correlated error terms  =  2 V ≠  2 I.
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.
The Simple Regression Model
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Correlation and regression Dr. Ghada Abo-Zaid
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
The General Linear Model Or, What the Hell’s Going on During Estimation?
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Chapter 10 Curve Fitting and Regression Analysis
The General Linear Model. The Simple Linear Model Linear Regression.
Multiple regression analysis
Chapter 10 Simple Regression.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Part I – MULTIVARIATE ANALYSIS
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 11 Multiple Regression.
Linear and generalised linear models
Incomplete Block Designs
Linear and generalised linear models
Basics of regression analysis
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Maximum likelihood (ML)
Classification and Prediction: Regression Analysis
Introduction to Multilevel Modeling Using SPSS
Introduction to Linear Regression and Correlation Analysis
Chapter 13: Inference in Regression
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Chapter 13 Multiple Regression
Scatterplots & Regression Week 3 Lecture MG461 Dr. Meredith Rolfe.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
PSYC 3030 Review Session April 19, Housekeeping Exam: –April 26, 2004 (Monday) –RN 203 –Use pencil, bring calculator & eraser –Make use of your.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Topic 20: Single Factor Analysis of Variance. Outline Analysis of Variance –One set of treatments (i.e., single factor) Cell means model Factor effects.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Computacion Inteligente Least-Square Methods for System Identification.
Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Multiple Regression.
The simple linear regression model and parameter estimation
Chapter 7. Classification and Prediction
Ch. 2: The Simple Regression Model
Probability Theory and Parameter Estimation I
Ch. 2: The Simple Regression Model
Regression Analysis Week 4.
G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations
Statistical Methods For Engineers
Multiple Regression.
Prepared by Lee Revere and John Large
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
OVERVIEW OF LINEAR MODELS
5.4 General Linear Least-Squares
Simple Linear Regression
OVERVIEW OF LINEAR MODELS
The Multiple Regression Model
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

GENERAL LINEAR MODELS: Estimation algorithms KIM MINKALIS

GOAL OF THE THESIS Mixed, Lifereg in words

THE GENERAL LINEAR MODEL The general linear model is a statistical linear model that can be written as: where: Y is a matrix with series of multivariate measurements X is a matrix that might be a design matrix B is a matrix containing parameters that are usually to be estimated U is a matrix containing errors or noise The residual is usually assumed to follow a multivariate normal distribution. The general linear model incorporates a number of different statistical models: ANOVA, ANCOVA, MANOVA, MANCOVA, ordinary linear regression, t-test and F-test. If there is only one column in Y (i.e., one dependent variable) then the model can also be referred to as the multiple regression model (multiple linear regression). Kim – Highlight X is anything: Continuous or Categorical.

SIMPLE LINEAR REGRESSION Simple Linear Model in Scalar Form: Consider now writing an equation for each observation: Simple Linear Model in Matrix Form: When X is one single continuous “predictor” X is called the design matrix β is the vector of parameters ε is the error vector Y is the response vector

SIMPLE LINEAR REGRESSION Distributional Assumptions in Matrix Form

SIMPLE LINEAR REGRESSION

SIMPLE LINEAR REGRESSION Least squares ALTERNATE METHODS: MLE REML GEE

SUMS OF SQUARES SST SSE SSR TOTAL SUM OF SQUARES = RESIDUAL (ERROR) SUM OF SQUARES + EXPLAINED (MODEL) SUM OF SQUARES SST SSE SSR SST is the sum of the squares of the difference of the dependent variable and its grand mean (total variation in Y – outcome variable) SSR is the sum of the squares of the differences of the predicted values and the grand mean (variation explained by the fitted model) SSE is a measure of the discrepancy between the data and an estimation model (unexplained residual variation)

SUMS OF SQUARES and mean squares The sums of squares for the analysis of variance in matrix notation is: Degrees of freedom Mean squares

Example 1 simple linear regression DATA To read from an existing SAS dataset, submit a USE command to open it. The general form of the USE statement is: USE sas dataset <VAR operand> <WHERE expression>; Transferring data from a SAS data set to a matrix is done with the READ statement. READ <range> <var operand> <where expression> <into name> ; READING DATA INTO IML

Example 1 simple linear regression Number of observations 15 Number of parameters for fixed effects 2 Vector of estimated regression coefficients Degrees of Freedom 15-2=13 Variance-Covariance Matrix for Beta Standard Error of Beta (2X1 vector)

Example 1 simple linear regression A/B means DIVIDE COMPONENTWISE (A and B must be the same size) t-statistics for tests of significant regression coefficients Probf(A,d1,d2) is Prob[F(d1,d2) ≤ A] for an F distribution Recall T(d)2 = F(1,d), so that 1-Probf(T#T,1,d) returns two-sided Student-t P-values SST SSR MSR

Example 1 simple linear regression PROC GLM PROC iml Right to SAS. Run GLM then IML EXAMPLE

MULTIPLE LINEAR REGRESSION MODEL MATRIX ALGEBRA IS EXACTLY THE SAME! Now, if we consider multiple continuous predictors

Example 2 SINGLE FACTOR ANALYSIS OF VARIANCE DATA Dataset has a total of 19 observations (Store-Design Combinations) Cases (Outcome Variable) = Number of cases sold Design = 1 of 4 different package designs for new breakfast cereal Store = 5 stores with approximately same sales volume Need to create multiple columns to represent levels within categorical factor USE PROC IML USE DATA STEP But, what if X is a single categorical predictor?

Example 2 SINGLE FACTOR ANALYSIS OF VARIANCE The DESIGN function creates a design matrix of 0s and 1s from column-vector. Each unique value of the vector generates a column of the design matrix. This column contains ones in elements with corresponding elements in the vector that are the current value; it contains zeros elsewhere. The DESIGNF function works similar to the DESIGN function; however, the result matrix is one column smaller and can be used to produce full-rank design matrices. The result of the DESIGNF function is the same as if you took the last column off the DESIGN function result and subtracted it from the other columns of the result. DESIGN FUNCTION DATA STEP DESIGNF FUNCTION

Example 2 SINGLE FACTOR ANALYSIS OF VARIANCE MATRIX A Generalized Inverse Note that column 5 can be written as a linear column 1 − column 2 − column 3 − column 4 Matrix does not have a unique inverse conditional Inverse pseudo Inverse Same mathematics as multiple linear regression model Constructing the design matrix is the only trick MATRIX G AG AGA=A

Example 2 SINGLE FACTOR ANALYSIS OF VARIANCE PROC GLM PROC iml Go into SAS and Run the example. Show the code only in SAS syntax. EXAMPLE

Analysis OF COVARIANCE (ANCOVA) ANOVA+Regression Categorical+Continuous ANCOVA is used to account/adjust for Pre-Existing Conditions In our example we will model the Area Under the Curve per Week is adjusted for the Baseline Beck Depression Score Index (Continuous), the Gender of the Subject (Categorical) and the Type of Treatment (Categorical). Some models may only have one covariate representing the baseline score and the outcome variable represents the final score – it may be tempting to get rid of the covariate by modeling the difference. This may be problematic as you are forcing a slope of 1. We also have to make use of partial F tests to compare two models. EXAMPLE

DESIGN MATRIX: Building Interaction terms With regression models we can make the equation on the right hand side as complicated as possible, as long as all predictors are “Linear” terms. Complexity may include interaction terms, which correspond to adding a predictor X1X2 which is the product of continuous predictors X1 and X2, and is included in the model and refered to as the interaction of X1 and X2. For the ANOVA model, we can do the same thing. But the formation of the interaction has an impact on the design matrix if the terms are both factors or one is a factor and one is a continuous measure.

CONSTRUCTION OF LEAST SQUARE MEANS In PROC GLM what is the difference between the MEANS and the LSMEANS statement? When the MEANS statement is used, PROC GLM computes the arithmetic means (average) of all continuous variables in the model (both dependent and independent) for each level of the categorical variable specified in the MEANS statement. When the LSMEANS statement is used, PROC GLM computes the predicted population margins; that is, they estimate marginal means over a balanced population. Means corrected for imbalances in other variables. When an experiment is balanced, MEANS and LSMEANS agree. When data are unbalanced, however, there can be a large difference between a MEAN and an LSMEAN.

CONSTRUCTION OF LEAST SQUARE MEANS Assume A has 3 levels, B has 2 levels, and C has 2 levels, and assume that every combination of levels of A and B exists in the data. Assume also that Z is a continuous variable with an average of 12.5. Then the least-squares means are computed by the following linear combinations of the parameter estimates:

CONSTRUCTION OF LEAST SQUARE MEANS

Example LSMEANS PROC GLM PROC iml EXAMPLE HIGHLIGHT THE IML SYNTAX FOR THE LSMEAN ESTIMATE & CONTRAST. EXAMPLE

MAXIMUM LIKELIHOOD ESTIMATION With linear models it is possible to derive estimators that are optimal in some sense As models become more general optimal estimators become more difficult obtain and estimators that are asymptotically optimal are obtained instead Maximum likelihood estimators (MLE) have a number of nice asymptotic properties and are relatively easy to obtain Start with the distribution of our data: While the Least Squares approach works well with regression modeling framework, extending this model to more complex GLM structures we need alternate methods of estimation.

MAXIMUM LIKELIHOOD ESTIMATION

MAXIMUM LIKELIHOOD ESTIMATION HESSIAN GRADIENT GRADIENT INFORMATION MATRIX Change the color on the “Information matrix” Regardless of the algorithm used the MLE of the model parameters remain the same:

MAXIMUM LIKELIHOOD ESTIMATION VERSUS Ordinary least squares FIXED EFFECTS estimation Independence of Mean and Variance for Normals Independence of Estimators Variance estimation OLS is an unbiased estimator Using MLE as compared to OLS in a regression framework ML is a biased estimator Note that the ML formula differs from the OLS formula by dividing by N and not N-p

ITERATIVE METHODS NEWTON RAPHSON METHOD OF SCORING

EXTENSION OF THE GENERAL LINEAR MODEL In a linear model it is assumed that there is a single source of variability. One can extend linear models by allowing for multiple sources of variability. In the simplest case, the combined covariance matrix is a linear function of the variance components. In other cases, the combined covariance matrix is a non-linear function of the variance components. The linear form is typical of the structure encountered in various split plot designs and the non-linear form is typical of repeated measure designs. MIXED LINEAR MODEL EQUATION

MIXED LINEAR MODELS Set derivative equal to zero and solve for β: Plug β into derivatives with respect to σi2

MIXED LINEAR MODELS Maximum Likelihood solutions equating derivatives equal to zero: Fixed Effects Variance components We can make an algebraically simpler expression for the second equation by defining P in the following manner: Note that sesq(M) represents the sum of squares of elements of M

MIXED LINEAR MODELS Second Partials

MIXED LINEAR MODELS FISHER SCORING – EXPECTED VALUES

RESTRICTED (Residual )MAXIMUM LIKELIHOOD (REML) Maximum Likelihood does not yield the usual estimators when the data are balanced In estimating variance components ML does not take into account the degrees of freedom that are involved in estimating fixed effects Estimating variance components based on residuals calculated after fitting by ordinary least squares just the fixed effects part of the model

MIXED EFFECTS Example Actual levels of milk fat in its yogurt exceeded the labeled amount Outcome Variable = Fat Content of each Yogurt Sample (3.0) Random Effect = 4 Randomly Chosen Laboratories Fixed Effect = Government’s VS Sheffield’s Method 6 samples where sent to each laboratory but Government’s Labs had technical difficulties and were not able to determine fat content for all 6 samples.

MIXED EFFECTS Example PROC GLMMOD The GLMMOD procedure constructs the design matrix for a general linear model; it essentially constitutes the model-building front end for the GLM procedure.

MIXED EFFECTS Example PARTIAL DATA

MIXED EFFECTS Example Z Matrix G Matrix R Matrix ZGZ’ Matrix ZGZ’+R Matrix

MIXED EFFECTS Example READ DATA INTO PROC IML RANDOM EFFECTS Recall that columns 2-5 represent the 4 different labs and columns 6-13 represent the interaction between labs and methods Need to get rid of column 1 which represents the intercept FIXED EFFECTS Recall that column 1 represents the intercept and columns 2 and 3 represent the two different methods The outcome variable fat is read into the vector y

MIXED EFFECTS Example Get initial estimates for variance components Use MSE from model containing only fixed effects as initial estimate Note: We used biased estimate from ML approach 0.1113189 (ML) instead of 0.11733610 (OLS) for initial estimates G is a q x q matrix where q is the number of random effect parameters. G is always diagonal in a random effects model if the random effects are assumed uncorrelated. In our example, starting value for G is a 12X12 diagonal matrix G0 = 0.0556594* I(12) R0 = 0.0556594 *I(39)

MIXED EFFECTS Example LOG LIKELIHOOD GRADIENT RESIDUAL NOTE: W represents a 39X4 design matrix representing levels of factor LAB S represents a 39X8 design matrix representing the levels of the Interaction between LAB*METHOD

MIXED EFFECTS Example HESSIAN NOT POSITIVE DEFINITE ADD 215 TO MAIN DIAGONAL

MIXED EFFECTS EXAmPLE CALL NLPNRR USER DEFINED FUNCTIONS AND CALL ROUTINES

MIXED EFFECTS EXAMPLE HESSIAN Use PROC IML Nonlinear Optimization and Related Subroutines Move this.

MIXED EFFECTS Example Variance Components must be positive Hessian must be positive definite Use CALL NLPNRR EXAMPLE

MIXED EFFECTS Example 215 = 32768 214 = 16384 212 = 4096 28 = 256 26 = 64 24 = 16 22 = 4

MIXED EFFECTS Example Variance Components PROC IML PROC MIXED

QUESTIONS