BIOL 582 Lecture Set 19 Matrices, Matrix calculations, Linear models using linear algebra.

Slides:



Advertisements
Similar presentations
Matrices A matrix is a rectangular array of quantities (numbers, expressions or function), arranged in m rows and n columns x 3y.
Advertisements

Matrix Algebra Matrix algebra is a means of expressing large numbers of calculations made upon ordered sets of numbers. Often referred to as Linear Algebra.
Matrix Algebra Matrix algebra is a means of expressing large numbers of calculations made upon ordered sets of numbers. Often referred to as Linear Algebra.
Refresher: Vector and Matrix Algebra Mike Kirkpatrick Department of Chemical Engineering FAMU-FSU College of Engineering.
3_3 An Useful Overview of Matrix Algebra
Multiple Regression Predicting a response with multiple explanatory variables.
Linear Algebraic Equations
x y z The data as seen in R [1,] population city manager compensation [2,] [3,] [4,]
Chapter 2 Basic Linear Algebra
ECIV 301 Programming & Graphics Numerical Methods for Engineers Lecture 12 System of Linear Equations.
Review of Matrix Algebra
Ch 7.2: Review of Matrices For theoretical and computation reasons, we review results of matrix theory in this section and the next. A matrix A is an m.
ECIV 520 Structural Analysis II Review of Matrix Algebra.
7/2/ Lecture 51 STATS 330: Lecture 5. 7/2/ Lecture 52 Tutorials  These will cover computing details  Held in basement floor tutorial lab,
MOHAMMAD IMRAN DEPARTMENT OF APPLIED SCIENCES JAHANGIRABAD EDUCATIONAL GROUP OF INSTITUTES.
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Crime? FBI records violent crime, z x y z [1,] [2,] [3,] [4,] [5,]
Pam Perlich Urban Planning 5/6020
Linear regression models in matrix terms. The regression function in matrix terms.
Matrix Approach to Simple Linear Regression KNNL – Chapter 5.
Matrix Definition A Matrix is an ordered set of numbers, variables or parameters. An example of a matrix can be represented by: The matrix is an ordered.
Intro to Matrices Don’t be scared….
Arithmetic Operations on Matrices. 1. Definition of Matrix 2. Column, Row and Square Matrix 3. Addition and Subtraction of Matrices 4. Multiplying Row.
CE 311 K - Introduction to Computer Methods Daene C. McKinney
Lecture 10A: Matrix Algebra. Matrices: An array of elements Vectors Column vector Row vector Square matrix Dimensionality of a matrix: r x c (rows x columns)
Chapter 7 Matrix Mathematics Matrix Operations Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Linear Algebra Review 1 CS479/679 Pattern Recognition Dr. George Bebis.
Section 4.1 Using Matrices to Represent Data. Matrix Terminology A matrix is a rectangular array of numbers enclosed in a single set of brackets. The.
Sundermeyer MAR 550 Spring Laboratory in Oceanography: Data and Methods MAR550, Spring 2013 Miles A. Sundermeyer Linear Algebra & Calculus Review.
ECON 1150 Matrix Operations Special Matrices
Analysis of Covariance Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
Some matrix stuff.
Array Addition  Two arrays can be added if and only if both arrays have exactly the same dimensions.  Assuming the dimension requirement is satisfied,
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Statistics and Linear Algebra (the real thing). Vector A vector is a rectangular arrangement of number in several rows and one column. A vector is denoted.
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
Matrices Addition & Subtraction Scalar Multiplication & Multiplication Determinants Inverses Solving Systems – 2x2 & 3x3 Cramer’s Rule.
Unit 3: Matrices.
Matrix Algebra and Regression a matrix is a rectangular array of elements m=#rows, n=#columns  m x n a single value is called a ‘scalar’ a single row.
Lecture 9: ANOVA tables F-tests BMTRY 701 Biostatistical Methods II.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
BIOL 582 Supplemental Material Matrices, Matrix calculations, GLM using matrix algebra.
Linear algebra: matrix Eigen-value Problems Eng. Hassan S. Migdadi Part 1.
Introduction to Matrices and Matrix Approach to Simple Linear Regression.
Tutorial 4 MBP 1010 Kevin Brown. Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1.
Introduction to Matrices Douglas N. Greve
ME 142 Engineering Computation I Matrix Operations in Excel.
Special Topic: Matrix Algebra and the ANOVA Matrix properties Types of matrices Matrix operations Matrix algebra in Excel Regression using matrices ANOVA.
Linear Models Alan Lee Sample presentation for STATS 760.
3.4 Solution by Matrices. What is a Matrix? matrix A matrix is a rectangular array of numbers.
Matrices and Matrix Operations. Matrices An m×n matrix A is a rectangular array of mn real numbers arranged in m horizontal rows and n vertical columns.
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
Linear System of Simultaneous Equations Warm UP First precinct: 6 arrests last week equally divided between felonies and misdemeanors. Second precinct:
Unit 3: Matrices. Matrix: A rectangular arrangement of data into rows and columns, identified by capital letters. Matrix Dimensions: Number of rows, m,
Matrix Algebra Basics Chapter 3 Section 5. Algebra.
Matrix Algebra Definitions Operations Matrix algebra is a means of making calculations upon arrays of numbers (or data). Most data sets are matrix-type.
Matrices. Variety of engineering problems lead to the need to solve systems of linear equations matrixcolumn vectors.
10.4 Matrix Algebra. 1. Matrix Notation A matrix is an array of numbers. Definition Definition: The Dimension of a matrix is m x n “m by n” where m =
Matrices Introduction.
College Algebra Chapter 6 Matrices and Determinants and Applications
MTH108 Business Math I Lecture 20.
Chapter 7 Matrix Mathematics
Introduction to Matrices
Matrices Definition: A matrix is a rectangular array of numbers or symbolic elements In many applications, the rows of a matrix will represent individuals.
Dr Huw Owens Room B44 Sackville Street Building Telephone Number 65891
Matrix Algebra.
Matrices and Determinants
Presentation transcript:

BIOL 582 Lecture Set 19 Matrices, Matrix calculations, Linear models using linear algebra

Compact method of expressing mathematical operations (including statistics) Makes linear models easier to compute BIOL 582Matrix operations Scalar: a number Vector: an ordered list (array) of scalars (n rows x 1 cols ) Matrix: a rectangular array of scalars (n rows x p cols ) Nomenclature: elements (or variables) are italicized, matrices are bold. Lowercase = vector; Capital = matrix Many variants in scientific literature

Reverse rows and columns Represent by A t or A′ Vector transpose works identically BIOL 582Matrix operations: transpose

Matrices must have same dimensions Add/subtract element-wise Vector addition/subtraction works identically Addition Subtraction BIOL 582Matrix operations: addition and subtraction

inner Scalar multiplication: Multiply scalar by each element in matrix or vector Matrix/vector multiplication is a summed multiplication Inner dimensions allow multiplication Outer dimensions determine size of result Order of matrices makes a difference: AB ≠ BA AB n 1 × p 1 * n 2 × p 2 BIOL 582Matrix operations: multiplication Inner dimension must agree or multiplication cannot take place

Scalar multiplication: Matrix multiplication: BIOL 582Matrix operations

Inner (scalar) product: vector multiplication resulting in a scalar (weighted linear combination) Outer (matrix) product: vector multiplication resulting in a matrix Inner Product Outer Product Inner dimensions MUST AGREE!!! BIOL 582Matrix operations

BIOL 582Special matrices I: Identity matrix (equivalent to ‘1’ for matrices) 1: A matrix of all ones 0: A matrix of all zeros Diagonal: diagonal contains non-zero elements Square: n = p Symmetric: off-diagonal elements same:

Orthogonal: square matrix with property: VERY useful for statistics and other fields (e.g, morphometrics) Orthonormal Example: BIOL 582Special Matrices

Cannot divide matrices, so calculate the inverse (reciprocal) of denominator and multiply Inverses have property that: Inverses are tedious to calculate, so in practice we use a computer Only works for square matrices whose determinant ≠ 0 (singular) Determinant: combination of diagonal and off-diagonal elements BIOL 582Matrix operations: division

For the 2 x 2 case: Example: Confirm: BIOL 582Matrix operations: invserse

The linear equation Can be written in matrix form as where BIOL 582Linear Model using matrix operations

Why is it so simple? Consider just this part for a simple example of four subjects and two independent variables: BIOL 582Linear Model using matrix operations

The linear model is: The estimated coefficients (parameter estimates) are solved as: How/why? Try to solve for Cannot divide both sides by X Cannot multiply by inverse of X, unless X is square- symmetric BIOL 582Linear Model using matrix operations

Making X symmetric: This matrix can be inverted: So, multiplying both sides of by will assist inverting the necessary part Note the dimensions so far: (k x n)(n x 1) = (k x n)(n x k)(k x 1)  (k x 1) = (k x 1) Now multiply both sides by inverse above Which has dimensions: (k x n)(n x k) (k x 1) = (k x n)(n x k) (k x 1)  (k x 1) = (k x 1) BIOL 582Linear Model using matrix operations

The equation Simplifies to And the dimensions of each side remain (k x 1) One problem is that the predicted values of the response are unknown without knowing the parameter estimates. However, the best estimates of the response values are the values themselves, so the equation is written as What this means is that one does not have to calculate SS for x and y and solve each coefficient independently! BIOL 582Linear Model using matrix operations

Done for a simple linear model of head size as a function of log SVL BIOL 582Example in R using Snake data > snake<-read.csv("snake.data.csv") > attach(snake) > # number of responses > n<-length(HS) > X<-matrix(c(rep(1,n),log(SVL)), nrow=n, ncol=2) > X[1:10,] [,1] [,2] [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] > dim(X) [1] 40 2

Done for a simple linear model of head size as a function of log SVL BIOL 582Example in R using Snake data > y<-matrix(HS, nrow=n, ncol=1) > y[1:10,];dim(y) [,1] [1,] [2,] [3,] 7.16 [4,] [5,] [6,] 8.25 [7,] 9.74 [8,] [9,] [10,] [1] 40 1

Done for a simple linear model of head size as a function of log SVL BIOL 582Example in R using Snake data > B<-solve(t(X)%*%X)%*%t(X)%*%y > B [,1] [1,] [2,] > > # compare to canned function > lm.snake<-lm(HS~log(SVL),x=T) > lm.snake Call: lm(formula = HS ~ log(SVL), x = T) Coefficients: (Intercept) log(SVL)

Done for a simple linear model of head size as a function of log SVL BIOL 582Example in R using Snake data > # Predictions (fitted values) > y.hat<-X%*%B > y.hat[1:7,] [1] > > # Residuals > e<-y-y.hat > e[1:7,] [1] > > # Compare to > predict(lm.snake)[1:7] > resid(lm.snake)[1:7] >

After solving How does one determine if any or all coefficients are significant? Do the same thing for a reduced model and compare SSE First, how does one find SSE? First: Then Thus BIOL 582Analysis of variance using matrix operations

How is ? Using the snake example… BIOL 582Analysis of variance using matrix operations > SSE.f<-t(e)%*%e > SSE.f [,1] [1,]

ANOVA step by step for the snake data BIOL 582Analysis of variance using matrix operations > # ANOVA by hand, with matrix operations > > X.f<-matrix(c(rep(1,n),log(SVL)),nrow=n,ncol=2) > X.r<-matrix(rep(1,n),nrow=n,ncol=1) > y<-matrix(HS, nrow=n, ncol=1) > B.f<-solve(t(X.f)%*%X.f)%*%t(X.f)%*%y > B.r<-solve(t(X.r)%*%X.r)%*%t(X.r)%*%y > e.f<-y-X.f%*%B.f > e.r<-y-X.r%*%B.r > SSE.f<-t(e.f)%*%e.f > SSE.r<-t(e.r)%*%e.r > > SSE.f [,1] [1,] > SSE.r [,1] [1,] > k.f<-ncol(X.f);k.r<-ncol(X.r) > F.snake<-((SSE.r-SSE.f)/(k.f-k.r))/(SSE.f/(n-k.f)) > F.snake [,1] [1,] > P.value<-1-pf(F.snake,(k.f-k.r),(n-k.f)) > P.value [,1] [1,] e-08 > R2<-(SSE.r-SSE.f)/(SSE.r) # only because X.r includes only an intercept > R2 [,1] [1,]

ANOVA for the snake data, this time relying on lm functions BIOL 582Analysis of variance using matrix operations > # ANOVA first using lm, then matrix operations > > lm.f<-lm(HS~log(SVL),x=T) > lm.r<-lm(HS~1,x=T) > e.f<-resid(lm.f) > e.r<-resid(lm.r) > SSE.f<-t(e.f)%*%e.f > SSE.r<-t(e.r)%*%e.r > > SSE.f [,1] [1,] > SSE.r [,1] [1,] > > k.f<-ncol(X.f);k.r<-ncol(X.r) > F.snake<-((SSE.r-SSE.f)/(k.f-k.r))/(SSE.f/(n-k.f)) > F.snake [,1] [1,] > P.value<-1-pf(F.snake,(k.f-k.r),(n-k.f)) > P.value [,1] [1,] e-08 > > R2<-(SSE.r-SSE.f)/(SSE.r) > R2 [,1] [1,]

ANOVA for the snake data, what R does should be clear now BIOL 582Analysis of variance using matrix operations > # ANOVA via model comparison method > > lm.f<-lm(HS~log(SVL),x=T) > lm.r<-lm(HS~1,x=T) > > anova(lm.r,lm.f) Analysis of Variance Table Model 1: HS ~ 1 Model 2: HS ~ log(SVL) Res.Df RSS Df Sum of Sq F Pr(>F) e-08 *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 >

ANOVA for the snake data, what R does should be clear now BIOL 582Analysis of variance using matrix operations > # or just a model summary > summary(lm.f) Call: lm(formula = HS ~ log(SVL), x = T) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) *** log(SVL) e-08 *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: on 38 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 1 and 38 DF, p-value: 4.488e-08

It is worth looking at design matrices… BIOL 582Analysis of variance using matrix operations > X.f [,1] [,2] [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] [11,] [12,] [13,] [14,] [15,] [16,] [17,] [18,] [19,] [20,] [21,] [22,] [23,] [24,] [25,] [26,] [27,] [28,] [29,] [30,] > lm.f$x (Intercept) log(SVL) > X.r [,1] [1,] 1 [2,] 1 [3,] 1 [4,] 1 [5,] 1 [6,] 1 [7,] 1 [8,] 1 [9,] 1 [10,] 1 [11,] 1 [12,] 1 [13,] 1 [14,] 1 [15,] 1 [16,] 1 [17,] 1 [18,] 1 [19,] 1 [20,] 1 [21,] 1 [22,] 1 [23,] 1 [24,] 1 [25,] 1 [26,] 1 [27,] 1 [28,] 1 [29,] 1 [30,] 1 > lm.r$x (Intercept)

Now for an example for a single factor ANOVA BIOL 582Analysis of variance using matrix operations > # Single factor Anova example, relying more so on lm commands > lm.f<-lm(HS~Sex,x=T) > lm.f$x (Intercept) SexM > lm.f<-lm(HS~Sex,x=T) > X.f<-lm.f$x > lm.r<-lm(HS~1,x=T) > X.r<-lm.r$x > y<-HS > B.f<-solve(t(X.f)%*%X.f)%*%t(X.f)%*%y > B.r<-solve(t(X.r)%*%X.r)%*%t(X.r)%*%y > e.f<-y-X.f%*%B.f > e.r<-y-X.r%*%B.r > SSE.f<-t(e.f)%*%e.f > SSE.r<-t(e.r)%*%e.r > > SSE.f [,1] [1,] > SSE.r [,1] [1,] > k.f<-ncol(X.f);k.r<-ncol(X.r) > F.snake<-((SSE.r-SSE.f)/(k.f-k.r))/(SSE.f/(n-k.f)) > F.snake [,1] [1,] > P.value<-1-pf(F.snake,(k.f-k.r),(n-k.f)) > P.value [,1] [1,] > > R2<-(SSE.r-SSE.f)/(SSE.r) > R2 [,1] [1,]

Now for an example for a single factor ANOVA BIOL 582Analysis of variance using matrix operations > summary(lm.f) Call: lm(formula = HS ~ Sex, x = T) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) e-13 *** SexM Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: on 38 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 1 and 38 DF, p-value: > B.f [,1] (Intercept) SexM >