Factor Analysis Purpose of Factor Analysis Maximum likelihood Factor Analysis Least-squares Factor rotation techniques R commands for factor analysis References.

Slides:

Advertisements

Similar presentations

Copula Representation of Joint Risk Driver Distribution

Advertisements

General Linear Model With correlated error terms  =  2 V ≠  2 I.

Factor Analysis and Principal Components Removing Redundancies and Finding Hidden Variables.

Factor Analysis Continued

Exploratory Factor Analysis

Dimension reduction (1)

The General Linear Model. The Simple Linear Model Linear Regression.

Lecture 7: Principal component analysis (PCA)

Procrustes analysis Purpose of procrustes analysis Algorithm R code Various modifications.

Maximum likelihood (ML) and likelihood ratio (LR) test

Principal component regression

Chapter 10 Simple Regression.

Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.

Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.

Factor Analysis Purpose of Factor Analysis

Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.

Principal component analysis (PCA)

Maximum likelihood (ML)

Maximum likelihood (ML) and likelihood ratio (LR) test

Procrustes analysis Purpose of procrustes analysis Algorithm Various modifications.

Canonical correlations

1 Carrying out EFA - stages Ensure that data are suitable Decide on the model - PAF or PCA Decide how many factors are required to represent you data When.

The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.

Log-linear and logistic models

Basics of discriminant analysis

Proximity matrices and scaling Purpose of scaling Similarities and dissimilarities Classical Euclidean scaling Non-Euclidean scaling Horseshoe effect Non-Metric.

Chapter 11 Multiple Regression.

Contingency tables and Correspondence analysis Contingency table Pearson’s chi-squared test for association Correspondence analysis using SVD Plots References.

Proximity matrices and scaling Purpose of scaling Similarities and dissimilarities Classical Euclidean scaling Non-Euclidean scaling Horseshoe effect Non-Metric.

Linear and generalised linear models

Ch. 10: Linear Discriminant Analysis (LDA) based on slides from

Principal component analysis (PCA)

Linear and generalised linear models

Basics of regression analysis

Principal component analysis (PCA) Purpose of PCA Covariance and correlation matrices PCA using eigenvalues PCA using singular value decompositions Selection.

Proximity matrices and scaling Purpose of scaling Classical Euclidean scaling Non-Euclidean scaling Non-Metric Scaling Example.

Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.

Maximum likelihood (ML)

Lecture II-2: Probability Review

Separate multivariate observations

Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.

Some matrix stuff.

R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.

Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.

Chapter 9 Factor Analysis

Factor Analysis Psy 524 Ainsworth. Assumptions Assumes reliable correlations Highly affected by missing data, outlying cases and truncated data Data screening.

ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.

Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.

Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.

Techniques for studying correlation and covariance structure Principal Components Analysis (PCA) Factor Analysis.

Lecture 12 Factor Analysis.

Principal Component Analysis (PCA)

Feature Extraction 主講人：虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.

Principal Component Analysis

Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)

Feature Extraction 主講人：虞台文.

Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.

FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.

Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.

Dimension reduction (2) EDR space Sliced inverse regression Multi-dimensional LDA Partial Least Squares Network Component analysis.

R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Principal component analysis (PCA)

Factor Analysis An Alternative technique for studying correlation and covariance structure.

Factor Analysis An Alternative technique for studying correlation and covariance structure.

Feature space tansformation methods

Principal Component Analysis

Factor Analysis BMTRY 726 7/19/2018.

Exploratory Factor Analysis. Factor Analysis: The Measurement Model D1D1 D8D8 D7D7 D6D6 D5D5 D4D4 D3D3 D2D2 F1F1 F2F2.

Presentation transcript:

Factor Analysis Purpose of Factor Analysis Maximum likelihood Factor Analysis Least-squares Factor rotation techniques R commands for factor analysis References

Purpose of Factor Analysis Factor analysis is one of the techniques to reduce dimension of the observed variables. Suppose that we have p-dimensional continuous variable vector x = (x 1,x 2,,,x p ). These are what we observe. These may not be real independent underlying variables. Factor analysis seeks to find real underlying variables that are not observable. It means that we want to find m<p dimensional vector – y=(y 1,y 2,,,y m ) of independent variables satisfying conditions:. Where e is normal random vector with 0 mean and constant dispersion. It is assumed that elements of e are independent of each other and y. Moreover it is assumed that elements of y are independent each other and they are standard normal variables. We can write: Where  is the diagonal mxm matrix. Elements of this matrix are called specific or unique variances. Weights  are factor loadings. Elements of y are called common variables and elements of e are called unique or specific variables.Without loss of generality we will assume that mean of x s 0, i.e.  =0. Note that in case of PCA we wanted to find linear combination of observable variables. In the case of factor analysis we want to find independent variables linear combinations of which are observable variables. As it is case in many situations assumption of normal distribution makes treatment easy, although results are applicable to wider range of problems.

Factor analysis model Model defined by the linear equation given above can not be solved directly. However we can use the relation between covariance matrix, factor loadings and specific variances. It has the form: Objective of the factor analysis is to determine m (length of the vector y),  and  using the observed sample estimate of the covariance matrix S. It should be noted that if we have mxm orthogonal matrix M (M T M=I) then for z=My we can write: i.e. solution to the problem is not unique. Solutions are indeterminate up to an orthogonal transformation. Only thing we can do is to estimate the factor space. To be able to find the unique solution we need to add new condition. This condition is: where  and D are a diagonal matrices. If we can identify factor space using these constraints then we can use any rotation matrix and define factors convenient for interpretation. Moreover we can even use any non-singular matrix and use it to redefine new factors. When we use orthogonal transformation then independent variables go to independent variables. When we use non-orthogonal transformation independent variables may transform to dependent variables. Note that if  =0 then the second condition cannot be used. It is called Heywood case.

Variance of variables and communalities We can write relations between covariances of original variables and loadings and unique variances The term: is also called communality. That is the variance of the original variable shared with others via common variables. And  ii is the unique variances that is property of the variable of x i only.

Maximum number of factors Number of elements in the covariance matrix of p variables is ½p(p+1) (elements of S). Number of elements of loadings is pm, number of specific variances is p. Thus we want identify p(m+1) elements. Number of constraints is ½m(m-1). Taking the constraints into account we want to identify p(m+1)-1/2m(m-1) elements using ½p(p+1) elements. Then we can write relation for the maximum number of identifiable elements: For example if we have 6 original variables we cannot define more than 3 factor variables. If we have 15 original variables we cannot define more than 10 new variables. In practice it is hoped that one can find much smaller number of factors describing the whole system.

Factor Analysis using Maximum likelihood If we use assumptions that n observed variables x i = (x i1,,,x ip ) are distributed normally then we can write for the likelihood function (assuming that mean of x is 0): We can write for the log-likelihood function: Derivatives wrt to factor loadings and specific variables become: here we used the matrix notation of the derivatives, some facts from matrix algebra and the fact that covariance matrix is symmetric:

Factor analysis using ML The maximum likelihood equations are usually solved iteratively. Care should be taken in implementation of these equations as convergence can be slow and some elements of the specific variables can become negative. These equations are usually solved using Newton-Raphson (NR) second order methods or scoring method (scoring method uses Fisher information matrix instead of the second derivative matrix. It can be slower than NR but has attractive properties that initial values of the parameters can be far from optimal.) Numerical optimisation should also ensure that  i >0. Optimisations are usually done using these constraints. Maximum likelihood can be performed in a following way: find initial values for  i, then estimate values for  and then find new values for  i. One of the problems in factor analysis is the common problem in multivariate analysis: It is not guarantied that all measurement are in the same scale. For that reason it is common to use correlation matrices instead of covariance matrices. If factor analysis is done using Maximum likelihood then loadings using correlation matrix can easily derived. In general maximum likelihood estimation is invariant under transformations with non-zero Jacobians. Since transformation from covariance matrix to correlation matrix (and corresponding transformation of loadings and unique variances) has non-zero Jacobian then having found parameters using one of them we can derive another one.

Least-squares for Factor analysis Other widely used technique for factor analysis is the least-squares technique. Its simplicity makes it attractive. It is done by minimisation of: Covariance matrix has the same conditions as before. If we get derivatives and equate to 0 we can derive the following equations: First initial value for  is taken and using the first equation  is found. For this eigenvalue analysis is used. Then using the second equation  is updated. This technique is called principal factor analysis. It should not be confused with principal component analysis. If values of  are 0 then the first equation is very similar to principal component analysis. That is the reason why some statistical packages contain PCA as a special case for factor analysis. Two points should be noted: Least-squares are usually used to find initial estimates for ML. If correlation matrix is used then results derived using least squares will be different. Results obtained using covariance and correlation matricess can not be converted into each other using simple scaling as it was the case for the maximum likelihood estimation.

Significance test and model selection If normality assumptions holds then we can use likelihood ratio test for factor with dimension m. If null hypothesis is: and the alternative is that covariance is unconstrained (i.e. null hypothesis is not true) then likelihood ratio test reduces to: Distribution of this is approximated by a chi-squared distribution with ½((p-m) 2 -(p+m)) degrees of freedom. This enables us to carry out the significance test for null- hypothesis. If maximum number of identifiable parameters is reached we can conclude that it is not straightforward to extract from the given data some structure. Usually n is replaced by n’=n-1-1/6(2p+5)-2/3m. In this case chi-squared approximation is more accurate. This test is called a goodness-of-fit test. For model selection usual techniques used are: First carry out principal component analysis then using one of the recommended techniques (scree plot, proportion of variances etc) select number of factors. Then do factor analysis starting from this value. Likelihood ratio test can be carried out to test significance of the number of factors. But it should applied with care. Likelihood ratio test does not make any adjustments on sequential application of the test. Determining the number of parameters is trade of between the number of parameters (we want to have as little as possible) and goodness-of-fit.

Factor rotations Factor analysis does not give the unique solution. As we noted above using orthonormal rotation we can derive factors that will fit the model with exactly same accuracy. It is usual to rotate factors after analysis. There are several techniques for doing that. All they attempt to minimise some loadings and maximise others so that interpretation of results is easy. Two widely used techniques to derive rotations are varimax and quartimax. Varimax maximises:  ’ are loadings after the rotation. Quartimax maximises: Many statistical packages can find rotation matrices using these techniques. R uses varimax only. Sometimes it is useful to find non-orthogonal rotation matrices. One of the techniques is promax available in R. One of the techniques for factor rotation maximises non-normality of the unobserved (common) variables. This technique is an separate technique and it is called Independent component analysis (ICA).

Factor scorings There are also techniques to find factor scores. One technique is due to Bartlett that uses least- squares technique: If we get derivatives of this wrt to y and equate to zero we can get: Another technique uses normality assumption (due to Thomson) and finds conditional expected value of y given x. It turns out to be: Here we assumed that mean values of x-s are 0. Both technique gives score as a linear combination of the initial variables. A is sometimes called factor score estimation matrix in computer package output.

R commands for factor analyses First decide what data matrix we have and prepare data matrix. Necessary commands for factor analysis are in the package called mva. This package contains many functions for multivariate analysis. First load this package using library(mva) – loads the library mva Now we can analyse data using factor analysis data(swiss) – loads data fan <- factanal(swiss,2) - It does actual calculations. Second number is the number of factors desired. Have a look help for this command. There are options for rotation and other things fan = factanal(swiss,2,scores=“Bartlett”) – will do factor analysis and calculate scores. varimax(fan$loadings) – perform varimax rotation promax(fan$loadings) – performs promax rotation fan - prints out the result of factor analaysis If covariance matrix has been calculated by some means then it can be used for factor analysis: data (Harman23.cor) fan = factanal(covmat=Harman23.cor,factors=3) It will use factor analysis using the correlation matrix. Obviously scores can not be calculated.

References 1)Krzanowski WJ and Marriout FHC. (1994) Multivatiate analysis. Vol 2. Kendall’s library of statistics 2)Morrison DR (1990) Multivatiate statistical methods 3)Mardia,KV, Kent, JT and Bibby, JM (2003) Multivariate analysis