Factor Analysis An Alternative technique for studying correlation and covariance structure.

Slides:



Advertisements
Similar presentations
EigenFaces and EigenPatches Useful model of variation in a region –Region must be fixed shape (eg rectangle) Developed for face recognition Generalised.
Advertisements

Ch 7.7: Fundamental Matrices
Surface normals and principal component analysis (PCA)
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Multivariate distributions. The Normal distribution.
Lecture 7: Principal component analysis (PCA)
An introduction to Principal Component Analysis (PCA)
Factor Analysis Ulf H. Olsson Professor of Statistics.
Factor Analysis Purpose of Factor Analysis Maximum likelihood Factor Analysis Least-squares Factor rotation techniques R commands for factor analysis References.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
Factor Analysis Purpose of Factor Analysis
AGC DSP AGC DSP Professor A G Constantinides© Estimation Theory We seek to determine from a set of data, a set of parameters such that their values would.
3.3 Brownian Motion 報告者:陳政岳.
Probability theory 2011 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different definitions.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
GRA 6020 Multivariate Statistics Factor Analysis Ulf H. Olsson Professor of Statistics.
Factor Analysis Ulf H. Olsson Professor of Statistics.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Techniques for studying correlation and covariance structure
Correlation. The sample covariance matrix: where.
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
Separate multivariate observations
Maximum Likelihood Estimation
Summarized by Soo-Jin Kim
Chapter 2 Dimensionality Reduction. Linear Methods
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
III. Multi-Dimensional Random Variables and Application in Vector Quantization.
The Multiple Correlation Coefficient. has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable.
Empirical Financial Economics Asset pricing and Mean Variance Efficiency.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
Chapter 9 Factor Analysis
Factor Analysis Psy 524 Ainsworth. Assumptions Assumes reliable correlations Highly affected by missing data, outlying cases and truncated data Data screening.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Principal Component vs. Common Factor. Varimax Rotation Principal Component vs. Maximum Likelihood.
1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Techniques for studying correlation and covariance structure Principal Components Analysis (PCA) Factor Analysis.
Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University.
III. Multi-Dimensional Random Variables and Application in Vector Quantization.
1 Factor Analysis and Inference for Structured Covariance Matrices Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/
Principle Component Analysis and its use in MA clustering Lecture 12.
Factor Analysis I Principle Components Analysis. “Data Reduction” Purpose of factor analysis is to determine a minimum number of “factors” or components.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Factor Analysis Basics. Why Factor? Combine similar variables into more meaningful factors. Reduce the number of variables dramatically while retaining.
Feature Extraction 主講人:虞台文.
Principal Components Analysis ( PCA)
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
EXPLORATORY FACTOR ANALYSIS (EFA)
Basic simulation methodology
Factor Analysis.
CH 5: Multivariate Methods
Inference for the mean vector
Techniques for studying correlation and covariance structure
The Multivariate Normal Distribution, Part 2
Matrix Algebra and Random Vectors
Factor Analysis An Alternative technique for studying correlation and covariance structure.
Feature space tansformation methods
Factor Analysis BMTRY 726 7/19/2018.
Principal Components What matters most?.
Digital Image Processing Lecture 21: Principal Components for Description Prof. Charlene Tsai *Chapter 11.4 of Gonzalez.
数据的矩阵描述.
Learning From Observed Data
Multivariate Methods Berlin Chen
Principal Component Analysis
Multivariate Methods Berlin Chen, 2005 References:
Chapter-1 Multivariate Normal Distributions
Lecture 8: Factor analysis (FA)
因子分析.
Factor Analysis.
Outline Variance Matrix of Stochastic Variables and Orthogonal Transforms Principle Component Analysis Generalized Eigenvalue Decomposition.
Presentation transcript:

Factor Analysis An Alternative technique for studying correlation and covariance structure

have a p-variate Normal distribution with mean vector Let have a p-variate Normal distribution with mean vector The Factor Analysis Model: Let F1, F2, … , Fk denote independent standard normal observations (the Factors) Let e1, e2, … , ep denote independent normal random variables with mean 0 and var(ei) = yp Suppose that there exists constants lij (the loadings) such that: x1= m1 + l11F1+ l12F2+ … + l1k Fk + e1 x2= m2 + l21F1+ l22F2+ … + l2k Fk + e2 … xp= mp + lp1F1+ lp2F2+ … + lpk Fk + ep

Factor Analysis Model where and

Note: hence and i.e. the component of variance of xi that is due to the common factors F1, F2, … , Fk. i.e. the component of variance of xi that is specific only to that observation.

F1, F2, … , Fk are called the common factors e1, e2, … , ep are called the specific factors = the correlation between xi and Fj. if

Rotating Factors Recall the factor Analysis model This gives rise to the vector having covariance matrix: Let P be any orthogonal matrix, then and

Hence if with is a Factor Analysis model then so also is with where P is any orthogonal matrix.

Rotating the Factors

Rotating the Factors

There are many techniques for rotating the factors VARIMAX Quartimax The process of exploring other models through orthogonal transformations of the factors is called rotating the factors There are many techniques for rotating the factors VARIMAX Quartimax Equimax VARIMAX rotation attempts to have each individual variables load high on a subset of the factors

Extracting the Factors

Several Methods – we consider two Principal Component Method Maximum Likelihood Method

Principle Component Method Recall where are eigenvectors of S of length 1 and are eigenvalues of S.

Hence Thus This is the Principal Component Solution with p factors Note: The specific variances, yi, are all zero.

The objective in Factor Analysis is to explain the correlation structure in the data vector with as few factors as necessary It may happen that the latter eigenvalues of S are small.

In addition let In this case The equality will be exact along the diagonal where

Maximum Likelihood Estimation Let denote a sample from where The joint density of is

The Likelihood function is The maximum likelihood estimates Are obtained by numerical maximization of

Example: Olympic decathlon Scores Data was collected for n = 160 starts (139 athletes) for the ten decathlon events (100-m run, Long Jump, Shot Put, High Jump, 400-m run, 110-m hurdles, Discus, Pole Vault, Javelin, 1500-m run). The sample correlation matrix is given on the next slide

Correlation Matrix

Identification of the factors