Advanced Statistics Factor Analysis, II. Last lecture 1. What causes what, ξ → Xs, Xs→ ξ ? 2. Do we explore the relation of Xs to ξs, or do we test (try.

Slides:



Advertisements
Similar presentations
Writing up results from Structural Equation Models
Advertisements

1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Structural Equation Modeling Using Mplus Chongming Yang Research Support Center FHSS College.
Structural Equation Modeling
Exploratory Factor Analysis
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Structural Equation Modeling
Lecture 7: Principal component analysis (PCA)
Psychology 202b Advanced Psychological Statistics, II April 7, 2011.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Factor Analysis Ulf H. Olsson Professor of Statistics.
Multivariate Data Analysis Chapter 11 - Structural Equation Modeling.
“Ghost Chasing”: Demystifying Latent Variables and SEM
Structural Equation Modeling
Factor Analysis Ulf H. Olsson Professor of Statistics.
Tables, Figures, and Equations
LECTURE 16 STRUCTURAL EQUATION MODELING.
G Lect 31 G Lecture 3 SEM Model notation Review of mediation Estimating SEM models Moderation.
Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.
Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Confirmatory Factor Analysis Psych 818 DeShon. Purpose ● Takes factor analysis a few steps further. ● Impose theoretically interesting constraints on.
Some matrix stuff.
بسم الله الرحمن الرحیم.. Multivariate Analysis of Variance.
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
SEM: Basics Byrne Chapter 1 Tabachnick SEM
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
1 Exploratory & Confirmatory Factor Analysis Alan C. Acock OSU Summer Institute, 2009.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Factor Analysis Psy 524 Ainsworth. Assumptions Assumes reliable correlations Highly affected by missing data, outlying cases and truncated data Data screening.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam.
Measurement Models: Identification and Estimation James G. Anderson, Ph.D. Purdue University.
Introduction to Multivariate Analysis of Variance, Factor Analysis, and Logistic Regression Rubab G. ARIM, MA University of British Columbia December 2006.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
Correlation & Regression Analysis
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Chapter 13.  Both Principle components analysis (PCA) and Exploratory factor analysis (EFA) are used to understand the underlying patterns in the data.
Principle Component Analysis and its use in MA clustering Lecture 12.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Statistical Data Analysis 2010/2011 M. de Gunst Lecture 10.
ALISON BOWLING CONFIRMATORY FACTOR ANALYSIS. REVIEW OF EFA Exploratory Factor Analysis (EFA) Explores the data All measured variables are related to every.
Advanced Statistics Factor Analysis, I. Introduction Factor analysis is a statistical technique about the relation between: (a)observed variables (X i.
Principal Component Analysis
Chapter 13 Understanding research results: statistical inference.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Computacion Inteligente Least-Square Methods for System Identification.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Chapter 17 STRUCTURAL EQUATION MODELING. Structural Equation Modeling (SEM)  Relatively new statistical technique used to test theoretical or causal.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.
Advanced Statistical Methods: Continuous Variables
Structural Equation Modeling using MPlus
Chapter 15 Confirmatory Factor Analysis
Correlation, Regression & Nested Models
Chapter 13 Created by Bethany Stubbe and Stephan Kogitz.
CJT 765: Structural Equation Modeling
LESSON 24: INFERENCES USING REGRESSION
Structural Equation Modeling
Simple Linear Regression
Product moment correlation
Exploratory Factor Analysis. Factor Analysis: The Measurement Model D1D1 D8D8 D7D7 D6D6 D5D5 D4D4 D3D3 D2D2 F1F1 F2F2.
Presentation transcript:

Advanced Statistics Factor Analysis, II

Last lecture 1. What causes what, ξ → Xs, Xs→ ξ ? 2. Do we explore the relation of Xs to ξs, or do we test (try to confirm) our a priori assumption about this relation? Ad 1. The difference between PCA (principal component analysis) and FA (factor analysis). Ad 2. The difference between EFA (exploratory factor analysis and CFA (confirmatory factor analysis).

PCA and FA extraction Factor loadings (components for PCA) are correlations between factors and variables. For PCA and FA they are extracted on the basis of eigenvectors and eigenvalues associated with this vectors. Eiganvectors V are linear combination of variables to account for the variance measured by the corresponding eigenvalues L (variance refers to factors) Basic equation of extraction is

VariablesUnrotatedRotatedCommun- alities Factor 1Factor 2Factor 1Factor 2 COST LIFT DEPTH POWDER SSL % of variance

Number of factors

Extraction methods Principal component Principal factors Image factoring (rescales – unique variances eliminated) MLF – maximum likelihood factoring (has significance test for factors)

T-F: Practical issues 1. Sample size and missing data N > Missing data: consider regression imputation 2. Normality Normal distribution: – (1) unimodal, – (2) symmetric (0 skewness), – (3) mesokurtic (not too tall, not to flat) Checking the distribution of x i as for the regression and other analysis.

T-F: Practical issues 3. Linearity. Scaterplots for pairs of x i 4. Absence of outliers among cases 5. Absence of multicolinearity Computation of SMCs. SMC = squared multiple correlation of a variable where it serves as DV and the rest of variables in the analysis are IV. SMC >.9 indicates multicolinearity. SMC = 1 is called singularity.

T-F: Practical issues 6. Factoriability of R. Some r >.3 Recommendation: Kaiser’s ratio: sum of squared correlations divided by sum of squared correlations plus sum of squared partial correlations. Partial correlations should be small, if they are 0 then K-ratio = 1. K-ratio >.6 is usually required for FA. 7. Outliers among variables Omit variables that have low squared multiple correlation with all other variables initially concidered for FA.

Credits This lecture is partially based on: Melvin Kohn and Kazimierz M. Slomczynski. Social Structure and Self-Direction. Blackwell IFiS Publishers Albright, Jeremy J., and Hun Myoung Park Confirmatory Factor Analysis Using Amos, LISREL, Mplus, and SAS/STAT CALIS. Working Paper. The University Information Technology Services (UITS) Center for Statistical and Mathematical Computing, Indiana University. Bloomington, IN

Notation for Confirmatory Factor Analysis

ξ x λ ϕ δ It is common to display confirmatory factor models as path diagrams in which square represent observed variables and circles represent the latent variables. E.g.: Consider two latent variables ξ 1 and ξ 2 and six observed variables x 1 through x 6. Factor loadings are represented by λ ij. Covariance between ξ 1 and ξ 2 is ϕ. The δ i incorporate all the variance in x i which is not captured by the common factors.

Equation for X Latent variables are mean centered to have deviations from their means. Under this assumption the confirmatory factor model is summarized by the equation: X = Λ ξ + δ X is the vector of observed variables; Λ (lambda) is the matrix of factor loadings connecting the ξ i to the x i ; ξ is the vector of common factors, and δ is the vector of errors. The error terms have a mean of zero, E(δ) = 0, and common factors and errors are uncorrelated, E(ξδ’)=0.

Specific equation for x 1 to x 6 X 1 = λ 11 * ξ 1 + δ 1 X 2 = λ 21 * ξ 1 + δ 2 X 3 = λ 31 * ξ 1 + δ 3 X 4 = λ 42 * ξ 2 + δ 4 X 5 = λ 52 * ξ 2 + δ 5 X 6 = λ 62 * ξ 2 + δ 6

Similarities with regression Equation for x i is a linear function of one or more common factors plus an error term. There is no intercept since the variables are mean centered. The primary difference between these factor equations and regression analysis is that the ξi are unobserved in CFA. Consequently, estimation proceeds in a manner distinct from the conventional approach of regressing each x on the ξi.

Identification One essential step in CFA is determining whether the specified model is identified. If the number of the unknown parameters to be estimated is smaller than the number of pieces of information provided, the model is underidentified. E.g.: 10 = 2x + 3y is not identified (two unknowns but only one piece of information - one equation); a large number of values for x and y makes the equation true: x = -10, y = 10; x = -25, y = 20; x = -40, y = 30, etc. To make it just-identified, another independent equation should be provided; for example, adding 3 = x + y ends up with x=-1 and y=4.

Identification: Input information In CFA, a model is identified if all of the unknown parameters can be rewritten in terms of the variances and covariances of the x variables. In our case, a variance/covariance matrix for variables x 1 …x 6 is: σ 61 σ 62 σ 63 σ 64 σ 65 σ 66 σ 51 σ 52 σ 53 σ 54 σ 55 σ 41 σ 42 σ 43 σ 44 σ 31 σ 32 σ 33 σ 21 σ 22 σ 11 The number of input information is 6(6+1)/2 = 21

Degrees of freedom Generally the input information is computed as: p(p+1)/2, where p is the number of observed variables. Unknowns: ϕ 21, six λ ij, six δ i, and δ 63 Degrees of freedom are 21 (knowns) -14 (unknowns) = 7. CFA is over-identified.

Scale of latent variables Without introducing some constraints any confirmatory factor model is not identified. The problem lies in the fact that the latent variables are unobserved and hence their scales are unknown. To identify the model, it therefore becomes necessary to set the metric of the latent variables in some manner. The two most common constraints are to set either the variance of the latent variable or one of its factor loadings to one.

Basic estimation equation When the x variables are measured as deviations from their means it is easy to show that the sample covariance matrix for x, represented by S, can be decomposed as follows: Σ = Λ Φ Λ’ + Θ where Φ (phi) represents the covariance matrix of the ξ factors and Θ (theta) represents the covariance matrix of the unique factors δ). Estimation proceeds by finding the parameters Λ, Φ, and Θ so that predicted x covariance matrix Σ (sigma) is as close to the sample covariance matrix S as possible.

Estimation Several different fitting functions exist for determining the closeness of the implied covariance matrix to the sample covariance matrix, of which maximum likelihood is the most common. A full discussion of the topic in the context of CFA is available in Bollen (1989, chapter 7), including some necessary and sufficient conditions for identification.

ML estimation Maximum Likelihood Method. The method of maximum likelihood (the term first used by Fisher, 1922a) is a general method of estimating parameters of a population by values that maximize the likelihood (L) of a sample.

Fit statistics A goodness-of-fit tests evaluate the model in terms of the fixed parameters used to specify the model, and acceptance or rejection of the model in terms of the overidentifying conditions in the model. Basic assessment: Chi square/degrees of freedom ratio tests the hypothesis that the model is consistent with the pattern of covariation among the observed variables; smaller rather than larger values indicate a good fit. Goodness-of-fit index (GFI): a measure of the relative amount of variances and covariances jointly accounted for by the model; the closer the GFI is to 1.00, the better is the fit of the model to the data.

Comparison of unstandardized and standardized solutions