Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam.

Similar presentations


Presentation on theme: "Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam."— Presentation transcript:

1 Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam

2 Overview Digression: The expectation Formal specification Exercise 2 Estimation –ULS –WLS –ML The  2 -test General confirmatory factor analysis approach

3 Digression: the expectation If the variables are expressed in deviations from their mean: E(x)=E(y)=0, then: If the variables are expressed as standard scores: E(x 2 )=E(y 2 )=1, then:

4 Formal specification The full model in matrix notation: x = Λξ + δ The variables are expressed in deviations from their means, so: E(x)=E(ξ)=E(δ)=0. The latent ξ-variables are uncorrelated with the unique components (δ), so: E(ξδ’)=E(δξ’)=0 On the left side we need the covariance (or correlation) matrix. Hence; E(xx’) = Σ = E(Λξ + δ)(Λξ + δ)’ –Σ is the covariance matrix of the x variables. Σ = E(Λξ + δ)(ξ’Λ’ + δ’) Σ = E(Λξξ’Λ’ + δξ’Λ’ + Λξδ’ + δδ’)

5 Formal specification The factor equation: x = Λξ + δ E(x)=E(ξ)=E(δ)=0 and E(ξδ’)=E(δξ’)=0 Σ = E(Λξξ’Λ’ + δξ’Λ’ + Λξδ’ + δδ’) Σ = E(Λξξ’Λ’) + E(δξ’Λ’) + E(Λξδ’) + E(δδ’) Σ = ΛE(ξξ’)Λ’ + Λ’E(δξ’) + ΛE(ξδ’) + E(δδ’) Σ = ΛE(ξξ’)Λ’ + Λ’*0 + Λ*0 + E(δδ’) Σ = ΛΦΛ’ + Θ δ –E(ξξ’) = Φthe variance-covariance matrix of the factors –E(δδ’) = Θ δ the variance-covariance matrix of the factors This is the covariance equation: Σ = ΛΦΛ’ + Θ δ Now relax, and see the powerful possibilities of this equation.

6 Formulate expressions for the variances of and the correlations between the x variables in terms of the parameters of the model Now via the formal way. It is assumed that: E(x i )=E(ξ i )=0, and E(δ i δ j )=E(δx)=E(δξ)=0 Exercise 2

7 The factor equation is: x = Λξ + δ The covariance equation then is: Σ = ΛΦΛ’ + Θ δ This provides the required expression.

8 Exercise 2

9

10 Because both matrices are symmetric, we skip the upper diagonal.

11 Exercise 2 Let’s list the variances and covariances.

12 Exercise 2 The covariances between the x variables: The variances of the x variables: We already assumed that: E(x i )=E(ξ i )=E(δ i )=0, and E(δ i δ j )=E(δx)=E(δξ)=0 If we standardize the variables x and ξ so that: var(x i )=var(ξ i )=1, Then we can write:

13 Results Exercise 1 ρ 12 = λ 11 λ 21 ρ 13 = λ 11 φ 21 λ 32 ρ 14 = λ 11 φ 21 λ 42 ρ 23 = λ 21 φ 21 λ 32 ρ 24 = λ 21 φ 21 λ 42 ρ 34 = λ 32 λ 42 Exercise 2 Becomes Which is the same result as in the intuitive approach, but using a different notation: φ ii =var(ξ ii ) and φ ij =cov(ξ ij ) or when standardized cor(ξ ij ) Becomes

14 Estimation The model parameters can normally be estimated if the model is identified. Let’s assume for the sake of simplicity that our variables are standardized, except for the unique components. The decomposition rules only hold for the population correlations and not for the sample correlations. Normally, we know only the sample correlations. It is easily shown that the solution is different for different models. So an efficient estimation procedure is needed.

15 Estimation There are several general principles. We will discuss: - the Unweighted Least Squares (ULS) procedure - the Weighted Least Squares (WLS) procedure. Both procedures are based on: the residuals between the sample correlations (S) and the expected values of the correlations. Thus estimation means minimizing the difference between: The expected values of the correlations are a function of the model parameters, which we found earlier:

16 ULS Estimation The ULS procedure suggests to look for the parameter values that minimize the unweighted sum of squared residuals: Where i is the total number of unique elements of the correlations matrix. Let’s see what this does for the example used earlier with the four indicators. x1x1 x2x2 x3x3 x4x4 x1x1 1.0 x2x2.421.0 x3x3.56.481.0 x4x4.35.30.501.0

17 ULS Estimation F ULS = –(.42 - λ 11 λ 21 ) 2 + (.56 - λ 11 λ 31 ) 2 + (.35 - λ 11 λ 41 ) 2 + –(.48 - λ 21 λ 31 ) 2 + (.30 - λ 21 λ 41 ) 2 + –(.40 - λ 31 λ 41 ) 2 + –(1 - ( λ 11 2 + var(δ 11 ))) 2 + (1 - ( λ 21 2 + var(δ 22 ))) 2 + –(1 - ( λ 31 2 + var(δ 33 ))) 2 + (1 - ( λ 41 2 + var(δ 44 ))) 2 The estimation procedure looks (iteratively) for the values of all the parameters that minimize the function F uls. Advantages: –Consistent estimates without distributional assumptions on x’s. –So for large samples ULS is approximately unbiased. Disadvantages: –There is no statistical test associated with this procedure (RMR). –The estimators are scale dependent.

18 WLS Estimation The WLS procedure suggests to look for the parameter values that minimize the weighted sum of squared residuals: Where i is the total number of unique elements of the correlations matrix. These weights can be chosen in different ways.

19 Maximum Likelihood Estimation The most commonly used procedure, the Maximum Likelihood (ML) estimator, can be specified as a special case of the WLS estimator. The ML estimator provides standard errors for the parameters and a test statistic for the fit of the model for much smaller samples. But this estimator is developed under the assumption that the observed variables have a multivariate normal distribution.

20 The χ 2 - test Without a statistical test we don’t know whether our theory holds. The test statistic t used is the value of the fitting function (F ML ) at its minimum. If the model is correct, t is  2 (df) distributed Normally the model is rejected if t > C  where C  is the value of the  2 for which: pr(  2 df > C   =  See the appendices in many statistics books. But, the  2 should not always be trusted, as any other similar test-statistic. A robust test is to look at: –The residuals, and –The expected parameter change (EPC).

21 General CF approach A model is specified with observed and latent variables. Correlations (covariances) between the observed variables can be expressed in the parameters of the model (decomposition rules). If the model is identified the parameters can be estimated. A test of the model can be performed if df > 0. Eventual misspecifications (unacceptable  2 ) can be detected. Corrections in the models can be introduced: adjusting the theory.

22 Data collection process Model modification Reality Data Model Theory


Download ppt "Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam."

Similar presentations


Ads by Google