Download presentation
1
MGMT 6971 PSYCHOMETRICS © 2014, Michael Kalsher
Factor Analysis MGMT PSYCHOMETRICS © 2014, Michael Kalsher
2
Outline What Are Factors? Representing Factors Extracting factors
Graphs and Equations Extracting factors Methods and Criteria Interpreting Factor Structures Factor Rotation Reliability Cronbach’s alpha Writing Results
3
When to use Factor Analysis?
Data Reduction (retaining as much original information as possible) Identifying underlying latent structures Determining whether different measures or variable are tapping aspects of a common dimension. Clusters of correlated variables are termed factors Example: Factor analysis could be used to identify the “core” characteristics (out of a potentially large number of characteristics) that make a person popular. Candidate characteristics: Time spent talking about the other person (Talk 1 – a relatively positive trait) Level of social skills (Social Skills) How interesting a person is to others (Interest) Time spent talking about themselves (Talk 2 – a relatively negative trait) Selfishness (Selfish) The person’s propensity to lie about themselves (Liar).
4
An R-Matrix Factor 1: The better your social skills, the more interesting and talkative you tend to be. In Factor Analysis and Principal Components Analysis (PCA) we look to reduce the R-matrix into a smaller set of uncorrelated dimensions. Factor 2: Selfish people are likely to lie and talk about themselves.
5
Factors and Components
Factor analysis attempts to achieve parsimony by explaining the maximum amount of common variance in a correlation matrix using the smallest number of explanatory constructs. These ‘explanatory constructs’ are called factors. PCA tries to explain the maximum amount of total variance in a correlation matrix. It does this by transforming the original variables into a set of linear components.
6
What is a Factor? Factors can be viewed as classification axes along which the individual variables can be plotted. The greater the loading of variables onto a factor, the more the factor explains relationships among those variables. Ideally, variables should be strongly related to (or load onto) only one factor.
7
Graphical Representation of a factor plot
Note that each variable loads primarily on only one factor. Factor loadings tell use about the relative contribution that a variable makes to a factor
8
Mathematical Representation of a factor plot
The equation describing a linear model can be applied to the description of a factor. The b’s in the equation represent the factor loadings observed in the factor plot. Yi = b1X1i +b2X2i + … bnXn + εi Factori = b1Variable1i +b2Variable2i + … bnVariablen + εi Note: there is no intercept in the equation since the lines intersection at zero and hence the intercept is also zero.
9
Mathematical Representation of a factor plot
There are two factors underlying the popularity construct: general sociability and consideration. We can construct equations that describe each factor in terms of the variables that have been measured. Sociabilityi = b1Talk 1i +b2Social Skillsi + b3interesti + b4Talk 2 + b5Selfishi + b6Liari + εi Considerationi = b1Talk 1i +b2Social Skillsi + b3interesti + b4Talk 2 + b5Selfishi + b6Liari + εi
10
Mathematical Representation of a factor plot
The values of the “b’s” in the two equations differ, depending on the relative importance of each variable to a particular factor. Sociabilityi = 0.87Talk 1i +0.96Social Skillsi Interesti Talk Selfishi Liari + εi Considerationi = 0.01Talk 1i Social Skillsi interesti Talk Selfishi Liari + εi Replace values of b with the co-ordinate of each variable on the graph. Ideally, variables should have very high b-values for one factor and very low b-values for all other factors.
11
Factor Matrix (or Component Matrix)
Columns display the factors (underlying constructs) and rows display how each variable loads onto each factor. Variables Factors Sociability Consideration Talk 1 0.87 0.01 Social Skills 0.96 -0.03 Interest 0.92 0.04 Talk 2 0.00 0.82 Selfish -0.10 0.75 Liar 0.09 0.70 Both factor analysis and PCA are linear models in which loadings are used as weights. The b values represent the weights of a variable on a factor and are termed Factor Loadings. These values can be represented as a matrix called a Factor Matrix or Component Matrix (if doing PCA). The assumption of factor analysis (but not PCA) is that these algebraic factors represent real-world dimensions.
12
Factor Scores Once factors are derived, we can estimate each person’s Factor Scores (based on their scores for each factor’s constituent variables). Potential uses for Factor Scores. Estimate a person’s score on one or more factors. Answer questions of scientific or practical interest (e.g., Are females are more sociable than males? using the factors scores for sociability). Methods of Determining Factor Scores Weighted Average (simplest, but scale dependent) Regression Method (easiest to understand; but scores can correlate with factors other than the one one which they are based and with other factor scores from a different orthogonal factor). Bartlett Method (produces scores that are unbiased and correlate only with their own factor). Anderson-Rubin Method (produces scores that are uncorrelated and standardized)
13
Approaches to Factor Analysis
Exploratory Factor Analysis (EFA) Reduce a number of measurements to a smaller number of indices or factors (e.g., principal components analysis and principal axis factoring). Goal: Identify factors based on the data and to maximize the amount of variance explained. Confirmatory Factor Analysis (CFA) Test hypothetical relationships between measures and more abstract constructs. Goal: The researcher must hypothesize, in advance, the number of factors, whether or not these factors are correlated, and which items load onto and reflect particular factors. In contrast to EFA, where all loadings are free to vary, CFA allows for the explicit constraint of certain loadings to be zero. 13
14
Communality Understanding variance in an R-matrix Communality
Total variance for a particular variable has two components: Common Variance – variance shared with other variables. Unique Variance – variance specific to that variable (including error or random variance). Communality The proportion of common (or shared) variance present in a variable is known as the communality. A variable that has no unique variance has a communality of 1; one that shares none of its variance with any other variable has a communality of 0. 14
15
Variance of Variance of Variance of Variable 3 Variable 1 Variable 2
Communality = 1 Variance of Variable 3 Variance of Variable 1 Variance of Variable 2 Communality = 0 Variance of Variable 4 Slide 15
16
Factor Extraction: PCA vs. Factor Analysis
Principal Component Analysis. A data reduction technique that represents a set of variables by a smaller number of variables called principal components. They are uncorrelated, and therefore, measure different, unrelated aspects or dimensions of the data. Principal Components are chosen such that the first one accounts for as much of the variation in the data as possible, the second one for as much of the remaining variance as possible, and so on. Useful for combining many variables into a smaller number of subsets. Factor Analysis. Derives a mathematical model from which factors are estimated. Factors are linear combinations that maximize the shared portion of the variance underlying latent constructs. May be used to identify the structure underlying such variables and to estimate scores to measure latent factors themselves. 16
17
Factor Extraction: Eigenvalues & Scree Plot
Measure the amount of variation accounted for by each factor. Number of principal components is less than or equal to the number of original variables. The first principal component accounts for as much of the variability in the data as possible. Each succeeding component has the highest variance possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the preceding components. Scree Plots Plots a graph of each eigenvalue (Y-axis) against the factor with which it is associated (X-axis). By graphing the eigenvalues, the relative importance of each factor becomes apparent. 17
18
Factor Retention Based on Scree Plots
Cattell (1966) suggests using the ‘point of inflexion’ of the scree plot
19
Factor Retention based on Kaiser’s Criterion
Kaiser (1960) recommends retaining all factors with eigenvalues greater than 1. Based on the idea that eigenvalues represent the amount of variance explained by a factor and that an eigenvalue of 1 represents a substantial amount of variation. Jolliffe (1972; 1986) reported that Kaiser’s criterion is too strict and recommended retaining factors with eigenvalues more than 0.7. An eigenvalue of 1 can mean different things in different analyses (e.g., 100 variables vs. 10 variables; an eigenvalue of 1 means that the factor explains as much variance as a variable which defeats the purpose of the procedure).
20
Doing Factor Analysis: An Example
Students often become stressed about statistics (SAQ) and the use of computers and/or SPSS to analyze data. Suppose we develop a questionnaire to measure this propensity (see sample items on the following slides; the data can be found in SAQ.sav). Does the questionnaire measure a single construct? Or is it possible that there are multiple aspects comprising students’ anxiety toward SPSS?
21
Initial Considerations
The quality of the data (GIGO). Sample size is important! A sample of 300 or more will likely provide a stable factor solution, but depends on the number of variables and factors identified. Correlations among the items should not be too low (less than .3) or too high (greater than .8), but the pattern is what is important. Factors that have four or more loadings greater than 0.6 are likely to be reliable regardless of sample size. Screen the correlation matrix, eliminate any variables that obviously cause concern.
24
Step 1: Select Factor Analysis
24
25
Step 2: Add all variables to be included
25
26
Step 3: Get descriptive statistics & correlations
Produces the R-Matrix Significance of R-matrix correlations Tells us whether the area, or shape , of the data is singular (determinant is 0) or if all the variables are completely unrelated (determinant is 1) Relate to adequacy of sample size. KMO varies between 0 and 1 with higher being better. A significant Bartlett’s test is evidence that the correlation between variables are overall significantly different from 0. 26
27
Step 4: Ask for Scree Plot and set extraction options
27
28
Step 5: Handle missing values and sort coefficients by size
Eliminates all of a participant’s data if even one value is missing Eliminates only the missing value, but includes the rest of the person’s data. Sorts variables by their factor loadings. Only displays loadings above a specified level (you pick it). 28
29
Step 6: Select rotation type and set rotation iterations
Choice depends on whether you believe the underlying factors should be related. Varimax for independent factors; if related, then DirectOblimin or Promax. Varimax: loads a smaller number of variables highly onto each factor to produce more interpretable “clusters”. Quartimax: maximizes the spread of factor loadings for a variable across all factors leading to lots of variables loading onto a single factor. Equimax: hybrid of varimax and quartimax. 29
30
Factor Rotation To aid interpretation it is possible to maximize the loading of a variable on one factor while minimizing its loading on all other factors. This is known as Factor Rotation. Two types: Orthogonal (factors are uncorrelated) Oblique (factors intercorrelate)
31
Orthogonal Rotation Oblique Rotation
32
Step 7: Save Factor Scores
If the goal is to ensure that the factor scores are uncorrelated, select Anderson-Rubin; if correlations between factor scores are acceptable then choose the Regression method. 32
33
Variance Explained 33
34
Communalities / Factor Matrix
35
Scree Plot 35
36
Rotated Factor Matrix 36
37
Pattern Matrix 37
38
Structure Matrix 38
39
Factor Correlation Matrix
39
40
A measure should consistently reflect the construct it is measuring
Reliability: A measure should consistently reflect the construct it is measuring Test-Retest Method What about practice effects/mood states? Alternate Form Method Expensive and Impractical Split-Half Method Splits the questionnaire into two random halves, calculates scores and correlates them. Cronbach’s Alpha Splits the questionnaire (or sub-scales of a questionnaire) into all possible halves, calculates the scores, correlates them and averages the correlation for all splits. Ranges from 0 (no reliability) to 1 (complete reliability) Should be .7 or greater to be considered “reliable”.
41
Step 8: Reliability Analysis
42
Step 8: Reliability Analysis
43
Item-Total Statistics: Statistics sub-scale
43
44
Item-Total Statistics: Peer Comparison sub-scale
44
45
Item-Total Statistics: Fear of Computer sub-scale
45
46
Item-Total Statistics: Fear of Math sub-scale
46
47
Reporting the Results A principal component analysis (PCA) was conducted on the 23 items with orthogonal rotation (varimax). Bartlett’s test of sphericity, Χ2(253) = , p< .001, indicated that correlations between items were sufficiently large for PCA. An initial analysis was run to obtain eigenvalues for each component in the data. Four components had eigenvalues over Kaiser’s criterion of 1 and in combination explained 50.32% of the variance. The scree plot was slightly ambiguous and showed inflexions that would justify retaining either 2 or 4 factors. Given the large sample size, and the convergence of the scree plot and Kaiser’s criterion on four components, four components were retained in the final analysis. Component 1 represents a fear of computers, component 2 a fear of statistics, component 3 a fear of math, and component 4 peer evaluation concerns. The fear of computers, fear of statistics, and fear of math subscales of the SAQ all had high reliabilities, all Chronbach’s α = However, the fear of negative peer evaluation subscale had a relatively low reliability, Chronbach’s α= .57.
48
Procedure for factor analysis & PCA
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.