An Introduction to Factor Analysis

An Introduction to Factor Analysis
Reducing variables and/or detecting underlying structures

Books you’ll never see . . .

Uses Data reduction 24 actual variables Factor 1 Factor 2
Two latent variables

Uses Create composites/scales for psychometric instruments Depression
Anxiety

Uses Validate composites/scales for psychometric instruments
Depression Anxiety

Summary of uses Factor analytic techniques are most commonly used to reduce many items into a more usable number of factors. This way, the more simplified data can be used more easily in research. Also used in the development or exploration of questionnaires or other psychometric instruments.

Latent variables A metaphor

An example of common variance using bivariate relationships
I measure a sample of kindergarten children’s ability to recognize the sound(s) at the beginning of words, e.g., /k/ in “cat” I also measure the children’s ability to segment (break apart) sounds e.g., “cat” = /k/ /a/ /t/ I correlate these two measures

Beginning letter sounds
Phoneme Segmentation Beginning letter sounds

Not useful when The variables have inadequate reliability. This lack of stability of measurement affects the meaningfulness of the derived factors. A vast array of variables, with no theoretical association are forced into analysis just to see what turns up

Approaches to Factor Analytic Techniques
Exploratory Mathematically driven technique Seeks to identify the underlying structure of a set of items or variables Use of scholarly intuition to figure out what the factors mean Confirmatory Starts with a theory of what you expect to confirm (a priori) Do the items load as you expected on the factors that you predicted? Much more involved Structural Equation Modeling approach—test of model fit

Methodological Considerations
Selection of variables Size of sample Reliability of measures Appropriateness of using Factor Analytic techniques (given the goal of the research) Choice of method (how to extract the factors) How many factors to retain Methods of rotation (to ease interpretability) Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4),

Selection of variables Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4),

Assumptions and Requirements of Factor Analytic Techniques
More than one variable involved Sample acquired through random selection Robust bivariate relationships among variables Variables are measured using either interval or ratio (or ordinal—quasi-interval?) level data Data approximate a normal distributions (multivariate normality is also nice) Relationships among variables are linear Variables are measured reliably No multicolinearity (e.g., bivariate r above 0.90) Few missing observations “Large” number of observations

Size of sample What is a reasonable sample size? How many observations do you need? Old school: Ten observations per planned extracted factor (with a minimum of 100 recommended) “More is better” rule. Similar reasoning as other parametric statistical techniques, but less can be okay under some circumstances. Recently, it is more recognized that smaller samples can be reasonably factor analyzed, but this is something still hotly debated.

Reliability of measures
Factor analysis is a correlational technique (multiple regression) Low reliabilities attenuate correlations Low reliabilities introduce “noise” and obscure “signal” for the factors you are trying to detect and extract

Researcher as Quality Control

Appropriateness of Factor Analysis
Test development and instrument validation Create composites/sub-scales for psychometric instruments Detect underlying structures within Construct validity Evaluation of a theory Data reduction Reduce multiple variables to a smaller group, while maintaining the diversity of information offered. Demonstrate that multiple instruments test the same thing demonstrate that items load on one factor, or no factors, or multiple factors

Partitioning Variance
Variance common to other variables Variance specific to that variable Random measurement error

Most common methods of extracting factors?
Common variance Common Factor Analysis (CFA) Assumption: The factors explain the correlations among the variables (variance in common) Finds common variance among many items, groups it, and then it must be appropriately labeled Goal: To find the fewest number of factors that account for the relationships among variables Unique variance (item) Unique variance (item) CFA considers this variance Unique variance (item) DeCoster (1998) Overview of Factor Analysis Kahn 2006

Principal Components Analysis (PCA)
Unique variance (item+error) Assumption: Components explain the variance in common among the variables and the amount of unique variance (item & error) present Goal: To find the fewest components that account for the relationships among variables Unique variance (item+error) Unique variance (item+error)

Principal Components Analysis
Comparisons Common Factor Analysis Seeks the factors that account for the common variance among the variables Used for Exploratory Factor Analysis (EFA) or Confirmatory Factor Analysis (CFA) Easier to generalize to other samples/populations since the unique and error variance of items isn’t considered Most often used to detect underlying structures among variables. Principal Components Analysis Seeks factors that account for all of the common and other variance among the variables Harder to generalize since other sources of variance (that are item specific and not shared) are included in the model Most often used for data reduction to use in research

Factor Analytic Techniques
Observed variables Factor Analytic Techniques Item 1 unique Item 4 Exploratory questions: unique What factors exist among the variables? unique Item 5 Factor 1 Item 7 unique Item 8 Latent Variables (unobserved) unique Item 2 unique Item 3 unique To what degree are the variables (items) related to the factors that were extracted? FACTOR LOADINGS Item 6 Factor 2 unique Item 9 unique unique Item 10 Kahn 2006

Common Factor Analysis
CFA takes into account shared (common) and item specific variance and uses the squared multiple correlation (R squared) as the measure of communality. Communality is the variance in one variable that is shared with the other variables. The factors extracted by CFA, therefore, explain the shared variance common to more than one variable.

Variance common to other variables Multicultural Counseling Inventory—Item 6: “I include the facts of age, gender roles, and socioeconomic status in my understanding of different minority cultures.” The measured overlap (R square) between this item and the other items on the MCI is the communality.

Partitions variance for that variable, that is in common with other variables. How? Uses Multiple Regression. Use each item as an outcome in MR Use all other items as predictors Finds the communality among all of the variables, relative to one another

Outcome: Item 1 The R square is the average shared variance for that item with the other items Predictors: Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8 Item 9 Item 10 Item 1

Predictors: Item 1 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8 Item 9 Item 10 Outcome: Item 2 The average R square is the average shared variance for that item with the other items Item 2

Predictors: Item 1 Item 2 Item 4 Item 5 Item 6 Item 7 Item 8 Item 9 Item 10 Outcome: Item 3 The average R square is the average shared variance for that item with the other items Item 3

How is communality reported with CFA?
Item 1 Item 2 Item 3 Item 4 Item 5 Item .76 .60 .56 .43 .87 .34 .45 .64 .33 .32 .65 .52 Item 6 .82 .81 .57 .41 Squared multiple correlations (R square) are on the diagonal of the correlation matrix

What makes a good factor?
It is consistent with the literature regarding past investigations of variable relationships It is easy to understand and interpret It adheres to the “simple structure” model

Principal Component Analysis
Data reduction

Principal Component Analysis
Item 1 unique Item 4 unique unique Item 5 Component 1 Item 7 unique How many components are there that can account for all or most of the information contained in the original data? Item 8 unique Item 2 unique Item 3 unique Item 6 Component 2 unique Item 9 unique unique Item 10 Kahn 2006

How is communality reported with PCA?
Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 1.0 .71 .62 .76 .34 .45 .64 .33 .32 .65 .82 .81 .57

CFA vs. PCA Common factor analysis and principal components analysis often yield similar results when sample sizes are large and/or if item communalities are large. Common factor analysis is preferred in situations in which these criteria are not met, especially when the researcher wishes to better understand the latent variables that underlie a mass of items. So not just data reduction but dimensionality

Factor Analytic Family of Techniques
Metaphors for extraction of factors/components

With each extraction of a component, less and less variance is unaccounted for.
1 2 4 5 6 7 8 3

Factor Analysis Metaphor
ITEM POOL: Variance-covariance matrix for an instrument + + + First factor + + + + + + Extracts the shared variation only (i.e., plusses) + + + + + + + + + + + + + + + + + + + + + + + + + + + ITEM POOL: There is still shared variance left, but it is different than the first batch + + + Second factor + - - Extracts the shared variation only (i.e., plusses) + + + + + + + + + - - + + +

The Principle of Parsimony
Goal: We often want to use the smallest number of separate variables to convey the most information about the relationships among constructs. “Less is more” Kahn 2006

Selection of variables Size of sample Reliability of measures Appropriateness of using Factor Analytic techniques (given the goal of the research) Choice of method (how to extract the factors) How many factors to retain? Methods of rotation (to ease interpretability) Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4),

How many factors to retain?
If you keep letting the program extract factors, it will extract as many factors as there are items. So how do you decide how many factors to extract? Bryant & Yarnold (1995). Principal-Components and Factor Analysis from Grimm & Yarnold’s (Eds.) Reading and Understanding Multivariate Statistics

You want the fewest factors necessary to account for the most variance.
Factor Analytic techniques will give you as many factors as you want (even if they’re complete nonsense). The aim is to find the real factors that are consistent with the theoretical structure, not just factors that pop up and have no logical explanation. Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

A priori criterion Replication criterion Percentage criterion Stopping rules Kaiser rule Catell’s scree plot Parallel analysis Bryant & Yarnold (1995). Principal-Components and Factor Analysis from Grimm & Yarnold’s (Eds.) Reading and Understanding Multivariate Statistics

A priori criterion 1. When you are replicating research and you want to use the same number of factors to retain as previous researchers. 2. You decide a cut-off point, based on some theoretical rationale (e.g., retain factors until 80% of the variance is explained by the extracted factors).

Eigenvalues The eigenvalue is the variance in every variable that is accounted for by the factor in question. The sum of all eigenvalues = number of variables/items in component analysis Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

(For CFA, which SPSS calls principal axis factoring, this would be “factor” instead of “component”) Kaiser criterion - Retain all factors with an Eigenvalue greater than 1.0) This sets the limit so that a component must account for at least as much variance as a single variable (to be considered useful). Kahn 2006

Catell’s scree test: Retain all factors with a big drop (change in slope). Can be combined with the Kaiser criterion (Factors with an eigenvalue greater than 1.0) This includes the limit so that a factor must show that it accounts for a chunk of unique variance that is more than the variance of a single item.

Parallel Analysis You generate a scree plot (with eigenvalues) based on random data that uses the same number of variables (items) and the same number of cases. Retain the factors with eigenvalues higher than the random eignenvalues. Not an option in SPSS Kahn 2006

Obtaining a clearer pattern of factor loadings
Factor Rotation Obtaining a clearer pattern of factor loadings

The Goal of Rotating Factors
To create high factor loadings for each item on one factor And create low factors loadings for all other factors THIS COMBINATION OF CHARACTERISTICS IS REFERRED TO AS THE SIMPLE STRUCTURE. IT MAKES THE FACTORS MORE INTERPRETABLE Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

Factor Structure Coefficients
These are correlations between the item and it’s associated factor. The simple structure dictates that factor coefficients are best if they are very high (in reference to their own factor) and very low (in reference to any other retained factor). Rotating factors will change their structure coefficients, thus better approximating the simple structure being sought.

Thurston’s Rule Good items (variables) should only load onto one factor Items should load on that one factor at least a magnitude of 0.30. The item should not have an eigenvalue of less than 1.0

Distillation Item 1 Factor 1 Factor 2 Item 2 Item 1 Item 2 Item 3

Kirby, J. R. , Parrila, R. , & Pfeiffer, S. (2003)
Kirby, J.R., Parrila, R., & Pfeiffer, S. (2003). Naming speed and phonological awareness as predictors of reading development. Journal of Educational Psychology, 93(3),

Factor 1 Factor 1 Factor 2 Factor 2
Kirby, J.R., Parrila, R., & Pfeiffer, S. (2003). Naming speed and phonological awareness as predictors of reading development. Journal of Educational Psychology, 93(3), Factor 1 Factor 1 Factor 2 Factor 2

Common rotations Orthogonal - factors are at 90 degree angles (i.e., uncorrelated) *Varimax Quartimax Equimax *most popular Oblique-Factors maybe correlated with each other. Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

Factor Extraction Because the first factor extracted accounts for the most variance among the variables, the next factor extracted will capture variance not accounted for by the first factor. This helps the latent variables be “orthogonal,” meaning that the extracted factors are generally uncorrelated with each other.

Orthogonal Rotations Varimax: Most common. Maximizes loadings on one factor while minimizing loadings on other factors. Quartimax: Uncommon. Maximizes factor loading on the first factor only. Equimax: Also less common. Combines other techniques and because of this, is more difficult to interpret than the other two options. Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

Oblique rotations Not used frequently but should be when factors are correlated. Promax is the most popular of the oblique methods First rotates orthogonally Then followed by oblique rotation Minimizes small loadings Simple structure is best approximated Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

How to decide? You want what will give you the most interpretable result, with the simplest solution, consistent with an underlying theoretical structure. You can use different rotational techniques and compare results. Similar results strengthen confidence in the outcome. Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

How to clarify factor loadings using rotation
Item 1 Item 2 Item 3 Factor’s 2 axis Item 4 Factor 1 axis

Rotation Item 1 Item 2 Factor’s 2 axis Item 4 Factor 1 axis

Factor Rotation Item 1 Item 2 Item 3 Item 4 Rotated Factor 1
axis Item 1 Item 2 Factor 1 axis Rotated Factor 1 Rotated Factor 2 Item 3 Item 4

Before orthogonal rotation
Eigenvectors Variables 1 2 .62 .52 .54 .25 3 .59 4 .39 .66 5 .35 .68 Factor loading coefficients define the eigenvector. The factor loading coefficient represents the correlation between the item and the eigenvector

After orthogonal rotation
Eigenvectors Variables 1 2 .65 .45 .62 .09 3 .05 .69 4 .02 .68 5 .10 .82 Factor loading coefficients define the eigenvector. The factor loading coefficient represents the correlation between the item and the eigenvector

Factor coefficients: before and after
Eigenvectors Variables 1 2 .62 .52 .54 .25 3 .59 4 .39 .66 5 .35 .68 Eigenvectors Variables 1 2 .65 .45 .62 .09 3 .05 .69 4 .02 .68 5 .10 .82

Uses of Factor Analytic Techniques
All of the techniques associated with creating factors from many variables are sample specific; however, the better the quality of your sample (size, representativeness, etc.), the more likely your results will generalize to other samples, and theoretically, to the population of interest.

Floyd & Widaman (1995) “Thus, common factor analysis can provide valuable insights into the multivariate structure of a measuring instrument, isolating the theoretical constructs [i.e., factors] whose effects are reflected in responses on the instrument.” (p. 287)

Cross Validation Randomly divide your sample (2/3, 1/3)
Try to replicate factor solutions across groups Explore for part of the sample, then confirm with the other portion

Chi-square goodness of fit test
EFA vs. CFA Exploratory Find and retain factors (no test of significance, per se) Confirmatory See how well the constructed model fits the data Chi-square goodness of fit test

Confirmatory Factor Analysis and Model Fit
The researcher specifies in advance (predicts) how many factors will be found and which items should load on which factors. Factor 1 Factor 2 Factor 3 Factor 4

Links and Resources

An Introduction to Factor Analysis

Similar presentations

Presentation on theme: "An Introduction to Factor Analysis"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

An Introduction to Factor Analysis

Similar presentations

Presentation on theme: "An Introduction to Factor Analysis"— Presentation transcript:

Similar presentations

About project

Feedback