Download presentation
Presentation is loading. Please wait.
Published byAnnabelle Mitchell Modified over 9 years ago
1
Department of Cognitive Science Michael Kalsher Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 Factor Analysis 1 PSYC 4310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher
2
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 2 What Are Factors? Representing Factors –Graphs and Equations Extracting factors –Methods and Criteria Interpreting Factor Structures –Factor Rotation Reliability –Cronbach’s alpha Writing Results Outline
3
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 3 When to use Factor Analysis? Data ReductionData Reduction Identification of underlying latent structuresIdentification of underlying latent structures -Clusters of correlated variables are termed factors –Example: –Factor analysis could potentially be used to identify the characteristics (out of a large number of characteristics) that make a person popular. Candidate characteristics: Level of social skills, selfishness, how interesting a person is to others, the amount of time they spend talking about themselves (Talk 2) versus the other person (Talk 1), their propensity to lie about themselves.
4
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 4 The R-Matrix Meaningful clusters of large correlation coefficients between subsets of variables suggests these variables are measuring aspects of the same underlying dimension. Factor 1: The better your social skills, the more interesting and talkative you tend to be. Factor 2: Selfish people are likely to lie and talk about themselves.
5
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 5 What is a Factor? Factors can be viewed as classification axes along which the individual variables can be plotted. The greater the loading of variables on a factor, the more the factor explains relationships among those variables. Ideally, variables should be strongly related to (or load on) only one factor.
6
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 6 Graphical Representation of a factor plot Note that each variable loads primarily on only one factor. Factor loadings tell use about the relative contribution that a variable makes to a factor
7
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 7 Mathematical Representation of a factor plot Y i = b 1 X 1i +b 2 X 2i + … b n X n + ε i Factor i = b 1 Variable 1i +b 2 Variable 2i + … b n Variable n + ε i The equation describing a linear model can be applied to the description of a factor. The b’s in the equation represent the factor loadings observed in the factor plot. Note: there is no intercept in the equation since the lines intersection at zero and hence the intercept is also zero.
8
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 8 Mathematical Representation of a factor plot Sociability i = b 1 Talk 1 i +b 2 Social Skills i + b 3 interest i + b 4 Talk 2 + b 5 Selfish i + b 6 Liar i + ε i There are two factors underlying the popularity construct: general sociability and consideration. We can construct equations that describe each factor in terms of the variables that have been measured. Consideration i = b 1 Talk 1 i +b 2 Social Skills i + b 3 interest i + b 4 Talk 2 + b 5 Selfish i + b 6 Liar i + ε i
9
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 9 Mathematical Representation of a factor plot Sociability i = 0.87Talk 1 i +0.96Social Skills i + 0.92Interest i + 0.00Talk 2 - 0.10Selfish i + 0.09Liar i + ε i The values of the “b’s” in the two equations differ, depending on the relative importance of each variable to a particular factor. Consideration i = 0.01Talk 1 i - 0.03Social Skills i + 0.04interest i + 0.82Talk 2 + 0.75Selfish i + 0.70Liar i + ε i Ideally, variables should have very high b-values for one factor and very low b-values for all other factors. Replace values of b with the co-ordinate of each variable on the graph.
10
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 10 Factor Loadings The b values represent the weights of a variable on a factor and are termed Factor Loadings. These values are stored in a Factor pattern matrix (A). Columns display the factors (underlying constructs) and rows display how each variable loads onto each factor. Variables Factors SociabilityConsideration Talk 10.870.01 Social Skills0.96-0.03 Interest0.920.04 Talk 20.000.82 Selfish-0.100.75 Liar0.090.70
11
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 11 Factor Scores Once factors are derived, we can estimate each person’s Factor Scores (based on their scores for each factor’s constituent variables). Potential uses for Factor Scores. -Estimate a person’s score on one or more factors. -Answer questions of scientific or practical interest (e.g., Are females are more sociable than males? using the factors scores for sociability). Methods of Determining Factor Scores -Weighted Average (simplest, but scale dependent) -Regression Method (easiest to understand; most typically used) -Bartlett Method (produces scores that are unbiased and correlate only with their own factor). -Anderson-Rubin Method (produces scores that are uncorrelated and standardized)
12
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Approaches to Factor Analysis Exploratory –Reduce a number of measurements to a smaller number of indices or factors (e.g., Principal Components Analysis or PCA). –Goal: Identify factors based on the data and to maximize the amount of variance explained. Confirmatory –Test hypothetical relationships between measures and more abstract constructs. –Goal: The researcher must hypothesize, in advance, the number of factors, whether or not these factors are correlated, and which items load onto and reflect particular factors. In contrast to EFA, where all loadings are free to vary, CFA allows for the explicit constraint of certain loadings to be zero.
13
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Communality Understanding variance in an R-matrix –Total variance for a particular variable has two components: Common Variance – variance shared with other variables. Unique Variance – variance specific to that variable (including error or random variance). Communality –The proportion of common (or shared) variance present in a variable is known as the communality. –A variable that has no unique variance has a communality of 1; one that shares none of its variance with any other variable has a communality of 0.
14
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Factor Extraction: PCA vs. Factor Analysis –Principal Component Analysis. A data reduction technique that represents a set of variables by a smaller number of variables called principal components. They are uncorrelated, and therefore, measure different, unrelated aspects or dimensions of the data. –Principal Components are chosen such that the first one accounts for as much of the variation in the data as possible, the second one for as much of the remaining variance as possible, and so on. –Useful for combining many variables into a smaller number of subsets. –Factor Analysis. Derives a mathematical model from which factors are estimated. –Factors are linear combinations that maximize the shared portion of the variance underlying latent constructs. –May be used to identify the structure underlying such variables and to estimate scores to measure latent factors themselves.
15
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Factor Extraction: Eigenvalues & Scree Plot Eigenvalues –Measure the amount of variation accounted for by each factor. –Number of principal components is less than or equal to the number of original variables. The first principal component accounts for as much of the variability in the data as possible. Each succeeding component has the highest variance possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the preceding components. Scree Plots –Plots a graph of each eigenvalue (Y-axis) against the factor with which it is associated (X-axis). –By graphing the eigenvalues, the relative importance of each factor becomes apparent.
16
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 16 Factor Retention Based on Scree Plots
17
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 17 Kaiser (1960) recommends retaining all factors with eigenvalues greater than 1. -Based on the idea that eigenvalues represent the amount of variance explained by a factor and that an eigenvalue of 1 represents a substantial amount of variation. -Kaiser’s criterion tends to overestimate the number of factors to be retained. Factor Retention: Kaiser’s Criterion
18
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 18 Students often become stressed about statistics (SAQ) and the use of computers and/or SPSS to analyze data. Suppose we develop a questionnaire to measure this propensity (see sample items on the following slides; the data can be found in SAQ.sav). Does the questionnaire measure a single construct? Or is it possible that there are multiple aspects comprising students’ anxiety toward SPSS? Doing Factor Analysis: An Example
19
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 19
20
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 20
21
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 21 Doing Factor Analysis: Some Considerations Sample size is important! A sample of 300 or more will likely provide a stable factor solution, but depends on the number of variables and factors identified. Factors that have four or more loadings greater than 0.6 are likely to be reliable regardless of sample size. Correlations among the items should not be too low (less than.3) or too high (greater than.8), but the pattern is what is important.
22
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 22 Factor Extraction
23
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 23 Scree Plot for the SAQ Data
24
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 24 Table of Communalities Before and After Extraction Component Matrix Before Rotation (loadings of each variable onto each factor) Note: Loadings less than 0.4 have been omitted.
25
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 25 Factor Rotation To aid interpretation it is possible to maximize the loading of a variable on one factor while minimizing its loading on all other factors. This is known as Factor Rotation. Two types: –Orthogonal –Orthogonal (factors are uncorrelated) –Oblique –Oblique (factors intercorrelate)
26
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 26 Orthogonal Rotation Oblique Rotation
27
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 27 Orthogonal Rotation (varimax) Fear of Computers Fear of Statistics Fear of Math Peer Evaluation Note: Varimax rotation is the most commonly used rotation. Its goal is to minimize the complexity of the components by making the large loadings larger and the small loadings smaller within each component. Quartimax rotation makes large loadings larger and small loadings smaller within each variable. Equamax rotation is a compromise that attempts to simplify both components and variables. These are all orthogonal rotations, that is, the axes remain perpendicular, so the components are not correlated.
28
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 28 Oblique Rotation: Pattern Matrix Fear of Statistics Fear of Computers Fear of Math Peer Evaluation
29
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 29 Reliability: A measure should consistently reflect the construct it is measuring Test-Retest Method –What about practice effects/mood states? Alternate Form Method –Expensive and Impractical Split-Half Method –Splits the questionnaire into two random halves, calculates scores and correlates them. Cronbach’s Alpha –Splits the questionnaire (or sub-scales of a questionnaire) into all possible halves, calculates the scores, correlates them and averages the correlation for all splits. –Ranges from 0 (no reliability) to 1 (complete reliability)
30
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 30 Reliability: Fear of Computers Subscale
31
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 31 Reliability: Fear of Statistics Subscale
32
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 32 Reliability: Fear of Math Subscale
33
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 33 Reliability: Peer Evaluation Subscale
34
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt 34 Reporting the Results A principal component analysis (PCA) was conducted on the 23 items with orthogonal rotation (varimax). Bartlett’s test of sphericity, Χ 2 (253) = 19334.49, p<.001, indicated that correlations between items were sufficiently large for PCA. An initial analysis was run to obtain eigenvalues for each component in the data. Four components had eigenvalues over Kaiser’s criterion of 1 and in combination explained 50.32% of the variance. The scree plot was slightly ambiguous and showed inflexions that would justify retaining either 2 or 4 factors. Given the large sample size, and the convergence of the scree plot and Kaiser’s criterion on four components, four components were retained in the final analysis. Component 1 represents a fear of computers, component 2 a fear of statistics, component 3 a fear of math, and component 4 peer evaluation concerns. The fear of computers, fear of statistics, and fear of math subscales of the SAQ all had high reliabilities, all Chronbach’s α =.82. However, the fear of negative peer evaluation subscale had a relatively low reliability, Chronbach’s α=.57.
35
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Step 1: Select Factor Analysis
36
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Step 2: Add all variables to be included
37
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Step 3: Get descriptive statistics & correlations
38
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Step 4: Ask for Scree Plot and set extraction options
39
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Step 5: Handle missing values and sort coefficients by size
40
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Step 6: Select rotation type and set rotation iterations
41
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Step 7: Save Factor Scores
42
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Communalities
43
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Variance Explained
44
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Scree Plot
45
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Rotated Component Matrix: Component 1
46
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Rotated Component Matrix: Component 2
47
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Component 1: Factor Score
48
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Component (Factor): Score Values
49
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2009, Michael Kalsher and James Watt Rename Components According to Interpretation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.