Exploratory Factor Analysis

Slides:

Advertisements

Similar presentations

Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides

Advertisements

Chapter Nineteen Factor Analysis.

© LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON

Principal Components Analysis with SAS Karl L. Wuensch Dept of Psychology East Carolina University.

Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.

Factor Analysis There are two main types of factor analysis:

Determining the # Of PCs Remembering the process Some cautionary comments Statistical approaches Mathematical approaches “Nontrivial factors” approaches.

1 Carrying out EFA - stages Ensure that data are suitable Decide on the model - PAF or PCA Decide how many factors are required to represent you data When.

Principal component analysis

Education 795 Class Notes Factor Analysis II Note set 7.

Chapter 7 Correlational Research Gay, Mills, and Airasian

Relationships Among Variables

Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Internal Consistency Reliability Analysis PowerPoint.

Multivariate Methods EPSY 5245 Michael C. Rodriguez.

Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Factor Analysis PowerPoint Prepared by Alfred.

Using the SmartPLS Software Assessment of Measurement Models

Introduction to CFA. LEARNING OBJECTIVES: Upon completing this chapter, you should be able to do the following: Distinguish between exploratory factor.

Psy 427 Cal State Northridge Andrew Ainsworth PhD.

Tests and Measurements Intersession 2006.

MGMT 6971 PSYCHOMETRICS © 2014, Michael Kalsher

Advanced Correlational Analyses D/RS 1013 Factor Analysis.

Applied Quantitative Analysis and Practices

By Cao Hao Thi - Fredric W. Swierczek

6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)

Factor Analysis ( 因素分析 ) Kaiping Grace Yao National Taiwan University

Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.

Reliability Analysis Based on the results of the PAF, a reliability analysis was run on the 16 items retained in the Task Value subscale. The Cronbach’s.

Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.

Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.

Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.

Interpreting Correlation Coefficients. Correlations Helpful in determining the extent of the relationships between –Ratio variables –Interval variables.

Explanatory Factor Analysis: Alpha and Omega Dominique Zephyr Applied Statistics Lab University of Kenctucky.

© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.

Lecture 12 Factor Analysis.

Applied Quantitative Analysis and Practices

Exploratory Factor Analysis. Principal components analysis seeks linear combinations that best capture the variation in the original variables. Factor.

Education 795 Class Notes Factor Analysis Note set 6.

Exploratory Factor Analysis Principal Component Analysis Chapter 17.

Chapter 13.  Both Principle components analysis (PCA) and Exploratory factor analysis (EFA) are used to understand the underlying patterns in the data.

Department of Cognitive Science Michael Kalsher Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 Factor Analysis 1 PSYC 4310 Advanced Experimental.

Advanced Statistics Factor Analysis, I. Introduction Factor analysis is a statistical technique about the relation between: (a)observed variables (X i.

Applied Quantitative Analysis and Practices LECTURE#19 By Dr. Osman Sadiq Paracha.

FACTOR ANALYSIS 1. What is Factor Analysis (FA)? Method of data reduction o take many variables and explain them with a few “factors” or “components”

Applied Quantitative Analysis and Practices LECTURE#17 By Dr. Osman Sadiq Paracha.

Chapter 8 Relationships Among Variables. Outline What correlational research investigates Understanding the nature of correlation What the coefficient.

SW388R7 Data Analysis & Computers II Slide 1 Principal component analysis Strategy for solving problems Sample problem Steps in principal component analysis.

Principal Component Analysis

Independent Samples ANOVA. Outline of Today’s Discussion 1.Independent Samples ANOVA: A Conceptual Introduction 2.The Equal Variance Assumption 3.Cumulative.

FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.

Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.

Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.

FACTOR ANALYSIS & SPSS. First, let’s check the reliability of the scale Go to Analyze, Scale and Reliability analysis.

1 FACTOR ANALYSIS Kazimieras Pukėnas. 2 Factor analysis is used to uncover the latent (not observed directly) structure (dimensions) of a set of variables.

Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.

FACTOR ANALYSIS & SPSS.

Exploratory Factor Analysis

EXPLORATORY FACTOR ANALYSIS (EFA)

Lecturing 11 Exploratory Factor Analysis

Analysis of Survey Results

Showcasing the use of Factor Analysis in data reduction: Research on learner support for In-service teachers Richard Ouma University of York SPSS Users.

Example: Computer Preference – 1 Factor

An introduction to exploratory factor analysis in IBM SPSS Statistics

Advanced Data Preparation

An Empirical Study On Willingness To Pay of the Electricity in Taiwan

© LOUIS COHEN, LAWRENCE MANION AND KEITH MORRISON

EPSY 5245 EPSY 5245 Michael C. Rodriguez

Principal Component Analysis

PCA of Waimea Wave Climate

Presentation transcript:

Exploratory Factor Analysis

Suitable for FA? Based on what? Stages of making a decision on the factors to be extracted What is the convergent validity? discriminant validity? Reliability. Overall reliability? Extracted factors’ reliability? Interpretation of the factor structure label these extracted factors Conclusion

Suitable for FA? At the initial stage of preliminary checking: Correlation R-Matrix  These items are eyesores. Q6 (r = .271), Q7(r = .225), Q10 (r =.254), Q12 (r =.079), Q19 (r = - .095), Q20 (r = .171), Q23 (r = .281), Q25 (r =.176), Q26 (r = .151), and Q27 (r = .259)  Why? The standard that the extent of association among items should be within 0.3~0.8 is not met. Any of these items are not correlated with the remaining items to the extent that they are likely to be separately loaded on specific factors.

Suitable for FA? Communalities table singularity  Q12 (factor loading value is 0.297) Determinant value : 0.00000124 < 0.00001  multicollinearity problem

Suitable for FA? At the initial stage of preliminary checking: KMO value (= .894) > 0.5 Barlett’s test of sphericity: statistical sig. Anti-image Correlation Matrix shows that values along diagonal line is larger than 0.5, and values off the diagonal line are dominantly smaller, which meet the Measure of sampling adequacy (MSA) criteria with 0.5 set as the minimum requirement.

Suitable for FA? Bland’s theory of research methods lecturers predicted that good research methods lecturers should have four characteristics (i.e., a profound love of statistics, an enthusiasm for experimental design, a love of teaching, and a complete absence of normal interpersonal skills).  supported or refuted? These four characteristics are correlated to some degree.  Multicollinearity is understandable . We can’t conclude the dataset is not suitable for FA only because of its multicollinearity problem.

Suitable for FA? In terms of KMO with statistical significance, an indicator of sampling adequacy, Anti-image Correlation Matrix, meeting the Measure of sampling adequacy (MSA) Communalities: most items have reached the minimum criterion 0.5, indicating that most items have reached the degree of being explained by common factors  Suitable for FA, but some items had better be crossed out.

Stages of making a decision on the factors to be extracted At the preliminary stage :  an action taken: Q12 (singularity problem) and Q10 (comparatively low factor loading value =0.417< 0.5) deleted. At the second stage: an action taken : the remaining items (26 items) are under EFA by resorting to ablimin rotation approach. ( because of expected correlated underlying factors)

Stages of making a decision on the factors to be extracted At the second stage: Pattern Matrix table  Q21 and Q27 crossing-load on two components  the loading values of Q1, Q9, and Q11 are suppressed due to their coefficient values below the threshold set as 0.4.

Stages of making a decision on the factors to be extracted At the second stage: Q21, Q27, Q1, Q9, and Q11 deleted. 21 items are left for EFA again. At the third stage: determinant value (=0.000),slightly larger than the benchmark 0.00001. Pattern Matrix : no crossing-loading variables.

Stages of making a decision on the factors to be extracted At the third stage: KMO value is .868 with statistical significance total variance of being explained : these extracted five components after rotation account for nearly 62 percent of variance eigenvalue of each component >1 communalities: only one variable value, Q7 (= 0.478), is below the threshold value 0.5.

Stages of making a decision on the factors to be extracted Pattern Matrix : two items ---Q7 (.483), Q26(.438) --- factor loadings are not as high as other items loaded onto factors. But in terms of convergent validity criteria flexibly varying with various sample sizes, these variables Q7,Q26 still with sufficient factor loading values (minimum benchmark 0.35~0.4 for sample size ranging from 250~200), if retained, can be justified.

Stages of making a decision on the factors to be extracted Kaiser’s criterion is not met communalities values after extraction > 0.7 ( if the # of variables is less than 30 ) sample size > 250 average communality > 0.6  retain all factors with eigenvalues above 1 Scree plot is the last resort to turn to if sample size is large (i.e., around 300 or more) 21 items decided  five factors extracted Before coming to the final decision about how many factors to be extracted, Kaiser’s criterion has to be checked. In this case, Kaiser’s criterion is not met, because the three conditions do not occur. Then, the Scree plot is the last resort to turn to if sample size is large. However, as for our dataset, the sample size is not that large to resort to Scree plot to make the final decision. But if we suppose sample size is large enough, then there is ground for us to turn to the Scree Plot as the last resort to say that the extracted five factors among the 21 items can be decided.

Convergent Validity refer to to what extent variables loaded within a factor are correlated  the higher loading, the better. Factor structure : check Pattern Matrix to know about the convergent validity (no crossing-loadings between factors )  variables precisely loading on factors check convergent validity in terms of sample size. In this case, the sample size is 239; the convergent validity is acceptable, for most variables are above the range of 0.35 to 0.4. in terms of loadings within factors.

Discriminant Validity 2 ways to check discriminant validity Check Pattern Matrix to see no crossing- loadings Check Factor Correlation Matrix : correlations between factors do not exceed 0.7.

Factor Correlation Matrix Discriminant Validity Correlations between factors do not exceed 0.7 Factor Correlation Matrix Factor 1 2 3 4 5 1.000 .452 .585 .480 .322 .506 .205 -.127 .351 .315 Extraction Method: Principal Axis Factoring. Rotation Method: Promax with Kaiser Normalization. After the five factors are extracted, the variables loaded on these five factors are statistically discriminant from each other, because the correlations between each factor do not exceed 0.7.

Reliability Statistics Overall Reliability of the 21 items in the dataset (TOSSE.sav.) Reliability Statistics Cronbach's Alpha Alpha Based on Standardized Items N of Items .879 .881 21 Larger than 0.7

Reliability Statistics Reliability Statistics Cronbach's Alpha Cronbach's Alpha Based on Standardized Items N of Items .880 .886 6 Reliability of Comp 1> 0.7 Reliability Statistics Cronbach's Alpha Cronbach's Alpha Based on Standardized Items N of Items .679 3 Reliability of Comp 2 =. 0.7

Reliability Statistics Reliability Statistics Reliability Statistics Cronbach's Alpha Cronbach's Alpha Based on Standardized Items N of Items .717 .742 4 Reliability of Comp 3 > 0.7 Reliability Statistics Cronbach's Alpha Cronbach's Alpha Based on Standardized Items N of Items .690 .692 3 Reliability of Comp 4 =. 0.7 In terms of overall reliability, and individual factor reliability, the factor structure is ok. Reliability Statistics Cronbach's Alpha Cronbach's Alpha Based on Standardized Items N of Items .736 . 737 5 Reliability of Comp 5 > 0.7

Interpretation of extracted 5 factors labels of the five factors: Component 1: ‘Passion for Applying Statistics Knowledge’ Component 2 : ‘Apprehension for Teaching ’ Component 3: ‘Obsession with Successfully Applying Statistics to Experiment’ Component 4: ‘Preference for being alone’, Component 5: ‘Passion for teaching Statistics’

Component 1: ‘Passion for Applying Statistics Knowledge’ 2 3 4 5 Thinking about whether to use repeated or independent measures thrills me .835 I'd rather think about appropriate dependent variables than go to the pub .824 I quiver with excitement when thinking about designing my next experiment .773 I enjoy sitting in the park contemplating whether to use participant observation in my next experiment .752 Designing experiments is fun .597 I like control conditions .582

Component 2 : ‘Apprehension for Teaching’ Teaching others makes me want to swallow a large bottle of bleach because the pain of my burning oesophagus would be light relief in comparison .819 If I had a big gun I'd shoot all the students I have to teach .782 Standing in front of 300 people in no way makes me lose control of my bowels .526

Component 3: ‘Obsession with Successfully Applying Statistics to Experiment’ I tried to build myself a time machine so that I could go back to the 1930s and follow Fisher around on my hands and knees licking the floor on which he'd just trodden .767 I memorize probability values for the F-distribution .742 I worship at the shrine of Pearson .570 I soil my pants with excitement at the mere mention of Factor Analysis .530

Component 4: ‘Preference for being alone’ I often spend my spare time talking to the pigeons ... and even they die of boredom .763 My cat is my only friend .760 I still live with my mother and have little personal hygiene .734

Component 5: ‘Passion for teaching Statistics’ Passing on knowledge is the greatest gift you can bestow an individual .705 I like to help students .686 I love teaching .677 Helping others to understand Sums of Squares is a great feeling .483 I spend lots of time helping students .438

Conclusion The extracted five factors refute Bland’s theory through the EFA, for we are asked to test the theory of four personality traits the labeling of Component 2 (Apprehension for Teaching) contradicts the labeling of Component 5 (Passion for teaching Statistics) Individual Factor reliability ---Comp 2 / Comp 4 at the margin of 0.7, not above 0.7

Why don’t we first group the question items into four components in correspondence with the four characteristics proposed by Bland, and then run FA? CFA?

Conclusion When EFA is resorted to, very often an extracted factor loaded with some variables as a cluster is hard to be labeled. And thus several trials seem unavoidable until the labeling of a factor can comprehensively interpret the variables loaded on that factor. As such, this dataset seems to be more like a CFA case because of the already-existing hypothesis about the underlying constructs (i.e., four personality traits).