1 FACTOR ANALYSIS Kazimieras Pukėnas. 2 Factor analysis is used to uncover the latent (not observed directly) structure (dimensions) of a set of variables.

Slides:



Advertisements
Similar presentations
Hypothesis testing 5th - 9th December 2011, Rome.
Advertisements

Canonical Correlation
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Chapter Nineteen Factor Analysis.
© LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Lecture 7: Principal component analysis (PCA)
Analysis of variance (ANOVA)-the General Linear Model (GLM)
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
Factor Analysis There are two main types of factor analysis:
Factor Analysis Factor analysis is a method of dimension reduction.
Chi-square Test of Independence
Principal component analysis
Analysis of Variance & Multivariate Analysis of Variance
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Internal Consistency Reliability Analysis PowerPoint.
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
Example of Simple and Multiple Regression
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Factor Analysis PowerPoint Prepared by Alfred.
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
Advanced Correlational Analyses D/RS 1013 Factor Analysis.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
Hypothesis testing Intermediate Food Security Analysis Training Rome, July 2010.
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
© 2007 Prentice Hall19-1 Chapter Nineteen Factor Analysis © 2007 Prentice Hall.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
MK346 – Undergraduate Dissertation Preparation Part II - Data Analysis and Significance Testing.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
Introduction to Multivariate Analysis of Variance, Factor Analysis, and Logistic Regression Rubab G. ARIM, MA University of British Columbia December 2006.
Lecture 12 Factor Analysis.
Copyright © 2010 Pearson Education, Inc Chapter Nineteen Factor Analysis.
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
Applied Quantitative Analysis and Practices
Education 795 Class Notes Factor Analysis Note set 6.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Chapter 13.  Both Principle components analysis (PCA) and Exploratory factor analysis (EFA) are used to understand the underlying patterns in the data.
Chi-Square Analyses.
Factor Analysis I Principle Components Analysis. “Data Reduction” Purpose of factor analysis is to determine a minimum number of “factors” or components.
Applied Quantitative Analysis and Practices LECTURE#19 By Dr. Osman Sadiq Paracha.
PART 2 SPSS (the Statistical Package for the Social Sciences)
Factor Analysis. Introduction 1. Factor Analysis is a set of techniques used for understanding variables by grouping them into “factors” consisting of.
SW388R7 Data Analysis & Computers II Slide 1 Principal component analysis Strategy for solving problems Sample problem Steps in principal component analysis.
Principal Component Analysis
FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
FACTOR ANALYSIS & SPSS. First, let’s check the reliability of the scale Go to Analyze, Scale and Reliability analysis.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
FACTOR ANALYSIS & SPSS.
Exploratory Factor Analysis
BINARY LOGISTIC REGRESSION
Dr. Siti Nor Binti Yaacob
EXPLORATORY FACTOR ANALYSIS (EFA)
Analysis of Survey Results
Showcasing the use of Factor Analysis in data reduction: Research on learner support for In-service teachers Richard Ouma University of York SPSS Users.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Descriptive Statistics vs. Factor Analysis
Measuring latent variables
Principal Component Analysis
Chapter_19 Factor Analysis
Reasoning in Psychology Using Statistics
Factor Analysis.
Reasoning in Psychology Using Statistics
Measuring latent variables
Presentation transcript:

1 FACTOR ANALYSIS Kazimieras Pukėnas

2 Factor analysis is used to uncover the latent (not observed directly) structure (dimensions) of a set of variables. It reduces attribute space from a larger number of variables to a smaller number of factors, which most often do not have a quantitative measure; Factor analysis could be used for any of the following purposes: To reduce a large number of variables to a smaller number of factors for modeling purposes; To validate a scale or index by demonstrating that its constituent items load on the same factor, and to drop proposed scale items which cross-load on more than one factor. To create a set of orthogonal factors to be treated as uncorrelated variables as one approach to handling multicollinearity in such procedures as multiple regression. INTRODUCTION

3 The most common factor analysis model linking the k variables with the m general latent (invisible) factors and the specific (characteristic) latent factor e 1, being described by the system of equations, where i=1,…, k, m<k, i.e. general factors are smaller in number than the observed variables. The multipliers are interpreted as factor weights. Knowing the X i values the task of factor analysis is to find estimates of common (general) factors, factors weights, and specific (unique) variances (which cannot be explained by common factors). k X,,X,X  21

4 ASSUMPTIONS In factor analysis, multicollinearity is necessary because variables must be highly associated with some of the other variables so they will load into factors. But all of the variables should not be highly correlated (except when only one factor will be present). It is best for factor analysis to have groups of variables highly associated with each other (which will result in those variables loading as a factor) and not correlated at all with other groups of variables. As indicator of the strength of the relationship among variables is Bartlett's test of sphericity. Bartlett's test of sphericity is used to test the null hypothesis that the variables in the population correlation matrix are uncorrelated. If Bartlett's test p<0.05, factor analysis can be proceed for the data. The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy is an index for comparing the magnitudes of the observed correlation coefficients to the magnitudes of the partial correlation coefficients. Large values for the KMO measure (KMO>0.7 or at minimum KMO>0.6) indicate that a factor analysis of the variables is a good idea.

5 EXTRACT THE FACTORS The next step in the factor analysis is to extract the factors. The most popular method of extracting factors is called a Principle Component Analysis. There are other competing, and sometimes preferable, extraction methods (such as Maximum Likelihood). These analyses determine how well the factors explain the variation. The goal here is to identify the linear combination of variables that account for the greatest amount of common variance.

6 FACTOR ROTATION The initial solutions are hard to interpret because variables tend to load on multiple factors. The so-called rotation procedure serves to make the output more understandable and is usually necessary to facilitate the interpretation of factors. The sum of eigenvalues is not affected by rotation, but rotation will alter the eigenvalues (and percent of variance explained) of particular factors and will change the factor loadings.

7 EXAMPLE Wellness complex visitors were asked to identify the elements (as range of services, quality of services, etc.), that have powerful effect on choices of the specific complex. Answers are a five scale: 1 - very important, 5 - it does not matter. To Obtain a Factor Analysis: Open the file with the data analyzed. From the menus choose: Analyze  Data Reduction  Factor… Select the variables for the factor analysis in Factor Analysis dialog box (Fig.1). Categorical data (such as religion or country of origin) are not suitable for factor analysis. Click Descriptives.

8 EXAMPLE Fig. 1. Factor Analysis dialog box

9 EXAMPLE Specify KMO and Bartlett‘s test of sphericity in Factor Analysis: Descriptives dialog box (Fig. 2) and click Continue. Click Rotation…and select the Varimax method of factor rotation in Factor Analysis: Rotation dialog box (Fig. 2). Leave default Rotated Solution option. Click Continue. Click Scores... and select the option Save as variables in Factor Analysis: Factor Scores dialog box (Fig. 3). This creates one new variable for each factor in the final solution. Leave default Regression Method. Click Continue.

10 EXAMPLE Fig. 2. Factor Analysis: Descriptives and Factor Analysis: Rotation dialog boxes

11 EXAMPLE Click Options..., check Supress small coefficients in Factor Analysis: Options dialog box (Fig. 3), and specify Absolute value below equal to 0,4. Click Continue. Click OK in Factor Analysis dialog box.

12 EXAMPLE Fig. 3. Factor Analysis: Scores and Factor Analysis: Options dialog boxes

13 EXAMPLE SPSS Output KMO and Bartlett‘s Test (Fig.4) shows the Kaiser- Meyer-Olkin (KMO) measure of sampling adequacy and Bartlett's test of sphericity. For these data KMO = 0.640, which falls into the requested range, and Bartlett's test is highly significant (p < 0.001), and therefore factor analysis is appropriate. SPSS Output Total Variance Explained (Fig. 4) lists the eigenvalues associated with each linear component (factor) before extraction, after extraction and after rotation. SPSS extracts all factors with eigenvalues greater than 1 as displayed in the columns labeled Extraction Sums of Squared Loadings. The eigenvalues associated with each factor represent the variance explained by that particular linear component and SPSS also displays the eigenvalue in terms of the percentage of variance explained (so, factor 1 explains % of total variance). In the final part of the table (labeled Rotation Sums of Squared Loadings), the eigenvalues of the factors after rotation are displayed. Rotation has the effect of optimizing the factor structure and one consequence for these data is that the relative importance of the four factors is equalized.

14 EXAMPLE Fig. 4. The main outputs of Factor Analysis

15 EXAMPLE The next step is to look at the content of questions that load onto the same factor in table Rotated Component Matrix (Fig. 5) to try to identify common themes. The questions that load highly on factor 1 seem to all relate to high-quality services; therefore we might label this factor as wide range of high-quality services. The questions that load highly on factor 2, mainly to relate to companies dislocation and working hours; therefore, we might label this factor as comfortable dislocation and opening hours. The questions that load highly on factor 3 all seem to relate to promotion; therefore, we might label this factor as good promotion. Finally, the questions that load highly on factor 4 mainly to relate to convenient parking and modern premises; therefore, we might label this factor as modern infrastructure of the center. This analysis seems to reveal that the initial questionnaire, in reality, is composed of four subscales.

16 EXAMPLE Fig. 5. The main outputs of Factor Analysis

17 EXAMPLE After completing a factor analysis, the factors saved as variables can be used in other analyses, such as homogeneity of factors across groups under investigation. For this purpose the scale factor score coefficients are often converted into categorical variables by splitting into quantiles, and Chi Square test is performed for the null hypothesis that the factors are homogeneous. We will verify the hypothesis, that first factor (wide range of high-quality services) is equally important for visitors all ages, by continuing the example. From the menus choose: Transform  Rank Cases... Select the desired factor in Rank Cases dialog box (Fig.6) and specify Assign Rank 1 to Largest value.

18 EXAMPLE Fig.6. Rank Cases and Rank Cases: Types dialog boxes

19 EXAMPLE Click Rank Types…, check Ntiles in Rank Cases: Types dialog box (Fig. 6) and specify value 2. Decheck the Rank and click Continue. Click OK in Rank Cases dialog box. The categorized version with two outcomes (1–important and 2–no matter) of first factor (NFAC1_1) in data file will appear. From the menus choose: Analyze  Descriptive Statistics  Crosstabs... Select the categorical variable NFAC1_1 for Column(s) and categorical variable Age for Row(s) in Crosstabs dialog box (Fig.7).

20 EXAMPLE 7 pav. Crosstabs dialog box

21 EXAMPLE Click Statistics in Crosstabs dialog box and check Chi- square in Crosstabs: Statistics... dialog box. Click Continue in Crosstabs: Statistics... dialog box and OK in Crosstabs dialog box. The observed (Fig. 8) under Asymp. Sig. (2-sided) p- value of Chi-square test exceed the threshold value (p>0.05), so we are unable to reject the null hypothesis and conclude that is no significant difference in scores of first factor (wide range of high-quality services) across ages.

22 EXAMPLE Fig. 8. The output of Chi-Square test

23 Thanks for your attention