Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 FACTOR ANALYSIS Kazimieras Pukėnas. 2 Factor analysis is used to uncover the latent (not observed directly) structure (dimensions) of a set of variables.

Similar presentations


Presentation on theme: "1 FACTOR ANALYSIS Kazimieras Pukėnas. 2 Factor analysis is used to uncover the latent (not observed directly) structure (dimensions) of a set of variables."— Presentation transcript:

1 1 FACTOR ANALYSIS Kazimieras Pukėnas

2 2 Factor analysis is used to uncover the latent (not observed directly) structure (dimensions) of a set of variables. It reduces attribute space from a larger number of variables to a smaller number of factors, which most often do not have a quantitative measure; Factor analysis could be used for any of the following purposes: To reduce a large number of variables to a smaller number of factors for modeling purposes; To validate a scale or index by demonstrating that its constituent items load on the same factor, and to drop proposed scale items which cross-load on more than one factor. To create a set of orthogonal factors to be treated as uncorrelated variables as one approach to handling multicollinearity in such procedures as multiple regression. INTRODUCTION

3 3 The most common factor analysis model linking the k variables with the m general latent (invisible) factors and the specific (characteristic) latent factor e 1, being described by the system of equations, where i=1,…, k, m<k, i.e. general factors are smaller in number than the observed variables. The multipliers are interpreted as factor weights. Knowing the X i values the task of factor analysis is to find estimates of common (general) factors, factors weights, and specific (unique) variances (which cannot be explained by common factors). k X,,X,X  21

4 4 ASSUMPTIONS In factor analysis, multicollinearity is necessary because variables must be highly associated with some of the other variables so they will load into factors. But all of the variables should not be highly correlated (except when only one factor will be present). It is best for factor analysis to have groups of variables highly associated with each other (which will result in those variables loading as a factor) and not correlated at all with other groups of variables. As indicator of the strength of the relationship among variables is Bartlett's test of sphericity. Bartlett's test of sphericity is used to test the null hypothesis that the variables in the population correlation matrix are uncorrelated. If Bartlett's test p<0.05, factor analysis can be proceed for the data. The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy is an index for comparing the magnitudes of the observed correlation coefficients to the magnitudes of the partial correlation coefficients. Large values for the KMO measure (KMO>0.7 or at minimum KMO>0.6) indicate that a factor analysis of the variables is a good idea.

5 5 EXTRACT THE FACTORS The next step in the factor analysis is to extract the factors. The most popular method of extracting factors is called a Principle Component Analysis. There are other competing, and sometimes preferable, extraction methods (such as Maximum Likelihood). These analyses determine how well the factors explain the variation. The goal here is to identify the linear combination of variables that account for the greatest amount of common variance.

6 6 FACTOR ROTATION The initial solutions are hard to interpret because variables tend to load on multiple factors. The so-called rotation procedure serves to make the output more understandable and is usually necessary to facilitate the interpretation of factors. The sum of eigenvalues is not affected by rotation, but rotation will alter the eigenvalues (and percent of variance explained) of particular factors and will change the factor loadings.

7 7 EXAMPLE Wellness complex visitors were asked to identify the elements (as range of services, quality of services, etc.), that have powerful effect on choices of the specific complex. Answers are a five scale: 1 - very important, 5 - it does not matter. To Obtain a Factor Analysis: Open the file with the data analyzed. From the menus choose: Analyze  Data Reduction  Factor… Select the variables for the factor analysis in Factor Analysis dialog box (Fig.1). Categorical data (such as religion or country of origin) are not suitable for factor analysis. Click Descriptives.

8 8 EXAMPLE Fig. 1. Factor Analysis dialog box

9 9 EXAMPLE Specify KMO and Bartlett‘s test of sphericity in Factor Analysis: Descriptives dialog box (Fig. 2) and click Continue. Click Rotation…and select the Varimax method of factor rotation in Factor Analysis: Rotation dialog box (Fig. 2). Leave default Rotated Solution option. Click Continue. Click Scores... and select the option Save as variables in Factor Analysis: Factor Scores dialog box (Fig. 3). This creates one new variable for each factor in the final solution. Leave default Regression Method. Click Continue.

10 10 EXAMPLE Fig. 2. Factor Analysis: Descriptives and Factor Analysis: Rotation dialog boxes

11 11 EXAMPLE Click Options..., check Supress small coefficients in Factor Analysis: Options dialog box (Fig. 3), and specify Absolute value below equal to 0,4. Click Continue. Click OK in Factor Analysis dialog box.

12 12 EXAMPLE Fig. 3. Factor Analysis: Scores and Factor Analysis: Options dialog boxes

13 13 EXAMPLE SPSS Output KMO and Bartlett‘s Test (Fig.4) shows the Kaiser- Meyer-Olkin (KMO) measure of sampling adequacy and Bartlett's test of sphericity. For these data KMO = 0.640, which falls into the requested range, and Bartlett's test is highly significant (p < 0.001), and therefore factor analysis is appropriate. SPSS Output Total Variance Explained (Fig. 4) lists the eigenvalues associated with each linear component (factor) before extraction, after extraction and after rotation. SPSS extracts all factors with eigenvalues greater than 1 as displayed in the columns labeled Extraction Sums of Squared Loadings. The eigenvalues associated with each factor represent the variance explained by that particular linear component and SPSS also displays the eigenvalue in terms of the percentage of variance explained (so, factor 1 explains 23.384% of total variance). In the final part of the table (labeled Rotation Sums of Squared Loadings), the eigenvalues of the factors after rotation are displayed. Rotation has the effect of optimizing the factor structure and one consequence for these data is that the relative importance of the four factors is equalized.

14 14 EXAMPLE Fig. 4. The main outputs of Factor Analysis

15 15 EXAMPLE The next step is to look at the content of questions that load onto the same factor in table Rotated Component Matrix (Fig. 5) to try to identify common themes. The questions that load highly on factor 1 seem to all relate to high-quality services; therefore we might label this factor as wide range of high-quality services. The questions that load highly on factor 2, mainly to relate to companies dislocation and working hours; therefore, we might label this factor as comfortable dislocation and opening hours. The questions that load highly on factor 3 all seem to relate to promotion; therefore, we might label this factor as good promotion. Finally, the questions that load highly on factor 4 mainly to relate to convenient parking and modern premises; therefore, we might label this factor as modern infrastructure of the center. This analysis seems to reveal that the initial questionnaire, in reality, is composed of four subscales.

16 16 EXAMPLE Fig. 5. The main outputs of Factor Analysis

17 17 EXAMPLE After completing a factor analysis, the factors saved as variables can be used in other analyses, such as homogeneity of factors across groups under investigation. For this purpose the scale factor score coefficients are often converted into categorical variables by splitting into quantiles, and Chi Square test is performed for the null hypothesis that the factors are homogeneous. We will verify the hypothesis, that first factor (wide range of high-quality services) is equally important for visitors all ages, by continuing the example. From the menus choose: Transform  Rank Cases... Select the desired factor in Rank Cases dialog box (Fig.6) and specify Assign Rank 1 to Largest value.

18 18 EXAMPLE Fig.6. Rank Cases and Rank Cases: Types dialog boxes

19 19 EXAMPLE Click Rank Types…, check Ntiles in Rank Cases: Types dialog box (Fig. 6) and specify value 2. Decheck the Rank and click Continue. Click OK in Rank Cases dialog box. The categorized version with two outcomes (1–important and 2–no matter) of first factor (NFAC1_1) in data file will appear. From the menus choose: Analyze  Descriptive Statistics  Crosstabs... Select the categorical variable NFAC1_1 for Column(s) and categorical variable Age for Row(s) in Crosstabs dialog box (Fig.7).

20 20 EXAMPLE 7 pav. Crosstabs dialog box

21 21 EXAMPLE Click Statistics in Crosstabs dialog box and check Chi- square in Crosstabs: Statistics... dialog box. Click Continue in Crosstabs: Statistics... dialog box and OK in Crosstabs dialog box. The observed (Fig. 8) under Asymp. Sig. (2-sided) p- value of Chi-square test exceed the threshold value (p>0.05), so we are unable to reject the null hypothesis and conclude that is no significant difference in scores of first factor (wide range of high-quality services) across ages.

22 22 EXAMPLE Fig. 8. The output of Chi-Square test

23 23 Thanks for your attention


Download ppt "1 FACTOR ANALYSIS Kazimieras Pukėnas. 2 Factor analysis is used to uncover the latent (not observed directly) structure (dimensions) of a set of variables."

Similar presentations


Ads by Google