Factor Analysis.

Slides:

Advertisements

Similar presentations

Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides

Advertisements

Chapter Nineteen Factor Analysis.

Factor Analysis for Data Reduction. Introduction 1. Factor Analysis is a set of techniques used for understanding variables by grouping them into “factors”

© LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON

Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.

Lecture 7: Principal component analysis (PCA)

Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.

Factor Analysis There are two main types of factor analysis:

Principal component analysis

Goals of Factor Analysis (1) (1)to reduce the number of variables and (2) to detect structure in the relationships between variables, that is to classify.

Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides

Multivariate Methods EPSY 5245 Michael C. Rodriguez.

8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.

Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Factor Analysis PowerPoint Prepared by Alfred.

Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides

Which Test Do I Use? Statistics for Two Group Experiments The Chi Square Test The t Test Analyzing Multiple Groups and Factorial Experiments Analysis of.

Factor Analysis © 2007 Prentice Hall. Chapter Outline 1) Overview 2) Basic Concept 3) Factor Analysis Model 4) Statistics Associated with Factor Analysis.

Advanced Correlational Analyses D/RS 1013 Factor Analysis.

Customer Research and Segmentation Basis for segmentation is heterogeneous markets Three imp. Definitions: Mkt. Segment, Mkt. Segmentation, Target market.

Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.

© 2007 Prentice Hall19-1 Chapter Nineteen Factor Analysis © 2007 Prentice Hall.

Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.

Lecture 12 Factor Analysis.

Chapter Eight: Using Statistics to Answer Questions.

Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.

Business Research Method Factor Analysis. Factor analysis is a general name denoting a class of procedures primarily used for data reduction and summarization.

Copyright © 2010 Pearson Education, Inc Chapter Nineteen Factor Analysis.

12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.

Applied Quantitative Analysis and Practices

Exploratory Factor Analysis. Principal components analysis seeks linear combinations that best capture the variation in the original variables. Factor.

Education 795 Class Notes Factor Analysis Note set 6.

Factor Analysis I Principle Components Analysis. “Data Reduction” Purpose of factor analysis is to determine a minimum number of “factors” or components.

Applied Quantitative Analysis and Practices LECTURE#19 By Dr. Osman Sadiq Paracha.

Factor Analysis. Introduction 1. Factor Analysis is a set of techniques used for understanding variables by grouping them into “factors” consisting of.

SW388R7 Data Analysis & Computers II Slide 1 Principal component analysis Strategy for solving problems Sample problem Steps in principal component analysis.

Principal Component Analysis

FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.

Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.

FACTOR ANALYSIS & SPSS. First, let’s check the reliability of the scale Go to Analyze, Scale and Reliability analysis.

1 FACTOR ANALYSIS Kazimieras Pukėnas. 2 Factor analysis is used to uncover the latent (not observed directly) structure (dimensions) of a set of variables.

Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.

Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.

FACTOR ANALYSIS & SPSS.

Exploratory Factor Analysis

Hypothesis Tests l Chapter 7 l 7.1 Developing Null and Alternative

Statistical Significance

Customer Research and Segmentation

Regression Analysis Module 3.

Lecture Slides Elementary Statistics Twelfth Edition

EXPLORATORY FACTOR ANALYSIS (EFA)

Analysis of Survey Results

Showcasing the use of Factor Analysis in data reduction: Research on learner support for In-service teachers Richard Ouma University of York SPSS Users.

Elementary Statistics

Business Research Method

An introduction to exploratory factor analysis in IBM SPSS Statistics

An Empirical Study On Willingness To Pay of the Electricity in Taiwan

© LOUIS COHEN, LAWRENCE MANION AND KEITH MORRISON

EPSY 5245 EPSY 5245 Michael C. Rodriguez

15.1 Goodness-of-Fit Tests

Conjoint Analysis.

Principal Component Analysis

Chapter 13: Inference for Distributions of Categorical Data

Product moment correlation

Chapter_19 Factor Analysis

Chapter Nine: Using Statistics to Answer Questions

Making Use of Associations Tests

STATISTICS INFORMED DECISIONS USING DATA

Presentation transcript:

Factor Analysis

Introduction 1. Factor Analysis is a set of techniques used for understanding variables by grouping them into “factors” consisting of similar variables 2. It can also be used to confirm whether a hypothesized set of variables groups into a factor or not 3. It is most useful when a large number of variables needs to be reduced to a smaller set of “factors” that contain most of the variance of the original variables 4. Generally, Factor Analysis is done in two stages, called Extraction of Factors and Rotation of the Solution obtained in stage 5. Factor Analysis is best performed with interval or ratio-scaled variables 6. Factor analysis is based on the assumption that all variables correlate to some degree. Consequently, those variables that share similar underlying dimensions should be highly correlated, and those variables that measure dissimilar dimensions should yield low correlations.

Assumptions Normality Independent sampling required Important only to the extent that skewness/outliers affect observed correlations OR if significance tests are performed (rare) Independent sampling required Variables should be linearly related to one another (in pairs) Many of the variables should be correlated at a moderate level (test with Bartlett’s test of sphericity)

Application Areas/Example The major purpose of factor analysis is the orderly simplification of a large number of inter-correlated measures to a few representative constructs or factors. In business, a common application area of Factor Analysis is to understand underlying motives of consumers who buy a product category or a brand. The worked out example will help clarify the use of Factor Analysis in Marketing Research. In this example, we assume that a two wheeler manufacturer is interested in determining which variables his potential customers think about when they consider his product. Let us assume that twenty two-wheeler owners were surveyed by this manufacturer. They were asked to indicate on a seven point scale (1=Completely Agree, 7=Completely Disagree), their agreement or disagreement with a set of ten statements relating to their perceptions and some attributes of the two-wheelers. The objective of doing Factor Analysis is to find underlying "factors" which would be fewer than 10 in number, but would be linear combinations of some of the original 10 variables.

The research design for data collection can be stated as follows- Twenty 2-wheeler users were surveyed about their perceptions and image attributes of the vehicles they owned. Ten questions were asked to each of them, all answered on a scale of 1 to 7 (1= completely agree, 7= completely disagree). 1. I use a 2-wheeler because it is affordable. 2. It gives me a sense of freedom to own a 2-wheeler. 3. Low maintenance cost makes a 2-wheeler very economical in the long run. 4. A 2-wheeler is essentially a man’s vehicle. 5. I feel very powerful when I am on my 2-wheeler. 6. Some of my friends who don’t have their own vehicle are jealous of me. 7. I feel good whenever I see the ad for 2-wheeler on T.V., in a magazine or on a hoarding. 8. My vehicle gives me a comfortable ride. 9. I think 2-wheelers are a safe way to travel. 10. Three people should be legally allowed to travel on a 2-wheeler.

Table contd on next slide...

Analyze – Dimension Reduction : Factor – Variables: variables 001 to 010; Descriptives : Statistics box – check Initial solution; Correlation Matrix box: check coefficients, KMO & Bartlett’s test of sphericity, Anti-image; Continue; Click Extraction : select Method – Principal axis factoring, radio button correlation matrix, In extract box radio button Eigenvalues over 1, Display both unrotated factor solution and Screen plot; Continue; Rotation – Method : Varimax radio button, Display box check Rotated solution; Continue; Options – In the Coefficient Display Format box, select Sorted by size and Suppress absolute values less than .3; Continue; OK.

An examination of correlation matrix (see next slide) in the output indicates that a considerable number of correlations are more than 0.3 and so the matrix is suitable for factoring. Note: With respect to Correlation Matrix if any pair of variables has a value less than 0.3, we can consider dropping one of them from the analysis.

H0 : The factor analysis is not valid The Bartlett test of sphericity is a goodness of fit test. The null hypothesis is that the population correlation matrix is identity matrix. The coefficient is statistically validated by a chi-square test. The hypotheses are : H0 : The factor analysis is not valid H1 : The factor analysis is valid The Kaiser-Meyer-Olkin measure of sampling adequacy is greater than 0.5. This implies that the factor analysis for data reduction is effective. The measures of sampling adequacy are printed on the diagonal in the anti-image correlation matrix (see next slide). We can observe that most of the measures are well above the acceptable level of 0.5. KMO and Bartlett's Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .618 Bartlett's Test of Sphericity Approx. Chi-Square 164.098 df 45 Sig. .000

We start factor analysis assuming that the number of factors must be equal to the number of variables. Each factor is assigned an eigen value of 1. Eigenvalue, also called the characteristic roots, is a ratio between the explained and unexplained variation in a model. We then run factor analysis model and decide the relevant number of factors. The proportion of variance in any one of the original variables which is captured by the extracted factors is known as Communality. The same is presented in the table of communalities. For example, over 95% of the variance in VAR00007 is accounted for.

The table of Total Variance Explained displays the total variance explained in three stages. At the initial stage, it shows that the factors and their associated eigenvalues, the percentage of variance explained and the cumulative percentages. In reference to eigenvalues, we would expect that three factors to be extracted because they have eigenvalues greater than 1. If three factors were extracted, then about 72% of the variance would be explained.

Eigenvalues: Only factors with eigenvalues of 1 or greater are considered to be significant; all factors with eigenvalues less than 1 are ignored. Eigenvalue, also called the characteristic roots, is a ratio between the explained and unexplained variation in a model. For a good model the eigen value must be more than one. The rationale for using the eigenvalue criterion is that the amount of common variance explained by an extracted factor should be at least equal to the variance explained by a single variable (unique variance), if that factor is to be retained for interpretation. An eigenvalue greater than 1 indicates that more common variance than unique variance is explained by that factor.

The screen plot graphically displays the eigenvalues for each factor The screen plot graphically displays the eigenvalues for each factor. It is a graph of the eigenvalues against all the factors. The graph is useful for determining how many factors to retain. The point of interest is where the curve starts to flatten. It can be seen that the curve begins to flatten between factors 3 and 4. Also note that from factor 4 onwards the eigenvalues are less than 1; so only three factors are retained.

The factor matrix is the matrix of loadings or correlations between the variables and factors. Pure varialbles have loadings of 0.3 or greater on only one factor. Complex variables may have high loadings on more than one factor, and they make interpretation of the output difficult. Rotation may therefore be necessary. Varimax rotation, where the factor axes are kept at right angles to each other, is the most frequently chosen. Ordinarily, rotation reduces the number of complex variables and improves interpretation. Factor 1 comprises of 4 items. Factor 2 comprises of 4 items and Factor 3 comprises of 2 items. Some items have dual (or sometimes triple/multiple) loadings greater than 0.3 on more than one factor. These items must be interpreted with caution, because simple structure in not apparent.

The final step in factor analysis involves determining how many factors to interpret and then assigning a label to these factors. The number of factors to be interpreted largely depends on the underlying purpose of the analysis. Looking at the rotated component matrix, we notice that variable nos. 4, 5, 6 and 7 have loadings of 0.97, 0.95, 0.93 and 0.97 on factor 1 (we look down the Factor 1 column, and look for high loadings close to 1.00). This suggests that Factor 1 is a combination of these four original variables. It also suggests a similar grouping. Therefore, there is no problem interpreting factor 1 as a combination of “a man’s vehicle” (statement in variable 4), “feeling of power” (variable 5), “others are jealous of me” (variable 6) and “feel good when I see my 2-wheeler ads” (variable 7).

At this point, the researcher’s task is to find a suitable phrase which captures the essence of the original variables which form the underlying concept or “factor”. In this case, factor 1 could be named “male ego”, or “machismo”, or “pride of ownership” or something similar. With the same mathematical output, interpretations of different researchers may differ.

Now we will attempt to interpret factor 2 Now we will attempt to interpret factor 2. We look in the column for Factor 2, and find that variables 8 and 9 have high loadings of 0.85 and 0.91, respectively. Variable 2 has a loading of -0.41. This indicates that factor 2 is a combination of these three variables. But if we look at the table of unrotated factor matrix, a slightly different picture emerges. Here, variable 3 also has a high loading on factor 2, along with variables 8 and 9. However, if we look at the rotated factor matrix, variable 3 has a higher loading on factor 3 as compared to factor 2. It is left to the researcher which interpretation he wants to use, as there are no hard and fast rules. Assuming we decide to use the three variables, the related statements are “sense of freedom”, “comfort” and “safety” (from statements 2, 8 and 9). We may combine these variables into a factor called “utility” or “functional features” or any other similar word or phrase which captures the essence of these four statements / variables.

For interpreting Factor 3, we look at the column labelled factor 3 in the table and find that variables 1, 3 and 10 are loaded high on factor 3. According to the unrotated factor matrix, only variable 10 loads high on factor 3. Supposing we stick to rotated matrix, then the combination of “affordability’, “low maintenance” and “cost saving by 3 people legally riding on a 2-wheeler” give the impression that factor 3 could be “economy” or “low cost”. We have now completed interpretation of the 3 factors with eigen values of 1 or more. We will now look at some additional issues which may be of importance in using factor analysis. Factor Transformation Matrix - This is the matrix by which you multiply the unrotated factor matrix to get the rotated factor matrix.

Additional Issues in Interpreting Solutions We must guard against the possibility that a variable may load highly on more than one factors. Strictly speaking, a variable should load close to 1.00 on one and only one factor, and load close to 0 on the other factors. If this is not the case, it indicates that either the sample of respondents have more than one opinion about the variable, or that the question/ variable may be unclear in its phrasing. The other issue important in practical use of factor analysis is the answer to the question ‘what should be considered a high loading and what is not a high loading?” Here, unfortunately, there is no clear-cut guideline, and many a time, we must look at relative values in the factor matrix. Sometimes, 0.6 may be treated as a high value, while sometimes 0.9 could be the cutoff for high values.