A set of techniques for data reduction

Slides:



Advertisements
Similar presentations
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Advertisements

Factor Analysis and Principal Components Removing Redundancies and Finding Hidden Variables.
Factor Analysis Continued
Factor Rotation & Factor Scores: Interpreting & Using Factors Well- & Ill-defined Factors Simple Structure Simple Structure & Factor Rotation Major Kinds.
Chapter Nineteen Factor Analysis.
OVERVIEW What is Factor Analysis? (purpose) History Assumptions Steps Reliability Analysis Creating Composite Scores.
© LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON
Lecture 7: Principal component analysis (PCA)
Principal Components An Introduction Exploratory factoring Meaning & application of “principal components” Basic steps in a PC analysis PC extraction process.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
Factor Analysis There are two main types of factor analysis:
Principal component analysis
Dr. Michael R. Hyman Factor Analysis. 2 Grouping Variables into Constructs.
Education 795 Class Notes Factor Analysis II Note set 7.
Factor Rotation & Factor Scores: Interpreting & Using Factors Well- & Ill-defined Factors Simple Structure Simple Structure & Factor Rotation Major Kinds.
Multivariate Methods EPSY 5245 Michael C. Rodriguez.
Factor Analysis Psy 524 Ainsworth.
Principal Component Analysis & Factor Analysis Psych 818 DeShon.
What is Factor Analysis?
Factor Analysis © 2007 Prentice Hall. Chapter Outline 1) Overview 2) Basic Concept 3) Factor Analysis Model 4) Statistics Associated with Factor Analysis.
Psy 427 Cal State Northridge Andrew Ainsworth PhD.
MGMT 6971 PSYCHOMETRICS © 2014, Michael Kalsher
Advanced Correlational Analyses D/RS 1013 Factor Analysis.
Applied Quantitative Analysis and Practices
Factor Analysis Psy 524 Ainsworth. Assumptions Assumes reliable correlations Highly affected by missing data, outlying cases and truncated data Data screening.
Thursday AM  Presentation of yesterday’s results  Factor analysis  A conceptual introduction to: Structural equation models Structural equation models.
Factor Analysis Revealing the correlational structure among variables Understanding & Reducing Complexity.
Factor Analysis ( 因素分析 ) Kaiping Grace Yao National Taiwan University
© 2007 Prentice Hall19-1 Chapter Nineteen Factor Analysis © 2007 Prentice Hall.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
PC Decisions: # PCs, Rotation & Interpretation Remembering the process Some cautionary comments Statistical approaches Mathematical approaches “Nontrivial.
Lecture 12 Factor Analysis.
Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
Applied Quantitative Analysis and Practices
Exploratory Factor Analysis. Principal components analysis seeks linear combinations that best capture the variation in the original variables. Factor.
Education 795 Class Notes Factor Analysis Note set 6.
Exploratory Factor Analysis Principal Component Analysis Chapter 17.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Multivariate Data Analysis Chapter 3 – Factor Analysis.
Advanced Statistics Factor Analysis, I. Introduction Factor analysis is a statistical technique about the relation between: (a)observed variables (X i.
Applied Quantitative Analysis and Practices LECTURE#19 By Dr. Osman Sadiq Paracha.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
FACTOR ANALYSIS 1. What is Factor Analysis (FA)? Method of data reduction o take many variables and explain them with a few “factors” or “components”
Factor Analysis Basics. Why Factor? Combine similar variables into more meaningful factors. Reduce the number of variables dramatically while retaining.
SW388R7 Data Analysis & Computers II Slide 1 Principal component analysis Strategy for solving problems Sample problem Steps in principal component analysis.
Principal Component Analysis
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Feature Extraction 主講人:虞台文.
FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Basic statistical concepts Variance Covariance Correlation and covariance Standardisation.
1 FACTOR ANALYSIS Kazimieras Pukėnas. 2 Factor analysis is used to uncover the latent (not observed directly) structure (dimensions) of a set of variables.
FACTOR ANALYSIS & SPSS.
Exploratory Factor Analysis
Customer Research and Segmentation
EXPLORATORY FACTOR ANALYSIS (EFA)
Analysis of Survey Results
Factor analysis Advanced Quantitative Research Methods
Showcasing the use of Factor Analysis in data reduction: Research on learner support for In-service teachers Richard Ouma University of York SPSS Users.
An introduction to exploratory factor analysis in IBM SPSS Statistics
Measuring latent variables
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Principal Component Analysis
Chapter_19 Factor Analysis
Lecture 8: Factor analysis (FA)
Measuring latent variables
Presentation transcript:

A set of techniques for data reduction Factor Analysis A set of techniques for data reduction Rakesh Pandey Professor of Psychology, B.H.U.

Factor Analysis FA can be conceived of as a method for examining interrelatedness of a set of variables in search of clusters or subsets of highly correlated variables. Visual glimpse of Factor Analysis

15 balls –different color

What is Factor Analysis (FA)? Factor analysis is set of analytic techniques that permits the reduction of a large number of interrelated variables to a smaller number of latent or hidden dimensions (factors) that can explain the maximum variance in the original set of variables. Correlated variables are grouped together and separated from other variables with low or no correlation Grouping of variables in subset is done in such a way that variables within a subset are mutually highly correlated, whereas at the same time variables in different subsets are relatively uncorrelated. The latent variable underlying each subset or group of variables is referred as factor. FA accomplishes the said task by analysing the correlation matrix

Correlation Matrix Q1 Q2 Q3 Q4 Q5 Q6 1 .987 .801 .765 -.003 -.088 -.051 .044 .213 .968 -.190 -.111 0.102 .789 .864 Q1-3 palpitation, dry mouth, sweating Q4-6 worry, apprehension, nervousness

Example output and basic concepts Test Factor h square s square rji eij I II III IV 1 .7 .0 .4 .3 .74 .12 .86 .14 2 .6 .2 .40 .10 .50 3 .8 .1 .5 .90 .05 .95 4 ? Eigen values

Terminology Communality. Amount of variance a variable shares with all the other variables. This is the proportion of variance explained by the common factors. Eigenvalue. Represents the total variance explained by each factor. Percentage of variance. The percentage of the total variance attributed to each factor. Factor loadings. Correlations between the variables and the factors. Factor matrix. A factor matrix contains the factor loadings of all the variables on all the factors Factor scores. Factor scores are composite scores estimated for each respondent on the derived factors.

Conducting Factor Analysis Checking appropriateness of data matrix Composition of Data matrix All variables measured on same sample Remove outliers Handle missing data Sample Size adequacy Comrey (1973) suggested that n=100 is poor; 200 is fair; 300 is good; 500 is very good and 1000 is excellent. 5-10 subjects up to 300 respondents Independence of Measures (component-total, common items etc.) Construction of the Correlation Matrix & testing the Appropriateness of Correlation Matrix Extracting factors (choosing method – most common – PCA or FA) Determining Number of Factors Rotation of Factors Interpretation of Factors Validation of Factor Structure Suggested readings

Appropriateness of the Correlation Matrix If visual inspection reveals no substantial number of correlations greater than .30, then factor analysis is probably inappropriate. No of zero or near zero correlations should be no more than 10 to 15% No variables correlated 1.0 with each other Remove one of each problematic pair, or use sum if appropriate. Significance of the Matrix: Bartlett’s (1950) test of sphericity should be signficant KMO-Measure of sampling adequacy (MSA). This index ranges from 0 to 1. KMO- .5-.7-mediocre; .7-.8 good;.8-.9 great; .9 and above Marvelous Anti-image correlation matrix: the diagonal values should be greater than .50 like KMO Multicolinearity: Determinant should be greater than .00001 Very few residuals should be over .05

Methods of Factor Extraction Two main approaches Differ in estimating communalities Principal components Simplest computationally Assumes all variance is common variance (implausible) but gives similar results to more sophisticated methods. SPSS default. Principal factor analysis Estimates communalities first

PCA & FA Principal components analysis Analyses total variance A composite of the observed variables (component) as a summary of those variables Assumes no error in items Unity inserted on diagonal of matrix Precise mathematical solutions possible Factor (or common factors) analysis Analyses shared or common variance Explain relationship between observed variables in terms of latent variables or factors Assumes error in items SMC inserted in diagonal matrix Precise math not possible, solved by iteration SES Education Income IQ Reasoning memory

How many Factors? Initially unknown Needs to be specified by the investigator on the basis of preliminary analysis No 100% foolproof statistical test for number of factors Several Methods Latent root method % Variance Scree plot Horn’s Parallel analysis Velicer’s MAP

Scree Plot Example

Parallel Analysis

Rotation The initial solution is “un-rotated” In un-rotated solution Most items have large loadings on more than one factor Several items may have negative loadings Grouping of variables may not be obvious Rotation of factors helps to address the said problems The Purpose is to obtain simple structure & positive manifold

Simple structure .9 .8 .7

How rotation relates to “Simple Structure” Factor Rotations -- changing the “viewing angle” of the factor space-- have been the major approach to providing simple structure structure is “simplified” if the factor vectors “spear” the variable clusters PC1’ Unrotated PC1 PC2 V1 .7 .5 V2 .6 .6 V3 .6 -.5 V4 .7 -.6 PC2 Rotated PC1 PC2 V1 .7 -.1 V2 .7 .1 V3 .1 .5 V4 .2 .6 V2 V1 PC1 V3 V4 PC2’

Major Types of Rotation Remember -- extracted factors are orthogonal (uncorrelated) Orthogonal Rotation -- resulting factors are uncorrelated more parsimonious & efficient, but less “natural” Oblique Rotation -- resulting factors are correlated more “natural” & better “spearing”, but more complicated Orthogonal Rotation Oblique Rotation PC1’ PC1’ PC2 PC2 Angle less than 90o Angle is 90o V2 V2 V1 V1 PC1 PC1 V3 V3 V4 V4 PC2’ PC2’

Major Types of Orthogonal Rotation & their “tendencies” Varimax -- most commonly used and common default “simplifies factors” by maximizing variance of loadings of variables of a factor (minimized #vars with high loadings) Maximize column variance Quartimax “simplifies variables” by maximizing variance of loadings of a variable across factors (minimizes #factors a var loads on) Mximize row variance Equimax designed to “balance” varimax and quartimax tendencies didn’t work very well -- can’t do simultaneously - whichever is done first dominates the final structure

Major Types of Oblique Rotation & their “tendencies” Promax computes best orthogonal solution and then “relaxes” orthogonality constraints to better “spear” variable clusters with factor vectors (give simpler structure) Direct Oblimin and others

Purpose or Application of FA The main applications of factor analytic techniques are: to detect structure in the relationships between variables, that is to classify variables and identify latent construct underlying them. to reduce the number of variables and remove redundant, unclear and irrelevant variables test/scale construction & evaluation of psychometric quality of a measure Precursor to subsequent MV techniques Latent path modelling Dealing with multicolinearity Improving reliability of aggregate or summated scales

Thank You

Demo for various applications of FA Use car_sales data of spss For reduction (vehicle type through fuel efficiency) For exploring structure (Select Long distance last month through Wireless last month and Multiple lines through Electronic billing )