Advanced Statistics Factor Analysis, I. Introduction Factor analysis is a statistical technique about the relation between: (a)observed variables (X i.

Slides:



Advertisements
Similar presentations
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Advertisements

Canonical Correlation
Factor Analysis and Principal Components Removing Redundancies and Finding Hidden Variables.
Exploratory Factor Analysis
Chapter Nineteen Factor Analysis.
1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.
Psychology 202b Advanced Psychological Statistics, II April 7, 2011.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Factor Analysis Ulf H. Olsson Professor of Statistics.
Principal Components An Introduction Exploratory factoring Meaning & application of “principal components” Basic steps in a PC analysis PC extraction process.
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
Principal Components An Introduction exploratory factoring meaning & application of “principal components” Basic steps in a PC analysis PC extraction process.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
Factor Analysis There are two main types of factor analysis:
A quick introduction to the analysis of questionnaire data John Richardson.
1 Carrying out EFA - stages Ensure that data are suitable Decide on the model - PAF or PCA Decide how many factors are required to represent you data When.
GRA 6020 Multivariate Statistics Factor Analysis Ulf H. Olsson Professor of Statistics.
Factor Analysis Ulf H. Olsson Professor of Statistics.
Goals of Factor Analysis (1) (1)to reduce the number of variables and (2) to detect structure in the relationships between variables, that is to classify.
Education 795 Class Notes Factor Analysis II Note set 7.
Multivariate Methods EPSY 5245 Michael C. Rodriguez.
Factor Analysis Psy 524 Ainsworth.
Principal Components An Introduction
Objectives of Multiple Regression
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Factor Analysis PowerPoint Prepared by Alfred.
What is Factor Analysis?
Chapter 9 Factor Analysis
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
MGMT 6971 PSYCHOMETRICS © 2014, Michael Kalsher
Advanced Correlational Analyses D/RS 1013 Factor Analysis.
Applied Quantitative Analysis and Practices
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Factor Analysis Psy 524 Ainsworth. Assumptions Assumes reliable correlations Highly affected by missing data, outlying cases and truncated data Data screening.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Advanced Statistics Factor Analysis, II. Last lecture 1. What causes what, ξ → Xs, Xs→ ξ ? 2. Do we explore the relation of Xs to ξs, or do we test (try.
Introduction to Multivariate Analysis of Variance, Factor Analysis, and Logistic Regression Rubab G. ARIM, MA University of British Columbia December 2006.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Explanatory Factor Analysis: Alpha and Omega Dominique Zephyr Applied Statistics Lab University of Kenctucky.
Lecture 12 Factor Analysis.
Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.
Applied Quantitative Analysis and Practices
Education 795 Class Notes Factor Analysis Note set 6.
Chapter 13.  Both Principle components analysis (PCA) and Exploratory factor analysis (EFA) are used to understand the underlying patterns in the data.
Department of Cognitive Science Michael Kalsher Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 Factor Analysis 1 PSYC 4310 Advanced Experimental.
Applied Quantitative Analysis and Practices LECTURE#19 By Dr. Osman Sadiq Paracha.
FACTOR ANALYSIS 1. What is Factor Analysis (FA)? Method of data reduction o take many variables and explain them with a few “factors” or “components”
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
Principal Component Analysis
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.
Exploratory Factor Analysis
EXPLORATORY FACTOR ANALYSIS (EFA)
Evaluation of measuring tools: validity
Factor analysis Advanced Quantitative Research Methods
Descriptive Statistics vs. Factor Analysis
Measuring latent variables
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Principal Component Analysis
Chapter_19 Factor Analysis
Exploratory Factor Analysis. Factor Analysis: The Measurement Model D1D1 D8D8 D7D7 D6D6 D5D5 D4D4 D3D3 D2D2 F1F1 F2F2.
Measuring latent variables
Presentation transcript:

Advanced Statistics Factor Analysis, I

Introduction Factor analysis is a statistical technique about the relation between: (a)observed variables (X i ) (i = 1….m) and (b)factors (ξ j ) (j = 1…k) [ξ pronounced ksi] - It is assumed that the no. of ξs is smaller than no. of Xs; - Xs are called indicators, measurements, data; - ξs are called constructs, unobserved variables, latent variables Factor analysis is a statistical method used to describe variability among observed variables in terms of a potentially lower number of factors. It is assumed that variations in a number of observed variables reflect the variations in fewer, unobserved variables (factors).

Basic questions Theory of measurement: 1. What causes what, ξ → Xs, Xs→ ξ ? 2. Do we explore the relation of Xs to ξs, or do we test (try to confirm) our a priori assumption about this relation? __________________________________________________ Numerical strength of the relationship between ξ and Xs is expressed by coefficients λ (lambda). Error terms are denoted by δ (delta).

Models I and II Model I: ξ 1 = λ 1 *X 1 + λ 2 *X 2 + λ 3 *X 3 + λ 4 *X 4 + λ 5 *X 5 + error Model II: X 1 = λ 1 * ξ 1 + δ 1 X 2 = λ 2 * ξ 1 + δ 2 X 3 = λ 3 * ξ 1 + δ 3 X 4 = λ 4 * ξ 1 + δ 4 X 5 = λ 5 * ξ 1 + δ 5

PCA vs. FA Model I corresponds to Principal Component Analysis, PCA Model II corresponds to Factor Analysis, FA Answering the first question, What causes what, ξ → Xs, Xs→ ξ, leads to this basic distinction between PCA and FA The second question, Do we explore the relation of Xs to ξs, or do we test this relationship, deals with an important division within FA.

Exploratory factor analysis, EFA We do not assume the relationship between Xs and ξs. In particular we do not assume how many factors should be extracted and what would be their meaning. E.g.:

Confirmatory factor analysis, CFA.

Principal component analysis, PCA PCA seeks a linear combination of variables such that the maximum variance is extracted from the variables. If there are more than one factor, PCA then removes this variance and seeks a second linear combination which explains the maximum proportion of the remaining variance, and so on. It results in orthogonal (uncorrelated) factors.

Exploratory factor analysis, EFA EFA is used to uncover the underlying structure of a relatively large set of variables. The researcher's a priori assumption is that any indicator may be associated with any factor. This is the most common form of factor analysis.

Confirmatory factor analysis, CFA CFA seeks to determine if the number of factors & the loadings of measured (indicator) variables on them conform to what is expected on the basis of pre-established theory. This is a testing-hypothesis approach. Observed variables are selected on the basis of prior theory; factor analysis is used to see if they load as predicted on the expected number of factors. The researcher's a priori assumption is that each factor (with specified meaning) is associated with a given subset of indicators. In short: A minimum requirement of confirmatory factor analysis is that one hypothesizes beforehand the number of factors in the model, and posit expectations about which variables will load on which factors.

History of FA 1.Charles Edward Spearman ( ); pioneer of FA in connection with studying human intelligence. His theory that disparate cognitive test scores reflect a single general factor (g factor) led to developing factor analysis. The basic idea: test scores reflect g & the rest. Two factor model. 2.Raymond Bernard Cattell (1905 –1998) continued the work on two-factor model of IQ. The distinction of fluid and crystallized intelligence: abstract, adaptive intellectual abilities versus applied or crystallized knowledge. However, his improvement of FA comes from studying personality.

T-F: Kinds of Research Questions, 1 1. Number of factors Goal: to reduce a large number of variables Xs to smaller number of factors ξs. - minimal number of variables for a meaningful factor: for PC 3, for FA 4 Examples of one factor solutions: (a) PC – social status measured by years of schooling, job complexity, and earnings; (b) FA – attitude toward free market economy measured by opinions that the government should not intervene (a) in flow of capital, (b and c) in labor market in terms of jobs and in terms of earnings, and (d) in prices of goods.

T-F: Kinds of Research Questions, 2 2. Nature of factors Factors are interpreted by the variables that correlate with them. Theoretical argument for naming factors.

T-F: Kinds of Research Questions, 3 3. Importance of Solutions and Factors Assessing the importance of solutions in terms of how much variance in a data set is accounted for by the factors.

T-F: Kinds of Research Questions, 4 4. Testing Theory in Factor Analysis In scientific work, even the simplest exploratory analysis should be guided by some theoretical consideration

T-F: Kinds of Research Questions, 5 5. Estimating Scores on Factors The ultimate goal of factor analysis is to create new variable(s), factor(s) ξs. It means that each unit of observation receives some value on this new variable. Is this distribution reasonable?

Terminology Factor loadings: also called component loadings in PCA, are correlations between the variables (rows) and factors (columns). The squared factor loading = the percent of variance in that indicator variable explained by the factor. To get the percent of variance in all the variables accounted for by each factor, we have to add the sum of the squared factor loadings for that factor (column) and divide by the number of variables. (Note the number of variables equals the sum of their variances as the variance of a standardized variable is 1.) This is the same as dividing the factor's eigenvalue by the number of variables.

Interpreting factor loadings Rule of thumb in confirmatory factor analysis: loadings should be.7 or higher to confirm that independent variables identified a priori are represented by a particular factor. Rationale: the.7 level corresponds to about half of the variance in the indicator being explained by the factor. - the.7 standard is high; real-life data may well not meet this criterion, which is why some researchers, particularly for exploratory purposes, will use a lower level such as.4 for the central factor, and.25 for other factors; - call loadings above.6 "high" and those below.4 "low". In any event, factor loadings must be interpreted in the light of theory, not by arbitrary cutoff levels.

Terminology Communality: The sum of the squared factor loadings for all factors for a given variable (row) is the variance in that variable accounted for by all the factors; this is called the communality. The communality measures the percent of variance in a given variable explained by all the factors jointly; it may be interpreted as the reliability of the indicator. Uniqueness of a variable: is the variability of a variable minus its communality. Factor scores (also called component scores in PCA): are the scores of each case (row) on each factor (column). To compute the factor score for a given case for a given factor, one takes the case's standardized score on each variable, multiplies by the corresponding factor loading of the variable for the given factor, and sums these products. Computing factor scores allows one to look for factor outliers. Also, factor scores may be used as variables in subsequent modeling.

Terminology Eigenvalues:/Characteristic roots: The eigenvalue for a given factor measures the variance in all the variables, which is accounted for by that factor. The ratio of eigenvalues = the ratio of explanatory importance of the factors with respect to the variables. If a factor has a low eigenvalue, then it is contributing little to the explanation of variances in the variables and may be ignored as redundant with more important factors. Eigenvalues measure the amount of variation in the total sample accounted for by each factor. Extraction sums of squared loadings: Initial eigenvalues and eigenvalues after extraction (listed by SPSS as "Extraction Sums of Squared Loadings") are the same for PCA extraction, but for other extraction methods, eigenvalues after extraction will be lower than their initial counterparts. SPSS also prints "Rotation Sums of Squared Loadings" and even for PCA, these eigenvalues will differ from initial and extraction eigenvalues, though their total will be the same.