Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit 6b: Principal Components Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 1

Similar presentations


Presentation on theme: "Unit 6b: Principal Components Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 1"— Presentation transcript:

1 Unit 6b: Principal Components Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 1 http://xkcd.com/419/

2 Weighted Composites Biplots: Visualizing Variables as Vectors Principal Components Analysis © Andrew Ho, Harvard Graduate School of Education Unit 6b – Slide 2 Multiple Regression Analysis (MRA) Multiple Regression Analysis (MRA) Do your residuals meet the required assumptions? Test for residual normality Use influence statistics to detect atypical datapoints If your residuals are not independent, replace OLS by GLS regression analysis Use Individual growth modeling Specify a Multi-level Model If time is a predictor, you need discrete- time survival analysis… If your outcome is categorical, you need to use… Binomial logistic regression analysis (dichotomous outcome) Multinomial logistic regression analysis (polytomous outcome) If you have more predictors than you can deal with, Create taxonomies of fitted models and compare them. Form composites of the indicators of any common construct. Conduct a Principal Components Analysis Use Cluster Analysis Use non-linear regression analysis. Transform the outcome or predictor If your outcome vs. predictor relationship is non-linear, Use Factor Analysis: EFA or CFA? Course Roadmap: Unit 6b Today’s Topic Area

3 © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 3 Here’s a dataset in which teachers’ responses to what the investigators believed were multiple indicators/ predictors of a single underlying construct of Teacher Job Satisfaction:  The data are described in TSUCCESS_info.pdf. Here’s a dataset in which teachers’ responses to what the investigators believed were multiple indicators/ predictors of a single underlying construct of Teacher Job Satisfaction:  The data are described in TSUCCESS_info.pdf. DatasetTSUCCESS.txt OverviewResponses of national sample of teachers to six questions about job satisfaction. Source Administrator and Teacher Survey of the High School and Beyond (HS&B) dataset, 1984 administration, National Center for Education Statistics (NCES). All NCES datasets are also available free from the EdPubs on-line supermarket.High School and BeyondNational Center for Education StatisticsEdPubs Sample Size5269 teachers (4955 with complete data). More Info HS&B was established to study educational, vocational, and personal development of young people beginning in their elementary or high school years and following them over time as they began to take on adult responsibilities. The HS&B survey included two cohorts: (a) the 1980 senior class, and (b) the 1980 sophomore class. Both cohorts were surveyed every two years through 1986, and the 1980 sophomore class was also surveyed again in 1992. Multiple Indicators of a Common Construct

4 The OLS criterion minimizes the sum of vertical squared residuals. Other definitions of “best fit” are possible: Vertical Squared Residuals (OLS)Horizontal Squared Residuals (X on Y)Orthogonal Residuals (PCA!) Unit 6b – Slide 4© Andrew Ho, Harvard Graduate School of Education Review: From OLS to Orthogonal (Perpendicular) Residuals

5 © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 5 Visualizing Correlations The sample correlation between variables X1 and X2 is.5548. We can visualize that here with a scatterplot, as usual, adding in the OLS regression line (minimizing vertical residuals) and the orthogonal (principal axis) regression line (minimizing orthogonal residuals).

6 © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 6 Indicator St. Dev. Angle of Bisection/Bivariate Correlation X1X2X3X4X5X6 X1:Have high standards of teaching1.090.5550.1610.2130.2530.192 X2:Continually learning on the job1.25 56  0.1660.2310.2700.222 X3:Successful in educating students0.67 81  80  0.2990.3560.433 X4:Waste of time to do best as teacher1.67 78  77  73  0.4480.399 X5:Look forward to working at school1.33 75  74  69  63  0.553 X6:Time satisfied with job0.57 79  77  64  67  56  Regard the correlation between two indicators as the cosine of the angle between them: … etc. Regard the correlation between two indicators as the cosine of the angle between them: … etc. X1 1.09 X2 1.25 56  length Regard the standard deviation of the indicator as its “length”: … etc. length Regard the standard deviation of the indicator as its “length”: … etc. X1 1.09 X2 1.25 X3 0.67 X1 X2 X3 56  81  80  1.09 1.25 0.67 … etc. Putting it all together … Visualizing Variables as Vectors The correlation is not visualized in the actual observations but the angle between the vectors representing the variables. The smaller the angle (more parallel), the more correlated the variables. The correlation is not visualized in the actual observations but the angle between the vectors representing the variables. The smaller the angle (more parallel), the more correlated the variables.

7 © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 7 John Willett’s “Potato Technology” Indicator St. Dev. Angle of Bisection/Bivariate Correlation X1X2X3X4X5X6 X1:Have high standards of teaching1.090.5550.1610.2130.2530.192 X2:Continually learning on the job1.25 56  0.1660.2310.2700.222 X3:Successful in educating students0.67 81  80  0.2990.3560.433 X4:Waste of time to do best as teacher1.67 78  77  73  0.4480.399 X5:Look forward to working at school1.33 75  74  69  63  0.553 X6:Time satisfied with job0.57 79  77  64  67  56 

8 © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 8 The biplot Command in Stata 56  “Biplot” because it can plot both observations (rows) and variables (columns), though we use it here for the latter only. It is the shadow of the multidimensional representation of vectors onto 2D space defined by the first two “principal components” The lines here are directly proportional to the standard deviations of the variables, and the cosine of the angle between them is their correlation. This is a shadow of a 3D representation onto a 2D plane. The X3 vector is shorter in part because of its smaller standard deviation but more so because it projects into the slide.

9 © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 9 Biplots for >3 variables, unstandardized and standardized We can visualize two variables in 2D, and three variables in 3D. >3 variables can project into 3D space (with potato technology), and the shadow of potato technology onto the 2D screen makes a biplot. Also, we can standardize… Unstandardized variables Standardized variables

10 © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 10 If a composite is a simple sum (or straight average) of indicators, then coefficient alpha is relevant. If a composite is a simple sum of standardized indicators, then standardized alpha is relevant. Cronbach Coefficient Alpha Standardized 0.735530 Cronbach Coefficient Alpha Standardized 0.735530 For an additive composite of “standardized” indicators: First, each indicator is standardized to a mean of 0 and a standard deviation of 1: Then, the standardized indicator scores are summed: For an additive composite of “standardized” indicators: First, each indicator is standardized to a mean of 0 and a standard deviation of 1: Then, the standardized indicator scores are summed: = + + + + + XSTD XSTD1 XSTD2 XSTD3 XSTD4 XSTD5 XSTD6 Unweighted Composites But what biplots and potato technology show is that we might be able to do better in forming an optimal composite, by allowing particular indicators to be weighted.

11 © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 11 More generally, a weighted linear composite can be formed by weighting and adding standardized indicators together to form a composite measure of teacher job satisfaction … By choosing weights that differ from unity and differ from each other, we can create an infinite number of potential composites, as follows: We often use “normed” weights, such that: By choosing weights that differ from unity and differ from each other, we can create an infinite number of potential composites, as follows: We often use “normed” weights, such that: Among all such weighted linear composites, are there some that are “optimal”? How would we define such “optimal” composites?:  Does it make sense, for instance, to seek a composite with maximum variance, given the original standardized indicators.  Perhaps I can also choose weights that take account of the differing inter-correlations among the indicators, and “pull” the composite “closer” to the more highly-correlated indicators? Among all such weighted linear composites, are there some that are “optimal”? How would we define such “optimal” composites?:  Does it make sense, for instance, to seek a composite with maximum variance, given the original standardized indicators.  Perhaps I can also choose weights that take account of the differing inter-correlations among the indicators, and “pull” the composite “closer” to the more highly-correlated indicators? Weighted Composites

12 *-------------------------------------------------------------------------------- * Carry-out a principal components analysis of teacher satisfaction *-------------------------------------------------------------------------------- * Conduct a principal components analysis of the six indicators. * By default, PCA performs a listwise deletion of cases with missing values, * and standardizes the indicators before compositing: pca X1-X6, means * Scree Plot showing the eigenvalues screeplot, ylabel(0(.5)2.5) * Output the composite scores on the first two principal components: predict PC_1 PC_2, score *-------------------------------------------------------------------------------- * Inspect properties of composite scores on the first two principal components. *-------------------------------------------------------------------------------- * List out the principal component scores for the first 35 teachers: list PC_1 PC_2 in 1/35 * Estimate univariate descriptive statistics for the composite scores on the * first two principal components: tabstat PC_1 PC_2, stat(n mean sd) columns(statistics) * Estimate the bivariate correlation between the composite scores on the first * two principal components: pwcorr PC_1 PC_2, sig obs *-------------------------------------------------------------------------------- * Carry-out a principal components analysis of teacher satisfaction *-------------------------------------------------------------------------------- * Conduct a principal components analysis of the six indicators. * By default, PCA performs a listwise deletion of cases with missing values, * and standardizes the indicators before compositing: pca X1-X6, means * Scree Plot showing the eigenvalues screeplot, ylabel(0(.5)2.5) * Output the composite scores on the first two principal components: predict PC_1 PC_2, score *-------------------------------------------------------------------------------- * Inspect properties of composite scores on the first two principal components. *-------------------------------------------------------------------------------- * List out the principal component scores for the first 35 teachers: list PC_1 PC_2 in 1/35 * Estimate univariate descriptive statistics for the composite scores on the * first two principal components: tabstat PC_1 PC_2, stat(n mean sd) columns(statistics) * Estimate the bivariate correlation between the composite scores on the first * two principal components: pwcorr PC_1 PC_2, sig obs STATA routine pca implements Principal Components Analysis (PCA):  By choosing sets of weights, PCA seeks out optimal weighted linear composites of the original (often standardized) indicators.  These composites are called the “Principal Components.”  The First Principal Component is that weighted linear composite that has maximum variance, among all possible composites with the same indicator-indicator correlations. STATA routine pca implements Principal Components Analysis (PCA):  By choosing sets of weights, PCA seeks out optimal weighted linear composites of the original (often standardized) indicators.  These composites are called the “Principal Components.”  The First Principal Component is that weighted linear composite that has maximum variance, among all possible composites with the same indicator-indicator correlations. The pca Command After completing a PCA, we can save individual “scores” on the first, second, etc., components (principal components). Provide new variable names for composites that you want to keep (here, PC_ 1, PC_2, etc.) and use them in subsequent analysis. A “scree plot” can tell us how much variance each component accounts for, and how many components might be necessary to account for sufficient variance. © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 12

13 Summary statistics of the variables ------------------------------------------------------------- Variable | Mean Std. Dev. Min Max -------------+----------------------------------------------- X1 | 4.329364 1.088205 1 6 X2 | 3.873663 1.242735 1 6 X3 | 3.154995.6692635 1 4 X4 | 4.227043 1.665968 1 6 X5 | 4.42442 1.328885 1 6 X6 | 2.836529.5714207 1 4 ------------------------------------------------------------- Summary statistics of the variables ------------------------------------------------------------- Variable | Mean Std. Dev. Min Max -------------+----------------------------------------------- X1 | 4.329364 1.088205 1 6 X2 | 3.873663 1.242735 1 6 X3 | 3.154995.6692635 1 4 X4 | 4.227043 1.665968 1 6 X5 | 4.42442 1.328885 1 6 X6 | 2.836529.5714207 1 4 ------------------------------------------------------------- © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 13 STATA provides important pieces of output, including univariate descriptive statistics: As a first step, by default, PCA estimates the sample mean and standard deviation of the indicators and standardizes each of them, as follows: Standardized Variables and Total Variance As before, standardizing variables before a PCA suggests that differences in variance across indicators are not interpretable. If they are interpretable, and if indicators share a common scale that transcends standard deviation units, then use the covariance option.

14 Principal components (eigenvectors) ------------------------------------------------------------------------ Variable | Comp1 Comp2 Comp3 Comp4 Comp5 Comp6 ----------+------------------------------------------------------------ X1 | 0.3472 0.6182 0.0896 0.0264 0.6261 0.3108 X2 | 0.3617 0.5950 0.0543 -0.0217 -0.6685 -0.2548 X3 | 0.3778 -0.3021 0.7555 0.4028 0.0503 -0.1746 X4 | 0.4144 -0.1807 -0.5972 0.6510 -0.0493 0.1129 X5 | 0.4727 -0.2067 -0.2418 -0.4501 0.3022 -0.6176 X6 | 0.4591 -0.3117 0.0558 -0.4584 -0.2548 0.6433 ------------------------------------------------------------------------ Principal components (eigenvectors) ------------------------------------------------------------------------ Variable | Comp1 Comp2 Comp3 Comp4 Comp5 Comp6 ----------+------------------------------------------------------------ X1 | 0.3472 0.6182 0.0896 0.0264 0.6261 0.3108 X2 | 0.3617 0.5950 0.0543 -0.0217 -0.6685 -0.2548 X3 | 0.3778 -0.3021 0.7555 0.4028 0.0503 -0.1746 X4 | 0.4144 -0.1807 -0.5972 0.6510 -0.0493 0.1129 X5 | 0.4727 -0.2067 -0.2418 -0.4501 0.3022 -0.6176 X6 | 0.4591 -0.3117 0.0558 -0.4584 -0.2548 0.6433 ------------------------------------------------------------------------ © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 14 This “ideal” composite is called the First Principal Component, it is: Where: This “ideal” composite is called the First Principal Component, it is: Where:.35.36.38.41.47.46 First Principal Component List of the original indicators This Column Is Referred To As The “First Eigenvector” It contains the weights that PCA has determined will provide that particular linear composite of the six original standardized indicators that has maximum possible variance, given the inter-correlations among the original indicators. This Column Is Referred To As The “First Eigenvector” It contains the weights that PCA has determined will provide that particular linear composite of the six original standardized indicators that has maximum possible variance, given the inter-correlations among the original indicators. Here is a list of the optimal weights we were seeking!! The First Principal Component Note that the sum of squared weights in any component will equal 1, these are “normalized” eigenvectors.

15 Principal components (eigenvectors) ------------------------------------------------------------------------ Variable | Comp1 Comp2 Comp3 Comp4 Comp5 Comp6 ----------+------------------------------------------------------------ X1 | 0.3472 0.6182 0.0896 0.0264 0.6261 0.3108 X2 | 0.3617 0.5950 0.0543 -0.0217 -0.6685 -0.2548 X3 | 0.3778 -0.3021 0.7555 0.4028 0.0503 -0.1746 X4 | 0.4144 -0.1807 -0.5972 0.6510 -0.0493 0.1129 X5 | 0.4727 -0.2067 -0.2418 -0.4501 0.3022 -0.6176 X6 | 0.4591 -0.3117 0.0558 -0.4584 -0.2548 0.6433 ------------------------------------------------------------------------ Principal components (eigenvectors) ------------------------------------------------------------------------ Variable | Comp1 Comp2 Comp3 Comp4 Comp5 Comp6 ----------+------------------------------------------------------------ X1 | 0.3472 0.6182 0.0896 0.0264 0.6261 0.3108 X2 | 0.3617 0.5950 0.0543 -0.0217 -0.6685 -0.2548 X3 | 0.3778 -0.3021 0.7555 0.4028 0.0503 -0.1746 X4 | 0.4144 -0.1807 -0.5972 0.6510 -0.0493 0.1129 X5 | 0.4727 -0.2067 -0.2418 -0.4501 0.3022 -0.6176 X6 | 0.4591 -0.3117 0.0558 -0.4584 -0.2548 0.6433 ------------------------------------------------------------------------ © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 15.35.36.38.41.47.46 First Principal Component Notice that each original (standardized) indicator is approximately equally weighted in the First Principal Component:  This suggests that the first principal component we have obtained in this example is a largely equally weighted summary of the indicator variables that we have.  Notice that teachers who score highly on the First Principal Component: Have high standards of teaching performance. Feel that they are continually learning on the job. Believe that they are successful in educating students. Feel that it is not a waste of time to be a teacher. Look forward to working at school. Are always satisfied on the job.  Given this, let’s define the First Principal Component as an overall Index of Teacher Enthusiasm? Sure, why not.  Be cautious of the “naming fallacy” or the “reification fallacy.” Notice that each original (standardized) indicator is approximately equally weighted in the First Principal Component:  This suggests that the first principal component we have obtained in this example is a largely equally weighted summary of the indicator variables that we have.  Notice that teachers who score highly on the First Principal Component: Have high standards of teaching performance. Feel that they are continually learning on the job. Believe that they are successful in educating students. Feel that it is not a waste of time to be a teacher. Look forward to working at school. Are always satisfied on the job.  Given this, let’s define the First Principal Component as an overall Index of Teacher Enthusiasm? Sure, why not.  Be cautious of the “naming fallacy” or the “reification fallacy.” But, what is the First Principal Component actually measuring? Interpreting the First Principal Component

16 -------------------------------------------------------------- Component | Eigenvalue Difference Proportion Cumulative -----------+-------------------------------------------------- Comp1 | 2.606 1.394 0.4343 0.4343 Comp2 | 1.212 0.499 0.2019 0.6363 Comp3 | 0.713 0.118 0.1188 0.7551 Comp4 | 0.595 0.147 0.0992 0.8543 Comp5 | 0.448 0.021 0.0746 0.9289 Comp6 | 0.427. 0.0711 1.0000 -------------------------------------------------------------- Component | Eigenvalue Difference Proportion Cumulative -----------+-------------------------------------------------- Comp1 | 2.606 1.394 0.4343 0.4343 Comp2 | 1.212 0.499 0.2019 0.6363 Comp3 | 0.713 0.118 0.1188 0.7551 Comp4 | 0.595 0.147 0.0992 0.8543 Comp5 | 0.448 0.021 0.0746 0.9289 Comp6 | 0.427. 0.0711 1.0000 -------------------------------------------------------------- © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 16 This column contains the “Eigenvalues.” But, how successful a composite of the original indicators is the First Principal Component? The eigenvalue for the First Principal Component provides its variance:  In this example, where the original indicator-indicator correlations were low, the best that PCA has been able to do is to form an “optimal” composite that contains 2.61 units of the original 6 units of standardized variance.  That’s 43.43% of the original standardized variance.  Is this sufficient to call this unidimensional?  This implies that 3.39 units of the original standardized variance remain!!!  Maybe there are other interesting composites still to be found, that will sweep up the remaining variance.  Perhaps we can form other substantively-interesting composites from these same six indicators, by choosing different sets of weights:  Maybe there are other “dimensions” of information still hidden within the data?  We can inspect the other “principal components” that PCA has formed in these data … The eigenvalue for the First Principal Component provides its variance:  In this example, where the original indicator-indicator correlations were low, the best that PCA has been able to do is to form an “optimal” composite that contains 2.61 units of the original 6 units of standardized variance.  That’s 43.43% of the original standardized variance.  Is this sufficient to call this unidimensional?  This implies that 3.39 units of the original standardized variance remain!!!  Maybe there are other interesting composites still to be found, that will sweep up the remaining variance.  Perhaps we can form other substantively-interesting composites from these same six indicators, by choosing different sets of weights:  Maybe there are other “dimensions” of information still hidden within the data?  We can inspect the other “principal components” that PCA has formed in these data … Eigenvalues and the Proportion of Variance The “scree plot” helps us tell whether we should include an additional principal component to account for greater variance. We sometimes use the “Rule of 1,” in this case, keeping two principal components, but I prefer basing the decision based on the “elbow” from visual inspection. Consistent, in this case.

17 Unit 6b – Slide 17 Scree Plots and Biplots Principal components (eigenvectors) ------------------------------------------------------------------------ Variable | Comp1 Comp2 Comp3 Comp4 Comp5 Comp6 ----------+------------------------------------------------------------ X1 | 0.3472 0.6182 0.0896 0.0264 0.6261 0.3108 X2 | 0.3617 0.5950 0.0543 -0.0217 -0.6685 -0.2548 X3 | 0.3778 -0.3021 0.7555 0.4028 0.0503 -0.1746 X4 | 0.4144 -0.1807 -0.5972 0.6510 -0.0493 0.1129 X5 | 0.4727 -0.2067 -0.2418 -0.4501 0.3022 -0.6176 X6 | 0.4591 -0.3117 0.0558 -0.4584 -0.2548 0.6433 ------------------------------------------------------------------------ Principal components (eigenvectors) ------------------------------------------------------------------------ Variable | Comp1 Comp2 Comp3 Comp4 Comp5 Comp6 ----------+------------------------------------------------------------ X1 | 0.3472 0.6182 0.0896 0.0264 0.6261 0.3108 X2 | 0.3617 0.5950 0.0543 -0.0217 -0.6685 -0.2548 X3 | 0.3778 -0.3021 0.7555 0.4028 0.0503 -0.1746 X4 | 0.4144 -0.1807 -0.5972 0.6510 -0.0493 0.1129 X5 | 0.4727 -0.2067 -0.2418 -0.4501 0.3022 -0.6176 X6 | 0.4591 -0.3117 0.0558 -0.4584 -0.2548 0.6433 ------------------------------------------------------------------------ high Teachers who score high on the second component…  Have high standards of teaching performance.  Feel that they are continually learning on the job. But also …  Believe they are not successful in educating students.  Feel that it is a waste of time to be a teacher.  Don’t look forward to working at school.  Are never satisfied on the job high Teachers who score high on the second component…  Have high standards of teaching performance.  Feel that they are continually learning on the job. But also …  Believe they are not successful in educating students.  Feel that it is a waste of time to be a teacher.  Don’t look forward to working at school.  Are never satisfied on the job  If the first principal component is teacher enthusiasm, the second might be teacher frustration.  Note that, by construction, all principal components are uncorrelated with all other principal components.  Does it make sense that enthusiasm and frustration should be uncorrelated?  If the first principal component is teacher enthusiasm, the second might be teacher frustration.  Note that, by construction, all principal components are uncorrelated with all other principal components.  Does it make sense that enthusiasm and frustration should be uncorrelated? © Andrew Ho, Harvard Graduate School of Education

18 Unit 6b – Slide 18 Obs PC_1 PC_2 1 -0.67402 1.64567 2 -3.70420 1.25497 3 -2.80870 1.46971 4.. 5 -0.72933 0.16173 6.. 7 0.68828 -0.66211 8 -1.64624 1.96727 9 1.84142 1.66606 10 -0.11813 -0.20596 11 -3.70653 0.85507 12 2.11717 0.84820 13 -0.66466 -0.47258 14 -1.09068 -0.99362 15 -0.89365 0.42894 16 1.61503 0.55299 17 -1.95180 -2.33192 18 -1.40406 -0.25084 19 1.18572 -0.87904 20 -2.05647 -1.88495 21 -0.36685 -0.09749 22 -2.64324 -0.21207 23 2.21446 1.10305 24 -2.55062 -0.75701 25-0.03442 -2.97280 (cases deleted) Obs PC_1 PC_2 1 -0.67402 1.64567 2 -3.70420 1.25497 3 -2.80870 1.46971 4.. 5 -0.72933 0.16173 6.. 7 0.68828 -0.66211 8 -1.64624 1.96727 9 1.84142 1.66606 10 -0.11813 -0.20596 11 -3.70653 0.85507 12 2.11717 0.84820 13 -0.66466 -0.47258 14 -1.09068 -0.99362 15 -0.89365 0.42894 16 1.61503 0.55299 17 -1.95180 -2.33192 18 -1.40406 -0.25084 19 1.18572 -0.87904 20 -2.05647 -1.88495 21 -0.36685 -0.09749 22 -2.64324 -0.21207 23 2.21446 1.10305 24 -2.55062 -0.75701 25-0.03442 -2.97280 (cases deleted) As promised, scores on 1 st and 2 nd principal components are uncorrelated… Also, notice that any teacher missing on any indicator is also missing on every composite Scores of Teacher #1 on PC #1 & PC #2:.35.36.38.41.47.46.62.60 -.30 -.18 -.20 -.31 Pearson Correlation Coefficients PC_1 PC_2 PC_1 1.00000 0.00000 PC_2 0.00000 1.00000 Pearson Correlation Coefficients PC_1 PC_2 PC_1 1.00000 0.00000 PC_2 0.00000 1.00000 Estimated bivariate correlation between teachers’ scores on 1 st and 2 nd principal components is exactly zero!! Principal components are uncorrelated by construction. We can save the first and second principal components…

19 © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 19 A Framework for Principal Components Analysis  Know your variables. Read your items. Take the test.  Think carefully about what the scale intends to measure.  And how scores will be interpreted and used.  Univariate and bivariate statistics and visualizations.  Particularly correlation matrices and pairwise scatterplots.  Transform variables to achieve linearity if necessary.  To standardize or not to standardize?  Do item scales share equal-interval meaning across items?  Are differences in variances meaningful across items?  Is it useful to set the variances of items equal?  Reliability analyses ( alpha ) for unweighted composites  Provides a reliability statistic, Cronbach’s alpha, that estimates the correlation of scores across replications of a measurement procedure (drawing new items, views items as random).  Principal Components Analysis ( pca ) for weighted composites  Decides how to best weight the items you have in order to maximize the variance as a proportion of total score variance (keeps the same items, views items as fixed).  Use eigenvalues and scree plots to check dimensionality.  Select the number of principal components to represent the data based on statistical and substantive criteria.  Confirm that the weights on the variables are interpretable and consistent with the theories that motivated the design of the instrument. Beware the naming/reification fallacy.  Save the principal components for subsequent analysis.  Know your variables. Read your items. Take the test.  Think carefully about what the scale intends to measure.  And how scores will be interpreted and used.  Univariate and bivariate statistics and visualizations.  Particularly correlation matrices and pairwise scatterplots.  Transform variables to achieve linearity if necessary.  To standardize or not to standardize?  Do item scales share equal-interval meaning across items?  Are differences in variances meaningful across items?  Is it useful to set the variances of items equal?  Reliability analyses ( alpha ) for unweighted composites  Provides a reliability statistic, Cronbach’s alpha, that estimates the correlation of scores across replications of a measurement procedure (drawing new items, views items as random).  Principal Components Analysis ( pca ) for weighted composites  Decides how to best weight the items you have in order to maximize the variance as a proportion of total score variance (keeps the same items, views items as fixed).  Use eigenvalues and scree plots to check dimensionality.  Select the number of principal components to represent the data based on statistical and substantive criteria.  Confirm that the weights on the variables are interpretable and consistent with the theories that motivated the design of the instrument. Beware the naming/reification fallacy.  Save the principal components for subsequent analysis.

20 © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 20 Eigenvectors PC_1 PC_2 X1 Have high standards of teaching 0.3472 0.6182 X2 Continually learning on job 0.3617 0.5950 X3 Successful in educating students 0.3778 -.3021 X4 Waste of time to do best as teacher 0.4144 -.1807 X5 Look forward to working at school 0.4727 -.2067 X6 Time satisfied with job 0.4591 -.3117 Eigenvectors PC_1 PC_2 X1 Have high standards of teaching 0.3472 0.6182 X2 Continually learning on job 0.3617 0.5950 X3 Successful in educating students 0.3778 -.3021 X4 Waste of time to do best as teacher 0.4144 -.1807 X5 Look forward to working at school 0.4727 -.2067 X6 Time satisfied with job 0.4591 -.3117 Estimated correlation between any indicator and any component can be found by multiplying the corresponding component loading by the square root of the eigenvalue. This is sometimes useful in interpretation. (For unstandardized variables, you must next divide by the standard deviation of the variable to obtain the correlation) Estimated correlation between any indicator and any component can be found by multiplying the corresponding component loading by the square root of the eigenvalue. This is sometimes useful in interpretation. (For unstandardized variables, you must next divide by the standard deviation of the variable to obtain the correlation) Correlation of X1 and PC_1: = 0.347   2.61 = 0.347  1.62 = 0.561 Correlation of X1 and PC_1: = 0.347   2.61 = 0.347  1.62 = 0.561 Correlation of X1 and PC_2: = 0.618   1.212 = 0.618  1.101 = 0.680 Correlation of X1 and PC_2: = 0.618   1.212 = 0.618  1.101 = 0.680 Correlation of X1 and PC_3: = … = Correlation of X1 and PC_3: = … = Appendix: Some Useful Eigenvalue/Eigenvector Math The proportion of total variation accounted for by any principal component is its eigenvalue over the sum of all eigenvalues. Eigenvectors are “normalized” such that the sum of squared weights, or loadings, is 1. Note that the dot product of any two eigenvectors is 0, reaffirming their orthogonality. Eigenvectors are “normalized” such that the sum of squared weights, or loadings, is 1. Note that the dot product of any two eigenvectors is 0, reaffirming their orthogonality.


Download ppt "Unit 6b: Principal Components Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 1"

Similar presentations


Ads by Google