Theoretical rationale and empirical evaluation

Slides:

Advertisements

Similar presentations

Helen Gaeta, David Friedman, & Gregory Hunt Cognitive Electrophysiology Laboratory New York State Psychiatric Institute Differential Effects of Stimulus.

Advertisements

Factor Analysis and Principal Components Removing Redundancies and Finding Hidden Variables.

An Introduction to Multivariate Analysis

Factor Analysis Continued

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.

Lecture 7: Principal component analysis (PCA)

1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.

Factor Analysis There are two main types of factor analysis:

Discrim Continued Psy 524 Andrew Ainsworth. Types of Discriminant Function Analysis They are the same as the types of multiple regression Direct Discrim.

CHAPTER 19 Correspondence Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.

A quick introduction to the analysis of questionnaire data John Richardson.

Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.

Data Forensics: A Compare and Contrast Analysis of Multiple Methods Christie Plackner.

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.

Statistics 1 Course Overview

ERP DATA ACQUISITION & PREPROCESSING EEG Acquisition: 256 scalp sites; vertex recording reference (Geodesic Sensor Net)..01 Hz to 100 Hz analogue filter;

Evaluating Performance Information for Mapping Algorithms to Advanced Architectures Nayda G. Santiago, PhD, PE Electrical and Computer Engineering Department.

Canonical Correlation Analysis and Related Techniques Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia.

Applied Quantitative Analysis and Practices LECTURE#11 By Dr. Osman Sadiq Paracha.

Chapter 9 Factor Analysis

Applied Quantitative Analysis and Practices

All slides © S. J. Luck, except as indicated in the notes sections of individual slides Slides may be used for nonprofit educational purposes if this copyright.

Thursday AM  Presentation of yesterday’s results  Factor analysis  A conceptual introduction to: Structural equation models Structural equation models.

Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.

Techniques for studying correlation and covariance structure Principal Components Analysis (PCA) Factor Analysis.

Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University.

Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.

Introduction Can you read the following paragraph? Can we derive meaning from words even if they are distorted by intermixing words with numbers? Perea,

Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.

Education 795 Class Notes Factor Analysis Note set 6.

Principal components analysis (PCA) as a tool for identifying EEG frequency bands: I. Methodological considerations and preliminary findings Jürgen Kayser,

Advanced Statistics Factor Analysis, I. Introduction Factor analysis is a statistical technique about the relation between: (a)observed variables (X i.

Principal Component Analysis

A direct comparison of Geodesic Sensor Net (128-channel) and conventional (30-channel) ERPs in tonal and phonetic oddball tasks Jürgen Kayser, Craig E.

All slides © S. J. Luck, except as indicated in the notes sections of individual slides Slides may be used for nonprofit educational purposes if this copyright.

FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.

Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.

Descriptive Statistics The means for all but the C 3 features exhibit a significant difference between both classes. On the other hand, the variances for.

Effect of cognitive-behavioral therapy on brain activity related to stimulus-response conflict processing in Gilles de la Tourette Syndrome Lori Baltazar1,4.

Cortical evoked potentials to an auditory illusion: Binaural beats

Directed Component Analysis

Portfolio Risk Lecture 14.

Factor Analysis An Alternative technique for studying correlation and covariance structure.

A High-Density EEG investigation of the Misinformation Effect: Differentiating between True and False Memories John E. Kiat & Robert F. Belli Department.

Theoretical and empiral rationale for using unrestricted PCA solutions

Making Use of Associations Tests

Factor analysis Advanced Quantitative Research Methods

Elementary Statistics

Findings for Healthy Adults and Depressed Patients

Introduction Second report for TEGoVA ‘Assessing the Accuracy of Individual Property Values Estimated by Automated Valuation Models’ Objective.

Interpreting Principal Components

Analysing data from a questionnaire:

Measuring latent variables

Adrian G. Fischer, Markus Ullsperger Neuron

EPSY 5245 EPSY 5245 Michael C. Rodriguez

6.1 Introduction to Chi-Square Space

Factor Analysis An Alternative technique for studying correlation and covariance structure.

Principal Component Analysis

Volume 49, Issue 3, Pages (February 2006)

Chapter_19 Factor Analysis

Machine Learning for Visual Scene Classification with EEG Data

Factor Analysis (Principal Components) Output

Athanassios G Siapas, Matthew A Wilson Neuron

Making Use of Associations Tests

M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University

Local Origin of Field Potentials in Visual Cortex

Canonical Correlation Analysis and Related Techniques

Performance Tests in Guide Dogs

Measuring latent variables

The Normal Distribution

Presentation transcript:

Theoretical rationale and empirical evaluation Optimizing principal components analysis (PCA) methodology for ERP component identification and measurement: Theoretical rationale and empirical evaluation Jürgen Kayser and Craig E. Tenke Department of Biopsychology, New York State Psychiatric Institute, New York http://nypisys.cpmc.columbia.edu/psychophysiology/index.html Introduction Although PCA is widely used to determine "data-driven" ERP components, it is unclear if and how specific methodological choices may affect factor extraction. We report here the effects of three variations when applying temporal PCA (tPCA) to ERP data: 1) Type of association matrix (correlation / covariance) 2) Varimax rotation (scaled / unscaled) 3) Number of components extracted and rotated Real ERP data, collected from healthy, right-handed adults using a visual half-field study (see Figure 4), were repeatedly submitted to tPCA using BMDP statistical software (4M; Dixon, 1992). Columns of the data matrix represented time (110 sample points from –100 to 1,000 ms), and rows consisted of subjects (16), conditions (4), and electrode sites (30). tPCAs were performed for three extraction / rotation criteria: 1) Covariance matrix / Varimax rotation on raw data 2) Correlation matrix / Varimax rotation 3) Covariance matrix / Varimax rotation on standardized variables 110 tPCAs were computed for each extraction / rotation condition, by systematically increasing the number of components to be extracted from 1 to 110 (= number of variables) Methods Limiting the number of components changed the morphology of some components considerably (see Figures 5B and 6B). However, more liberal or unlimited extraction criteria did not degrade or change high-variance components. Instead, their interpretability was improved by more distinctive time courses with narrow and unambiguous peaks (i.e., low secondary loadings; see Figures 5A and 6A). Some physiologically meaningful ERP components that are small in amplitude and/or topographically localized (e.g., P1) were found to have a PCA counterpart (e.g., Factor 130; see Figure 8A), that were lost with restricted solutions due to their low overall variance contributions. Covariance-based factors had more distinct time courses (i.e., lower secondary loadings) than the corresponding correlation-based factors (Figures 5B and 6B), thereby allowing a better interpretation of their electrophysiological relevance. Correlation-based solutions were likely to produce artificial factors that merely reflected small but systematic variations when the ERP waveform intersected the baseline (i.e., zero; cf. Factors –70, 10, and 50 in Figures 6A and 8B). Scaling covariance-based PCA factors before rotation approximated correlation-based solutions, and ultimately yielded the same coefficients (factor loadings) when all components were rotated (see Figure 6A). Results Figure 4. Grand average ERPs for 16 healthy adults for neutral and negative visual stimuli at 30 recording sites, averaged across hemifield of presentation (250 ms exposure in visual half-field paradigm). Data from Kayser et al (2000). This is the default in SPSS for the covariance matrix! The usefulness of the extracted factors can be evaluated by specific knowledge about the variance distribution of ERPs, which are characterized by the removal of baseline activity. The variance should be small for sample points before and shortly after stimulus onset (across and within cases), but large near the end of the recording epoch and at ERP component peaks. As a covariance matrix preserves this information, it is lost with a correlation matrix that assigns equal weights to each sample point, yielding the possibility that small but systematic variations may form a factor. These considerations were evaluated and confirmed with simulated ERP data (see Figures 1–3). Theoretical Rationale VARIABLES = 128. CASES = 1920. / VARIABLE USE = 11 to 120. / FACTOR METHOD = PCA. NUMB = {Factors to be extracted}. {Extraction Method} / ROTATE METHOD = VMAX. / A) B) Factor extractions of the unscaled covariance matrix are preferable to correlation- / scaled covariance-based PCA solutions. For ERP data, there is no reason to restrict the number of factors to be extracted. Conclusions A) B) A) B) Extraction Method: FORM = COVA. Extraction Method: FORM = CORR. or FORM = COVA. LOAD = CORR. 1 3 .. 12 3 .. 12 2 Figure 1. A) Invariant waveform template (128 sample points, 100 samples/sec, 200 ms baseline) used to generate two pseudo ERP data sets for 30 electrode ‘sites’ and 20 ‘subjects.’ A ‘topography’ was introduced by scaling the template for selected sites with a factor of 0.5 (Fp1/2), 0.8 (F7/8, F3/4, Fz), or 1.2 (C3/4, Cz). For the second data set, random noise (range ±0.25 µV, uniform distribution) was added to each sample point. B) ERP ‘group’ average of noise data set. 3 References Dixon, W.J. (Ed.) (1992). BMDP Statistical Software Manual (Vol. 2). Berkeley, CA: University of California Press. Kayser, J., Bruder, G.E., Tenke, C.E., Stewart, J.E., & Quitkin, F.M. (2000). Event-related potentials (ERPs) to hemifield presentations of emotional stimuli: differences between depressed patients and healthy adults in P3 amplitude and asymmetry. International Journal of Psychophysiology, 36(3), 211-236. 4 20 .. 109 20 .. 109 Pseudo ERP Data Pseudo ERP Data + Noise 5 Figure 5. Sequences of factor loadings of the covariance–based solutions (A) and overlaid loadings of Factor 3 for restricted (12) and liberal (20) extraction criteria (B). Figure 6. Sequences of factor loadings of the correlation–based solutions (A) and overlaid loadings of Factor 3 for restricted (12) and liberal (20) extraction criteria (B). 6 A) B) 7 Figure 7. Plots of eigenvalues (percentage of overall variance) for the first 10 factors extracted from the unrestricted (109) covariance or correlation solution. Factors to be extracted 8 Figure 2. A) Pseudo ERPs at four electrode sites (Fp1, Fz, Cz, Pz). A constant, low-level voltage offset (-0.01 µV) was systematically applied to the pre-stimulus baseline (-200 .. – 50 ms) at every other electrode (e.g., see Pz in inset). B) Pseudo ERPs as in A), but with random noise added. Note that the low-level offset at Pz is lost (see inset). 9 10 Factor Loadings 20 2 41 78 Explained variance [%] Number of factors extracted 30 100.0 48.1 0.0 14.4 94.5 45.8 0.1 1.1 0.1 1.1 109 1.0 820 440 260 170 640 330 130 560 50 90 870 430 250 70 - 170 10 50 90 120 630 820 440 260 170 640 330 130 560 50 90 A) 10 50 90 120 630 870 430 250 70 - 170 B) A) B) Pseudo ERP Data Pseudo ERP Data + Noise Figure 3. Time course of factor loadings for the first PCA factors extracted from the covariance or correlation matrix for pseudo ERP data with (B) and without noise (A). The covariance-based PCA extracted a component (factor 1), that accurately reflected the introduced variance shape for both data sets. The correlation-based PCA only produced a component (factor 1) that indicated the direction, but not the size of variations from zero (i.e., from baseline). Similarly, the constant low-level offset was disproportionally reflected in another component (factor 2) for the noise-free data. Figure 8. Factor score topographies and overlaid factor loadings of the first 10 covariance- (A) or correlation-based (B) PCA components extracted from the unrestricted (109) solution, identified by peak latencies of factor loadings. P1 factor