Presentation is loading. Please wait.

Presentation is loading. Please wait.

All slides © S. J. Luck, except as indicated in the notes sections of individual slides Slides may be used for nonprofit educational purposes if this copyright.

Similar presentations


Presentation on theme: "All slides © S. J. Luck, except as indicated in the notes sections of individual slides Slides may be used for nonprofit educational purposes if this copyright."— Presentation transcript:

1 All slides © S. J. Luck, except as indicated in the notes sections of individual slides Slides may be used for nonprofit educational purposes if this copyright notice is included, except as noted Permission must be obtained from the copyright holder(s) for any other use The ERP Boot Camp Plotting, Measurement, & Statistics

2 Plotting- The Wrong Way Where is time zero? What is the prestimulus baseline period? Is this voltage > 0? Where exactly is 300 ms? Is rare P2 > frequent P2? 100 ms

3 Plotting- The Wrong Way Nagamato et al (1989, Biol Psychiatry)

4 Plotting- The Right Way Electrode Site Baseline shows 0 µV Time Zero Voltage calibration aligned with waveform Time ticks on baseline for every waveform Calibration size and polarity Legend in figure To-be-compared waveforms overlaid

5 Plotting- Scalp Distribution Wang et al (2005, Brain Topography)

6 Plotting- Basic Principles You must show the waveforms (SPR rule) You must show the waveforms (SPR rule) -You need to show enough sites so that experts can figure out underlying component structure -I often show just one site for a cognitive audience when component can be isolated (N2pc or LRP) -In most cases, don’t shown more than 6-8 sites A prestimulus baseline must be shown A prestimulus baseline must be shown -Usually 200 ms (minimum of 100 ms for most experiments) -If you don’t see a baseline, the study is probably C.R.A.P (Carelessly Reviewed Awful Publication) Overlap the key waveforms Overlap the key waveforms In most cases, show both original waveforms and difference waves In most cases, show both original waveforms and difference waves

7 Measuring ERP Amplitudes Basic options -Peak amplitude Or local peak amplitude Or local peak amplitude -Mean/area amplitude

8 Why Mean is Better than Peak Rule #1: “Peaks and components are not the same thing. There is nothing special about the point at which the voltage reaches a local maximum.” Rule #1: “Peaks and components are not the same thing. There is nothing special about the point at which the voltage reaches a local maximum.” -Mean amplitude better characterizes a component as being extended over time -Peak amplitude encourages misleading view of components Peak may find rising edge of adjacent component Peak may find rising edge of adjacent component -Can be solved by local peak measure Peak is sensitive to high-frequency noise Peak is sensitive to high-frequency noise -Can be mitigated by low-pass filter Time of peak depends on overlapping components Time of peak depends on overlapping components -The peak may be nowhere near the center of the experimental effect

9 Why Mean is Better than Peak Peak amplitude is biased by the noise level Peak amplitude is biased by the noise level -More noise means greater peak amplitude -Mean amplitude is unbiased by noise level Example Example -Do 1000 simulation runs at two noise levels -Take mean amplitude and peak amplitude on each run -Average of 1000 mean amplitudes will be approximately the same for high-noise and low-noise data -Average of 1000 peak amplitudes will be greater for high-noise data than for low-noise data

10 Peak Amplitude and Noise Clean Waveform Waveform + 60-Hz Noise

11 Why Mean is Better than Peak Peak at different time points for different electrodes Peak at different time points for different electrodes -A real effect cannot do this A narrower measurement window can be used for mean amplitude A narrower measurement window can be used for mean amplitude Mean amplitude is linear; peak amplitude is not Mean amplitude is linear; peak amplitude is not -Mean of peak amplitudes ≠ peak amplitude of grand average -Mean of mean amplitudes = mean amplitude of grand average -Same applies to single-trial data vs. averaged waveform

12 Shortcomings of Mean Amplitude You will still pick up overlapping components You will still pick up overlapping components -A narrower window reduces this, but increases noise level Different measurement windows might be appropriate for different subjects Different measurement windows might be appropriate for different subjects -This could be a source of measurement noise -Patients and controls might have different latencies, leading to a systematic distortion of the results This is a case where peak might be better This is a case where peak might be better How do you pick the measurement window? How do you pick the measurement window? -Using the time course of an effect biases you to find a significant effect -Reality: People often look at the data first -Alternative 1: Select window based on prior results -Alternative 2: “Functional localizer” condition to find “ROI” -Same issues arise when choosing electrode sites

13 The Baseline Baseline correction is equivalent to subtracting baseline voltage from your amplitude measures Baseline correction is equivalent to subtracting baseline voltage from your amplitude measures -Any noise in baseline contributes to amplitude measure -Short baselines are noisy -Usual recommendation: 200 ms -Need to look at 200+ ms to evaluate overlap and preparatory activity Baseline can be significant confound Baseline can be significant confound -Baselines may differ across conditions due to overlap or preparatory activity, and this activity may fade over time -A poststimulus amplitude measure may therefore vary across conditions due to differential baselines Fading prestimulus differences can also distort scalp distributions Fading prestimulus differences can also distort scalp distributions -Distribution of prestimulus period contributes to distribution

14 Baseline Distortion: Noise Entire waveform shifted down (negative) because of positive noise blip

15 Baseline Distortion: Overlap

16 Baseline Distortion: Differential Overlap

17 Measuring Midpoint Latency Basic options -Peak latency Or local peak latency Or local peak latency -50% area latency

18 Better Example of 50% Area Rare Minus Frequent

19 Shortcomings of Peak Latency Peak may find rising edge of adjacent component Peak may find rising edge of adjacent component -Can be solved by local peak measure Peak is sensitive to high-frequency noise Peak is sensitive to high-frequency noise -Can be mitigated by low-pass filter Time of peak depends on overlapping components Time of peak depends on overlapping components Terrible for broad components with no real peak Terrible for broad components with no real peak Biased by the noise level Biased by the noise level -More noise => nearer to center of measurement window Not linear Not linear Difficult to relate to reaction time Difficult to relate to reaction time

20 50% Area Latency Uses entire waveform in determining latency Uses entire waveform in determining latency Robust to noise Robust to noise Not biased by the noise level Not biased by the noise level Works fine for broad waveforms with no real peak Works fine for broad waveforms with no real peak Linear Linear Easier to relate to RT Easier to relate to RT Shortcomings Shortcomings -Measurement window must include entire component -Strongly influenced by overlapping components -Requires monophasic waveforms -Works best on big components and/or difference waves

21 50% Area Latency Example Luck & Hillyard (1990)

22 50% Area Latency Example Luck & Hillyard (1990)

23 Measuring Onset Latency Basic options for onset of component Basic options for onset of component -20% area latency -50% peak latency -Statistical threshold First of N consecutive p<.05 points First of N consecutive p<.05 points Peak amplitude 50% of peak amplitude Latency @ 50% of peak amplitude

24 Jackknife Approach Miller, Patterson, & Ulrich (1998) Miller, Patterson, & Ulrich (1998) -Hard to measure onset latency (and other nonlinear parameters) from noisy single-subject waveforms -Much easier to measure from grand average Measure from grand average of N-1 subjects N times (once excluding each subject) Measure from grand average of N-1 subjects N times (once excluding each subject) Variance will be artificially low but can be corrected Variance will be artificially low but can be corrected -F corrected = F uncorrected ÷ (N-1) 2 [N per condition] -Between, within, main effects, interactions -Jackknife can also be used with Pearson r So precise that you may need to use interpolation to measure latencies between sample points So precise that you may need to use interpolation to measure latencies between sample points

25 Jackknife Approach Subject 1 Subject 2 Subject 3 Grand w/o Subject 1 Grand w/o Subject 2 Grand w/o Subject 3 50% fractional peak latency

26 Jackknife Approach Conventional ANOVA on LRP onset latency Conventional ANOVA on LRP onset latency -F(1, 20) = 1.315, p = 0.258 Jackknife ANOVA on LRP onset latency Jackknife ANOVA on LRP onset latency -F(1, 20) = 5221.625, Fc = 13.05, p =.0017 Drawbacks Drawbacks -Easier to have equal Ns for between-subjects ANOVAs -Is sometimes worse than conventional approach -Testing a slightly different null hypothesis

27 ERP Latency vs. RT RT Probability Distribution

28 ERP Latency vs. RT Response Density Waveform

29 Statistical Analysis Replication is the best statistic Replication is the best statistic -The.05 threshold is arbitrary What would happen if we decided the threshold should be.06? What would happen if we decided the threshold should be.06? -We regularly violate the assumptions of statistical tests, so the computed p-values are not correct estimates of probability of a Type I error -The real question is whether the effects are real or noise -If they are real (and large enough), they will be replicable General advice General advice -Collect clean data with big effects -Run follow-up experiments that contain replications -Use a vanilla statistical approach (with jackknife approach for nonlinear measures) or -Find a really good statistician who can do the most appropriate statistical tests

30 Standard Approach First, collapse across irrelevant factors First, collapse across irrelevant factors -If target and standard are counterbalanced, collapse to avoid physical stimulus differences -This reduces number of ANOVA factors Fewer p-values Fewer p-values Fewer spurious interactions Fewer spurious interactions Smaller experimentwise error Smaller experimentwise error Do a separate ANOVA for each component Do a separate ANOVA for each component -Don’t use component as a repeated-measures factor -Separate ANOVAs for amplitude and latency -You could do a gigantic MANOVA, but it would have a zillion p- values

31 Standard Approach Use electrodes at which component is present Use electrodes at which component is present -Otherwise your effect may get swamped by noise at other electrodes -Interaction with electrode site has low power Electrode site is usually two factors Electrode site is usually two factors -Anterior-posterior -Left-middle-right Usually bad to do a separate ANOVA for each site Usually bad to do a separate ANOVA for each site -More p-values means greater chance of Type I error -Less power means greater chance of Type II error Overall advice: Use stats in a way that most directly tests your main hypotheses Overall advice: Use stats in a way that most directly tests your main hypotheses

32 Choosing Electrode Sites Imagine you are comparing Condition A and Condition B at 128 electrode sites, and the conditions do not actually differ (zero difference with infinite power) Imagine you are comparing Condition A and Condition B at 128 electrode sites, and the conditions do not actually differ (zero difference with infinite power) -If the noise is independent at each site, you would expect p <.05 for 6-7 sites (.05 x 128 = 6.4) -If noise is correlated among nearby sites, you would expect p <.05 for at least one cluster of sites Therefore, if you choose which sites to measure by seeing which sites (or clusters) show a difference, you will have many false positives (actual p >>.05) Therefore, if you choose which sites to measure by seeing which sites (or clusters) show a difference, you will have many false positives (actual p >>.05) Solution 1: All sites in an omnibus ANOVA (low power) Solution 1: All sites in an omnibus ANOVA (low power) Solution 2: Bonferonni correction (even lower power) Solution 2: Bonferonni correction (even lower power) Solution 3: Use a priori region of interest Solution 3: Use a priori region of interest Solution 4: Use “functional localizer” condition Solution 4: Use “functional localizer” condition Solution 5: Odd and even trials Solution 5: Odd and even trials

33 Example: Fishing for N2ac 2 simultaneous stimuli on each trial, selected from: A) Pure sine wave B) FM sweep C) White noise burst D) Click train Duration=750, SOA = 1500±150 One stimulus defined as target for each trial block (e.g., FM sweep) Task: Press one button for target-present, another for target-absent Each stimulus equally likely to be combined with each other stimulus Locations are randomized from trial to trial Target is present on 25% of trials Look at contra vs ipsi with respect to target

34 Example: Fishing for N2ac

35

36 Separate ANOVAs for anterior and posterior electrode clusters Factors: Contra/Ipsi, Hemisphere, Within-Hemisphere Site, Time

37 Example: Fishing for N2ac Key Effects Contra/Ipsi: Significant Contra/Ipsi x Time: Significant Contra/Ipsi x Electrode: ns Contra/Ipsi x Hemisphere: ns Key Effects Contra/Ipsi: ns Contra/Ipsi x Time: Significant Contra/Ipsi x Electrode: ns Contra/Ipsi x Hemisphere: Significant

38 Contra/Ipsi @ Each Time Interval 200-300: ns 300-400: ns 400-500: Significant 500-600: Significant Example: Fishing for N2ac Contra/Ipsi @ Each Time Interval 200-300: Significant 300-400: Significant 400-500: Significant 500-600: ns

39 Example: Fishing for N2ac Follow-Up Experiment: Same basic paradigm to demonstrate replicability Slightly different stimuli to demonstrate generality Additional anterior electrode sites to better map scalp distribution Also included unilateral stimuli to determine whether the N2ac requires competition between simultaneous stimuli Replicated basic anterior and posterior patterns These effects were not present for unilateral stimuli

40 Electrode Interactions Amplitudes are multiplicative across electrodes Amplitudes are multiplicative across electrodes -Fz amplitude might go from 1.0 µV to 1.5 µV, and Pz amplitude might go from 2 µV to 3 µV This produces a condition x electrode site interaction This produces a condition x electrode site interaction -Even without a change in neural generators Multiplicative Additive

41 Electrode Interactions McCarthy & Wood (1985): Normalize the Data McCarthy & Wood (1985): Normalize the Data -Divide by vector length Now the conditions have the same overall amplitude Now the conditions have the same overall amplitude -Main effects are eliminated; they are assessed prior to normalization

42 Electrode Interactions Technical Problem: Urbach & Kutas (2002) demonstrated that this does not actually work under many realistic conditions Technical Problem: Urbach & Kutas (2002) demonstrated that this does not actually work under many realistic conditions -Many of these problems disappear if you measure from difference waves Conceptual problem: The conclusions that can be drawn from an electrode site interaction are extremely weak Conceptual problem: The conclusions that can be drawn from an electrode site interaction are extremely weak -Could be same generators, but change in relative amplitudes -Could be same generators, but a change in relative latencies General advice: Don’t worry about electrode interactions General advice: Don’t worry about electrode interactions -You can’t draw very strong conclusions from them, so just report them

43 Heterogeneity of Covariance Within-subjects ANOVA assumes homogeneity of variance and covariance (sphericity) Within-subjects ANOVA assumes homogeneity of variance and covariance (sphericity) -Modest heterogeneity of variance not a big problem -Heterogeneity of covariance inflates Type I error rate What is homogeneity of covariance? What is homogeneity of covariance? -3 or more levels of a within-subjects factor -Each level must be equally correlated with the other levels

44 Heterogeneity of Covariance Subject 1 Subject 2 Subject 3 Cond A Cond B Cond C Within-Subjects ANOVA assumes: Covariance(A, B) = Covariance(B, C) = Covariance(A, C)

45 Heterogeneity of Covariance Why is this a special problem for ERPs? Why is this a special problem for ERPs? -Covariance is lower for more distant electrode pairs than for nearby electrode pairs -Whenever 3 or more electrodes are used, heterogeneity of covariance is likely SPR mandates that papers deal with this problem SPR mandates that papers deal with this problem Greenhouse-Geisser epsilon adjustment Greenhouse-Geisser epsilon adjustment -Degree of nonsphericity is computed -An adjustment factor, epsilon, is computed -New df computed by mulitplying epsilon by original df -New df used for computing p-values Greehouse-Geisser epsilon is overly conservative Greehouse-Geisser epsilon is overly conservative -Can use Huynh-Feldt epsilon instead Everyone should use epsilon adjustment for all studies, not just ERP studies Everyone should use epsilon adjustment for all studies, not just ERP studies

46 Method Sections Should Include… Number of trials per condition Number of trials per condition Recording sites, electrode type, amplifier gain, filters, sampling rate and resolution, impedance, reference, and offline re-referencing Recording sites, electrode type, amplifier gain, filters, sampling rate and resolution, impedance, reference, and offline re-referencing -Include impulse response function details for offline filters Artifact rejection procedure Artifact rejection procedure -Include observed mean and range of % rejected trials -Include # of subjects rejected and standard for rejection -Rejection of trials with behavioral errors ERP measurement procedures ERP measurement procedures -Measurement windows and perhaps justification Greenhouse-Geisser epsilon adjustment Greenhouse-Geisser epsilon adjustment See Picton et al. (2000) See Picton et al. (2000)

47 Results Sections Should Include… Complete description of behavior Complete description of behavior Waveforms Waveforms -From multiple sites (usually) -With a prestimulus baseline (always!!!) -Waveforms from which difference waves were computed (usually) Inferential statistics Inferential statistics -But means and waveforms should take precedence -Tables to show all main effects and interactions See Picton et al. (2000) See Picton et al. (2000)

48


Download ppt "All slides © S. J. Luck, except as indicated in the notes sections of individual slides Slides may be used for nonprofit educational purposes if this copyright."

Similar presentations


Ads by Google