Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr.

Slides:



Advertisements
Similar presentations
Hypothesis Testing Steps in Hypothesis Testing:
Advertisements

PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Statistical Issues in Research Planning and Evaluation
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
C82MST Statistical Methods 2 - Lecture 4 1 Overview of Lecture Last Week Per comparison and familywise error Post hoc comparisons Testing the assumptions.
Chapter Seventeen HYPOTHESIS TESTING
Part I – MULTIVARIATE ANALYSIS
MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Independent Samples and Paired Samples t-tests PSY440 June 24, 2008.
Lecture 9: One Way ANOVA Between Subjects
One-way Between Groups Analysis of Variance
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Anthony J Greene1 ANOVA: Analysis of Variance 1-way ANOVA.
Today Concepts underlying inferential statistics
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
The t Tests Independent Samples.
Chapter 14 Inferential Data Analysis
Richard M. Jacobs, OSA, Ph.D.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
AM Recitation 2/10/11.
Inferential Statistics: SPSS
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
ANOVA Greg C Elvers.
Education 793 Class Notes T-tests 29 October 2003.
Comparing Two Population Means
T tests comparing two means t tests comparing two means.
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
The t Tests Independent Samples. The t Test for Independent Samples Observations in each sample are independent (not from the same population) each other.
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
ANOVA (Analysis of Variance) by Aziza Munir
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
6/2/2016Slide 1 To extend the comparison of population means beyond the two groups tested by the independent samples t-test, we use a one-way analysis.
Education 793 Class Notes Presentation 10 Chi-Square Tests and One-Way ANOVA.
6/4/2016Slide 1 The one sample t-test compares two values for the population mean of a single variable. The two-sample t-test of population means (aka.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Chapter 12 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 12: One-Way Independent ANOVA What type of therapy is best for alleviating.
KNR 445 Statistics t-tests Slide 1 Introduction to Hypothesis Testing The z-test.
Power and Sample Size Anquan Zhang presents For Measurement and Statistics Club.
Chapter 10 The t Test for Two Independent Samples
Two-Way (Independent) ANOVA. PSYC 6130A, PROF. J. ELDER 2 Two-Way ANOVA “Two-Way” means groups are defined by 2 independent variables. These IVs are typically.
T tests comparing two means t tests comparing two means.
T Test for Two Independent Samples. t test for two independent samples Basic Assumptions Independent samples are not paired with other observations Null.
Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu
Statistics for the Social Sciences Psychology 340 Spring 2009 Analysis of Variance (ANOVA)
Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research.
Handout Five: Between-Subjects Design of Analysis of Variance- Planned vs. Post Hoc Comparisons EPSE 592 Experimental Designs and Analysis in Educational.
Handout Twelve: Design & Analysis of Covariance
Handout Eight: Two-Way Between- Subjects Design with Interaction- Assumptions, & Analyses EPSE 592 Experimental Designs and Analysis in Educational Research.
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
Analysis of Variance STAT E-150 Statistical Methods.
Handout Ten: Mixed Design Analysis of Variance EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr. Amery Wu Handout Ten:
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Statistics for Education Research Lecture 4 Tests on Two Means: Types and Paired-Sample T-tests Instructor: Dr. Tung-hsien He
Inferential Statistics Psych 231: Research Methods in Psychology.
Chapter 10: The t Test For Two Independent Samples.
Hypothesis testing using contrasts
Chapter 9 Hypothesis Testing.
T-test for 2 Independent Sample Means
Reasoning in Psychology Using Statistics
Reasoning in Psychology Using Statistics
Chapter 10 – Part II Analysis of Variance
Presentation transcript:

Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr. Amery Wu Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr. Amery Wu 1

2 We have learned how 1.ANOVA partitions the score variation attributable to the factor and to the sampling error. 2.we can use the F ratio to hypothesis test whether the variation due to factor (group differences) is true in the population. 3. we can conduct multiple comparisons. we can plan our paired comparisons ahead of time or use a data-driven post hoc approach to detect the mean differences. For these apparatuses to work appropriately or to work to their best effect, some logistics must be considered and checked. Where We have Been with One-way ANOVA

3 Factors Influencing the Appropriateness & Effectiveness of ANOVA  Sample Size  Effect Size  Power  Data Assumptions

4 The power of a statistical test is the probability that the test will reject the null hypothesis when the null hypothesis is false. If we define “false null” as the “CASE” (what we would like to detect), power is how sensitive a statistical test is to detect a true CASE, i.e., the true “positive” rate. It is also called sensitivity. Retain (0)Reject (1) Null is True (0)SpecificityType I Error Null is False (1)Type II ErrorPower (Sensitivity) Statistical Power

5  The power is, in general, a function of the size of the population parameter (population effect size), sample size, and the alpha level.  Power (π) = F (Δ, N, α). Note. Δ, delta, denotes population effect size.  Let’s take one sample t test for example, if the null is false (CASE) and the alpha is fixed (say set to = 0.05), the greater the test statistic t is, the more power we would have to reject the null. Factors Influencing Power  Looking at the right side of the equation, we can see that the greater the numerator (effect size Δ, i.e., mean difference), the greater the t (hence, the power) will be.  Also, the smaller the denominator (sampling error, i.e., standard error) is, the greater the t (hence, the power) will be. That is, the greater the sample size N is, the greater the t (hence, the power) will be.

Issues with the Method of Hypothesis Testing Sample Size  When the sample size is small, a true mean difference (M 1 -M 2 ) could be undetected and the t test would fail to reject the H 0.  On the contrary, an insignificantly trivial mean difference could be detected, when the sample size is large.  To address this issue, researchers are recommended to alternatively report the magnitude of the difference in effect size measures, instead of simply the p value of the hypothesis. 6

Observed Effect Size  Effect size is a measure of the magnitude of (how big) the effect (group difference ) is.  Effect size is a standardized measure because it transforms the magnitude of difference from the raw score scale to a scale of 0-1. Thus, differences found in studies of the same DV but measured in different raw scales can all be compared because they are all on the same scale.  A variety of effect size measures have been suggested. It could be either on the unit scale (e.g., Cohen’s d) or the squared unit scale (e.g., eta squared).  The population effect size is unknown. Most effect sizes we compute are “observed,” meaning they are computed based on the sample data. 7

8  Cohen’s d is the most commonly reported effect size, which is expressed on the standard unit scale.  Cohen's d = M 1 - M 2 / s pooled, where s pooled = [(SD 1 + SD 2 ) / 2] (when sample sizes are equal across groups).  SPSS drop-down menu does not provide this measure. Lab Activity: Using the information in the table below, calculate the observed effect sizes for all paired comparisons. Effect Size- Cohen’s d

9 Eta squared is expressed in the squared unit scale. It is calculated as SS between groups /SS total Effect Size- Eta Squared Lab Activity: Using the information in the ANOVA table below, compute the eta square for the factor “dose”. SPSS provides eta squared for the factor as a whole (all levels combined) as well as paired comparisons. See the slide Computing Retrospective Power Using SPSS.

10 Prospective vs. Retrospective Power  When the power of a statistical test is required or computed before the data is collected, it is referred to as prospective (a priori) power. When the power is computed after the data have been collected and a statistical decision has been made, it is referred to as retrospective (observed or post hoc) power (Zumbo & Hubley, 1998)  A prospective power function is typically used to estimate the sufficient sample size in order to achieve a certain required level of power in the design phase prior to the data collection. Most granting and other funding agencies will not consider funding a proposal without evidence of prospective power and its required sample size.  Retrospective is computed based on the “given” data. Post-hoc power analysis uses the obtained sample size and sample effect size to determine what the power was in the study. Assuming the effect size in the sample is equal to the effect size in the population. Journal editors, reviewers and readers are also concerned with the retrospective power of the statistical tests reported when evaluating an existing study.  Retrospective power is not the same as retrospective power.

11 Lab Activity: Estimating Prospective Power & Sample Size Dr. Karl L. Wuensch is studying the effect of the dose of a new drug on patients’ depression. He plans to have three levels for the treatment, (control, 10mg, & 20mg) that will be randomly prescribed to 3 groups of patients. His hypothesis is that drug dose has a “medium” population effect on depression. He referred to “medium” using Cohen’s d guidelines as follows: Small effect: from 0.2 to 0.3 Medium effect: around 0.5 Large effect: ≥0.8 He would like to know how many patients he has to recruit before he collects data in order to achieve a certain level of power. Lab Activity: Go to the ANOVA power calculation site at Note that this site uses Cohen’s d as an effect size measure. Help Dr. Wuensch determine how many patients he should recruit.

12 Lab Activity: Computing Retrospective Power Using SPSS Dr. Karl L. Wuensch ended up recruiting 20 patients for each group because of the constraints of time, budgets, and difficulty to recruiting voluntary patents. His data was complied in the SPSS file called dose.sav Lab Activity: Use SPSS to compute the retrospective (observed or post hoc power) for Dr. Wuensch’s study.

13 Lab Activity: Computing Retrospective Power- SPSS Output Questions: Why does the first comparison [Dose=1] have a much higher observed power than the second comparison [Dose=2]? Note that SPSS does not, by default, report Cohen’s d. Instead, the effect size reported by SPSS is (partial) eta squared. [Dose=1]: comparing the first group (control) to the last group (20mg) [Dose=2]: comparing the second group (10 mg) to the last group (20mg)

One-way between subjects analysis makes three assumptions about the data. For this method to work well, the data should meet the assumptions “reasonably” well. 1.The observations (scores across the individuals) are independent of one another. 2. The data in each group should follow the normal distribution. 3. The variances are equal between the groups (homogeneity). ANOVA Assumptions One-way Between Subjects Analysis 14

15  Observations (scores across the individuals) are independent of one another. The observation of the score of one individual is not influenced by that of another.  This assumption should be checked by examining how the individual scores are collected. For example, if some of the scores are relatively more similar to the others because of the time/location when they are collected, then the assumption is violated.  A typical example of violation of independent observations is that individual students’ academic achievement data are collected through sampling the schools. Students’ scores are influenced by the fact that they share the same teachers and/or principle and the same school climate. Students’ achievement may tend to be relatively higher/lower in one school than the other.  If this assumptions is violated, a random effects ANOVA should be used. Checking Assumption -Independent Observations

16 Checking Assumption -Normal Distribution  The data are sampled from a normally distributed population hence should be fairly normal.  The data in each group should follow the normal distribution.  if this assumption is violated, use non-parametric test statistics.  This assumption could be checked by the skewness, histogram, boxplot, QQ plots, etc. Use SPSS “Analyze” > “Descriptive” > “Explore” commands for these plots.

17 Checking Assumption -Equal Variances (homogeneity)  The group variances are equal. If violated, the power is reduced, but the Type-one error rate is still robust.  One can check this assumption by eyeball comparing SDs or variances for the groups. Alternatively, one can use Levene's homogeneity test; a non-significant result indicates the variances are all equal. This test could be to too sensitive if the sample size is large. Levene’s test can be found in SPSS.

18 When the equal variances assumption is violated, the power is reduced, but the Type-I error rate is still robust. Thus, one can increase sample size or use paired t-tests that do not assume equal variances. Checking Assumption -Equal Variances (homogeneity)