Presentation is loading. Please wait.

Presentation is loading. Please wait.

Issues in Inferential Statistics

Similar presentations


Presentation on theme: "Issues in Inferential Statistics"— Presentation transcript:

1 Issues in Inferential Statistics
Chapter 8

2 Research Question What are the differences in males’ and females’ ability to make free throws (2 independent groups)? Between-Groups Design What are the differences in the mean number of free throws made during the middle of the season compared to the end of the season (one group tested twice; pretest to posttest)? Within-Groups Design

3 Between- and Within-Groups Designs

4 Between-Groups Designs - 22 and 32 2
2 x 2 between-groups design – 2 levels of gender and 2 levels of treatment (male and female) 3 x 2 x 2 between-groups design – 3 grade levels (7th, 8th, & 9th grade), 2 treatment levels (experimental & control), and 2 gender (male & female) levels

5 More Expanded Designs One way with-groups (repeated measures) design – 4 levels of time (3, 6, 9, &12 months) Within-groups design Mixed-model design – 3 levels of time (within-groups) and 2 levels of gender (between groups)

6 Video 8.1 : Between and Within Designs

7 Two-Tailed Tests Locating a z-ratio of ±2.99 on the normal curve
in a two-tailed test. Remember: ± 1, 2, & 3

8 One-Tailed Test Locating the critical value on a t-distribution
for a one-tailed test. Notice: instead of ± 2, you have 95% all on the left side of the distribution

9 Two-Tailed vs. One-Tailed Tests
Two-tailed more difficult to reject the null hypothesis since you are examining both ends of the curve – 2.5% on the left and 2.5% on the right (based on p = 0.05). One-tailed used when you know or hypothesize that one mean will be higher or lower in one direction. Two-tailed is based on the assumption that you don’t know which mean will be higher or lower or the direction of the differences. SPSS uses 2-tailed as its default. Most researchers (including me) use 2-tailed tests although technically 1-tailed tests should be considered.

10 Two-Tailed vs. One-Tailed Tests
Hypothesis (1-tailed) H0: μE = μC and H1: μE > μC or H1: μE < μC Hypothesis (2-tailed) H0: μE = μC and H1: μE ≠ μC

11 Video 8.2 : SPSS t-Test with One- and Two-Tailed Tests

12 Type I and Type II Errors

13 Video 8.3 : Type I and Type II Errors

14 Power—The Probability of Rejecting the Null Hypothesis When It Is False (Eq. 8.1)
and substituting 1.96 for the z-ratio, we can determine the location on curve A where the null hypothesis would be rejected.

15 Power Setting the alpha level at 0.05 theoretically would mean that there is a 5% probability of making a Type I error. Using an alpha level of 0.01, the probability is reduced to 1%. Reducing the alpha level (0.05 to 0.01), we increase the probability of making a Type II error. Conversely, by increasing the alpha level (0.01 to 0.05), we increase the probability of making a Type I error.

16 Calculating Power – An Illustration

17 Determining the Sample Size Required for a Desired Amount of Power (Eq

18 Video 8.4 : Power Simulations from Rice Virtual Labs

19 Assumption of Normality
Robustness of certain statistics (i.e., t-tests) allows us to still use them even if the assumption of normality is not met as in the cases of skewed or non-mesokurtic distributions. As sample size increases, the shape of the distribution approaches normality. 1-tailed test requires more than 20 subjects.

20 Sampling Distribution from Various Population Shapes
From Glass and Hopkins, Statistical Methods in Education and Psychology, 3e, © Reprinted by permission of Pearson Education, Inc.

21 Sampling, Level of Measurement, and Homogeneity of Variance
The theory behind random sampling is that the sample will be representative of the population. This is one of the assumptions of t-tests and reduces the chance of Type I error. Samples that are not randomly selected are referred to as non-probability samples since there is a greater probability in the sample not representing the population. In experimental research, we randomly assign subjects to groups along with examining pretest differences that may exists amongst the groups.

22 Sampling, Level of Measurement, and Homogeneity of Variance
Interval or ratio level data is one of the assumptions of t-tests and other parametric tests. Due to the robustness of t-tests, even nominal and ordinal levels of measurement can be accommodated if the sample is reasonably large (e.g., 30 or more per group). Based on Central Limit Theorem

23 Sampling, Level of Measurement, and Homogeneity of Variance
Homogeneity of variance (assumption of t-tests) refers to the variances/standard deviations of the groups being compared to be reasonably similar. Type I and II error is impacted by unequal variances along with unequal/large sample sizes and the level at which the alpha is set (0.05 vs. 0.01). Table 8.7 (homogeneity of variance and sample size). Figure 8.8 (relationship between sample size, variance, and alpha level).

24 Relationship Between Size, Variance, and α Level
From Glass and Hopkins, Statistical Methods in Education and Psychology, 3e, © Reprinted by permission of Pearson Education, Inc.

25 Statistical vs. Practical Significance
Statistical significance is affected by many factors such as: Number of subjects Size of the difference Homogeneity of variance Normality of data Alpha level (0.05 vs. 0.01) Other factors

26 Statistical vs. Practical Significance
Do the results have “real world” value or is the difference enough to actually make a difference? Do the results provide enough evidence of practical application? Results of the analysis may be significant, but hold little or no practical value. Results may indicate no statistical significance, but may be of practical value. Decisions are sometimes arbitrary and based on the researcher’s opinion or experiences. Decisions can also be made based upon theory and the research of others.

27 Research Example What is the effect of sleep deprivation on treadmill time to exhaustion, to the nearest minute?

28 Sleep Deprivation Data

29 Percent Improvement (Eq. 8.3)
This indicates a rather minimal level of improvement.

30 Effect Size In the sleep deprivation example based on Cohen’s d, the effect size is small (guidelines p. 182). In this experiment, the sleep deprivation was responsible for changing the dependent variable by only 0.15 of a standard deviation.

31 Omega Squared (ω2) In the sleep deprivation example, only 27% of the total variability in treadmill minutes is attributable to sleep deprivation, but 73% of the variability resulted from other factors such as measurement error, preexisting differences between subjects, and other uncontrolled variables.


Download ppt "Issues in Inferential Statistics"

Similar presentations


Ads by Google