Issues in Inferential Statistics

Slides:



Advertisements
Similar presentations
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Advertisements

PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Statistical Decision Making
Chapter Seventeen HYPOTHESIS TESTING
Statistics for the Social Sciences Psychology 340 Fall 2006 Review For Exam 1.
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE © 2012 The McGraw-Hill Companies, Inc.
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 13 Using Inferential Statistics.
Today Concepts underlying inferential statistics
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
The t Tests Independent Samples.
Chapter 14 Inferential Data Analysis
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Choosing Statistical Procedures
Chapter Ten Introduction to Hypothesis Testing. Copyright © Houghton Mifflin Company. All rights reserved.Chapter New Statistical Notation The.
Chapter 8 Introduction to Hypothesis Testing
Comparing Means From Two Sets of Data
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
T tests comparing two means t tests comparing two means.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Inferential Statistics 2 Maarten Buis January 11, 2006.
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
1 Psych 5500/6500 t Test for Two Independent Means Fall, 2008.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Inference and Inferential Statistics Methods of Educational Research EDU 660.
QUANTITATIVE RESEARCH AND BASIC STATISTICS. TODAYS AGENDA Progress, challenges and support needed Response to TAP Check-in, Warm-up responses and TAP.
1.State your research hypothesis in the form of a relation between two variables. 2. Find a statistic to summarize your sample data and convert the above.
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
Chapter 13 Understanding research results: statistical inference.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Inferential Statistics Psych 231: Research Methods in Psychology.
Independent-Samples t test
Logic of Hypothesis Testing
Dependent-Samples t-Test
Psych 231: Research Methods in Psychology
Hypothesis Testing: One Sample Cases
Part Four ANALYSIS AND PRESENTATION OF DATA
Inference and Tests of Hypotheses
Understanding Results
© LOUIS COHEN, LAWRENCE MANION AND KEITH MORRISON
Data Analysis and Interpretation
Central Limit Theorem, z-tests, & t-tests
Hypothesis Testing: Hypotheses
Introduction to Inferential Statistics
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 7-9
2 independent Groups Graziano & Raulin (1997).
Reasoning in Psychology Using Statistics
Reasoning in Psychology Using Statistics
Hypothesis Testing.
1.3 Data Recording, Analysis and Presentation
Statistics for the Social Sciences
Psych 231: Research Methods in Psychology
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
What are their purposes? What kinds?
Reasoning in Psychology Using Statistics
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Presentation transcript:

Issues in Inferential Statistics Chapter 8

Research Question What are the differences in males’ and females’ ability to make free throws (2 independent groups)? Between-Groups Design What are the differences in the mean number of free throws made during the middle of the season compared to the end of the season (one group tested twice; pretest to posttest)? Within-Groups Design

Between- and Within-Groups Designs

Between-Groups Designs - 22 and 32 2 2 x 2 between-groups design – 2 levels of gender and 2 levels of treatment (male and female) 3 x 2 x 2 between-groups design – 3 grade levels (7th, 8th, & 9th grade), 2 treatment levels (experimental & control), and 2 gender (male & female) levels

More Expanded Designs One way with-groups (repeated measures) design – 4 levels of time (3, 6, 9, &12 months) Within-groups design Mixed-model design – 3 levels of time (within-groups) and 2 levels of gender (between groups)

Video 8.1 : Between and Within Designs

Two-Tailed Tests Locating a z-ratio of ±2.99 on the normal curve in a two-tailed test. Remember: ± 1, 2, & 3

One-Tailed Test Locating the critical value on a t-distribution for a one-tailed test. Notice: instead of ± 2, you have 95% all on the left side of the distribution

Two-Tailed vs. One-Tailed Tests Two-tailed more difficult to reject the null hypothesis since you are examining both ends of the curve – 2.5% on the left and 2.5% on the right (based on p = 0.05). One-tailed used when you know or hypothesize that one mean will be higher or lower in one direction. Two-tailed is based on the assumption that you don’t know which mean will be higher or lower or the direction of the differences. SPSS uses 2-tailed as its default. Most researchers (including me) use 2-tailed tests although technically 1-tailed tests should be considered.

Two-Tailed vs. One-Tailed Tests Hypothesis (1-tailed) H0: μE = μC and H1: μE > μC or H1: μE < μC Hypothesis (2-tailed) H0: μE = μC and H1: μE ≠ μC

Video 8.2 : SPSS t-Test with One- and Two-Tailed Tests

Type I and Type II Errors

Video 8.3 : Type I and Type II Errors

Power—The Probability of Rejecting the Null Hypothesis When It Is False (Eq. 8.1) and substituting 1.96 for the z-ratio, we can determine the location on curve A where the null hypothesis would be rejected.

Power Setting the alpha level at 0.05 theoretically would mean that there is a 5% probability of making a Type I error. Using an alpha level of 0.01, the probability is reduced to 1%. Reducing the alpha level (0.05 to 0.01), we increase the probability of making a Type II error. Conversely, by increasing the alpha level (0.01 to 0.05), we increase the probability of making a Type I error.

Calculating Power – An Illustration

Determining the Sample Size Required for a Desired Amount of Power (Eq

Video 8.4 : Power Simulations from Rice Virtual Labs

Assumption of Normality Robustness of certain statistics (i.e., t-tests) allows us to still use them even if the assumption of normality is not met as in the cases of skewed or non-mesokurtic distributions. As sample size increases, the shape of the distribution approaches normality. 1-tailed test requires more than 20 subjects.

Sampling Distribution from Various Population Shapes From Glass and Hopkins, Statistical Methods in Education and Psychology, 3e, © 1996. Reprinted by permission of Pearson Education, Inc.

Sampling, Level of Measurement, and Homogeneity of Variance The theory behind random sampling is that the sample will be representative of the population. This is one of the assumptions of t-tests and reduces the chance of Type I error. Samples that are not randomly selected are referred to as non-probability samples since there is a greater probability in the sample not representing the population. In experimental research, we randomly assign subjects to groups along with examining pretest differences that may exists amongst the groups.

Sampling, Level of Measurement, and Homogeneity of Variance Interval or ratio level data is one of the assumptions of t-tests and other parametric tests. Due to the robustness of t-tests, even nominal and ordinal levels of measurement can be accommodated if the sample is reasonably large (e.g., 30 or more per group). Based on Central Limit Theorem

Sampling, Level of Measurement, and Homogeneity of Variance Homogeneity of variance (assumption of t-tests) refers to the variances/standard deviations of the groups being compared to be reasonably similar. Type I and II error is impacted by unequal variances along with unequal/large sample sizes and the level at which the alpha is set (0.05 vs. 0.01). Table 8.7 (homogeneity of variance and sample size). Figure 8.8 (relationship between sample size, variance, and alpha level).

Relationship Between Size, Variance, and α Level From Glass and Hopkins, Statistical Methods in Education and Psychology, 3e, © 1996. Reprinted by permission of Pearson Education, Inc.

Statistical vs. Practical Significance Statistical significance is affected by many factors such as: Number of subjects Size of the difference Homogeneity of variance Normality of data Alpha level (0.05 vs. 0.01) Other factors

Statistical vs. Practical Significance Do the results have “real world” value or is the difference enough to actually make a difference? Do the results provide enough evidence of practical application? Results of the analysis may be significant, but hold little or no practical value. Results may indicate no statistical significance, but may be of practical value. Decisions are sometimes arbitrary and based on the researcher’s opinion or experiences. Decisions can also be made based upon theory and the research of others.

Research Example What is the effect of sleep deprivation on treadmill time to exhaustion, to the nearest minute?

Sleep Deprivation Data

Percent Improvement (Eq. 8.3) This indicates a rather minimal level of improvement.

Effect Size In the sleep deprivation example based on Cohen’s d, the effect size is small (guidelines p. 182). In this experiment, the sleep deprivation was responsible for changing the dependent variable by only 0.15 of a standard deviation.

Omega Squared (ω2) In the sleep deprivation example, only 27% of the total variability in treadmill minutes is attributable to sleep deprivation, but 73% of the variability resulted from other factors such as measurement error, preexisting differences between subjects, and other uncontrolled variables.