Confidence Intervals, Effect Size and Power Chapter 8.

Slides:



Advertisements
Similar presentations
Hypothesis Testing making decisions using sample data.
Advertisements

Confidence Intervals, Effect Size and Power
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Stats 95.
Statistical Issues in Research Planning and Evaluation
Statistics for the Behavioral Sciences
Review Find the mean square (MS) based on these two samples. A.26.1 B.32.7 C.43.6 D.65.3 E M A = 14 M B = Flood.
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
Ethan Cooper (Lead Tutor)
Thursday, September 12, 2013 Effect Size, Power, and Exam Review.
1. Estimation ESTIMATION.
T-tests Computing a t-test  the t statistic  the t distribution Measures of Effect Size  Confidence Intervals  Cohen’s d.
Statistics for the Social Sciences
Behavioural Science II Week 1, Semester 2, 2002
Introduction to Inference Estimating with Confidence Chapter 6.1.
PSY 307 – Statistics for the Behavioral Sciences
CONFIDENCE INTERVALS What is the Purpose of a Confidence Interval?
Lecture 7 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 11: Power.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
One Sample Z-test Convert raw scores to z-scores to test hypotheses about sample Using z-scores allows us to match z with a probability Calculate:
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Chapter Ten Introduction to Hypothesis Testing. Copyright © Houghton Mifflin Company. All rights reserved.Chapter New Statistical Notation The.
The Paired-Samples t Test Chapter 10. Paired-Samples t Test >Two sample means and a within-groups design >The major difference in the paired- samples.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Hypothesis Testing:.
Overview of Statistical Hypothesis Testing: The z-Test
Chapter 13 – 1 Chapter 12: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Errors Testing the difference between two.
Inference for Proportions(C18-C22 BVD) C19-22: Inference for Proportions.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Estimation Statistics with Confidence. Estimation Before we collect our sample, we know:  -3z -2z -1z 0z 1z 2z 3z Repeated sampling sample means would.
Many times in statistical analysis, we do not know the TRUE mean of a population of interest. This is why we use sampling to be able to generalize the.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Statistics for the Behavioral Sciences Second Edition Chapter 11: The Independent-Samples t Test iClicker Questions Copyright © 2012 by Worth Publishers.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 10. Hypothesis Testing II: Single-Sample Hypothesis Tests: Establishing the Representativeness.
Exam Exam starts two weeks from today. Amusing Statistics Use what you know about normal distributions to evaluate this finding: The study, published.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Education 793 Class Notes Decisions, Error and Power Presentation 8.
Issues concerning the interpretation of statistical significance tests.
The Single-Sample t Test Chapter 9. The t Distributions >Distributions of Means When the Parameters Are Not Known >Using t distributions Estimating a.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Chapter 10: Confidence Intervals
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 6 Hypothesis Tests with Means.
Chapter 9: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Type I and II Errors Testing the difference between two means.
The Single-Sample t Test Chapter 9. t distributions >Sometimes, we do not have the population standard deviation. (that’s actually really common). >So.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 7 Making Sense of Statistical.
Statistical Techniques
Please hand in homework on Law of Large Numbers Dan Gilbert “Stumbling on Happiness”
Sampling Distribution (a.k.a. “Distribution of Sample Outcomes”) – Based on the laws of probability – “OUTCOMES” = proportions, means, test statistics.
Chapter 8 Single Sample Tests Part II: Introduction to Hypothesis Testing Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social &
Psych 230 Psychological Measurement and Statistics Pedro Wolf October 21, 2009.
Hypothesis test flow chart
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Hypothesis Testing and Statistical Significance
CHAPTER 7: TESTING HYPOTHESES Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
Confidence Intervals, Effect Size and Power Chapter 8.
Modern Languages Row A Row B Row C Row D Row E Row F Row G Row H Row J Row K Row L Row M
Chapter 9 Introduction to the t Statistic
Statistics for the Social Sciences
Effect Size.
Central Limit Theorem, z-tests, & t-tests
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 7-9
Statistics for the Behavioral Sciences
Presentation transcript:

Confidence Intervals, Effect Size and Power Chapter 8

The Dangers of NHST >We already discussed that statistically significant does not always mean practically important. Be especially skeptical if they have very large sample sizes (without any of the information presented in this chapter).

Women are bad at Math. >While I joke about being bad at math, I really dislike when people tell me women are not good at math. Seriously, I was on the state math team. >Cool presentationCool presentation

Practically Useful Numbers >So what do we do to judge a study for the practical significance? Confidence Intervals (small = good) Effect size (big = good)

Confidence Intervals >Point estimate: summary statistic – one number as an estimate of the population e.g., mean Just one number that an aggregate…

Confidence Intervals >Interval estimate: based on our sample statistic, range of sample statistics we would expect if we repeatedly sampled from the same population e.g., Confidence interval Tells you a lot at once!

>Interval estimate that includes the mean we would expect for the sample statistic a certain percentage of the time were we to sample from the same population repeatedly Typically set at 95 or 99% (hint that matches p<.05, p<.01) Confidence Intervals

>The range around the mean when we add and subtract a margin of error >Confirms findings of hypothesis testing and adds more detail Confidence Intervals

>Maybe a solution to the problems of NHST? Some argue we should calculate means and CIs and if they do not overlap, that means they are different! >Only works sometimes though …

Confidence Intervals >The calculations are not that different than what we’ve been doing: Take a z-score where 95% of the data is in the middle of the distribution Translate that back to a raw score to match the mean. (draw)

Calculating Confidence Intervals >Step 1: Draw a picture of a distribution that will include confidence intervals >Step 2: Indicate the bounds of the CI on the drawing >Step 3: Determine the z statistics that fall at each line marking the middle 95% >Step 4: Turn the z statistic back into raw means >Step 5: Check that the CIs make sense

Confidence Interval for z-test M lower = - z(σ M ) + M sample M upper = z(σ M ) + M sample >Think about it: Why do we need two calculations?

Example > Let’s compare the average calories consumed by patrons of Starbucks that posted calories on their menus to the population mean number of calories consumed by patrons of Starbucks that do not post calories on their menus (Bollinger, Leslie, & Sorensen, 2010). >The population mean was 247 calories, and we considered 201 to be the population standard deviation. The 1000 people in the sample consumed a mean of 232 calories.

A 95% Confidence Interval, Part I

A 95% Confidence Interval, Part II

A 95% Confidence Interval, Part III

>Which do you think more accurately represents the data: point estimate or interval estimate? Why? >What are the advantages and disadvantages of each method? Think About It

Effect Size: Just How Big Is the Difference? >Significant = Big? >Significant = Different? >Increasing sample size will make us more likely to find a statistically significant effect

Small Effect SizeBig Effect Size Fail to reject nullProbably not there, sample size not issue Important! May not have enough people in study Reject nullProbably not there, sample size may be causing rejection (Type 1) Important and we found it! (power). Effect Size: Just How Big Is the Difference?

What is effect size? >Size of a difference that is unaffected by sample size How much two populations do not overlap >Standardization across studies

Effect Size & Mean Differences Imagine both represent significant effects: Which effect is bigger?

Effect Size and Standard Deviation Note the spread in the two figures: Which effect size is bigger?

>Decrease the amount of overlap between two distributions: 1. Their means are farther apart 2. The variation within each population is smaller Increasing Effect Size

>Cohen’s d Estimates effect size >Assesses difference between means using standard deviation instead of standard error Calculating Effect Size

Example > Let’s compare the average calories consumed by patrons of Starbucks that posted calories on their menus to the population mean number of calories consumed by patrons of Starbucks that do not post calories on their menus (Bollinger, Leslie, & Sorensen, 2010). >The population mean was 247 calories, and we considered 201 to be the population standard deviation. The 1000 people in the sample consumed a mean of 232 calories.

MOTE >Free effect size program Created by the DOOM lab! Yeah!

P rep >PLEASE IGNORE THE Prep NONSENSE IN YOUR BOOK. This book was published in 2011, well after the psychology community discovered it was mathematically flawed. Sheesh. 

Statistical Power >The measure of our ability to reject the null hypothesis, given that the null is false The probability that we will reject the null when we should The probability that we will avoid a Type II error

Power Rangers Example >See handout!

Calculating Power >Step 1: Determine the information needed to calculate power Population mean (um) Standard error (N, o needed) Sample mean found (M)

Calculating Power >Step 2: Determine a critical z value (cut off score) >Calculate raw score for that z: >Mneeded = um + (om)*z

Calculating Power >Step 3: Figure out the distance between the M needed and sample M you estimated Z = (Mneeded – Msample) / om >Then the percentage of the distribution above/below that score is power (matches your hypothesis of higher or lower)

What increases power?

Increasing Alpha

Two-Tailed Verses One-Tailed Tests

Increasing Sample Size or Decreasing Standard Deviation Both of these decrease standard error

Increasing the Difference between the Means Effect size

>Larger sample size increases power >Alpha level Higher level increases power (e.g., from.05 to.10) >One-tailed tests have more power than two-tailed tests >Decrease standard deviation >Increase difference between the means Factors Affecting Power

>Why would it be important to know the statistical power of your study? Think About It

Meta-Analysis >Meta-analysis considers many studies simultaneously. >Allows us to think of each individual study as just one data point in a larger study.

The Logic of Meta-Analysis >STEP 1: Select the topic of interest >STEP 2: Locate every study that has been conducted and meets the criteria >STEP 3: Calculate an effect size, often Cohen’s d, for every study >STEP 4: Calculate statistics and create appropriate graphs