Power analysis Chong-ho Yu, Ph.Ds..

Slides:



Advertisements
Similar presentations
PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Advertisements

Effect Size and Power.
Statistical Issues in Research Planning and Evaluation
RIMI Workshop: Power Analysis Ronald D. Yockey
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
Thursday, September 12, 2013 Effect Size, Power, and Exam Review.
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Statistics for the Social Sciences
PSY 307 – Statistics for the Behavioral Sciences
Today Concepts underlying inferential statistics
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
Psy B07 Chapter 8Slide 1 POWER. Psy B07 Chapter 8Slide 2 Chapter 4 flashback  Type I error is the probability of rejecting the null hypothesis when it.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Elementary Statistical Methods André L. Souza, Ph.D. The University of Alabama Lecture 22 Statistical Power.
POWER ANALYSIS Chong-ho Yu, Ph.Ds.. What is Power? Researchers always face the risk of failing to detect a true significant effect. It is called Type.
Understanding the Variability of Your Data: Dependent Variable Two "Sources" of Variability in DV (Response Variable) –Independent (Predictor/Explanatory)
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Introduction to inference Use and abuse of tests; power and decision IPS chapters 6.3 and 6.4 © 2006 W.H. Freeman and Company.
Step 3 of the Data Analysis Plan Confirm what the data reveal: Inferential statistics All this information is in Chapters 11 & 12 of text.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 4): Power Fall, 2008.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Education 793 Class Notes Decisions, Error and Power Presentation 8.
1 Power & Sample Size Calculations Presented by: Ms. Kamla Kumari D Maharaj MSc Nursing CON JPMC Karachi Course facilitator: Ms. Rabia Riaz Dated: 28 th.
Statistical Techniques
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 11: Between-Subjects Designs 1.
How Bad Is Oops?. When we make a decision while hypothesis testing (to reject or to do not reject the H O ) we determine which kind of error we have made.
Logic of Hypothesis Testing
MEASURES OF CENTRAL TENDENCY Central tendency means average performance, while dispersion of a data is how it spreads from a central tendency. He measures.
Dependent-Samples t-Test
Power and Multiple Regression
More on Inference.
Making Sense of Statistical Significance Inference as Decision
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2017 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Hypothesis Testing Is It Significant?.
Hypothesis Testing: One Sample Cases
Introduction to inference Use and abuse of tests; power and decision
Statistics for the Social Sciences
Inference and Tests of Hypotheses
Chapter 21 More About Tests.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Hypothesis Testing Is It Significant?.
Central Limit Theorem, z-tests, & t-tests
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Hypothesis Testing: Hypotheses
More about Tests and Intervals
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 7-9
INTEGRATED LEARNING CENTER
2 independent Groups Graziano & Raulin (1997).
Calculating Sample Size: Cohen’s Tables and G. Power
More on Inference.
Unlocking the Mysteries of Hypothesis Testing
Introduction to Econometrics, 5th edition
Review: What influences confidence intervals?
Section 10.3 Making Sense of Statistical Significance
Elements of a statistical test Statistical null hypotheses
Reasoning in Psychology Using Statistics
Chapter 12 Power Analysis.
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Inferential Statistics
Unit 9 Quiz: Review questions
Psych 231: Research Methods in Psychology
Reasoning in Psychology Using Statistics
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Power and Error What is it?.
Z-test and T-test Chong Ho (Alex) Yu 8/12/2019 1:50 AM
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Introduction To Hypothesis Testing
Presentation transcript:

Power analysis Chong-ho Yu, Ph.Ds.

What is Power? Researchers always face the risk of failing to detect a true significant effect. It is called Type II error, also known beta. In relation to Type II error, power is define as 1 - beta. Statistical power is the probability of detecting a true significant difference.

Factors Power is determined by the following: Effect size Sample size Alpha level Direction (one or two tailed)

Effect size The distance between the null and the alternate (how far away from zero?) In regression, no effect = zero slope (flat line). Red line: A very steep slope = stronger effect Blue line: weaker effect  big increase of X (run) would lead to small increase of Y (rise).

Alpha level The cut-off of the probability for determining whether there is a significant effect. .1, 0.05, 0.01; Most common: 0.05 (5%) Smaller is better If I tell you I can correctly predict the gender of a randomly selected student, the probability that I am right by guessing is 50%. What a big deal!? If I can correctly predict the numbers of the next Powerball, the probability that I am right by guessing is .0000000000001. I must have super power!

Alpha level If most of my students can increase the test score by 50% after using my treatment, the probability that I achieved this result by luck might be as small as .05. If the probability (p value) yielded from a statistical test for my treatment matches the alpha level (.05) or even lower, then I can declare that my treatment really works (significant), not due to dumb luck.

Direction of the test Non-directional hypothesis (two-tailed test): My treatment would make a difference on your face (It could be better or worse) Directional hypothesis (one-tailed test): My treatment would make things better.

G*power

Larger effect size  Larger power Bigger slope means bigger effect (stronger relationship between X and Y) When the effect size (ES) increases from .15 to.55,, power increases from .44 to .99.

Larger sample size  Larger power Revert ES to .15. Change the sample size to 700. When the sample size increases from 100 to 700, power increases from .44 to .99.

Larger alpha level  Larger power Revert the sample size to 100 Change the alpha level to .1. When the alpha level increases from .05 to 1, power increase to .59. Caution: Don’t change the alpha level unless you have a good reason.

Two-tailed  smaller power Revert the alpha level to .05. Change the number of tails from 1 to 2. When the test is two-tailed (non-directional) instead of one-tailed (uni-directional), power decreases from .44 to .32.

Absolute power, corrupt absolutely Absolute power, corrupt (your research) absolutely i.e. When the test is too powerful, even a trivial difference will be mistakenly reported as a significant one. You can confirm virtually anything (e.g. Chinese food can cause cancer) with a very large sample size. This type of error is called Type I error (false claim).

Large sample  Over-power In California the average SAT score is 1500. A superintendent wanted to know whether the mean score of his students is significantly behind the state average. 50 students, Average SAT score =1495 Standard deviation is 100 one-sample t-test yielded a non-significant result (p = .7252, over the alpha level .05) The superintendent was relaxed and said, “We are only five points out of 1600 behind the state standard. This score difference is not a big deal.”  https://www.graphpad.com/quickcalcs/OneSampleT1.cfm?Format=SD

performance gap But a statistician recommended replicating the study with a sample size of 1,000. As the sample size increased, the variance decreased. While the mean remained the same (1495), the SD dropped to 50. But this time the t-test showed a much smaller p value (.0016) and needless to say, this “performance gap” was considered to be statistically significant. Afterwards, the board called for a meeting and the superintendent could not sleep. Someone should tell the superintendent that the p value is a function of the sample size and this so-called "performance gap" may be nothing more than a false alarm.

A balance Power analysis is a procedure to balance between Type I (false alarm) and Type II (miss) errors. Simon (1999) suggested to follow an informal rule that alpha is set to .05 and beta to .2. Power is expected to be .8. This rule implies that a Type I error is four times as costly as a Type II error.

A priori power analysis

Practical power analysis Muller and Lavange (1992) asserted that the following should be taken into account for power analysis: Money to spent Personnel time of statisticians and subject matter specialists Time to complete the study (deadline and opportunity cost)

Power plot to determine the sample size

Post hoc power analysis In some situations researchers have no choices in sample size. For example, the number of participants has been pre-determined by the project sponsor. In this case, power analysis should still be conducted to find out what the power level is given the pre-set sample size.

In-class assignment and homework A priori power analysis: You are going to conduct a study to find out how high school GPA can predict college test performance. Given that the desired power is .8, the effect size (slope) is .25, the alpha level is .05, and the test is 2-tailed, how many subjects do you need? Post hoc power analysis: You are given a data set. The sample size is 500 and all other settings are the same as the above. What is the power level? Is it OK to proceed? Why or why not? Download the document “discussion question of power” and follow the instruction.