Dr. Tom Kuczek Purdue University. Power of a Statistical test Power is the probability of detecting a difference in means under a given set of circumstances.

Slides:



Advertisements
Similar presentations
T-tests continued.
Advertisements

Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Chapter 25: Paired Samples and Blocks. Paired Data Paired data arise in a number of ways. Compare subjects with themselves before and after treatment.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Decision Errors and Power
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
Thursday, September 12, 2013 Effect Size, Power, and Exam Review.
Topic 6: Introduction to Hypothesis Testing
Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.
Stat 301 – Day 28 Review. Last Time - Handout (a) Make sure you discuss shape, center, and spread, and cite graphical and numerical evidence, in context.
The t-test:. Answers the question: is the difference between the two conditions in my experiment "real" or due to chance? Two versions: (a) “Dependent-means.
Sample size computations Petter Mostad
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Introduction to Inference Estimating with Confidence Chapter 6.1.
Statistics Micro Mini Threats to Your Experiment!
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Chapter 11: Inference for Distributions
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Sample Size Determination
Inference about Population Parameters: Hypothesis Testing
Sampling Theory Determining the distribution of Sample statistics.
Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing.
Objective: To test claims about inferences for two sample means, under specific conditions.
13 Design and Analysis of Single-Factor Experiments:
Psy B07 Chapter 8Slide 1 POWER. Psy B07 Chapter 8Slide 2 Chapter 4 flashback  Type I error is the probability of rejecting the null hypothesis when it.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
More About Significance Tests
Inferences for Regression
Inference for a Single Population Proportion (p).
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Chapter 8 Introduction to Hypothesis Testing
Concept of Power ture=player_detailpage&v=7yeA7a0u S3A.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Significance Toolbox 1) Identify the population of interest (What is the topic of discussion?) and parameter (mean, standard deviation, probability) you.
Chapter 25: Paired Samples and Blocks
Chapter 13 ANOVA The Design and Analysis of Single Factor Experiments - Part II Chapter 13B Class will begin in a few minutes. Reaching out to the internet.
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
PSY2004 Research Methods PSY2005 Applied Research Methods Week Five.
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 4: Study Size and Power.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
Chapter 8 Delving Into The Use of Inference 8.1 Estimating with Confidence 8.2 Use and Abuse of Tests.
Chapter 16 Data Analysis: Testing for Associations.
Confidence intervals: The basics BPS chapter 14 © 2006 W.H. Freeman and Company.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Simple examples of the Bayesian approach For proportions and means.
Math 4030 – 9a Introduction to Hypothesis Testing
C82MST Statistical Methods 2 - Lecture 1 1 Overview of Course Lecturers Dr Peter Bibby Prof Eamonn Ferguson Course Part I - Anova and related methods (Semester.
Hypothesis Testing Errors. Hypothesis Testing Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
MPS/MSc in StatisticsAdaptive & Bayesian - Lect 51 Lecture 5 Adaptive designs 5.1Introduction 5.2Fisher’s combination method 5.3The inverse normal method.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Statistical Techniques
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Introduction to inference Estimating with confidence IPS chapter 6.1 © 2006 W.H. Freeman and Company.
Lec. 19 – Hypothesis Testing: The Null and Types of Error.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Statistical Inferences for Variance Objectives: Learn to compare variance of a sample with variance of a population Learn to compare variance of a sample.
Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 4: Estimating parameters with confidence.
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
More on Inference.
Meta-analysis statistical models: Fixed-effect vs. random-effects
More on Inference.
Introduction to Hypothesis Testing
Chapter 11: Testing a Claim
Power Problems.
Objectives 6.1 Estimating with confidence Statistical confidence
Presentation transcript:

Dr. Tom Kuczek Purdue University

Power of a Statistical test Power is the probability of detecting a difference in means under a given set of circumstances. Power, denoted as 1-β, depends on: n= the sample size in each group α=Probability of a Type I error σ=standard deviation of experimental error |µ 1 -µ 2 |=absolute difference between the means of each treatment group

What affects Power of a Statistical test? Power increases under the following conditions: n increases (may be costly) α increases (one reason α≤0.05 is recommended) σ decreases (will be discussed shortly) |µ 1 -µ 2 | increases (not an option if the treatments are fixed)

Experimental Error Experimental error, quantified by σ, actually has a number of components including: The variability in experimental units The variability of experimental conditions The variability inherent in the measurement system Nuisance variables, generally unobserved

Design issues related to increasing Power Increasing the sample size n always works, but may not be feasible, either due to financial constraints or physical constraints on the experiment (such as a limited amount of test material). If n and α are fixed, then the Power of the t-test depends only on the ratio |µ 1 -µ 2 |/ σ, as this ratio increases so does the Power of the test. If the method of applying the treatments is fixed then |µ 1 -µ 2 | is likely fixed. If n is also fixed, then the only option is to somehow reduce the experimental error σ.

How to reduce experimental error? The best way is to get a better measurement system with lower variability. This also improves any estimates of treatment effects. (example: qt-PCR vs. Microarray) Another way is to make the experimental conditions as tightly controlled as possible. This may negatively affect inference from the experiment (if one wishes to claim that the treatment works better under a variety of conditions). Reducing variability in experimental units also negatively affects inference, especially in clinical trials, since one would like to say the treatment works best on as many subjects as possible.

Statistical vs. Practical Significance If |µ 1 -µ 2 |>0, then for a large enough n, the null hypothesis H 0 will almost certainly be rejected giving Statistical Significance. The problem is that if |µ 1 -µ 2 | is “small”, say near zero, there may be no practical difference between the responses to the two treatments. This affects making a recommendation of which treatment to apply to a population. The treatment with the “better” mean may not make economic sense if there is no practical difference.

Be careful about Clinical Trials There is a fine point here. With regard to clinical trials, a subject may respond better to one treatment than another. The ideal thing in this situation is to test the subject with both treatments at different time points. This will be addressed later in the semester when we discuss Crossover Designs.

How to address Practical Significance in reporting results of an experiment In certain areas of Science, especially Biomedical Research, this is addressed by reporting the Effect Size. The Effect Size is a way to quantify practical significance for a population. The Effect Size, usually denoted ES (or d) in the literature, is defined by: ES=|µ 1 -µ 2 |/ σ.

Historically… The most cited paper in the clinical literature I have seen is: Reporting the Size of Effects in Research Studies to Facilitate Assessment of Practical or Clinical Significance. Kraemer, H.C., Psychoneuroendocrinology, Vol. 17, No. 6, pp , (Very readable too, even for those who are not into clinical trials).

Suggested guidelines for reporting mean differences in Clinical Data ES~0.2 is “small”. ES~0.5 is “moderate”. ES~0.8 is “large”. The reasoning is based upon the shift in mean of the Normal Distribution for the treatment group relative to the Control group. Please note that while this is common terminology for the Clinical literature, the terminology can vary in other areas of Science or Engineering (though I don’t know of other standard references).

Multiple experimental trials Data from more than one trial can be combined. One example is a Randomized Complete Block Design (RCBD) which will be covered later. An increasingly popular technique to estimate an “overall” ES using multiple trials is called Meta- Analysis. It can weight individual Effect Sizes by the sample sizes used in different experiments. (It is an advanced topic we will not cover in this course and there are a number of books in print which treat this topic.)