SJS SDI_181 Design of Statistical Investigations 18 Sample Size Determination Stephen Senn.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

BPS - 5th Ed. Chapter 181 Two-Sample Problems. BPS - 5th Ed. Chapter 182 Two-Sample Problems u The goal of inference is to compare the responses to two.
Inference in the Simple Regression Model
Tests of Hypotheses Based on a Single Sample
Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: Null hypothesis.
Hypothesis Testing W&W, Chapter 9.
Chapter 7 Hypothesis Testing
Sample size estimation
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Chapter 9 Hypothesis Testing Understandable Statistics Ninth Edition
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
Topic 6: Introduction to Hypothesis Testing
Introduction to Hypothesis Testing
Hypothesis Testing for Population Means and Proportions
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Inference about a Mean Part II
BCOR 1020 Business Statistics
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 9: Introduction to the t statistic
Sample Size Determination Ziad Taib March 7, 2014.
Inferential Statistics
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
AM Recitation 2/10/11.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Overview Definition Hypothesis
Hypothesis Testing.
Copyright © 2012 by Nelson Education Limited. Chapter 8 Hypothesis Testing II: The Two-Sample Case 8-1.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Tests of significance & hypothesis testing Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Chapter 9 Large-Sample Tests of Hypotheses
Jan 17,  Hypothesis, Null hypothesis Research question Null is the hypothesis of “no relationship”  Normal Distribution Bell curve Standard normal.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Copyright © Cengage Learning. All rights reserved. Hypothesis Testing 9.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
Instructor Resource Chapter 5 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Determining the Sample Size. Doing research costs… Power of a hypothesis test generally is an increasing function of sample size. Margin of error is generally.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Estimating a Population Proportion
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 12 Inference About A Population.
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Introduction to hypothesis testing Hypothesis testing is about making decisions Is a hypothesis true or false? Ex. Are women paid less, on average, than.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
European Patients’ Academy on Therapeutic Innovation The Purpose and Fundamentals of Statistics in Clinical Trials.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
DSCI 346 Yamasaki Lecture 1 Hypothesis Tests for Single Population DSCI 346 Lecture 1 (22 pages)1.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9 Introduction to the t Statistic
Virtual University of Pakistan
Lecture Nine - Twelve Tests of Significance.
Two-Sample Hypothesis Testing
Hypothesis Testing: Hypotheses
Chapter 9 Hypothesis Testing.
Chapter 9 Hypothesis Testing.
Decision Errors and Power
Elements of a statistical test Statistical null hypotheses
Presentation transcript:

SJS SDI_181 Design of Statistical Investigations 18 Sample Size Determination Stephen Senn

SJS SDI_182 A Note About Terminology The term sample is best reserved for cases where (usually) a representative subset (the sample) of a population is drawn. Nevertheless in the context of experiments when deciding on the size of the experiment one often refers loosely to sample size rather than experiment size

SJS SDI_183 An Important Topic If your experiment/sample is too small you may have an inconclusive result –consequence: you have wasted resources If your experiment/sample is larger than necessary you could have reached an adequate result with less effort –consequence: you have wasted resources Hence getting it right is important

SJS SDI_184 Sampling Consider a simple random sample Hence if we have some target precision for the standard error we can, with knowledge of solve for n. In fact, it is often the case (but not always: practice differs) that we use a standard of precision for the 95% confidence interval instead. Since the limits are approximately 2SE from the point estimate, we often solve for

SJS SDI_185 Sample Size and Clinical Trials Context in which sample size determination is well-established –Ethical/commercial pressures Usual approach is in term of power –Concept in the Neyman-Pearson theory of testing established at UCL during the late 1920s and early 1930s We shall now review this theory very generally before proceeding to a more formal treatment

SJS SDI_186 Hypothesis Testing Nominate null hypothesis –Default state of nature –Hypothesis one wishes to disprove Example –Null: no difference between treatment and control –Alternative: treatment superior to control

SJS SDI_187 Hypothesis Testing Continued Establish suitable alternative hypothesis –Usually we have a family of alternative hypotheses Establish distribution of suitable statistic under null –It should be salient for null and alternative Different values should be probable under null and alternative Given value thus distinguishes between two

SJS SDI_188 Hypothesis Testing Choose critical region of the test –The boundary of this is the critical value The statistics should be unlikely to fall in this region if null true The statistic should be likely to fall in this region if alternative true The region should be chosen so as to fix probability of falling in region if null true

SJS SDI_189

10 Type I and Type II Errors

SJS SDI_1811 Error rates = size of test 1- = power of test

SJS SDI_1812

SJS SDI_1813 Power Probability of rejecting the null-hypothesis given that it is false. (Probability of accepting alternative if true.) –For example probability of claiming difference between treatments when they are different This depends on size of effect (amount of difference between treatments) –Usually it is best to think of power as function of size of effect

SJS SDI_1814

SJS SDI_1815 How to do a Power Calculation Assume null hypothesis true –Establish critical value as a function of the sample size Assume clinically relevant difference obtains –Use previously established critical value –Calculate probability of rejection This is the power of the test Modify sample size to achieve desired power

SJS SDI_1816 Test statistic (assumed approximately normally distributed with expectation ) Standard error (assumed known) Type I error rate Type II error rate Clinically relevant difference (assumed positive) Critical value of test pdf of Normal Distribution function of Normal Inverse distribution function (quantile)

SJS SDI_1817 We assume a one tailed test. (For a two-tailed test it is conventional to treat this as if it was a one-tailed test of size /2. The theory is then the same, making the necessary substitution.) First we establish the critical value. We require c such that We assume that under H 0 = 0.

SJS SDI_1818 Next we establish the power of the test This formula thus provides a target value for the reciprocal of the variance of the treatment estimate

SJS SDI_1819 Now take a specific example, that of a parallel group trial with variance 2 and two groups each of size n. We have Note that n is an increasing function of and a decreasing function of. Also as and decrease, z and z increase so that n is is a decreasing function of and. (The more we wish our type one and two error rates to decrease the greater our sample size must be.)

SJS SDI_1820 An Example Placebo controlled parallel group trial in asthma. Target variable is FEV 1. Clinical relevant difference is 200 ml. Standard deviation is 450 ml. Two sided significance test at the 5% level. Power is 0.8 or 80%.

SJS SDI_1821 Solution = 200 ml = 450 ml = 0.05 so Z /2 = 1.96 NB Two-sided test being used = = 0.2 so Z = 0.84 Substituting we have n = 2x(450ml) 2 ( ) 2 /(200ml) 2 = So about 80 patients per group are needed.

SJS SDI_1822 Actually, in practice the standard error is not known and hence for carrying out the test an estimated standard error has to be substituted. This generally means that the test is based on the t- distribution rather than the Normal distribution and the theory needs to be adjusted accordingly. We take the specific example of the parallel group trial with two arms and n patients per arm to illustrate this. For such a trial we estimate the standard error using the formula

SJS SDI_1823 Consequently the critical value for is not z (2/n but t s (2/n, where t is the point on the integral of Students t-distribution with =2n-2 degrees of freedom corresponding to a probability of. Hence we require Now, given, the LHS has a non-central t-distribution with = 2n-2 degrees of freedom and non-centrality parameter, n = /( (2/n Hence we can solve(numerically) P(T; n n ) t n = 1 - for n. In practice, however, this refinement usually makes little difference

SJS SDI_1824 Sample Size Determination in Practice There are many specialist packages now available for the statistician –nQuery, Pass, Power and Precision etc. A sample size of 81 in each group will have 80% power to detect a difference in means of assuming that the common standard deviation is using a two group t- test with a two-sided significance level. This is the solution found for the previous example using nQuery. Note that use of the non-central t has led to a slightly larger sample size

SJS SDI_1825 Practical Problems A number of practical problems remain, however. First, we should note that if the number of groups being compared is more than 2, if the design is not a parallel group trial, if the outcome is not Normally distributed, if the analysis is more complicated than that indicated above, if the trial is sequential, if the allocation ratio is not one to one, or if the purpose of the trial is to prove equivalence, a different approach will be required. These are technical problems, however, for which solutions can be found. Instead, some more practical issues are listed below.

SJS SDI_1826 Practical Issues Although the test itself does not require knowledge of, the formula (3) for the sample size does. In practice we use some previous estimate but this itself will be subject to sampling error and (3) does not take this into account. There is usually no agreed standard for a clinically relevant difference. The levels of and are themselves arbitrary. An allowance must be made for drop-outs. (Patient withdrawals.) It may be required that the results be robust to a number of analyses. This requires a larger sample. How do we trade-off the interests of patients in the trial against those of future patients?

SJS SDI_1827 Criticisms of the NP Approach The criterion is very strange –fixed and –why? There is no explicit mention of the costs of sampling –Same solution, however costly observations are to obtain.

SJS SDI_1828 Finally It must be understood that just because a sample size has been chosen which gives 80% power does not imply that there is an 80% chance that the trial will be successful. 1) The drug may not work 2) If it works it may not produce a clinically relevant difference. 3) The drug might have a greater effect than the clinically relevant difference. (Implies more power.) 4) The sample size determination depends on the assumption that the trial is run competently.

SJS SDI_1829 Questions Suppose you are estimating a sample size for a placebo controlled clinical trial in hypertension (say) but this is the first time the test drug has ever been used –How would you estimate the variance? If you had no previous information at all on which to base a variance estimate what could you do?