Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Power and Sample Size Calculations Drug Development Statistics & Data Management July 2014 Cathryn Lewis Professor of Genetic Epidemiology.

Similar presentations


Presentation on theme: "Statistical Power and Sample Size Calculations Drug Development Statistics & Data Management July 2014 Cathryn Lewis Professor of Genetic Epidemiology."— Presentation transcript:

1 Statistical Power and Sample Size Calculations Drug Development Statistics & Data Management July 2014 Cathryn Lewis Professor of Genetic Epidemiology & Statistics Department of Medical & Molecular Genetics King’s College London With thanks to Irene Rebollo Mesa and Frühling Rijsdijk

2 Outline Power and Sample size2 1.Concepts of power 2.Power and types of error 3.Software to calculate power 4.Power for continuous outcome 5.Power for proportion, success/failure 6.Quiz!

3 Power and Sample size3 Planning a Study Question : What are the study endpoints? Types of Endpoints: Binary clinical outcome: Death from disease. Quantitative : Creatinine, cholesterol levels, QOL. Time to Event: Time to graft failure, time to death, time to recovery Good Qualities: -Clinically meaningful -Practical and feasible to measure -Occur frequently enough throughout the duration of the trial

4 4 Planning a Study Question : What is the expected prevalence of outcome (discrete) or variability of the outcome (continuous)? Based on previous studies, pilot study or hospital/NHS report. Variability and prevalence are vital for power. Both are best at intermediate levels. Question:What is the expected difference between groups in proportion of events (if discrete), or in mean measure (if continuous) Based on previous studies or pilot study Alternatively, minimum difference clinically relevant The larger the difference the higher the power Power and Sample size

5 5 Design: What is your Hypothesis 1.Superiority Objective  To determine whether there is evidence of statistical difference in the comparison of interest between two Tx regimes: A: Tx of InterestB: Placebo or Active control Tx H 0 : The two Txs have equal effect with respect to the mean response H 1 : The two Txs are different with respect to the mean response

6 6 Statistical Power Power and Sample size

7 7 Power Definition: The expected proportion of samples in which we decide correctly against the null hypothesis It depends on: 1.Size of the (treatment) effect in the population (  ) 2.The significance level at which we reject the null (0.05) 3.Sample size (N) 4.Design of the study: parallel or crossover etc. 5.Endpoint measurement (categorical, ordinal, continuous) 6.The expected dropout rate Power and Sample size

8 8 Power primer We summarise results of a trial in a statistical analysis with a test statistic (e.g. chi-squared, Z score) Provide a measure of support for a certain hypothesis Pre-determine threshold on test statistic to reject null hypothesis Test statistic Inevitably leads to two types of mistake : false positive (YES instead of NO)(Type I) false negative (NO instead of YES) (Type II) YES OR NO decision-making : significance testing YES NO Power and Sample size

9 9 T alpha 0.05 Sampling distribution if H A were true Sampling distribution if H 0 were true   POWER: 1 -  Standard Case Power and Sample size

10 10 Rejection of H 0 Non-rejection of H 0 H 0 true H A true Power and Sample size Power 1-type II error = 1-β Type II error = β Signifcance level Type I error = α

11 11 Hypothesis testing Null hypothesis : no effect A ‘significant’ result means that we can reject the null hypothesis A ‘non-significant’ result means that we cannot reject the null hypothesis Power and Sample size

12 12 Statistical significance The ‘p-value’ The probability of a false positive error if the null were in fact true Typically, we are willing to incorrectly reject the null 5% or 1% of the time (Type I error) Power and Sample size

13 13 Rejection of H 0 Non-rejection of H 0 H 0 true H A true Power and Sample size Power 1-type II error = 1-β Type II error = β Signifcance level Type I error = α

14 14 Rejection of H 0 Non-rejection of H 0 H 0 true H A true Nonsignificant result (1-  ) Type II error at rate  Significant result (1-  ) Type I error at rate  Power and Sample size

15 15 T alpha 0.05 Sampling distribution if H A were true Sampling distribution if H 0 were true   POWER: 1 -  Standard Case Power and Sample size

16 16 T   POWER: 1 -  ↑ Increased effect size alpha 0.05 Sampling distribution if H A were true Sampling distribution if H 0 were true Power and Sample size

17 17 T   More conservative α alpha 0.01 Sampling distribution if H A were true Sampling distribution if H 0 were true POWER: 1 -  ↓ Power and Sample size

18 18   Less conservative α alpha 0.1 Sampling distribution if H A were true Sampling distribution if H 0 were true POWER: 1 -  ↑ Power and Sample size

19 19 T alpha 0.05 Sampling distribution if H A were true Sampling distribution if H 0 were true   Reduced variation Power and Sample size POWER: 1 -  ↑

20 20 Determining Sample Size We need: – Acceptable type I error rate (  ), usually 0.05, or 0.025 if one sided –A meaningful difference  in the response: the smallest Tx effect clinically worth detecting / that we wish to detect –The desirable power (1-  to detect this difference, min. 80% –Ratio of allocation to the groups (equal sample sizes?) –Whether to use one-sided or two-sided test In addition, –The variability common to the two populations for continuous endpoint –The response (event) rate of the control group for the binary endpoint Power and Sample size

21 21 Power and Sample size Calculating power using software or Web -PRISM StatMate ($50) -G*Power 3 (Free) -Statistical software: SPSS, SAS, Stata, R -PS Power and Sample size Calculation (free) (Windows) -Web: Google “Statistical Power Calculation” -Russell V. Lenth -http://www.stat.uiowa.edu/~rlenth/Power/http://www.stat.uiowa.edu/~rlenth/Power/ -David Schoenfeld -http://hedwig.mgh.harvard.edu/sample_size/size.htmlhttp://hedwig.mgh.harvard.edu/sample_size/size.html -Perform calculation in two methods – similar answers

22 22 Statistical Considerations Russ Lenth’s Power and Sample size page http://www.stat.uiowa.edu/~rlenth/Power/

23 23 Statistical Considerations http://hedwig.mgh.harvard.edu/sample_size/size.html

24 24 Determining Sample Size: Continuous outcome Two Anti-Hypertensives: –Testing for superiority Endpoint: Difference in Diastolic BP –Continuous variable Relevant parameters –Difference in Diastolic BP between drugs:  =2 mm Hg –Standard deviation of Diastolic BP in each group:  = 10 mm Hg –Significance level: 0.05 –Required power: 0.8 –Assume equal sized groups Calculate sample size required Power and Sample size 393 patients in each group

25 25 Power and Sample size Russ Lenth’s Power and Sample size page http://www.stat.uiowa.edu/~rlenth/Power/

26 26 Power and Sample size

27 27 Power, by difference between two groups Statistical Considerations

28 28 Continuous outcome: PowerSignificance level Sample size (equal in each group, fixed ratio? ) Difference in means Standard error (equal in each group) Power and Sample size

29 29 Power and Sample size Determining Sample Size: Discrete Example APT070 perfusion vs. cold storage of kidney Testing for superiority Endpoint: Delayed Graft Function after transplantation Proportion of patients experiencing delayed graft Relevant parameters Baseline prevalence: 35% Minimum difference clinically significance, 10% p1=0.35, p2=0.25 [proportion with delayed graft function in each group] Significance level  =0.05 Power = 80% Calculate sample size required 349 patients in each group

30 30 Power and Sample size Russ Lenth’s Power and Sample size page http://www.stat.uiowa.edu/~rlenth/Power/

31 31 Power and Sample size http://hedwig.mgh.harvard.edu/sample_size/size.html With 349 patients on treatment A and 349 patients on treatment B there will be a 0% chance of detecting a significant difference at a two sided 0.05 significance level. This assumes that the response rate of treatment A is 0.35 and the response rate of treatment B is 0.25. With 349 patients on treatment A and 349 patients on treatment B there will be a 80% chance of detecting a significant difference at a two sided 0.05 significance level. This assumes that the response rate of treatment A is 0.35 and the response rate of treatment B is 0.25.

32 32 PowerSignificance level Sample size Equal in each group? Fixed ratio? Proportion responding in Group 2 Proportion responding in Group 1 Power and Sample size Discrete outcome

33 33 How to use power calculations Use power prospectively for planning future studies –Determine an appropriate sample size –Evaluating a planned study – will it yield useful information? Put science before statistics. –Use effect sizes that are clinically relevant –Don’t get distracted by statistical considerations Perform a pilot study –Helps establish procedures, understand and protect against the unexpected –Gives variance estimates needed in determining sample size Power and Sample size

34 34 Power and Sample size 1.Superiority 2.Equivalence: Objective  To demonstrate that two treatments have no clinically meaningful difference H 0 : The two Txs effects are different with respect to the mean response H 1 : The two Txs are equal with respect to the mean response Design: What is your Hypothesis? d = largest difference clinically acceptable

35 35 Power and Sample size 3.Non-Inferiority: Objective  To demonstrate that a given treatment is not clinically inferior to another H 0 : A given Tx is inferior with respect to the mean response H 1 : A given Tx is non-inferior with respect to the mean response Design: What is your Hypothesis?

36 36 QUIZ Assume 80% Power, α = 0.05, two-sided (x) more with A (y) more with B (z) the same Study AStudy B 1. Mortality 20% vs 10% 20% vs 15% 2. Mortality 20% vs 10%40% vs 30% 3. Diastolic BP80 vs 85 mmHg 90 vs 95 mmHg St. dev 10St dev 10 4. Diastolic BP80 vs 85 mmHg 80 vs 85 mmHg St. dev 10St dev 8 A B (x) more with A (y) more with B (z) the same (x) more with A (y) more with B (z) the same (x) more with A (y) more with B (z) the same How many subjects? Which study needs largest sample size? Power and Sample size

37 37 1. B 2. B 3. Same 4. A ANSWERS Bigger effect size in A (doubling of survival. Smaller effect, larger sample size needed to detect Small difference need more subjects Only standard deviation matters Bigger standard deviation more subjects Power and Sample size


Download ppt "Statistical Power and Sample Size Calculations Drug Development Statistics & Data Management July 2014 Cathryn Lewis Professor of Genetic Epidemiology."

Similar presentations


Ads by Google