Download presentation
Presentation is loading. Please wait.
Published byJean Mosley Modified over 8 years ago
1
542-05-#1 STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981.
2
542-05-#2 Sample Size Issues Fundamental Point Trial must have sufficient statistical power to detect differences of clinical interest High proportion of published negative trials do not have adequate power Freiman et al, NEJM (1978) 50/71 could miss a 50% benefit
3
542-05-#3 Example: How many subjects? Compare new treatment (T) with a control (C) Previous data suggests Control Failure Rate (P c ) ~ 40% Investigator believes treatment can reduce P c by 25% i.e. P T =.30, P C =.40 N = number of subjects/group?
4
542-05-#4 Estimates only approximate –Uncertain assumptions –Over optimism about treatment –Healthy screening effect Need series of estimates –Try various assumptions –Must pick most reasonable Be conservative yet be reasonable
5
542-05-#5 Statistical Considerations Null Hypothesis (H 0 ): No difference in the response exists between treatment and control groups Alternative Hypothesis (H a ): A difference of a specified amount ( ) exists between treatment and control Significance Level ( ): Type I Error The probability of rejecting H 0 given that H 0 is true Power = (1 - ): ( = Type II Error) The probability of rejecting H 0 given that H 0 is not true
6
542-05-#6 Standard Normal Distribution Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
7
542-05-#7 Standard Normal Table Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
8
542-05-#8 Distribution of Sample Means (1) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
9
542-05-#9 Distribution of Sample Means (2) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
10
542-05-#10 Distribution of Sample Means (3) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
11
542-05-#11 Distribution of Sample Means (4) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
12
542-05-#12 Test Statistics
13
542-05-#13 Distribution of Test Statistics Many have this common form Testing a population parameter (eg difference in means) Θ = sample estimate of a population parameter Then –Z = [Θ – E(Θ)]/√V(Θ) And then Z has a Normal (0,1) distribution for large sample size
14
542-05-#14 If statistic z is large enough (e.g. falls into red area of scale), we believe this result is too large to have come from a distribution with mean O (i.e. P c - P t = 0) Thus we reject H 0 : P c - P t = 0, claiming that their exists 5% chance this result could have come from distribution with no difference
15
542-05-#15 Normal Distribution Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
16
542-05-#16 Two Groups ORor
17
542-05-#17 Test of Hypothesis Two sidedvs.One sided e.g. H 0 : P T = P C H 0 : P T < P C Classic testz = critical value If |z| > z If z > z Reject H 0 =.05, z = 1.96 =.05, z = 1.645 where z = test statistic Recommend z be same value both cases (e.g. 1.96) two-sided one-sided =.05 or =.025 z = 1.96 1.96
18
542-05-#18 Typical Design Assumptions (1) 1. =.05,.025,.01 2.Power =.80,.90 Should be at least.80 for design 3. = smallest difference hope to detect e.g. = P C - P T =.40 -.30 =.1025% reduction!
19
542-05-#19 Typical Design Assumptions (2) Two Sided PowerSignificance Level
20
542-05-#20 Sample Size Exercise How many do I need? Next question, what’s the question? Reason is that sample size depends on the outcome being measured, and the method of analysis to be used
21
542-05-#21 Simple Case - Binomial 1.H 0 :P C = P T 2.Test Statistic (Normal Approx.) 3.Sample Size Assume N T = N C = N H A : = P C - P T
22
542-05-#22 Sample Size Formula (1) Two Proportions Simple Case Z = constant associated with P {|Z|> Z } = two sided! (e.g. =.05, Z =1.96) Z = constant associated with 1 - P {Z< Z } = 1- (e.g. 1- =.90, Z =1.282) Solve for Z ( 1- ) or
23
542-05-#23 Sample Size Formula (2) Two Proportions Z = constant associated with P {|Z|> Z } = two sided! (e.g. =.05, Z =1.96) Z = constant associated with 1 - P {Z< Z } = 1- (e.g. 1- =.90, Z =1.282)
24
542-05-#24 Sample Size Formula Power Solve for Z 1- Difference Detected Solve for
25
542-05-#25 Simple Example (1) H 0 : P C = P T H A : P C =.40, P T =.30 =.40 -.30 =.10 Assume =.05Z = 1.96 (Two sided) 1 - =.90Z = 1.282 p = (.40 +.30 )/2 =.35
26
542-05-#26 Simple Example (2) Thus a. N = 476 2N = 952 b. 2N = 956 N = 478
27
542-05-#27 Approximate* Total Sample Size for Comparing Various Proportions in Two Groups with Significance Level ( ) of 0.05 and Power (1- ) of 0.80 and 0.90
28
542-05-#28
29
542-05-#29 Comparison of Means Some outcome variables are continuous –Blood Pressure –Serum Chemistry –Pulmonary Function Hypothesis tested by comparison of mean values between groups, or comparison of mean changes
30
542-05-#30 Comparison of Two Means H 0 : C = T C - T = 0 H A : C - T = Test statistic for sample means ~ N ( ) Let N = N C = N T for design ~N(0,1) for H 0
31
542-05-#31 Comparison of Means Power Calculation
32
542-05-#32 Example e.g. IQ = 15 = 0.3x15 = 4.5 Set 2 =.05 = 0.10 1 - = 0.90 H A : = 0.3 / = 0.3 Sample Size N = 234 2N = 468
33
542-05-#33
34
542-05-#34 Comparing Time to Event Distributions Primary efficacy endpoint is the time to an event Compare the survival distributions for the two groups Measure of treatment effect is the ratio of the hazard rates in the two groups = ratio of the medians Must also consider the length of follow-up
35
542-05-#35 Assuming Exponential Survival Distributions Then define the effect size by Standard difference
36
542-05-#36 Time to Failure (1) Use a parametric model for sample size Common model - exponential –S(t) = e - t = hazard rate –H 0 : I = C –Estimate N George & Desu (1974) Assumes all patients followed to an event (no censoring) Assumes all patients immediately entered
37
542-05-#37 Assuming Exponential Survival Distributions Simple case The statistical test is powered by the total number of events observed at the time of the analysis, d.
38
542-05-#38 Converting Number of Events (D) to Required Sample Size (2N) d = 2N x P(event) 2N = d/P(event) P(event) is a function of the length of total follow- up at time of analysis and the average hazard rate Let AR = accrual rate (patients per year) A = period of uniform accrual (2N = AR x A) F = period of follow-up after accrual complete A/2 + F = average total follow-up at planned analysis = average hazard rate Then P(event) = 1 – P(no event) =
39
542-05-#39 Time to Failure (2) In many clinical trials 1.Not all patients are followed to an event (i.e. censoring) 2.Patients are recruited over some period of time (i.e. staggered entry) More General Model (Lachin, 1981) where g( ) is defined as follows
40
542-05-#40 1.Instant Recruitment Study Censored At Time T 2.Continuous Recruiting (O,T) & Censored at T 3.Recruitment (O, T 0 ) & Study Censored at T (T > T 0 )
41
542-05-#41 Example Assume =.05 (2-sided) & 1 - =.90 C =.3 and I =.2 T = 5 years follow-up T 0 = 3 0.No Censoring, Instant Recruiting N = 128 1.Censoring at T, Instant Recruiting N = 188 2.Censoring at T, Continual Recruitment N = 310 3.Censoring at T, Recruitment to T 0 N = 233
42
542-05-#42 Sample Size Adjustment for Non-Compliance (1) References: 1.Shork & Remington (1967) Journal of Chronic Disease 2.Halperin et al (1968) Journal of Chronic Disease 3.Wu, Fisher & DeMets (1988) Controlled Clinical Trials Problem Some patients may not adhere to treatment protocol Impact Dilute whatever true treatment effect exists
43
542-05-#43 Sample Size Adjustment for Non-Compliance (2) Fundamental Principle Analyze All Subjects Randomized Called Intent-to-Treat (ITT) Principle –Noncompliance will dilute treatment effect A Solution Adjust sample size to compensate for dilution effect (reduced power) Definitions of Noncompliance –Dropout: Patient in treatment group stops taking therapy –Dropin: Patient in control group starts taking experimental therapy
44
542-05-#44 Comparing Two Proportions –Assumes event rates will be altered by non ‑ compliance –Define P T * = adjusted treatment group rate P C * = adjusted control group rate If P T < P C, 0 PTPT PCPC P T * P C * 1.0
45
542-05-#45 Simple Model - Compute unadjusted N –Assume no dropins –Assume dropout proportion R –ThusP C * = P C P T * = (1-R) P T + R P C –Then adjust N –Example R1/(1-R) 2 % Increase.1 1.23 23%.25 1.78 78% Adjusted Sample Size
46
542-05-#46 Sample Size Adjustment for Non-Compliance Dropouts & dropins (R 0, R I ) –Example R 0 R 1 1/(1- R 0 - R 1 ) 2 % Increase.1.1 1.5656%.25.25 4.04 times%
47
542-05-#47 More Complex Model Ref: Wu, Fisher, DeMets (1980) Further Assumptions –Length of follow-up divided into intervals –Hazard rate may vary –Dropout rate may vary –Dropin rate may vary –Lag in time for treatment to be fully effective Sample Size Adjustments
48
542-05-#48 Used complex model Assumptions 1. =.05 (Two sided)1 - =.90 2.3 year follow-up 3.P C =.18(Control Rate) 4.P T =.13Treatment assumed 28% reduction 5.Dropout 26% (12%, 8%, 6%) 6.Dropin 21% (7%, 7%, 7%) Example: Beta-Blocker Heart Attack Trial (BHAT) (1)
49
542-05-#49 UnadjustedAdjusted P C =.18P C * =.175 P T =.13P T * =.14 28% reduction20% reduction N = 1100N* = 2000 2N = 22002N* = 4000 Example: Beta-Blocker Heart Attack Trial (BHAT) (2)
50
542-05-#50 Multiple Response Variables Many trials measure several outcomes (e.g. MILIS, NOTT) Must force investigator to rank them for importance Do sample size on a few outcomes (2-3) If estimates agree, OK If not, must seek compromise
51
542-05-#51 “Equivalency” or Non-Inferiority Trials Compare new therapy with standard Wish to show new "as good as" Rationale may be cost, toxicity, profit Examples –Intermittent Positive Pressure Breathing Trial Expensive IPPB vs. Cheaper Treatment –Nocturnal Oxygen Therapy Trial (NOTT) 12 Hours Oxygen vs. 24 Hours Problem Can't show H 0 : = 0 A Solution Specify minimum difference = min
52
542-05-#52 Sample Size Formula Two Proportions Simple Case Z = constant associated with Z = constant associated with 1 - Solve for Z ( 1- ) or
53
542-05-#53 Difference in Events Test Drug – Standard Drug
54
542-05-#54 Mid Stream Adjustments Murphy's Law applies to sample size May find event rate assumptions way off from early results, power of study very inadequate Problem –Quit? –Continue for almost certain doom? –Adjust sample size? –Extend followup? Early Decision Best to decide early, not look at treatment comparisons
55
542-05-#55 Adaptive Designs One class allows re-estimating the sample size once the trial is underway –Chung et al –Chen, Lan & DeMets Methods have been criticized for allowing bias (eg Mehta & Tsiatis) Thus, methods still not widely used –AHEFT Trial one example Will be discussed later in data monitoring lecture
56
542-05-#56 Event Rate Assumptions Challenging to get event rate assumptions correct Inclusion/exclusion criteria effect Healthy volunteer effect Changing background therapy/standard of care Even if trials conducted back to back
57
542-05-#57 PRAISE I vs PRAISE II Placebo arms
58
542-05-#58 Event Driven Trials For time to event trials, most of the information is in the events Power is a function of the events For time to event trials, target is really number of observed events (D), not the total sample size (2N) Thus, target the number of events
59
542-05-#59 Event Driven Trials Can adjust or adapt trial to target the number of events if the assumed event rate was too high Steering committee can –Increase sample size –Increase follow up –A combination of both
60
542-05-#60 Examples of Event Driven Trials PROMISE (Based on control arm) PRAISE I & II COPERNICUS CARS (Based on control arm)
61
542-05-#61 Response Adaptive Designs The size of the observed treatment effect may be different (i.e. less than) from assumptions –Treatment actually less effective –Compliance worse than assumed –Background therapy changed Smaller observed effect may be still of clinical interest if real
62
542-05-#62 Response Adaptive Designs Also, probability of rejecting H0 is also small –Power –Conditional Power Question is whether to –quit and start over or –make design modification and continue
63
542-05-#63 Response Adaptive Designs Stopping and starting over problematic –Waste of financial resources –Ethical issues of wasting contributions of patients who have already participated Probably can’t afford a policy of designing all trials for minimum treatment effect of interest
64
542-05-#64 Response Adaptive Designs Adjust/increase sample size if treatment effect assumed was too large Traditionally, this approach discouraged Recent methodology suggests possible approaches
65
542-05-#65 Response Adaptive Designs These methods are relatively new and still controversial Many leading biostatisticians very critical (e.g., Fleming, Emerson, Turnbull, Tsiatis) Issues often more than statistical control of Type I error –Introducing other sources of bias
66
542-05-#66 Response Adaptive Designs Increase sample size based on observed treatment effect May inflate false positive rate –By 30 to 40% (Cui et al) –Can double (Proschan et al) Inflation of Type I error of that magnitude not acceptable
67
542-05-#67 Response Adaptive Designs Statistical adjustments to control alpha –Weighted z-statistic –Adjustment to the critical value –enforcing rules for sample size recalculation
68
542-05-#68 Weighted Z Statistic Reference –Cui, Hung & Wang (1999,Biometrics) –Fisher(1998,Stat Med) –Shen & Fisher (1999, Biometrics) –Tsiatis & Mehta ( 2003, Biometrika)
69
542-05-#69 Weighted Z X i = N(0,1) distribution n = current sample size N 0 = initial total sample size a = hypothesized treatment effect t = n/ N 0
70
542-05-#70 Weighted Z N= proposed sample size based on Reject H 0 if Note: less weight assigned to new/additional observations
71
542-05-#71 Weighted Z Possible to modify design, increase sample size based on interim analysis & control Type I error Flexibility has a price
72
542-05-#72 Tsiatis – Mehta Criticism Argue that a properly designed group sequential trial is more efficient than these adaptive designs Challenge is to “properly” design (However, that can be a bigger challenge than often realized)
73
542-05-#73 Weighted/UnWeighted Modification Both Type I error < No real loss of power Ref: Chen, DeMets, Lan
74
542-05-#74 P-Value Method P-Value Method Reference –Proschan & Hunsberger (1995, Biometrics) Requires a “promising” p-value before allowing an increase in sample size Requires stopping if first stage p-value not promising Requires a larger critical value at the “second stage” to control the Type I error
75
542-05-#75 P-value Method One sided alpha = 0.05 P(1).10.15.20.25.50 Z(2) 1.77 1.82 1.85 1.875 1.95 Regardless of n 2, second stage
76
542-05-#76 Proschan & Hunsberger Method Simple method may make Type I error substantially less than 0.05 Developed another method to obtain exact Type I error as a function of Z 1 and n 2, using a conditional power type calculation (details to be discussed later)
77
542-05-#77 Proschan & Hunsberger Conditional Power and p value required in stage 2 as a function of R = n 2 /n 1 for the NHLBI Type II study example
78
542-05-#78 Proschan & Hunsberger Allows for sample size adjustment based on observed treatment effect Requires increasing final critical value
79
542-05-#79 Adaptive Design Remarks A need exists for adaptive designs (even FDA statisticians agree) Technical advances have been made through several new methods Adaptive designs are still not widely accepted & subject to (strong) criticism May be useful for non pivotal trials Practice precedes theory, perhaps in time
80
542-05-#80 Sample Size Summary Ethically, the size of the study must be large enough to achieve the stated goals with reasonable probability (power) Sample size estimates are only approximate due to uncertainty in assumptions Need to be conservative but realistic
81
542-05-#81 Demo of Sample Size Program Demo of Sample Size Program www.biostat.wisc.edu Program covers comparison of proportions, means, & time to failure Can vary control group rates or responses, alpha & power, hypothesized differences Program develops sample size table and a power curve for a particular sample size
82
542-05-#82 Sample Size Program Output
83
542-05-#83 Union Terrace/Lakefront
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.