Presentation is loading. Please wait.

Presentation is loading. Please wait.

Division of Biometrics III/Office of Biostatistics/OPaSS/CDER/FDA

Similar presentations


Presentation on theme: "Division of Biometrics III/Office of Biostatistics/OPaSS/CDER/FDA"— Presentation transcript:

1 Division of Biometrics III/Office of Biostatistics/OPaSS/CDER/FDA
On Some Statistical Considerations in Testing for Multiple Endpoints in Clinical Trials Mohammad Huque, Ph.D. Division of Biometrics III/Office of Biostatistics/OPaSS/CDER/FDA ASA Biopharm Section FDA/Industry Workshop, September 21-23, 2004, Washington, D.C. 11/20/2018

2 Disclaimer The views in this presentation do not necessarily reflect those of the Food and Drug Administration 11/20/2018

3 Outline Concepts - nature of relationship between endpoints
Issue #1: Multiple primary endpoints are often highly correlated. How to take advantage of this in adjusting for multiplicity? Issue #2: Use of sequential analysis of endpoints is increasingly becoming popular. How to reconcile some of the difficulties it poses? Issue #3: Problem of statistical testing when more than 1 primary endpoint must show statistical significance for effectiveness results to be clinically persuasive (To be presented at the PhRMA Meeting, October 2004, Washington, D.C. ) 11/20/2018

4 Triaging of multiple endpoints into meaningful families by trial objectives
Hierarchical ordered families 1) Prospectively defined 2) FWTE rate controlled Primary endpoints Secondary endpoints Exploratory endpoints (often not prospectively defined) Primary endpoints are primary focus of the trial. Their results determine main benefits of he clinical trial’s intervention. Secondary endpoints by themselves generally not sufficient for characterizing treatment benefit. Generally, tested for statistical significance for extended indication and labeling after the primary objectives of the trial are met. 11/20/2018

5 Nature of relationships between endpoints
Statistical independence and dependence concepts (familiar to statisticians) Causal dependence between endpoints (related to treatment effect) Endpoint X has effect  the endpoint Y will also have an effect, vice versa Examples: Diabetes trials - HbAc1 and fasting glucose levels. CHF trials – CHF related deaths and all-cause mortality. ITT versus PP endpoints Correlation between endpoints do not necessarily imply this causal dependence (A surrogate endpoint and a clinical endpoint may be correlated w/o this property). 11/20/2018

6 Extent of multiplicity adjustments between endpoints
correlation high Small adjustments Practically no adjustments Large adjustments Good case for combining endpoints low high low Causal dependence (Homogeneity of treatment effects across endpoints) 11/20/2018

7 Issue #1: Multiple primary endpoints are often highly correlated.
How to take advantage of this in adjusting for multiplicity? 11/20/2018

8 Adjusting for multiplicity for moderate to high correlated endpoints?
For K =2, 3: fairly easy to handle. Examples: Sidak type adjustments (K=2, 3) Hochberg’s method (K =2) with correction for correlation Closed testing using Simes test (K=2, 3) with correction for correlation For K > 3: Ad hoc procedures Tukey-Ciminera-Heyse’s method (1985) Modifications of Dubey’s method (1985) [Armitage-Parmar, ] Other methods: Bootstrap methods (Westfall, 1992) O’Brien’s OLS/GLS tests (1984) 11/20/2018

9 2 Endpoint Case: Sidak type adjustments
Assumption: test statistics Z1 and Z2 follow bi-variate normal distribution Overall α = 0.025, 1-sided tests Corr (1)Adj  *(Adj )  Adj 2 (1) Equal adjustments for both endpoints 11/20/2018

10 2 Endpoint Case: Adjustment in the Hochberg method
Test statistics Z1 and Z2 follow bi-variate normal distribution Overall αlpha = 0.05, 2-sided tests r Type I Adjustment Type I Test the smaller P Error rate Factor C Error Rate at level If max (p1, p2) < 0.05, then both endpoints significant If max (p1, p2) < 0.05, then test the smaller p-value at level C/2 (0.05) 11/20/2018

11 3 Endpoint Case: Sidak type adjustments
Test statistics Z1, Z2 and Z3 follow 3-variable normal distribution Overall αlpha = 0.025, 1-sided tests r12 r13 r (1)Adj  *(Adj )  (2)Adj 2 (1) Equal adjustments for all 3 endpoints (2) alpha1= 0.02 for the 1st endpoint and adjusted alpha2= adjusted alpha3 11/20/2018

12 3 Endpoint Case: closed testing using Simes test
Simes test at level 0.05 using all endpoints Y1, Y2 and Y3 with correction factor C C=1, test conservative for high endpoint correlation If Reject Simes test w. C Y1, Y2 Simes test w. C Y1, Y3 Simes test w. C Y2, Y3 If Reject Endpoint Y2 P > 0.05 Endpoint Y3 P > 0.05 Endpoint Y1 P < 0.05 11/20/2018

13 Correction factor C for the Simes test, K=3
Test statistics Z1, Z2 and Z3 follow 3-variable normal distribution αlpha = 0.05, 2-sided tests r Type I Adjustment Type I Error rate Factor C Error Rate Effectiveness in at least one endpoint, if p(3) < 0.05, or { P(3)  0.05, P(2) < 0.05*2/3*C}, or { P(3)  0.05, P(2)  0.05*2/3*C, P(1) < .05*1/3*C}. 11/20/2018

14 Case of Dependent Event Rate Endpoints
Dependence parameter  can be estimated as follows: Y= hospitalization endpoint x=1, y =0 p10 x=1, y =1 p11 p X= mortality endpoint x=0, y =1 p01 x=0, y =0 p00 q p’ q’ Dependence parameter  = p11/ (pqp’q’) Approximate test statistics for the proportions are bivariate normal in the limit with the above dependence parameter Previous methods for the continuous endpoints apply 11/20/2018

15 TCH (Tukey-Ciminera-Heyse, 1985) and Dubey (1985) tests (K >3)
TCH method (highly correlated endpoints, 1985) Adjusted alpha = 1- (1-alpha) 1/sqrt (K) Dubey (1985) [Armitage-Parmar ( )] Adjusted alpha = 1- (1-alpha) 1/mi mi = K (1- r.i), (i = 1, …, K), r.i = average of (K-1) correlation coefficients (ith endpoint vs. the other K-1 endpoints) Recent modifications of the Dubey method for proper protection of the type I error rate 11/20/2018

16 Modifications of the Dubey’s method First step - correlation matrix conversion
Convert correlation rij to corr ((|Zi|, (|Zj|), Zi and Zj follow standard 2-variable normal distribution w. correlation coefficient rij r = (0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) converts to ( , , , , , , , , ) 11/20/2018

17 Modifications of the Dubey procedure
Modification 1 (M1): Let the new correlation matrix be R. Scale R by R’ = Rf (f = 1.5 when K = 4). Next follow the Dubey procedure with this new scaled R’. Modification 2 (M2): Using R obtain R-square value between the endpoint i ( =1, …, K) and the remaining (K-1) endpoints. Multiply this R-square value by g (g = 0.75 when K =4). Then use this R-square value in place of the average correlation in the Dubey procedure. 11/20/2018

18 Performance of the ad hoc procedures for K=4 for some correlation structures
R = {r12, r13, r14, r23, r24, r34} R1 = {.9 (3), .8 (2), .3 } all v.high -one low (Avg 7.7) R2 = {.8 (2), .5(2), .3 (2) } 2 v.high, 2 medium, 2 low (5.3) R3 = {.7 (3), .5(2), .1 } high, 2 medium, 1 v.low ( 5.3) R4 = {.8, .7, .3 (2), .1 (2)} v.high, 1 high, 2 low, 1 v.low ( 3.7) R5 = {. 8 , .5, .3, .1 (3)} high, 1 medium, 1 low, 1 v.low (3.2) R6 = {.5 (2), .4, .3 (2), .1} 3 medium, 2 low, 1 v.low ( 3.5) R7 = (. 5(2), .4, .1 (3)} medium, 3 v.low ( 2.8) R8 = (.2 (3). .1 (3)} all v.low ( 1.5) 11/20/2018

19 Performance (1) of ad hoc procedures for K=4 for selected correlation structures R1-R8 Nominal alpha =0.05, 2-sided tests using normal Z-statistics MH MH MH2 MH2 R TCH Dubey f = f= g =1 g= Simes Sidak ==================================================== R (2) R R R R R R R ===================================================== Based on 100,000 clinical trial simulations Entry = with f = 1.7 11/20/2018

20 Some comments on the results of the previous table
Investigations limited to selected correlation structures for K = 4 Tukey’s adjustment – for highly correlated endpoints Dubey’s – fairly stable, but liberal in protecting alpha-level Mofication M2 (g =.75) performs well The approach sensitive to the choice of metric and scaling factor Simes and Sidak methods quite conservative for moderate to high correlated endpoints 11/20/2018

21 Properties of the Modifications M1 and M2
Under Investigation: Type I error rate control for K in the range Strong control of the familywise type I error rate using closed testing principle Simultaneous confidence interval properties Power properties 11/20/2018

22 O’Brien’s OLS/GLS t-tests, 1984 (K > 3)
These tests are based on weighted sums of the K standardized endpoints using weights (w1, w2, …, wK) = JT R-1 for the GLS test and = JT for the OLS test. In other words, GLS method give more weights to endpoints not highly correlated and the OLS method gives equal weight to all endpoints. Test sensitive under homogeneity of treatment effects and low correlation across endpoints Performs poorly under treatment by endpoint interaction Closed testing for endpoint specific results 11/20/2018

23 Issue #2 Use of sequential analysis of endpoints is increasingly becoming popular. How to reconcile some of the difficulties it poses? Suppose that the sequence breaks, and the subsequent endpoint has an extremely low value. How avoid this situation? 11/20/2018

24 An example of a sequence break when testing endpoints sequentially
Consider a heart failure trial with two endpoint y1=exercise tolerance and y2= mortality rate. The trial had a predefined sequential test strategy. Test for y1 first at level (2-sided). If this endpoint has a statistically significant result at this level, then and only then test for y2 at the same level 0.05, otherwise declare the trial as failure. Difficult Case! p1 > 0.05, p2 =0.001. 11/20/2018

25 A proposed test strategy
Predefine 1 and 2 so that  = 1 +2 e.g., 1 = 0.04 and 2 = 0.01. Test y1 first at level 1. (a) If p1  1, then reject H01 and then test y2 at level  (i.e.,  =.05, and not at level 2) (b) If p1 > 1, then do not reject H01, but test y2 at level 2 This test strategy controls the familywise type I error rate at level  (e.g.,  =0.05) 11/20/2018

26 Concluding Remarks Understanding of relationships between endpoints helps in selecting an efficient test strategy for multiple endpoints Methods that account for correlation between endpoints are fairly straightforward for K=2, 3 Ad hoc procedures such as M1 and M2 modifications of the Dubey’s procedure can be helpful in testing for K > 3. Also bootstrap and O’Brien’s methods can be applied Sequential testing can be done slightly differently to accommodate sequence breaks with extreme subsequent p-values 11/20/2018


Download ppt "Division of Biometrics III/Office of Biostatistics/OPaSS/CDER/FDA"

Similar presentations


Ads by Google