A Parametrized Strategy of Gatekeeping, Keeping Untouched the Probability of Having at Least One Significant Result Analysis of Primary and Secondary Endpoints.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Gatekeeping Testing Strategies in Clinical Trials Alex Dmitrienko, Ph.D. Eli Lilly and Company FDA/Industry Statistics Workshop September 2004.
Tests of Hypotheses Based on a Single Sample
Brief introduction on Logistic Regression
LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.
Hypothesis Testing making decisions using sample data.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
By Trusha Patel and Sirisha Davuluri. “An efficient method for accommodating potentially underpowered primary endpoints” ◦ By Jianjun (David) Li and Devan.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Estimation of Sample Size
Significance Testing Chapter 13 Victor Katch Kinesiology.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Chapter Seventeen HYPOTHESIS TESTING
Elementary hypothesis testing
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Statistical Methods Chichang Jou Tamkang University.
Simulation Modeling and Analysis Session 12 Comparing Alternative System Designs.
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
8-2 Basics of Hypothesis Testing
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
Today Concepts underlying inferential statistics
Adaptive Designs for Clinical Trials
Sample Size Determination Ziad Taib March 7, 2014.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
AM Recitation 2/10/11.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Overview of Statistical Hypothesis Testing: The z-Test
Lecture Slides Elementary Statistics Twelfth Edition
Overview Definition Hypothesis
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Sections 8-1 and 8-2 Review and Preview and Basics of Hypothesis Testing.
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Chapter 8 Introduction to Hypothesis Testing
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
Proprietary and Confidential © AstraZeneca 2009 FOR INTERNAL USE ONLY 1 O Guilbaud, FMS+Cramér Society, AZ-Södertälje, Alpha Recycling in Confirmatory.
Overview Basics of Hypothesis Testing
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Chapter 8 Introduction to Hypothesis Testing
LECTURE 19 THURSDAY, 14 April STA 291 Spring
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 10. Sampling Strategy for Building Decision Trees from Very Large Databases Comprising Many Continuous Attributes Jean-Hugues Chauchat and Ricco.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Section 8-2 Basics of Hypothesis Testing.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.
Issues concerning the interpretation of statistical significance tests.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
Bayesian Approach For Clinical Trials Mark Chang, Ph.D. Executive Director Biostatistics and Data management AMAG Pharmaceuticals Inc.
1 Chihiro HIROTSU Meisei (明星) University Estimating the dose response pattern via multiple decision processes.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
MPS/MSc in StatisticsAdaptive & Bayesian - Lect 51 Lecture 5 Adaptive designs 5.1Introduction 5.2Fisher’s combination method 5.3The inverse normal method.
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Methods of Presenting and Interpreting Information Class 9.
Statistical Core Didactic
Chapter 8: Inference for Proportions
Multiple Endpoint Testing in Clinical Trials – Some Issues & Considerations Mohammad Huque, Ph.D. Division of Biometrics III/Office of Biostatistics/OPaSS/CDER/FDA.
Comparing Several Means: ANOVA
Statistics II: An Overview of Statistics
Inferential Statistics
Covering Principle to Address Multiplicity in Hypothesis Testing
Presentation transcript:

A Parametrized Strategy of Gatekeeping, Keeping Untouched the Probability of Having at Least One Significant Result Analysis of Primary and Secondary Endpoints in a Many-To-One Comparison of Two Treatments with a Control Julie Perez – Eric Derobert SFDS : Journée annuelle du groupe Biopharmacie & Santé 17/11/2011

2 Presentation summary Introduction to the multiplicity issue and to the Gatekeeping Presentation of the scenario for which our Gatekeeping strategy was developed Parametrization of the strategy with a single parameter Parallel with the closed testing procedure Constraints CPE and CSE Computation of the empirical score with Joe’s method Maximization of the empirical score Modeling the Gatekeeping parameter with a logistic regression Conclusion and possible extensions

Introduction to the multiplicity issue and to Gatekeeping procedures

4 Some definitions about multiplicity Classical unadjusted tests lead to Inflation of Type-I error (=rejection of at least one true null hypothesis). A testing procedure provides a strong control of the overall Type-I error when it controls the Type-I error for any combination of true or false individual null hypotheses. A testing procedure provides a weak control of the overall Type-I error when it controls the Type-I error for the combination of all individual null hypotheses, i.e. for the combination of order k (= intersection of all individual null hypotheses). Weak control is much less stringent than strong control. The closed testing principle allows the rejection of any one of these individual hypotheses, say H i, if all possible intersection hypotheses involving H i can be rejected by using valid local level α tests. It controls the overall Type-I error for all the k hypotheses at level α in the strong sense.

5 Illustration of a multiplicity problem Let’s consider two treatments compared to a control, with a primary and a secondary endpoints. If we set the following rule (hierarchical principle): The hypothesis on the secondary endpoint for one treatment is tested as soon as the hypothesis on the primary endpoint has been rejected for the same treatment Dunnett test is performed on H 1 and H 2 at the global one-sided level If H 2 is the only false null hypothesis and if H 2 is rejected at the Dunnett nominal level (α D = – balanced groups), then H 1 is tested at the level (Dunnett closed testing procedure) H 4 is tested at some supplementary level Then the probability of rejecting H 1 or H 4 (true null hypotheses) is greater than It only controls multiplicity in a weak sense (from expected correlations between endpoints, from the expected dose-response curve, it could be argued that it’s ok; but this is not the point of view of Health Authorities) H 1 (Low dose) H 3 (Low dose)H 4 (High dose) H 2 (High dose) Primary Secondary

6 Multiplicity Issue and Gatekeeping procedures 1/2 Gatekeeping: A procedure which permits to deal with the multiplicity issue using hierarchical testing procedure (with strong control) Classical setting: Testing a single family of null hypotheses. Advanced setting: Testing m family of null hypotheses H 1, H 2, …, H m H 1, H 2, …, H k 1 H k m-1 +1, H k m-1 +2, …, H k m … Family 1 Family m

7 Multiplicity Issue and Gatekeeping procedures 2/2 Gatekeeping setting: Testing m families of null hypotheses Family i serves as a gatekeeper for Family i+1 (i=1 to m-1) The gatekeeper family (i) is tested The second family (i+1) is tested only if the gatekeeper has been successfully passed Family i Family i+1 Pass the gatekeeper

8 Different kinds of Gatekeeping procedures Serial Gatekeeping All hypotheses must be rejected within one family in order to proceed to the next family in the sequence Parallel Gatekeeping At least one significant result must be observed in one family (ie at least one hypothesis must be rejected in the family) to proceed to the next family Closure based tree Gatekeeping Combining both serial and parallel methods

Presentation of the scenario for which our Gatekeeping strategy was developed

10 First scenario with Gatekeeping Considering a two-family testing problem With 2 null hypotheses in each family This scenario corresponds (as an example) to: Testing two treatments (Low dose and High dose) versus a placebo Considering a primary endpoint (family 1) and a secondary endpoint (family 2) With a Gatekeeping procedure We test the secondary endpoint only if some significant results have been obtained on the primary endpoint Using a graphical approach (Bretz et al., 2009: “A graphical approach to sequentially rejective multiple test procedures”) From this graphical approach, we can define an associated closed testing strategy, and freely choose related test procedures

11 Gatekeeping scenario depending on 3 parameters

12 Important conditions for our Gatekeeping strategy N°1: Constraint CPE: The probability of rejecting at least one null hypothesis on the primary endpoint remains unchanged Some gatekeeping strategies already exist but do not satisfy this condition Therefore Gatekeeping reduces the power of the test Main reason why Gatekeeping is not often used N°2: Constraint CSE: The rejection of a null hypothesis for one dose on the secondary endpoint without having rejected the hypothesis on the primary endpoint is useless

13 Simplified scenario k = ½, q = 0 -> only depends on p

14 Link between the closed testing procedure and the graphical approach

15 How to satisfy the conditions ? N°1: Constraint CPE: Using Dunnett test for hypotheses concerning a common endpoint and Simes test for hypotheses on different endpoints If min(p1, p2) is significant with the Dunnett test, then min(p1, p2) is always significant with the Gatekeeping procedure i.e. if min(p1, p2) ≤ α D then min(p1, p2) ≤ α − p × α / 2 Finally, this is possible provided that α D ≤ α − p × α / 2 p ≤ 2 × (α − α D )/ α Here is the definition of the value max(p)=p Dunnett : p Dunnett = 2 × (α − α D )/ α If α=0.025 (one-sided), then, α D = and p Dunnett = (for balanced groups) Using only weighted Simes tests cannot fulfill CPE N°2: Constraint CSE: For each treatment, the hypothesis concerning the secondary endpoint is tested only if a result was obtained on the primary endpoint ->Based on the design of the procedure itself

16 Examples for extreme values of “p” When p = 0 No gatekeeping is performed The secondary endpoint can only be tested if and only if the significance of the primary criterion is proved for both treatments. Dunnett closed testing procedure performed on H1 and H2 If both H1 and H2 have been rejected, then Dunnett closed testing procedure is performed on H3 and H4. When p=p Dunnett Let’s assume p 1 <p 2 and p 1 <α D H1 is rejected with Dunnett test Case 1: p 2 <α D H2 is rejected with Dunnett test too Dunnett closed testing procedure on H3, H4 Case 2: α D <p 2 <α H2 and H3 are tested with weighted Simes The result only depends on p 3 If p 3 > α: Stop (only H1 is rejected) If p 3 < α: H2 can be rejected too. Then, Dunnett closed testing procedure on H3 and H4. Case 3: p 2 > α H2 is not rejected H3 is tested at a level w 3 = p Dunnett ×α/2= α − α D

17 Gatekeeping and empirical score In which cases does the Gatekeeping procedure permit to improve the power ? Definition of an empirical score to maximize Reflecting the level of success of the clinical trial Used to assess the performance of the gatekeeping procedure Examples of empirical scores: Score=1 if H 1 only (or H 2 ) is rejected Score=1+d if H 1 and H 2 are rejected Score=1+s if (H 1 and H 3 ) or (H 2 and H 4 ) are rejected

18 Empirical score to maximize Score = (h1×h2)×(1+d) + (h1+h2)×(1-h1×h2) + (h1×h3) ×s + (h2×h4) ×s

19 The different parameters of the clinical trial involved in the procedure 1/ Correlation coefficient between primary and secondary endpoints Chosen values: ρ=0 or ρ =+0.5 or ρ =+0.8 ρ =+0.9 or ρ =+1 Randomization ratio: r = 1  scenario with balanced groups r = √(2)  scenario minimizing the variance of pairwise differences versus placebo N tot = n 0 + n 1 + n 2 = (2+r).n 1 with n 1 =n 2 and n 0 = r.n 1 Effect-sizes of treatments: Target effect-size ∆ True effect-sizes δ’s δ = ∆ or ⅔.∆ or ⅓.∆ or 0 For each dose for the primary endpoint δ = between 0 and 2∆For each dose for the secondary endpoint

20 The different parameters of the clinical trial involved in the procedure 2/ The priority ratio RP = d/s : Illustrates the relative importance given to the rejection of hypotheses For both doses on the primary endpoint (D) For only one dose on both the primary and the secondary endpoints (S) For the time being: FWER = 2.5% (gatekeeping) Sample size depends on α=2.5% and β=10% (naïve)

21 Computation of the average empirical score We are in R 4 2 algorithms to compute the empirical score depending on all the parameters involved Bretz and Genz’ s SAS macro Joe’s article Joe’s method a little less precise than Bretz-Genz’s method But less computational time, and precise enough in terms of difference of score Method based on the computation of multi-normal rectangle or orthant probabilities Parameters : Correlation coefficient Randomization ratio Priority ratio Effect-sizes For the second dose on the primary endpoint For the first dose on the secondary endpoint

22 Computation of the empirical score in each case Computation of the “p-dependent” part of the score : The empirical score only depends on “p” for some precise cases and values of the four statistics associated to each hypothesis

23 Evolution of each score area depending on “p” When p decreases When p increases Opposite evolutions of the area “Score=1+d” and of the area “Score=1+s” depending on “p” Therefore: important influence of the subjective priority ratio =d/s

24 Results obtained with Joe’s method Gatekeeping parametrized with a single parameter “p” For each combination of values for the parameters involved in the clinical trial: Effect-sizes, correlation, randomization ratio, priority ratio Joe’s method computes the optimal value of “p” To maximize the value of the empirical score we defined But this can only be processed individually Necessity to systematize the process With a model estimating the value of “p” depending on all the other parameters

Modeling the Gatekeeping parameter With a logistic regression model maximizing the empirical score

26 Modeling the Gatekeeping parameter Idea of a logistic regression Logistic modeling particularly relevant in our case In most cases, “p” is either equal to 0 or p Dunnett Intermediary values of “p” are not so frequent

27 Modeling the Gatekeeping parameter Additive model in the logistic regression Important influence of the priority ratio Antagonistic influences of the two effect-sizes Important value for δ 1 (D2)  No gatekeeping Important value for δ 2 (D1)  Gatekeeping Very small influence of the correlation coefficient for the additive model without interactions Weighting of each case, based on the range of scores for all values of “p” Standardized weighting idea: the empirical score should be adequately estimated rather than the gatekeeping parameter itself Step-wise selection of variables 2 most important terms of interaction Interaction between the correlation coefficient and both effect-sizes

28 Modeling the Gatekeeping parameter Log (p/(p Dunnett -p) = L ( α, β, ρ, RP, r, δ 1 (D2), δ 2 (D1) ) where L is a linear function Influence of the randomization ratio only through the value of p Dunnett Basic model without any interactions: For r=1, p Dunnett = For r=√(2), p Dunnett =

29 Modeling the Gatekeeping parameter Now with interactions in the model Most important terms of interaction: Between the correlation coefficient and the effect-sizes Parameters: But adding these terms of interaction does not improve the quality of the model that much -> really necessary ?

30 Influence of the Type I and Type II errors Two levels α actually in the Gatekeeping procedure The level of risk α and the level of power 1- β on which are based the computation of the sample size and effect-sizes The α-level used for the Gatekeeping procedure Through statistical tests such as Dunnett and Simes Z α ; Z αp/2 ; Z α-αp/2 ; Z Dunnett The classical values for these levels α and β (α, β) = (0.025,0.10) α g = 0.025

31 Model depending of the levels of risks α, β In some cases, the levels α and β can take other values Then, we add them to the model But this model is not as good as the previous one for the classical values of α and β Tends to over-estimate the value of “p” Especially for cases with a small weight

32 Some practical examples To compare the model with fixed levels α, β, and the one depending on α, β

Conclusions and possible extension

34 Conclusions and possible extension 1/ Parametrization of the gatekeeping strategy For a “many-to-one” comparison of 2 treatments to a control, with one primary and one secondary endpoints With only one single gatekeeping parameter “p” Obtain a logistic regression model to compute the optimal value of “p” Depending on the clinical parameters Randomization ratio Correlation between endpoints Effect-sizes Subjective priority ratio And possibly the levels of risks

35 Conclusions and possible extension 2/ Possible extension to more than 2 treatments? Based on the 2 same constraints CPE and CSE But much more difficult to compute with Joe’s method.