Statistical & Design Considerations for Non-inferiority trials Andrew Nunn MRC Clinical Trials Unit London.

Slides:



Advertisements
Similar presentations
ISSUES THAT PLAGUE NON- INFERIORITY TRIALS PAST AND FUTURE RALPH B. DAGOSTINO, SR. BOSTON UNIVERSITY HARVARD CLINICAL RESEARCH INSTITUTE.
Advertisements

Equivalence Testing Dig it!.
Robert T. O’Neill, Ph.D. Director, Office of Biostatistics CDER, FDA
Study Size Planning for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Study Objectives and Questions for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Issues of Simultaneous Tests for Non-Inferiority and Superiority Tie-Hua Ng*, Ph. D. U.S. Food and Drug Administration Presented at MCP.
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Evidence Based Advertising “Don’t accept your dog’s admiration as conclusive evidence that you are wonderful” -Ann Landers.
Effect Size – Can the Effect Be Too Small Robert J. Temple, M.D. Advisory Committee Mtg April 25, 2006.
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
Superiority, Non-inferiority, and Equivalence
Testing Hypotheses About Proportions Chapter 20. Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called.
ODAC May 3, Subgroup Analyses in Clinical Trials Stephen L George, PhD Department of Biostatistics and Bioinformatics Duke University Medical Center.
Statistical Issues in Contraceptive Trials
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
1 A Bayesian Non-Inferiority Approach to Evaluation of Bridging Studies Chin-Fu Hsiao, Jen-Pei Liu Division of Biostatistics and Bioinformatics National.
The ICH E5 Question and Answer Document Status and Content Robert T. O’Neill, Ph.D. Director, Office of Biostatistics, CDER, FDA Presented at the 4th Kitasato-Harvard.
1 Equivalence and Bioequivalence: Frequentist and Bayesian views on sample size Mike Campbell ScHARR CHEBS FOCUS fortnight 1/04/03.
Clinical Trials Hanyan Yang
7-2 Estimating a Population Proportion
1Carl-Fredrik Burman, 11 Nov 2008 RSS / MRC / NIHR HTA Futility Meeting Futility stopping Carl-Fredrik Burman, PhD Statistical Science Director AstraZeneca.
Sample Size Determination
Sample Size Determination Ziad Taib March 7, 2014.
Power and Non-Inferiority Richard L. Amdur, Ph.D. Chief, Biostatistics & Data Management Core, DC VAMC Assistant Professor, Depts. of Psychiatry & Surgery.
Accredited Member of the Association of Clinical Research Professionals, USA Tips on clinical trials Maha Al-Farhan B.Sc, M.Phil., M.B.A., D.I.C.
Overview Definition Hypothesis
Hypothesis Testing.
Intervention Studies Principles of Epidemiology Lecture 10 Dona Schneider, PhD, MPH, FACE.
Understanding the Concept of Equivalence and Non-Inferiority Trials CM Gibson, 2000.
Oxford Inflammatory Bowel Disease MasterClass Understanding non-inferiority trial designs Dr Vipul Jairath Bsc DPhil (Oxon) MRCP NIHR Clinical Lecturer.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Biostatistics Case Studies 2015 Youngju Pak, PhD. Biostatistician Session 2: Sample Size & Power for Inequality and Equivalence Studies.
Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss.
Challenges of Non-Inferiority Trial Designs R. Sridhara, Ph.D.
1 Statistical Review Dr. Shan Sun-Mitchell. 2 ENT Primary endpoint: Time to treatment failure by day 50 Placebo BDP Patients randomized Number.
Randomized Trial of Preoperative Chemoradiation Versus Surgery Alone in Patients with Locoregional Esophageal Carcinoma, Ursa et al. Statistical Methods:
What is a non-inferiority trial, and what particular challenges do such trials present? Andrew Nunn MRC Clinical Trials Unit 20th February 2012.
Statistical Power The power of a test is the probability of detecting a difference or relationship if such a difference or relationship really exists.
DSBS Discussion: Multiple Testing 28 May 2009 Discussion on Multiple Testing Prepared and presented by Lars Endahl.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 4: Study Size and Power.
BIOE 301 Lecture Seventeen. Progression of Heart Disease High Blood Pressure High Cholesterol Levels Atherosclerosis Ischemia Heart Attack Heart Failure.
1 Updates on Regulatory Requirements for Missing Data Ferran Torres, MD, PhD Hospital Clinic Barcelona Universitat Autònoma de Barcelona.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 4: Study Size for Precision or Power.
Evaluating the Medical Evidence ​ A TOOLKIT FOR THE INTERPRETING THE EFFECTIVENESS OF INTERVENTIONS Niteesh Choudhy, M.D., Ph.D.
Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 1: Demonstrating Equivalence of Active Treatments:
How To Design a Clinical Trial
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
Joel Singer, Programme Head, Methodology and Statistics, CIHR Canadian HIV Trials Network.
Sample Size Determination
Considerations for Topical Microbicide Phase 3 Trial Designs, an Investigator’s Perspective Andrew Nunn Medical Research Council Clinical Trials Unit London,
Compliance Original Study Design Randomised Surgical care Medical care.
European Patients’ Academy on Therapeutic Innovation Ethical and practical challenges of organising clinical trials in small populations.
European Patients’ Academy on Therapeutic Innovation The Purpose and Fundamentals of Statistics in Clinical Trials.
Study Designs for Acute Otitis Media: What can each design tell us? C. George Rochester, Ph.D. Anti-Infective Advisory Committee Meeting, July 11, 2002.
Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 1: Demonstrating Equivalence of Active Treatments:
Biostatistics Case Studies 2016 Youngju Pak, PhD. Biostatistician Session 2 Understanding Equivalence and Noninferiority testing.
April Center for Open Fostering openness, integrity, and reproducibility of scientific research.
Remaining Challenges in Assessing Non-Inferiority Steven Snapinn DIA Statistics Community Virtual Journal Club December 16, 2014 Based on Paper with Qi.
A systematic review of selected journals
How To Design a Clinical Trial
Sample Size Determination
Biostatistics Case Studies 2007
Non-Inferiority Exposed: Uses and Abuses
Critical Reading of Clinical Study Results
Testing a Claim About a Mean:  Known
Aiying Chen, Scott Patterson, Fabrice Bailleux and Ehab Bassily
Issues in TB Drug Development: A Regulatory Perspective
Interpreting Basic Statistics
Medical Statistics Exam Technique and Coaching, Part 2 Richard Kay Statistical Consultant RK Statistics Ltd 22/09/2019.
Aparna Raychaudhuri, Ph. D
Presentation transcript:

Statistical & Design Considerations for Non-inferiority trials Andrew Nunn MRC Clinical Trials Unit London

TB Forum December Outline What is a non-inferiority trial? How do they differ from superiority trials? What do the regulators say? How large do the trials need to be? How should the trials be conducted and analysed?

TB Forum December What is a non-inferiority trial? How is it different from an equivalence trial? Does non-significant imply non-inferior? Does non-inferior imply non-significant? Are non-inferiority trials always larger than superiority trials? Can a failed superiority trial be turned into a non- inferiority trial? Why do we need these trials in TB?

TB Forum December A little bit of history - 35yrs ago Study R, the first East African/BMRC trial of short course chemotherapy could be regarded as a non-inferiority trial. 2STH/16TH worked well under strict trial conditions. The main objective was to see if a six month regimen was at least as good as the standard treatment, better - would be a bonus. S = streptomycin, T = thiacetazone, H = isoniazid

Study R – 30 month relapse free rates 0 -35% 5% -5% A possible δ Confidence intervals for difference from control 2STH/16TH regimen 6SHR 6SHZ 6SHT 6SH No difference

6 A point to remember “It is never correct to claim that treatments have no effect or that there is no difference in the effects of treatments. It is impossible to prove … that two treatments have the same effect. There will always be uncertainty surrounding estimates of treatment effects, and a small difference can never be excluded.” Alderson P, Chalmers I. BMJ 2003:326:

TB Forum December Why non-inferiority for new TB drugs? Under a wide variety of trial conditions the gold standard 2EHRZ/4HR regimen is at least 95% effective. –Nomads in the Algerian Sahara –Recently published IUATLD study in Africa and Asian centres. We will be very unlikely to better it. –we would require a total of 2600 evaluable patients to demonstrate a reduction from 5% to 2.5% relapses

TB Forum December Our goal Our goal is to reduce treatment duration to a maximum of 4 months and preferably less. How much are we prepared to pay, if anything, for such a reduction? –must the new regimen be as good as the standard? –would we be satisfied with a regimen that was almost as good? –if so, how good is almost?

TB Forum December EMEA quote “If no degree of possible inferiority of the test [new regimen] to the reference [control] is acceptable, then the development of products with equal efficacy to a comparator by means of non-inferiority trials would become impossible.” EMEA /CPMP /EWP /2158 /99

TB Forum December Does non-significant = non-inferior? No! definitely not. Common sense will tell us that a non- significant result from an under-powered study is, in the extreme case of little value. BUT, non-inferior does not necessarily mean non-significant!

TB Forum December How do we do it? We need a null hypothesis. The situation is the reverse of what is required in a superiority design. For superiority –H 0 is there is no difference. For non-inferiority –H 0 is there is a difference. The alternative hypothesis is also reversed.

TB Forum December Equivalence & non-inferiority What’s the difference?

TB Forum December Determining equivalence First step in establishing equivalence - define ‘limits of equivalence’ (± δ) Having conducted the trial, calculate the 95% confidence intervals for the difference between the control and the new treatment If the confidence interval is entirely within ± δ then equivalence is established

TB Forum December Non-inferiority Equivalence requires that the difference control - new intervention is both > -δ and < δ, the new treatment must be neither worse nor better than the control by a fixed amount. In contrast to equivalence with non-inferiority we are only interested in determining whether new treatment is no worse by an amount δ.

TB Forum December Non-inferiority 0-δ No difference The 95% CI for the difference between the control and the intervention are all > -δ, i.e. non-inferiority demonstrated.

TB Forum December Non-inferiority 0-δ No difference The lower 95% CI for the difference between the control and the intervention are all > -δ, i.e. non-inferiority demonstrated. The lower 95% CI is < -δ, non-inferiority has not been demonstrated.

TB Forum December Non-inferiority and superiority 0-δ No difference The 95% CI for the difference between the control and the intervention are all >-δ, i.e. non- inferiority demonstrated. In this case both non- inferiority and superiority have been demonstrated

TB Forum December Non-inferiority and inferiority 0-δ No difference The 95% CI for the difference between the control and the intervention are all >-δ, i.e. non- inferiority demonstrated. In this case both non- inferiority and superiority have been demonstrated In this case both non- inferiority and inferiority have been demonstrated

TB Forum December Choosing δ The value of δ must be chosen before the trial begins. It’s value will depend on clinical, statistical and possibly regulatory considerations.

TB Forum December Example: 2NN Study van Leth, Phanuphak et al (Lancet 2004), a study of first-line antiretroviral therapy in HIV Main comparison between nevirapine twice daily and efavirenz (plus stavudine and lamivudine) in terms of ‘treatment failure’ (based on virology, disease progression, therapy change) Primary objective was to establish the non-inferiority of nevirapine twice daily ( δ =10%)

TB Forum December Example: 2NN Study Confidence intervals for failure rates (E-2NN) –All data (-12.8%, 0.9%) –Only those starting med. (-14.6%, -0.8%) –Concurrently randomised (-11.9%, 3.4%) Non of these intervals are completely above δ value of -10%; one interval also excludes zero

TB Forum December Example: 2NN Study BUT, the authors concluded: ‘Antiviral therapy with nevirapine or efavirenz showed similar efficacy, so triple-drug regimens with either … are valid for first-line treatment’ Lancet 2004, 363:

TB Forum December Does it matter? A non-inferiority trial can demonstrate significant benefit from the new treatment - (cf Study A). But is it possible to have non-inferiority and a significantly worse outcome in the new treatment? Yes! provided δ is acceptable to clinicians. –if N is large enough any difference can be shown to be significant!

TB Forum December Adverse effects Assessment of adverse effects is particularly important in equivalence trials. It is not enough to prove non-inferiority in terms of efficacy. A new treatment must be as safe, or safer, than the old one.

TB Forum December Choosing δ On 27 th July 2005 the European Medicines Agency (EMEA) issued a new European “Guideline on the choice of the non-inferiority margin” This guideline comes into effect in January EMEA /CPMP /EWP /2158 /99

TB Forum December Quote from EMEA document “The lower limit of the confidence interval [of the difference between the new regimen and the control]... represents a lower bound and is usually interpreted as the degree of inferiority to the reference that can be excluded based on the data presented….. EMEA /CPMP /EWP /2158 /99

TB Forum December Quote from EMEA document “Of course this is not an actual lower bound and the magnitude of inferiority could be greater. However it is generally considered that the chance of the true difference being worse than that suggested by this bound is acceptably small.” EMEA /CPMP /EWP /2158 /99

TB Forum December General EMEA recommendations If possible three study arms should be included, test, reference and placebo - allows validation of the non- inferiority margin. The margin should be such there is assurance that the test arm has a clinically relevant effect. The primary focus is the relative effect of the test arm and the reference arm. The choice of the margin should be justified in the protocol The choice of the margin should be independent of power considerations.

TB Forum December Design consideration It is important to ensure that the design of equivalence trials, including definitions of a favourable response, should be as similar as possible to earlier trials assessing the control regimen.

TB Forum December Internal validity In a superiority trial there is a strong incentive to ensure high quality of conduct. In contrast in an non-inferiority trial the conclusion of non-inferiority could be reached because of poor discriminatory power. In a TB trial this could occur if follow-up rates were poor and/or there was failure in the lab to detect all relapses.

TB Forum December But If there are already many treatments being used interchangeably for the disease under consideration a possible approach might be to consider the information available from all of them. From this a delta may be constructed which summarises the information known about the relative efficacy of these products, and the new trial can be designed to provide a similar level of knowledge of the relative efficacy of the new product.

TB Forum December Accepting a larger δ “In the situation where the test product is anticipated to have a safety advantage over the reference it is likely that a larger delta could be justified as some loss of efficacy might be accepted in exchange for the safety benefits”

TB Forum December Is there a case for a larger δ if treatment can be shortened? “It may be possible to justify a wider non- inferiority margin for efficacy if the product has an advantage in some other aspect of its profile. This margin should not, however, be so wide that superiority to placebo is left in doubt”

TB Forum December How large a δ would you accept? If treatment could be shortened from 6 to 4 months would an increase in the failure/relapse rate from 5% to 10% be acceptable?

TB Forum December How large a δ would you accept? If treatment could be shortened from 6 to 4 months would an increase in the failure/relapse rate from 5% to 10% be acceptable? - provided that the failures and relapses could be satisfactorily retreated.

TB Forum December FDA position FDA (as described in FDA’s 1992 Points to Consider document) originally used a ‘step function’: Cure Rate δ  90%10% % 15% < 80 % 20% A more flexible approach has since been adopted

37 Example : Pediatric Meningitis Trial Investigational Drug vs. Active Control Sponsor’s FDA Proposal Proposal Projected response rate80% 80% Delta15% 10% Evaluable total sample size Projected % evaluable70% 70% Total to be enrolled Projected enrollment time 2-4 years 4-6 years FDA proposed study considered not to be feasible Note:  = 5%, power = 80%

38 What confidence level? Traditionally we use 95% confidence in superiority trials (thanks to RA Fisher!) Guidelines for pharmacokinetic equivalence have traditionally used 90% CI. In regulatory situations the choice is based on level of risk regulators are prepared to accept. Could be appropriate to use 90%, 95% or even 99%. Need for flexibility.

TB Forum December Calculating power - an example Given the expected range of, say 3-6% relapse rates in the control, 2EHRZ/4HR regimen. What study size would we require for a range of δ?

40

41 5% relapse, δ = 10%, 100 per arm

42 5% relapse, δ = 10%, 100 per arm 5% relapse, δ = 5%, 400 per arm

TB Forum December But... These power calculations do not allow for additional numbers required for a Per Protocol analysis, or patients excluded because they do not have TB, or because they have MDR disease. Neither do they allow for losses to follow-up.

TB Forum December How should we analyse non- inferiority trials? Superiority trials are analysed by ITT because it is the most conservative and least likely to be biased. ITT analysis of non-inferiority trials is not conservative - there is a bias towards no difference. PP biased since not all randomised patients included. It is recommended that non-inferiority trials should be analysed by both ITT and per protocol (PP).

45 Defining ITT and PP Definitions vary. –For ITT some definitions exclude patients who either do not have confirmed diagnosis, or who never received treatment. –PP includes all receiving full course of treatment with no major protocol violations. What definitions are appropriate for TB trials? CPMP: ‘similar conclusions from both the ITT and PP are required in a non-inferiority trial.’ ‘Sample size computations should ensure sufficient numbers in the PP population’. CPMP: Committee on Proprietary Medical Products (2000)

TB Forum December CAVE! Drop-outs from the two regimens need to be carefully evaluated. Suppose patients not responding dropped out early from one treatment arm, or Possibly because of differential withdrawal rate for adverse events - This would suggest there may be important differences between the treatments.

TB Forum December Interim analyses Do we need them? Probably not if it is to consider stopping early for strong evidence of non-inferiority. Such evidence would support a case for the possible superiority of the new treatment to the control - a strong incentive to keep on.

TB Forum December Conclusions 1 A major concern among regulators in many NI trials is that the efficacy of the control is not well established. This is NOT the case with the control regimen 2EHRZ/4HR. One advantage of no new drugs for 40 years!! In the event of establishing a 4 month regimen to be non-inferior it would be unwise to use that regimen as the control in the next NI trial - biocreep. Biocreep - slightly inferior treatment becomes the control for next generation of NI trials

TB Forum December Conclusions 2 NI trials must be conducted with rigour The value of δ needs to be determined before the start of the trial and should take into account both clinical and statistical considerations. Both the value of δ and other aspects of design need to be discussed with regulators Non-inferiority needs to be demonstrated not only for efficacy but also for safety.

TB Forum December Regulatory Guidance ICH E9 ‘Note for Guidance on Statistical Principles for Clinical Trials’, September 1998 ICH E10 ‘Note for Guidance on Choice of Control Group’, July 2000 CPMP ‘Note for Guidance on the Investigation of Bioavailability and Bioequivalence’, July 2001 CPMP ‘Points to Consider on Switching between Superiority and Non-Inferiority’, July 2000 CHMP ‘Guideline on the Choice of the Non-Inferiority Margin’, July 2005

TB Forum December Selected references D’Agostino RB, Massaro JM et al: Non-inferiority trials: design concepts and issues - the encounters of academic consultants in statistics. Statist Med 2003; 22: Altman DG, Bland JM: Absence of evidence is not evidence of absence. BMJ 1995; 311:485. Blackwelder WC: Current issues in equivalence trials. J Dent Res 2004; 83:C Jones B, Jarvis P et al: Trials to assess equivalence: the importance of rigorous methods. BMJ 1996; 313:36-9.

Study R – 30 month relapse free rates 0 -35% 5% -5% A possible δ Confidence intervals for difference from control 2STH/16TH regimen 6SHR 6SHZ 6SHT 6SH No difference

TB Forum December