Superiority, Non-inferiority, and Equivalence

Slides:



Advertisements
Similar presentations
ISSUES THAT PLAGUE NON- INFERIORITY TRIALS PAST AND FUTURE RALPH B. DAGOSTINO, SR. BOSTON UNIVERSITY HARVARD CLINICAL RESEARCH INSTITUTE.
Advertisements

Study Size Planning for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Evidence Based Advertising “Don’t accept your dog’s admiration as conclusive evidence that you are wonderful” -Ann Landers.
Effect Size – Can the Effect Be Too Small Robert J. Temple, M.D. Advisory Committee Mtg April 25, 2006.
LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
ODAC May 3, Subgroup Analyses in Clinical Trials Stephen L George, PhD Department of Biostatistics and Bioinformatics Duke University Medical Center.
Statistical Issues in Contraceptive Trials
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
1 A Bayesian Non-Inferiority Approach to Evaluation of Bridging Studies Chin-Fu Hsiao, Jen-Pei Liu Division of Biostatistics and Bioinformatics National.
Chapter 11: Sequential Clinical Trials Descriptive Exploratory Experimental Describe Find Cause Populations Relationships and Effect Sequential Clinical.
The ICH E5 Question and Answer Document Status and Content Robert T. O’Neill, Ph.D. Director, Office of Biostatistics, CDER, FDA Presented at the 4th Kitasato-Harvard.
Inference (CI / Tests) for Comparing 2 Proportions.
1 Equivalence and Bioequivalence: Frequentist and Bayesian views on sample size Mike Campbell ScHARR CHEBS FOCUS fortnight 1/04/03.
Sample Size Determination
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Sample Size Determination Ziad Taib March 7, 2014.
Power and Non-Inferiority Richard L. Amdur, Ph.D. Chief, Biostatistics & Data Management Core, DC VAMC Assistant Professor, Depts. of Psychiatry & Surgery.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Making all research results publically available: the cry of systematic reviewers.
Are the results valid? Was the validity of the included studies appraised?
Understanding the Concept of Equivalence and Non-Inferiority Trials CM Gibson, 2000.
Inference for a Single Population Proportion (p).
CI - 1 Cure Rate Models and Adjuvant Trial Design for ECOG Melanoma Studies in the Past, Present, and Future Joseph Ibrahim, PhD Harvard School of Public.
Biostatistics Case Studies 2015 Youngju Pak, PhD. Biostatistician Session 2: Sample Size & Power for Inequality and Equivalence Studies.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss.
Study design P.Olliaro Nov04. Study designs: observational vs. experimental studies What happened?  Case-control study What’s happening?  Cross-sectional.
Introduction to inference Use and abuse of tests; power and decision IPS chapters 6.3 and 6.4 © 2006 W.H. Freeman and Company.
Challenges of Non-Inferiority Trial Designs R. Sridhara, Ph.D.
1 Can One Evaluate An Outcomes Claim Based On An Active Controlled Study? Pfizer Response Cardiovascular and Renal Drugs Advisory Committee Rockville,
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Therapeutic Equivalence & Active Control Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
What is a non-inferiority trial, and what particular challenges do such trials present? Andrew Nunn MRC Clinical Trials Unit 20th February 2012.
Chapter 8 Delving Into The Use of Inference 8.1 Estimating with Confidence 8.2 Use and Abuse of Tests.
EXPERIMENTAL EPIDEMIOLOGY
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
August 20, 2003FDA Antiviral Drugs Advisory Committee Meeting 1 Statistical Considerations for Topical Microbicide Phase 2 and 3 Trial Designs: A Regulatory.
Issues concerning the interpretation of statistical significance tests.
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
2006, Tianjin, China sf.ppt - Faragalli 1 Statistical Hypotheses and Methods in Clinical Trials with Active Control Non-inferiority Design Yong-Cheng.
1 Study Design Issues and Considerations in HUS Trials Yan Wang, Ph.D. Statistical Reviewer Division of Biometrics IV OB/OTS/CDER/FDA April 12, 2007.
Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 1: Demonstrating Equivalence of Active Treatments:
Is there evidence to justify different claims for different drug classes? Presentation to: Cardiovascular & Renal Drugs Advisory Committee Food & Drug.
Statistical & Design Considerations for Non-inferiority trials Andrew Nunn MRC Clinical Trials Unit London.
Joel Singer, Programme Head, Methodology and Statistics, CIHR Canadian HIV Trials Network.
Sample Size Determination
Compliance Original Study Design Randomised Surgical care Medical care.
1 BLA Sipuleucel-T (APC-8015) FDA Statistical Review and Findings Bo-Guang Zhen, PhD Statistical Reviewer, OBE, CBER March 29, 2007 Cellular, Tissue.
Relative Risk Therapy A Better Therapy B Better COMPASS 95% CI no worse than 1.5 TARGET 95% CI no worse than 1.47 ASSENT-2.
1 Chapter 6 SAMPLE SIZE ISSUES Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981.
Zometa for Prostate Cancer Bone Metastases Protocol 039 Amna Ibrahim, M.D. Oncology Drug Products FDA.
Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 1: Demonstrating Equivalence of Active Treatments:
Biostatistics Case Studies 2016 Youngju Pak, PhD. Biostatistician Session 2 Understanding Equivalence and Noninferiority testing.
Remaining Challenges in Assessing Non-Inferiority Steven Snapinn DIA Statistics Community Virtual Journal Club December 16, 2014 Based on Paper with Qi.
Double-blind, randomized trial in 4,162 patients with Acute Coronary Syndrome
Chapter 9 -Hypothesis Testing
A systematic review of selected journals
Biostatistics Case Studies 2007
Non-Inferiority Exposed: Uses and Abuses
How many study subjects are required ? (Estimation of Sample size) By Dr.Shaik Shaffi Ahamed Associate Professor Dept. of Family & Community Medicine.
Adaptive non-inferiority margins under observable non-constancy
Deputy Director, Division of Biostatistics No Conflict of Interest
Critical Reading of Clinical Study Results
Statistics for Business and Economics (13e)
Interpreting Epidemiologic Results.
Medical Statistics Exam Technique and Coaching, Part 2 Richard Kay Statistical Consultant RK Statistics Ltd 22/09/2019.
Presentation transcript:

Superiority, Non-inferiority, and Equivalence Trial Objectives Superiority, Non-inferiority, and Equivalence

Questions of Interest Is the new treatment better than the control treatment that I am using now? (superiority trial) If it is not better, is the new treatment as good (not unacceptably non-inferior) as the control treatment that I am using now? (non-inferiority trial) Can I use the new treatment and the control treatment interchangeably? (equivalence trial) Non-inferiority and equivalence trials are usually considered when there is an active control.

Definitions (ICH Guidelines – E9) Superiority trial – a trial with the primary objective of showing that the response to the investigational product is superior to a comparative agent (active or placebo control). Equivalence trial – a trial with primary objective of showing that the response to two or more treatments differs by an amount which is clinically unimportant (active control). Non-inferiority trial – a trial with the primary objective of showing that the response to the investigational product is not clinically inferior (or not unacceptably inferior) to a comparative agent (active or placebo control but usually active) – very common in the regulatory setting either for a new treatment or for a new label indication.

GAO-1-798 Evidence from Clinical Trials FDA Guidance “The objective of a non-inferiority trial is to show that any difference in the effectiveness of the two drugs is small enough to allow a conclusion that the new drug is not substantially less effective than the active control.” “FDA considers the selection of a non-inferiority margin to be the single greatest challenge in designing, conducting, and interpreting non-inferiority trials…If a non-inferiority margin is incorrectly calculated and set to large, a drug that is not effective may appear to be effective; if the margin is too small, an effective drug may appear ineffective.” GAO-1-798 Evidence from Clinical Trials

Reasons for Active Controls An active treatment (comparator) with established efficacy exists. If superiority can be established, the standard of care is improved. While a short-term study with a placebo control might be ethical, if the outcome is morbidity/mortality, a trial with use of a placebo is not ethical if an accepted standard of care treatment exists (recall papers by Temple and Ellenberg).

The Number and Type of Active Comparator Studies Vary by Sponsor (Commercial versus Non-Commercial) Among published reports of trials between June 2008 and September 2009 in major medical journals, 97/212 (46%) used an active comparator. 36/108 (33%) with commercial sponsors and 61/104 (59%) with non-commercial sponsors. 18/36 (50%) of active controlled commercial trials were non-inferiority versus 5/61 (8%) of non-commercial trials. JAMA 2010; 303:951-958

Examples – Non-Inferiority - 1 Safety: Is a new vaccine for pertussis (whooping cough) that has an improved safety profile as effective in preventing whooping cough as the currently licensed vaccine? Ease of use: Is a new oral anticoagulant non-inferior to warfarin for stroke and systemic embolism among patients with atrial fibrillation? (N Engl J Med 2011)

Examples – Non-Inferiority - 2 Treatment duration: Is a short course of treatment for latent TB infection (3 months of INH plus rifapentine) as effective as 9 months of INH in preventing active TB? (N Engl J Med 2011) Cost: Is an inexpensive alternative to ranibizumab called bevacizumab non-inferior for visual acuity among patients with age-related macular degeneration? (N Engl J Med 2011)

Example - HIV Trial: Abacavir-Lamivudine-Zidovdine vs Indinavir-Lamivudine-Zidovudine JAMA 2001;285:1155-1163. “The study was powered to assess treatment equivalence for the primary endpoint (i.e., a plasma HIV RNA level <= 400 copies/mL at week 48 for the intent- to-treat population). For the primary end point, treatments were considered equivalent if the 95% confidence interval was within the bound -12% to 12%.”

Motivation Evaluating New Treatments in for Non-Inferiority and Equivalence Trials Costs less More convenient to use (e.g., short course of prophylaxis for TB, no blood tests as for warfarin) Lower risk of side effects (e.g., pertussis vaccine) But is it as effective?

Active and Placebo Controls in One Trial (Usually concurrent placebo arm is absent, but this may be practical in some short-term studies) Randomize Drug A Control Drug B Experimental Placebo Superiority Non-inferiority

Neither sertraline or St. John’s Wort was significantly different Effect of Hypericum perforatum (St. John’s Wort) in Major Depressive Disorder Randomize Sertraline Active Control St. John’s Wort Experimental Placebo Control Neither sertraline or St. John’s Wort was significantly different from placebo in this 8 week study. The authors noted “without a placebo, hypericum could easily have been considered as effective as sertraline…” JAMA 2002; 287:1807-1814.

In the absence of a concurrent placebo, have to provide assurance that the active control would have been superior to placebo, if it had been used, and the test treatment would have beat placebo had it been used (indirect inference).

Non-inferiority or Equivalence Trials: Key Features Efficacy of reference or control treatment (anchor) must be clearly established (control is better than nothing). Target population and outcome measures must be similar to the trial that established efficacy of control (constancy assumption). Margin of non-inferiority/equivalence must be a priori stated, clinically relevant, and chosen to ensure new treatment is better than “imputed” nothing (non-inferiority margin).

Hung and O’Neill, Encyclopedia of Clinical Trials Assay Sensitivity and Constancy are Critical Assumptions in Interpreting Non-inferiority and Equivalence Trials Assay Sensitivity (def.) – ability to demonstrate a difference between active and inactive treatments Can you assume that the standard treatment (active control) is effective? How do you tell the difference between a good trial that establishes two active treatments to be similarly effective from a bad trial that incorrectly claims similarity? External evidence: historical data that the control treatment is effective Internal evidence : a high quality trial Constancy (def.) Historical data showing that the control treatment is effective (better than placebo), holds in the setting of the current non-inferiority trial Hung and O’Neill, Encyclopedia of Clinical Trials

Historical Evidence Concerning Efficacy of Active Control and Defining the Non-Inferiority or Equivalence Margin One trial Meta-analysis or overview of trials (need to be cognizant of “file-drawer” problem) Point-estimate or lower bound of 95% CI Retention of certain fraction of superiority of active control over placebo (e.g., 50%) True probability of event for active control and placebo are 20% and 30% Show probability of event with new treatment is smaller than 25% (a difference, or non-inferiority margin, between new treatment and active control of 5%) Would like to convince people that if you had used placebo you would have won!

General Problems in Determining Non-Inferiority Margin What is “unacceptably inferior” or an acceptable level of non-inferiority – often in the eyes of the beholder! Multiple outcomes are at play – non-inferiority margins are typically defined for the primary endpoint but many outcomes may be considered. Constancy assumption: same endpoint, duration of follow-up as trial(s) that established efficacy of active control. The margin assumes we know “true” effect of active control and often there is substantial variability. In some cases, there are multiple choices for active control.

How do you prove two treatments are equal? Cannot prove HO: Δ=0

The non-inferiority/equivalence margin must “It is never correct to claim that treatments have no effect or that there is no difference in the effects of treatments. It is impossible to prove … that two treatments have the same effect. There will always be uncertainty surrounding estimates of treatment effects, and a small difference can never be excluded… An analysis of 45 reports of trials purporting to test equivalence found that only a quarter set boundaries on their equivalence.” Alderson P, Chalmers I. BMJ 2003:326:1691-8. The non-inferiority/equivalence margin must be specified in the protocol!

Relationship Between Significance Tests and Confidence Intervals Superiority strongly shown p=0.002 p=0.05 Superiority shown p=0.20 Superiority not shown Control Better New Agent Better Treatment Difference

participants were randomized to each of 3 other treatments. Superiority Trial – ALLHAT: Lisinopril vs Chlorthalidone for CHD Incidence, CVD Composite Outcome, and ESRD* CHD (95% CI:0.91-1.08) CVD Composite (95% CI: 1.05-1.16) ESRD (95% CI: 0.88-1.38) Lisinopril better 1.00 Chlorthalidone better HR (Lisinopril/Chlorthalidone) In ALLHAT, 15,255 participants were randomized to chlorthalidone and 9,000+ participants were randomized to each of 3 other treatments. JAMA 2002;288:2981-2997.

HR (Verapamil/SOC) for CONVINCE (Captopril/SOC) for CAPPP Interpretation of Head to Head (Equivalence) Trials: CONVINCE and CAPPP CONVINCE equivalence bounds (0.86-1.16) CONVINCE Trial result CAPPP Trial result Overview (9 trials) Calcium Channel Blocker better 1.00 SOC better HR (Verapamil/SOC) for CONVINCE (Captopril/SOC) for CAPPP CAPPP = Captopril Primary Prevention Project. Authors concluded: “captopril and conventional treatment did not differ in efficacy.” See JAMA 2003;289: 2073-2082 for Convince Trial

Example: 2NN Study A study of first-line antiretroviral therapy in HIV Main comparison between nevirapine twice daily and efavirenz (plus stavudine and lamivudine) in terms of ‘treatment failure’ (based on virology, disease progression, therapy change) Primary objective was to establish the non-inferiority of nevirapine twice daily (δ =10%) Lancet 2004, 363:1253-63

Results: 2NN Study Confidence intervals for failure rates (EFV-NVP) All data (-12.8%, 0.9%) Those starting med. (-14.6%, -0.8%) Neither interval is completely above δ value of -10%; one interval also excludes zero.

Conclusions: 2NN Study BUT, the authors concluded: ‘Antiviral therapy with nevirapine or efavirenz showed similar efficacy, so triple-drug regimens with either … are valid for first-line treatment’ Lancet 2004, 363:1253-63

standard drug over placebo Interpretation of Non-Inferiority Trials: 6 Examples (A – F): Hazard ratio (Test Drug/Standard) and 95% CI Estimated benefit of standard drug over placebo Zone of noninferiority Test drug better Standard drug better 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 Superiority A B Noninferiority (i.e., Equivalence) C Inferiority D E Underpowered trial F Anteman EM, Circulation 2001;103:e101-e104.

standard drug over placebo Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI) Estimated benefit of standard drug over placebo Zone of noninferiority Test drug better Standard drug better 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 Superiority A B Noninferiority (i.e., Equivalence) C Inferiority D E Underpowered trial F A = Test drug is superior to standard

standard drug over placebo Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI) Estimated benefit of standard drug over placebo Zone of noninferiority Test drug better Standard drug better 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 Superiority A B Noninferiority (i.e., Equivalence) C Inferiority D E Underpowered trial F B = Test drug is better than standard and can be considered “non-inferior” to standard

standard drug over placebo Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI) Estimated benefit of standard drug over placebo Zone of noninferiority Test drug better Standard drug better 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 Superiority A B Noninferiority (i.e., Equivalence) C Inferiority D E Underpowered trial F C = Test drug is worse than standard but not that much worse, and can be considered “non-inferior” to standard

standard drug over placebo Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI) Estimated benefit of standard drug over placebo Zone of noninferiority Test drug better Standard drug better 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 Superiority A B Noninferiority (i.e., Equivalence) C Inferiority D E Underpowered trial F D = Test drug is inferior to standard and non-inferiority criteria not satisfied.

standard drug over placebo Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI) Estimated benefit of standard drug over placebo Zone of noninferiority Test drug better Standard drug better 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 Superiority A B Noninferiority (i.e., Equivalence) C Inferiority D E Underpowered trial F E = Test drug is very inferior to standard (non-inferiority criteria not satisfied)

standard drug over placebo Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI) Estimated benefit of standard drug over placebo Zone of noninferiority Test drug better Standard drug better 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 Superiority A B Noninferiority (i.e., Equivalence) C Inferiority D E Underpowered trial F F = Trial is inconclusive due to small size and resultant wide CI

Possible Reasons for Non-Significant Difference Small sample size Poor compliance to study treatments Losses-to-follow-up Equivalent regimens Absence of proof of a treatment difference does not constitute proof of an absence of a treatment difference.

Non-Inferiority and Equivalence Trials Considerations Cannot prove Pe = Pc or µ1 = µ2 therefore Ho: δ < 0 versus HA : δ > 0 is not correct because a small, underpowered study could incorrectly lead to a claim of equivalence – absence of evidence is not evidence of absence, and if power is too high, Ho may be rejected when the difference is not important. Since Ho cannot be accepted, either reverse the roles of type 1 and 2 errors (i.e., rejection of Ho implies equivalence) or focus on confidence intervals Treatment difference must be chosen not only to rule out smallest clinically meaningful difference, but also to be sure new treatment is better than no treatment Consensus on what equivalence means, especially in a broad sense, is hard to achieve

1-Sided Hypothesis Testing (Non-inferiority) A = new treatment; B = standard; PA and PB = event rates (failure rate) If Ho is rejected, treatments are “equivalent” Roles of null and alternative hypotheses are reversed. In practice, this is confusing to people. Blackwelder W, Cont Clin Trials 1982

superiority trial to detect Parallel Group Studies with Continuous Outcomes: Sample Size Formula is the Same Except for δ0 2 Note: If Δ=0, then this is equivalent to superiority trial to detect δo with 90% power.

Example Non-Inferiority Trial for New BP Lowering Drug δO = 4 mmHg Δ = 0, -2 (A better) and +2 (B better) σ2 = 100; α = 0.025 (1-sided); 1-β = 0.90 1:1 allocation No. per group δO Δ 4 0 132 4 +2 525 4 -2 58

Confidence Interval Approach Example of Type I Error A (new treatment better) B (standard treatment better)

Confidence Interval Approach Example of Type II Error A (new treatment better) B (standard treatment better)

Prob (upper limit of CI exceeds Sample Size for Equivalence Design Based on CI Limits A = New Treatment; B = Standard Prob (upper limit of CI exceeds d when < ) = Prob ( P ^ A - B + Z 1 a 2 s > é ë ê ù û ú b s 2 (1 N

Sample Size for Equivalence Design Based on CI Limits (cont Sample Size for Equivalence Design Based on CI Limits (cont.) A = New Treatment; B = Standard ( ) Makuch and Simon (Cancer Treatment Reports, 1978) suggest a = 0.10 (1-sided) and b = 0.20; I like a = .05 (and usually 2-sided)

Algorithm can be easily programmed. For Proportions and Relative Risks, Farrington and Manning’s Approach is Better Problem arises because of estimation of variance under the null hypothesis. Farrington and Manning (Stat Med 1990) have shown that their maximum likelihood approach is better particularly for small values of pc and pe. Algorithm can be easily programmed. Stat Med 1990; 9:1447-1454

Sample Size per Group Makuch and Simon Farrington and Manning PA(PE) Sample Size for Proportions for Non-Inferiority Trial: Makuch and Simon versus Farrington and Manning (PA=PB)* Sample Size per Group Makuch and Simon Farrington and Manning PA(PE) PB(PC) δO 0.05 0.05 0.01 9,972 10,032 0.10 0.10 0.05 756 775 0.15 0.15 0.05 1,071 1,080 0.20 0.20 0.05 1,344 1,348 0.20 0.20 0.10 336 340 * α = 0.025 (1-sided) 1-β = 0.90 1:1 allocation

Sample Size for Proportions for Non-Inferiority Trial: Makuch and Simon versus Farrington and Manning (PA = or ≠ PB)* Sample Size per Group Makuch and Simon Farrington and Manning PA(PE) PB(PC) δO 0.10 0.10 0.05 756 775 0.125 0.10 0.05 3,343 3,379 0.10 0.125 0.05 371 384 * α = 0.025 (1-sided) 1-β = 0.90 1:1 allocation

Sample Size per Group Superiority* Farrington and Manning** PA(PE) Sample Size for Proportions: Superiority Trial with Specified Delta or Inferiority with Farrington and Manning (1:1 allocation and 1-β = 0.90) Sample Size per Group Superiority* Farrington and Manning** PA(PE) PB(PC) δO 0.05 0.05 0.01 9,021 10,032 8,174 0.10 0.10 0.05 581 775 630 0.15 0.15 0.05 917 1,080 880 0.20 0.20 0.05 1,211 1,349 1,099 0.20 0.20 0.10 266 340 277 * α = 0.05 (2-sided) PE=PC - δO ** α = 0.025 (1-sided) in 1st column; α = 0.05 (1-sided) in 2nd column

RRo chosen so that if upper limit < RRo, we conclude “equivalence” General Approach RR RRo New Treatment Better Standard Treatment Better RRo chosen so that if upper limit < RRo, we conclude “equivalence” RRo usually ≠ 1.0

CONVINCE Design Based on the findings from 17 trials with over 50,000 participants, the CVD risk reduction associated with BP lowering by diuretics and beta-blockers was estimated as 24%. Equivalence margin was set to ensure that there would be no more than a 50% loss of efficacy based on this point estimate. Upper bound = 1.16 = 0.88 (12% reduction)/ 0.76 (24% reduction). Lower bound = 1/1.16 = 0.86.

Confidence Interval Approach to Monitoring for Convince 0.86 Lower limit of equiv. 1.0 No diff. 1.16 Upper limit of equiv. Diuretic/β-blocker Better Ca+ Blocker Better Equivalence Inconclusive

Non-inferiority and superiority The 95% CI for the difference between the control and the intervention are all >-δ, i.e. non-inferiority demonstrated. In this case both non-inferiority and superiority have been demonstrated -δ Control treatment better No difference New treatment better

Non-inferiority and Inferiority The 95% CI for the difference between the control and the intervention are all >-δ, i.e. non-inferiority demonstrated. In this case both non-inferiority and superiority have been demonstrated In this case both non-inferiority and inferiority have been demonstrated -δ Control treatment better No difference New treatment better

Summary - Determining Equivalence First step in establishing equivalence - define ‘limits of equivalence’ (± δ) Having conducted the trial, calculate the 95% confidence intervals for the difference between the control and the new treatment If the confidence interval is entirely within ± δ then equivalence is established

Summary - Determining Non-inferiority Equivalence requires that the difference control - new intervention is both > -δ and < δ, the new treatment must be neither worse nor better than the control by a fixed amount. In contrast to equivalence with non-inferiority we are only interested in determining whether new treatment is no worse by an amount δ.

Analysis of Non-inferiority/Equivalence Trials Superiority trials are analysed by intention-to-treat (ITT) because it is the most conservative and least likely to be biased. ITT analysis of non-inferiority trials is not conservative - there is a bias towards no difference. Per protocol analysis is biased since not all randomised patients included. Recommendation: Analyze by both ITT and per protocol (need to ensure power for both).

Testing for Superiority after Non-Inferiority In some situations it may be appropriate to test for superiority after testing for non-inferiority. Regulatory authorities do not require any multiplicity adjustment for this. In this situation, while the primary analysis for non-inferiority might be based on a “per protocol” population, the primary analysis for the superiority analysis should be intention to treat.

Equivalence/Non-Inferiority Trials Summary Equivalence/non-inferiority trials may be larger, smaller or similar to superiority trials – depends on margin chosen and whether new therapy is assumed to be more efficacious. Equivalence is “in the eyes of the beholder” – select margins carefully! The absence of a significant difference in a superiority trial does not imply equivalence Need to be sure about the efficacy of the active control treatment based on earlier trials. It is critical that the conduct of equivalence/non-inferiority trials is excellent. Because of difficulty of interpretation, equivalence and non-inferiority trials should be used cautiously. More head to head superiority comparisons of approved treatments are needed.

Quality of Reporting of Non-inferiority and Equivalence Trials (JAMA 2006;295:1147-1151) Margin of non-inferiority/equivalence defined in most trials, but rationale for margin missing in majority of studies. About 25% of reports did not give sample size justification in sufficient detail to reproduce it. Less than 50% described both intention to treat and per protocol analysis. About 15% of reports did not state confidence intervals.

+ Builds on CONSORT guidelines for superiority trials. Guidelines for Reporting Non-inferiority and Equivalence Trials+ (JAMA 2006;295:1152-1160) Specification of whether the trial is a non-inferiority study Sample size details (specification and rationale for non-inferiority margin) Use of 1- or 2-sided confidence interval Nature of analysis: intention to treat, per protocol or both Presentation of results: confidence intervals + Builds on CONSORT guidelines for superiority trials.