Interim Analysis in Clinical Trials: A Bayesian Approach in the Regulatory Setting Telba Z. Irony, Ph.D. and Gene Pennello, Ph.D. Division of Biostatistics.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Fundamentals of Probability
Introductory Mathematics & Statistics for Business
Labeling claims for patient- reported outcomes (A regulatory perspective) FDA/Industry Workshop Washington, DC September 16, 2005 Lisa A. Kammerman, Ph.D.
Non-randomized Medical Device Clinical Studies: A Regulatory Perspective Sep. 16, 2005 Lilly Yue, Ph.D.* CDRH, FDA, Rockville MD * No official support.
1 Bayesian CTS FDA/Industry Workshop September 18, 2003Copyright Pharsight Case Study in the Use of Bayesian Hierarchical Modeling and Simulation.
The Application of Propensity Score Analysis to Non-randomized Medical Device Clinical Studies: A Regulatory Perspective Lilly Yue, Ph.D.* CDRH, FDA,
Addition Facts
Overview of Lecture Parametric vs Non-Parametric Statistical Tests.
C82MST Statistical Methods 2 - Lecture 2 1 Overview of Lecture Variability and Averages The Normal Distribution Comparing Population Variances Experimental.
Assumptions underlying regression analysis
STATISTICAL INFERENCE ABOUT MEANS AND PROPORTIONS WITH TWO POPULATIONS
Mentor: Dr. Kathryn Chaloner Iowa Summer Institute in Biostatistics
On Comparing Classifiers : Pitfalls to Avoid and Recommended Approach
Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: Null hypothesis.
HYPOTHESIS TESTING. Purpose The purpose of hypothesis testing is to help the researcher or administrator in reaching a decision concerning a population.
6. Statistical Inference: Example: Anorexia study Weight measured before and after period of treatment y i = weight at end – weight at beginning For n=17.
Phase II/III Design: Case Study
Chapter 4 Inference About Process Quality
Type I & Type II errors Brian Yuen 18 June 2013.
“Students” t-test.
Module 16: One-sample t-tests and Confidence Intervals
Chapter 8: Introduction to Hypothesis Testing. 2 Hypothesis Testing An inferential procedure that uses sample data to evaluate the credibility of a hypothesis.
Test B, 100 Subtraction Facts
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Two Sample Proportions Large Sample Difference of Proportions z Test & Confidence.
CHAPTER 15: Tests of Significance: The Basics Lecture PowerPoint Slides The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner.
CHAPTER 14: Confidence Intervals: The Basics
Multiple Regression and Model Building
Chapter 16 Inferential Statistics
Unit 4 – Inference from Data: Principles
Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
Bayesian posterior predictive probability - what do interim analyses mean for decision making? Oscar Della Pasqua & Gijs Santen Clinical Pharmacology Modelling.
Chapter 19 Confidence Intervals for Proportions.
Chapter Seventeen HYPOTHESIS TESTING
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
Ch. 9 Fundamental of Hypothesis Testing
Adaptive Designs for Clinical Trials
Prospective Subset Analysis in Therapeutic Vaccine Studies Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Chapter 8 Introduction to Hypothesis Testing. Hypothesis Testing Hypothesis testing is a statistical procedure Allows researchers to use sample data to.
Chapter 8 Introduction to Hypothesis Testing
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
Background to Adaptive Design Nigel Stallard Professor of Medical Statistics Director of Health Sciences Research Institute Warwick Medical School
Chapter 8 Introduction to Hypothesis Testing
Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
1 An Interim Monitoring Approach for a Small Sample Size Incidence Density Problem By: Shane Rosanbalm Co-author: Dennis Wallace.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
1 Statistics in Drug Development Mark Rothmann, Ph. D.* Division of Biometrics I Food and Drug Administration * The views expressed here are those of the.
Department Author Bayesian Sample Size Determination in the Real World John Stevens AstraZeneca R&D Charnwood Tony O’Hagan University of Sheffield.
Bayesian Approach For Clinical Trials Mark Chang, Ph.D. Executive Director Biostatistics and Data management AMAG Pharmaceuticals Inc.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
Introduction to Testing a Hypothesis Testing a treatment Descriptive statistics cannot determine if differences are due to chance. Sampling error means.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Chapter 13 Understanding research results: statistical inference.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Chapter Nine Hypothesis Testing.
Tests of Significance The reasoning of significance tests
Dr.MUSTAQUE AHMED MBBS,MD(COMMUNITY MEDICINE), FELLOWSHIP IN HIV/AIDS
Mark Rothmann U.S. Food and Drug Administration September 14, 2018
More about Posterior Distributions
Aiying Chen, Scott Patterson, Fabrice Bailleux and Ehab Bassily
AP STATISTICS LESSON 10 – 4 (DAY 2)
Presentation transcript:

Interim Analysis in Clinical Trials: A Bayesian Approach in the Regulatory Setting Telba Z. Irony, Ph.D. and Gene Pennello, Ph.D. Division of Biostatistics Office of Surveillance and Biometrics Center for Devices and Radiological Health, FDA No official support or endorsement by the Food and Drug Administration of this presentation is intended or should be inferred.

2 The Frequentist Approach to Interim Analyses Trial: 200 patients Several interim analyses planned If statistical significance is found at any of the looks, the trial stops and is successful. In order to obtain a significance level of 0.05, the levels at each possible stopping point must be smaller than Of course, there is an infinite number of possibilities of distributing the level 0.05 among the possible stopping points.

3

4 The alpha spending function is a version of stopping boundary that is a continuous function of the percentage of the study completed. There are lots of different boundaries, techniques, and software (PEST, EAST) to control type I error while performing interim looks in a clinical trial. You could create (and publish) your own boundary and develop your own software. Other Ways to Penalize Multiple Looks

5 One looked once at the data during the trial with the intention of stopping but didnt => does not reach significance (required p-value = 0.041) => not successful! The competitor did not look => reached significance (required p-value = 0.05) => successful! Moreover: reaching significance or not depends on whose boundary you choose => you have to tell in advance which one you want and cannot change your mind! That approach violates the Likelihood Principle! Two companies come with the same data on 200 patients Both obtain the same p-value at the end (0.045)

6 Frequentists inferences are based on p-values probabilities are on the sample space Estimation: P(data| parameter Hypothesis testing P(data | H ) Bayesians inferences are based on posterior distributions probabilities are on the parameter space Estimation: P( parameter |data) Hypothesis testing: P( H | data) Likelihood Principle prevails Why do frequentist and Bayesian approaches differ?

7 The Bayesian Approach to Interim Analyses No adjustments are made for interim looks or modifications of trials in midcourse. In fact, the decision of continuing the study or not should be based on potential costs and benefits weighed by the current posterior distribution of the unknowns.

8 p: chance of patient success Interim Look: 190 successes out of 200 observed patients Remaining: 80 patients. How many successes among the next 80 patients? Could we stop the trial and make a decision already? Predictive Distribution P( future observation(s) | prior, data) Example 1: Curtailment of the trial via predictive distribution Clinical trial

9 Predictive probability of success for the next 80 patients (based on the posterior distribution for p) Make sure that the remaining patients are exchangeable with the observed patients.

10 We collect data to learn about an endpoint Stop when the credible interval is small enough Stop when there is reasonable assurance that the hypothesis is true (or false) or the device is safe and effective (or is not). 2. Interim Analyses: Multiple Looks When we know enough we should stop the trial

11 Example: A totally Bayesian approach Planned ahead => no penalty for multiple looks! Interest: - rate of adverse effect - endocarditis Prior: P( ) - hierarchical model - used old results Interest: Posterior: P( | data) Want to be small. How small? New treatment

12 If there is a good chance that success. If there is a good chance that > target => failure. Pre-defined criterion: Look at every 100 patient years. Stop and approve if P( Stop and dont approve if P( > target | data) > Minimum sample size: 300 patient years (hierarchical model) Maximum sample size: 800 patient years ( practical reasons) The company could in fact go on for ever (!!) Target: = 0.1

13 Start with 300 patient years (data1). If P( 99% stop and approve. If P( > target | data1) > 80% stop and cut losses. If neither of the above continue sampling.

14 Sample 100 patient years more (data2). If P( 99% stop and approve. If P( > target | data1+ data2) > 80% stop and cut losses. If neither of the above continue sampling.

Sample 100 more (data i). If P( 99% stop and approve. If P >target |data1+data2... +data i)>80% stop and cut losses. Approved!

16 Frequentists believe one may sample to a foregone conclusion: one may stop as soon as one gets significance; or by repeatedly testing it is possible to reject Ho with probability as close to 1 as desired (probabilities of hypothesis are usually martigales - D. Berry, 1987). It takes an infinite amount of time, though. Controlling the overall type I error is a critical concern in monitoring clinical trials - Regulators. Some Bayesians (perhaps inspired by OBrien and Fleming) believe that one needs to be more restrictive in early stages of the trial, requiring higher posterior probabilities for termination at the beginning…. Problems

17 More Problems Normal distribution paradox (D. Rubin): Two Companies: Frequentist and Bayesian Both Perform Interim Looks. Bayesian uses non-informative prior and stops when P(Ho|data) >95%. Frequentist use a nominal significance level of 5%. In the Normal case with non-informative prior, the posterior probability is numerically equal to 1-(p-value). The Frequentist pays a penalty for the looks and the Bayesian doesnt. The Frequentist may be unsuccessful and the Bayesian may be successful with the same data!

18 To illustrate what would happen in terms of type I and II errors in a Bayesian Trial, we request simulations at the design stage. If the rate were actually below the target, what would happen? How often would would the trial stop for futility? (type II error) If the rate were actually above the target, what would happen? How often would the device be approved? (type I error) A Regulatory Solution Whenever the type I error rate is too high, we modify the design!

19 For each rate, simulated 1000 trials Evaluating the experimental design – Heart Valve