COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Slides:



Advertisements
Similar presentations
A Spreadsheet for Analysis of Straightforward Controlled Trials
Advertisements

Review bootstrap and permutation
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Hypothesis testing Another judgment method of sampling data.
Statistical Issues at FDA Greg Soon, Ph.D. Statistical Team Leader for Anti-viral Products FDA/CDER/OB/DBIII :30-4:30 University of Maryland.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Topic 6: Introduction to Hypothesis Testing
Chapter 8 Estimation: Additional Topics
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Topic 2: Statistical Concepts and Market Returns
Clustered or Multilevel Data
MARE 250 Dr. Jason Turner Hypothesis Testing III.
Inference about a Mean Part II
Chapter 11: Inference for Distributions
Today Concepts underlying inferential statistics
Sample Size Determination
Chapter 14 Inferential Data Analysis
Richard M. Jacobs, OSA, Ph.D.
Sample Size Determination Ziad Taib March 7, 2014.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Choosing Statistical Procedures
AM Recitation 2/10/11.
Education 793 Class Notes T-tests 29 October 2003.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
T-Tests and Chi2 Does your sample data reflect the population from which it is drawn from?
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Which Test Do I Use? Statistics for Two Group Experiments The Chi Square Test The t Test Analyzing Multiple Groups and Factorial Experiments Analysis of.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
1)Test the effects of IV on DV 2)Protects against threats to internal validity Internal Validity – Control through Experimental Design Chapter 10 – Lecture.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
B AD 6243: Applied Univariate Statistics Hypothesis Testing and the T-test Professor Laku Chidambaram Price College of Business University of Oklahoma.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Confidence intervals and hypothesis testing Petter Mostad
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
Pro gradu –thesis Tuija Hevonkorpi.  Basic of survival analysis  Weibull model  Frailty models  Accelerated failure time model  Case study.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
ANOVA Assumptions 1.Normality (sampling distribution of the mean) 2.Homogeneity of Variance 3.Independence of Observations - reason for random assignment.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
Sampling Methods, Sample Size, and Study Power
C82MST Statistical Methods 2 - Lecture 1 1 Overview of Course Lecturers Dr Peter Bibby Prof Eamonn Ferguson Course Part I - Anova and related methods (Semester.
Chapter Eight: Using Statistics to Answer Questions.
Chapter 10 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Methods of Presenting and Interpreting Information Class 9.
BIOST 513 Discussion Section - Week 10
ESTIMATION.
Statistical Core Didactic
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
HYPOTHESIS TESTING Asst Prof Dr. Ahmed Sameer Alnuaimi.
Internal Validity – Control through
Review of Chapter 11 Comparison of Two Populations
Chapter 9 Hypothesis Testing.
Single-Factor Studies
Single-Factor Studies
I. Statistical Tests: Why do we use them? What do they involve?
Fixed, Random and Mixed effects
Statistics II: An Overview of Statistics
Understanding Statistical Inferences
Chapter Nine: Using Statistics to Answer Questions
Presentation transcript:

COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those of the author and do not necessarily reflect those of the FDA.

OBJECTIVE OF TALK Discuss role of randomization and deliberate balancing in experimental design. Compare standard and computer intensive tests to examine robustness of level and power of common tests with deliberately balanced assignments when assumed distribution of responses is not correct.

OUTLINE OF TALK I. Testing with Deliberately Balanced Assignment II. Common Mistakes in Views on Randomization and Balance III. Robustness Studies on Inference in Deliberately Balanced Designs

I. TESTING WITH DYNAMIC ALLOCATION

DYNAMIC ASSIGNMENTS 1. Identify several relevant, discrete covariates, e.g., age, sex, CD4 count 2. Change randomization probabilities at each assignment to get each level of each covariate split nearly between arms

3. Assign new subject randomly if all covariates are balanced assign deterministically or with unequal probabilities to move toward marginal balance if not currently balanced

ISSUES WITH DYNAMIC ASSIGNMENTS 1. Why bother with this elaborate procedure? 2. Are the levels of tests for treatment effect preserved when standard tests are used with dynamic (minimization) assignments? 3. Does the use of minimization increase power in the presence of both treatment and covariate effects?

II. COMMON MISTAKES IN ANALYSIS OF BASELINE COVARIATES

Mistake 1. Purpose of Randomization is to Create Balance in Baseline Covariates Fact: Purpose of Randomization is to Guarantee Distributional Assumptions of Test Statistics and Estimators

Mistake 2. It is good practice in a randomized trial to test for equality between arms of a baseline covariate. Fact: All observed differences between arms in baseline covariates are known with certainty to be due to chance. There is no alternative hypothesis whose truth can be supported by such a test.

Mistake 3. If a test for equality between arms of a baseline covariate is significant, then one should worry. Fact: Such test statistics are not even good descriptive statistics since p-values depend on sample size, not just the magnitude of the difference.

Mistake 4. Observed Imbalances in Baseline Covariates cast Doubt on the Reality of Statistically Significant Findings in the Primary Analysis. Fact: The standard error of the primary statistic is large enough to insure that such imbalances create significant treatment effects no more frequently than the nominal level of the test.

Mistake 5. Type I Errors can be Reduced by Replacing the Primary Analysis with one Based on Stratifying on Baseline Covariates Observed Post Facto to be Unbalanced. Fact: The Operating Characteristics of Procedures Selected on the Basis of Observation of the Data are not generally Quantifiable.

If the Agency approved of Post Hoc Fixing of Type I Errors by Adding New Covariates to the Analysis (or by other Adjustments to Fix Randomization Failures), Then it should also Approve of Similar Post Hoc Fixing of Type II Errors when Failure of Randomization Leads to Imbalance in Favor of the Control Arm.

Mistake 6. If the same Random Assignment Method gave more even Balance in Trial A than in Trial B, then one should place more trust in a Rejection of the Null Hypothesis from Trial A. Fact: Balance on Baseline Covariates Decreases the Variance of Test Statistics and Estimators. It Increases the Power of Tests when the Alternative Hypothesis is True. It has no Effect on Type I Error.

Mistake 7. Balance on Baseline Covariates Leads to Important Reductions in Variances. Fact: Even without Balance, the Variance of Test Statistics and Estimators are of size O(1/N) where N = sample size. Balancing on p Baseline Covariates Decreases these variances by Subtracting a Term of size O(p/N 2 )

Typical model for Continuous Response: Y ik = m i + g 1 x 1 ik + … + g p x p ik + e ik where e ik ~ N(0, s 2 ) m i = treatment effect, X ik = (x 1 ik,…,x p ik ) = vector of covariates g 1,…, g p = unknown vector of covariate effects

s 2 * Precision of Estimate of (m 1 -m 0 ) = N/2 - ZZ where N = number per arm, Z = V -1 (X 1. - X 0.), V 2 = matrix of cross-products of X/2N, and randomization distribution of (X 1. - X 0.) ~ N( 0, V 2 ), of Z ~ N(0, I p ), of ZZ ~ Chi-square(p) Precision with Balance = N/2, E(Precision without Balance) = N/2 - O(p)

III. ROBUSTNESS STUDIES ON INFERENCE IN DELIBERATELY BALANCED DESIGNS A. MODELS USED TO COMPARE METHODS

METHODS COMPARED 1. Dynamic Allocation analyzed by F- statistic from ANCOVA based on arm and covariates 2. Dynamic Allocation analyzed by re- randomization test, using difference in means 3. Randomized Pairs, analyzed by F-statistic from ANCOVA using arm and covariates

BASIC FORM OF SIMULATED DATA 1. Control & test arms, N subjects randomized 1:1 2. X 1j, …, X 7j = binary covariates for subject j 3. e j = unobserved error for subject j 4. Y j = observed response for subject j 5. I 1j = 1 if subject j in arm 1, test arm 6. Y j = m j I 1j + e j + d k=1 7 X kj

MODELS FOR ERRORS 1. e j ~ N( 0, 1 ) Normal 2. e j ~ exp( N( 0, 1 )) Lognormal 3. e j ~ N( 4j/N, 1 ) Trend 4. e j ~.9 N( 0, 1) +.1 N( 0, 25 ) Mixed 5. e j ~ N( 0, 4j/N ) Hetero 6. e j ~ N( cos(2 j/N), 1 ) Sine wave 7. e j ~ N( 0, 1 ) if j<J ~ N(4, 1) if j>=J Step

MODELS FOR COVARIATES X 1j, …, X 7j are 1. independent with p 1, …, p 7 constant in j 2. correlated with p 1, …, p 7 constant 3. independent with p 1, …, p 7 monotone in j 4. independent with p 1, …, p 7 sinusoid in j Coefficient d = 1 or 0

MODELS FOR TREATMENT 1. Treatment effect m j = m, constant over j 2. Treatment effect m j = m * (4j/N), increasing over j

COMPARISONS 1. Select one of the models 2. Generate 200 sets of covariates and unobserved errors 3. For each set, construct I 1j once by dynamic & once by randomized pairs 4. Compute the 200 p-values for different tests and assignment methods

SIMULATED DATA FOR COX REGRESSION 1. Control & test arms, N subjects randomized 1:1 2. X 1j, …, X 7j = binary covariates for subject j 3. Y Lj = potentially observed failure time for subject j on arm L = 0 or 1 4. Y Li /[ d L ( 1+ k=1 7 X kj )] ~ F L, L = 0 or 1 5. F L = Exponential or Weibull 6. Censoring ~ Exp with scale large or small

RESULTS WITH COX REGRESSION 1. Assign subjects by dynamic allocation. 2. Estimate treatment effect by proportional hazards regression 3. Re-randomize and compute new ph reg estimates many times. 4. Compare parametric p-value with percentile of real estimate among all rerandomized treatment estimates

III. ROBUSTNESS STUDIES ON INFERENCE IN DELIBERATELY BALANCED DESIGNS B. RESULTS OF SIMULATIONS

SIMULATION RESULTS 1. In most cases considered, the gold standard but computer intensive re- randomization test gave the same power curve as the standard ANCOVA F-test for the dynamic allocation. Both level, when H 0 was true, and power, otherwise, were the same.

SIMULATION RESULTS 2. In most cases considered, the ANCOVA F- test gave the same power curve whether the subjects were assigned by dynamic allocation or randomized pairs. Deliberate balance on baseline covariates gave no improvement in power.

SIMULATION RESULTS 3. There was one clear exception to the above findings. When untreated responses showed a trend with time of enrollment, the ANCOVA F-test for treatment gave incorrectly low power.

SIMULATION RESULTS 4. In most cases considered with time to event data with dynamic allocation, the re- randomization test gave the same results as the Cox regression.

SUMMARY 1. Modifying a Randomization Method to Achieve Deliberate Balance Serves Mainly Cosmetic Purposes & Should be Discouraged 2. Balance on Covariates Reduces Variance of Test Stats & Estimators but Only by Small Amounts Var( trt effect) = O(1/N) when balanced When unbalanced, Var is larger by a term = O(p/N 2 )

SUMMARY 3. Rerandomization analyses based on Finite Population Models are gold standard for randomized trials 4. IID Error models are only approximations 5. Approximation is adequate for level with common minimization allocations under a wide variety of potential violations of the assumptions.

SUMMARY 6. Deliberate Balance Allocations and Simple Tests Require Belief that God is Randomizing Your Subjects Responses. Randomization and Finite Population Based Tests Protect You if the Devil is Determining the Order of Your Subjects Responses