Designs for Phase II Clinical Trials

Designs for Phase II Clinical Trials
Rick Chappell, Ph.D. Professor, Department of Biostatistics and Medical Informatics University of Wisconsin School of Medicine & Public Health BMI 542 – Week 3, Lecture 1 (with contributions from J. Eickhoff, D. DeMets)

Role of Phase II Studies in Drug Development
Phase II study: Early efficacy evaluation in a “small” number of subjects Primary Objective: Evaluate efficacy treatment regimen Determine whether the efficacy is adequate to warrant further testing (phase III) Secondary Objective: Describe associated adverse events with a larger sample size than phase I trials Some trials combine Phase I and Phase II, and test both efficacy and toxicity

Endpoints Must be quick; therefore surrogates are often used, e.g. Response to drug (definition of “response”?) Response Rate Duration of response Time to response Progression-free survival (PFS) in cancer Median PFS PFS rate (e.g., 6 month PFS, 12 month PFS)

Design Requirements General requirements of a phase II study: Short duration: The primary endpoint should be observed early (usually excludes overall survival as a primary endpoint) The number of available patients may be small (typically <60 patients - Strict eligibility criteria (oftentimes “second line” or “third line” patients, i.e., patients who have failed to response to standard care) For ethical reasons, allow for early stopping if new treatment regimen is inactive (or too toxic)

Simplest form: One-arm study
It is assumed that the primary endpoint is response rate (RR) or other “good” outcome. Design parameters: p0: Maximum unacceptable probability of response p1: Minimum acceptable probability of response Scientific knowledge is derived from (statistical) hypothesis testing Hypothesis test: H0 : RR ≤ p0 vs. HA : RR ≥ p1

Simplest form: One-arm study
Example: Phase II study in oncology Endpoint: Response rate Response: Response Evaluation Criteria in Solid Tumors Complete Response: Disappearance of all target lesion Partial Response: At least a 30% decrease in the sum of the longest diameter of target lesions (etc.) Overall Response Rate: Proportion of evaluable patients with CR or PR When might we consider stability a response?

Multi-stage Design Multi-stage design: Allow for early stopping (if regimen has an unacceptably low response rate) In a typical clinical setting it is difficult to manage more than two stages One of the earliest two-stage designs: “Gehan’s Design” (Gehan, J Chron Dis 1961) Gehan’s Design: A preliminary trial for screening a drug for initial evidence of efficacy

Two-Stage Designs Stage 1: enroll N1 patients X1 or more respond
Fewer than X1 respond Stage 2: Enroll an additional N2 patients Stop trial

Gehan’s two-stage design
One of the earliest two-stage designs: “Gehan’s Design” (Gehan, J Chron Dis 1961) Gehan’s Design: A preliminary trial for screening a drug for initial evidence of efficacy

Typical Gehan Design Let x% = 20% That is, want to check if drug likely to work in at least 20% of patients 1. Enter 14 patients 2. If 0/14 responses, stop and declare true drug response  20% 3. If 1+/14 responses, add 15-40 more patients 4. Estimate response rate & C.I.

____________________________________________
Gehan’s two-stage design: __________________________________________________ Stage 1: n1 patients are accrued: If no response is observed, stop and declare lack of efficacy If at least one response is observed, continue with Stage 2. Choice of n1: Pr(No response | RR = p1) = 5% ____________________________________________ Stage 2: n2 additional patients are accrued Choice of n2: “Large enough for estimating the RR with a specified level of precision (e.g., standard error of less than 5%)”

Number of response in stage 1
Example: Assume that the target response rate p1 is 20% und that the desired level of precision is a standard error of less than 5%. Choice of n1: n1= 14 is smallest integer that satisfies (the old “Rule of 14”) Pr(No response | RR = 20%) = (0.8)n1 ≤ 5% Choice of n2: Depends on the observed RR in stage 1 Number of response in stage 1 2 4 6 n2 35 68 84

Compute probability of consecutive failures:
Patient Prob 1 0.8 (0.8 x 0.8) (0.8 x 0.8 x 0.8) If drug 20% effective, there would be a 1-4.4% = 95.6% chance of at least one success If 0/14 success observed, reject drug

Phase II Design (5) Stage I Sample Size Table I
Rejection Effectiveness (%) Error 5% 10%

Stage II Sample Size Based on desired precision of effectiveness estimate r1 = # of successes in Stage 1 n1= # of patients in Stage 1 Now precision of total sample N=(n1 + n2) Let

upper 75% confidence limit from first sample
To be conservative, upper 75% confidence limit from first sample - Thus, we can generate a table for size of second stage (n2) based on desired precision

Additional Patients for Stage II (n2) (Rejection Rate 5% for Stage I)

Limitations of Gehan’s two-stage design:
Second stage provides no formal rule to decide whether the treatment should be tested further No control of type I error α Limited flexibility (e.g., type II error is fixed at 5%). Sample size may be too large for stage 2 Despites its limitations, Gehan’s two-stage designs are occasionally still used in practice: In a total of 208 phase II clinical trials published in 2000, 3.3% used a Gehan’s design (Thezenas et al, European Journal of Cancer 2004)

Traditional Two-Stage Designs (Simon, Control Clin Trials 1989)
Design Parameters: nk: Number of patients in stage k = 1,2 n: n = n1+ n2, maximum sample size rk: Critical value after stage k =1,2

Traditional Two-Stage Designs
____________________________________________________ Stage 1: n1 patients are accrued: If R1 ≤ r1 responses are observed, stop and conclude lack of efficacy Otherwise, continue with Stage 2 _____________________________________________ Stage 2: n2 additional patients are accrued: If R ≤ r2, conclude lack of efficacy Otherwise, conclude presence of efficacy (promising treatment, further considerations for phase III testing)

Simon’s Optimal Two-Stage Design (Simon, Control Clin Trials 1989)
How to determine (r1/n1, r2/n2)? Subject to fixed type I error (α) and type II error (β) rates. Appropriate type I/II error rates for phase II studies: 5% ≤ α ≤ 10% and 5% ≤β ≤ 20% Given fixed type I error (α) and type II error (β), many designs satisfy Pr(Reject H0|RR=p0) ≥ 1 - α and Pr(Reject H0|RR=p1) ≤ β See the following table from this paper. “PET” = “Probability of Early Termination” (e.g., after Stage 1).

Two-Stage Clinical Trials Sample Size
Possible Designs For P0=0.100, P1=0.500, Alpha=0.050, Beta=0.200 Constraints N1 R1 PET N R Ave N Alpha Beta Satisfied Single Stage Minimax Optimum **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both** **Both**

Optimal design: Minimizes the expected sample size under H0, i.e., the assumption that the treatment has insufficient efficacy (RR= p0)

Simon’s MiniMax Two-stage Design
Given fixed type I and II error rates and under the restriction Pr(Reject H0|RR=p0) ≥ 1 - α and Pr(Reject H0|RR=p1) ≤ β, MiniMax designs minimize the maximum sample size n Simon’s Optimal and Minimax Designs have been widely used in practice

Early termination for non-activity
Calculation of PET=Pr(stopping at stage 1|p0), where B(n, p; c) is the cumulative binomial probability of up to c events of probability p out of n subjects; individual terms are b(n,p;c) c1 =B(n1,p;c1)=∑ b(n1,p;i) i=0 Determine n1, n2, c1 and c2 (critical values after first and second state) using a direct search method on exact probabilities Power=Pr(reject H0|p0) min(n1,c2) =1-B(n1,p,c1) - ∑ b(n1,p,m)B(n2,p;c2-m) m=c1+1

Example: p0 = 10%, p1 = 30%, α = 5%, β = 15%

Simon’s Optimal/MiniMax designs have undesirable properties:
Optimal design with large maximum sample size n MiniMax design with large expected sample size Is a practical compromise possible? Maximum sample size n close to that of MiniMax design Expected sample size close to that of optimal design Enumerate all designs subject to fixed type I and type II error rates Determine a compromise between Optimal and MiniMax design using a graphical search (Jung, Carey and Kim, Control Clin Trials 2001)

Example: p0 = 30%, p1 = 50%, α = 5%, β = 15%

Balanced Design (Ye & Shyr, 2007)
Compromise between Simon’s optimal and MiniMax deign The same number of patients are accrued for both stages (n1=n2)

Software PASS sample size calculation software
R function ph2simon() in clinfun package > ph2simon(0.2,0.4,0.05,0.1) Simon 2-stage Phase II design Unacceptable response rate: 0.2 Desirable response rate: 0.4 Error rates: alpha = ; beta = 0.1 r n1 r n EN(p0) PET(p0) Optimal Minimax

Optimal Three-Stage Designs (Chen, Stat Med 1997)
Extension to Simon’s optimal two-stage design Useful when accrual rate is “slow” (e.g., single institution trials) Let PET1 denote the probability of early termination after the first stage, and PET1+2 the probability of early termination after the first or second stage Optimal three-stage design minimizes the expected sample size under H0 (RR ≤ p0): E(n|p0) = n1 + {1 – PET1(p0)} х n2 + {1 – PET1+2(p0)} х n3

Example: p0 = 10%, p1 = 30%, α = 10%, β = 10%
Three-stage optimal design Simon’s two-stage optimal design Stage k Sample size Stopping boundary E(n|p0) 1 10 2 19 3 26 4 17.79 Stage k Sample size Stopping boundary E(n|p0) 1 12 2 35 5 19.84

Comparison between optimal three-stage design and
Simon’s optimal two-stage design: There is no consistent pattern for the maximum sample size (may be larger or smaller when compared to a two-stage design) The optimal three-stage design reduces the expected sample size by an average of 10% when compared to a two-stage design

Phase II Designs for Multiple Endpoints
The selected primary endpoint is just one consideration in the decision to purse a new treatment Trade-off between efficacy and toxicity: A treatment with a high efficacy may not be of interest if too many patients experience life-threatening toxicities A treatment with a moderate efficacy but a good toxicity profile might be still considered for future trials.

Bivariate extension to Simon’s optimal two-stage design:
Incorporating toxicity considerations into a two-stage design (Bryant and Day, Biometrics 2001) The toxicity profile of a new treatment undergoing phase II testing might be poorly understood: Available phase I studies may not be directly relevant to the target patient population The MTD of a new regimen might be very imprecise, due to small sample sizes (3-6 per dose level) in phase I studies Bivariate extension to Simon’s optimal two-stage design: Early termination (after first stage) if: Insufficient efficacy or Unacceptable high toxicity rate

HAR : pR ≥ pR1 and HAT : pT ≤ pT1
Response and toxicity parameters: pR0: Maximum unacceptable probability of response pR1: Minimum acceptable probability of response pT0: Minimum unacceptable probability of toxicity pT1: Maximum acceptable probability of toxicity Combined efficacy-toxicity hypothesis testing: H0R : pR ≤ pR0 or H0T : pT ≥ pT0 vs. HAR : pR ≥ pR1 and HAT : pT ≤ pT1

Sample sizes n1 and n2 and critical values for stopping are obtained by specifying three error probabilities: α, γ and 1-β: The probability (α) of incorrectly declaring the treatment promising when the response and toxicity rates for the new therapy are the same as those of the standard therapy The probability (γ) of incorrectly declaring the treatment promising when the response rate for the new therapy is no greater than that of the standard or the toxicity rate for the new therapy is greater than that of the standard therapy. 3. The probability (1- β ) of declaring the treatment not promising at a particular point in the alternative region

The three error probability constraints are:
Pr(XR≥cR,XT≤CT| pR=pR0,pT=pT0,θ) ≤ α Sup Pr(XR≥cR,XT≤CT| pR,pT,θ) ≤ γ pR≤rR0 or PT≥pT0 3. Pr(XR≥cR,XT≤CT| pR=pRa,pT=pTa,θ) ≤ 1 - β

Design Parameters: nk: Number of patients in stage k = 1,2 rk: Critical value for response after stage k = 1,2 tk: Critical value for toxicity after stage k = 1,2

Bivariate optimal two-stage design
Example: pR0 = 10%, pR1 = 30%, pT0 = 40%, pT1 = 20%, αR = αT = 10%, β = 10% Bivariate optimal two-stage design Simon’s optimal two-stage design Stage k Sample size Stopping boundary (response) Stopping boundary (toxicity) E(n|pR0, pT0) 1 21 2 8 46 7 15 29.5 Stage k Sample size Stopping boundary (response) E(n|p0) 1 12 2 35 5 19.8

Conclusion: Incorporating toxicity considerations into the two-stage design is useful if the toxicity profile is not fully understood However, the cost of jointly considering both response and toxicity can be considerable (in terms of sample size requirements) Can be modified to multivariate efficacy endpoints (e.g., response rate and 6-month PFS rate)

Phase II Designs for Time-to-Event Endpoints
Response rate is not always a suitable primary endpoint: Response may not always correlate strongly with survival Challenges in response evaluation Some promising agents are cytostatic instead of cytotoxic

Two-stage design for evaluating survival probabilities (Case and Morgan, BMC Med Research Meth 2003)
Assume that the primary endpoint is a survival probability, e.g., 1-year OS rate or 1-year PFS rate Hypothesis test: H0: p ≤ p0 vs. HA: p ≥ p1 Survival probabilities (at time x): - p0: Maximum unacceptable survival probability (at time x) - p1: Minimum acceptable survival probability (at time x) Standard two-stage design may require inconvenient suspension of accrual at the interim analysis while patients are being followed.

Two-stage design (for x-year survival rate):
Accrue n1 patients until time t1. Each patients is followed until failure or for x years or until time t1, whichever is less If Z1(x,t1) < c1, stop the study early (lack of efficacy) If Z1(x,t1) ≥ c1, continue with second stage Stage 2: Accrue n2 additional patients between times t1 and t1+t2 Each patients is followed until failure or for x years, whichever is less If Z2(x,t1+t2) < c2: not promising regimen If Z2(x,t1+t2) ≥ c2: promising regimen

Example: (Case and Morgan, BMC Med Research Meth
2003) Phase II study to assess the activity of a new chemo radiation combination for patients with resectable pancreatic cancer Primary endpoint: One-year OS rate Hypothesis testing: H0: One-year OS rate is at most 35% vs HA: One-year OS rate is at least 50% Anticipated accrual rate: 24 patients per year Type I and II error rates: α = β = 10%

Design characteristics (of design which minimizes the
expected total study length): Stage nk tk ck Expected total study length (under H0) Expected total sample size (under H0) 1 53 2.2 yrs 0.38 2 29 1.2 yrs 1.17 3 years 63.5

Comparison between traditional designs (based on binomial
distribution) and two-stage design of survival probabilities Design Expected sample size (under H0) Expected total study length Maximum total study length Single stage (binomial) 72.0 4.00 years 4.0 years Simon two-stage (no interim accrual) 53.2 3.6 years 5.0 years (interim accrual) 67.4 3.2 years 4.4 years Two-stage design of survival probabilities 63.5 3.0 years

Most the designs discussed so far ONLY allow stopping if there is strong evidence that the treatment is not efficacious Can also have early stopping for efficacy Generally not popular in single arm studies Important to accumulate evidence to support claim of efficacy But, not stopping prolongs time to launch phase III

Randomized phase II Rationale for randomized phase II study:
Not willing to invest in phase III Want some “control” or “prioritization”

Two type of randomized phase II designs:
Phase II selection design (prioritization) Phase II designs with reference control arm (control) Phase II designs with reference control arms are often used in practice. Controversy issue: Critics consider these trials as “underpowered” phase II studies.

Phase II Selection Designs (Simon et al, Cancer Treat Report 1985, Sargent and Goldberg, Stat Med 2001) There are often multiple promising new therapies in a disease setting. Selection Design: Patients are randomized to treatments involving new combinations of active agents or new agents for which activity has been already demonstrated in some setting Goal: Identify the “best” treatment which should be tested in a phase III setting (formal comparison to standard therapy).

Hypothesis tests (comparison between arms) are not performed.
Selection theory criterion: Select the treatment with the highest efficacy Sample size requirements: If a superior treatment exists among the k treatment arms, it should be selected with a high probability. Probability of correct selection should be at least 90%.

Example: Binary outcomes (response)
Sample size requirements (per arm) to achieve a correct selection probability of 90% Response rate Number of arms p1,…,pk-1 pk k = 2 k = 3 k = 4 10% 25% 21 31 37 20% 35% 29 44 52 30% 45% 35 62 40% 55% 55 67

Randomized Phase II designs with reference arm
Includes reference or control arm Control arm is typically not directly compared to experimental arm (due to small N) Should include early stopping rules for lack of efficacy

Designs for Phase II Clinical Trials

Similar presentations

Presentation on theme: "Designs for Phase II Clinical Trials"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Designs for Phase II Clinical Trials

Similar presentations

Presentation on theme: "Designs for Phase II Clinical Trials"— Presentation transcript:

Similar presentations

About project

Feedback