Presentation is loading. Please wait.

Presentation is loading. Please wait.

Use of the False Discovery Rate for Evaluating Clinical Safety Data Joseph F. Heyse Devan V. Mehrotra Clinical Biostatistics – Vaccines Merck Research.

Similar presentations


Presentation on theme: "Use of the False Discovery Rate for Evaluating Clinical Safety Data Joseph F. Heyse Devan V. Mehrotra Clinical Biostatistics – Vaccines Merck Research."— Presentation transcript:

1 Use of the False Discovery Rate for Evaluating Clinical Safety Data Joseph F. Heyse Devan V. Mehrotra Clinical Biostatistics – Vaccines Merck Research Laboratories Blue Bell, PA Third International Conference on Multiple Comparisons Bethesda, MD August 6, 2002

2 Heyse/MCP2002 bl 2 Acknowledgment u This research was in collaboration with the late Professor John Tukey (Princeton University).

3 Heyse/MCP2002 bl 3 Outline u Motivating example u Multiplicity issues u FWER and FDR u Proposal for flagging AEs u Summary of three examples u Concluding remarks

4 Heyse/MCP2002 bl 4 Introduction u Evaluation of safety is an important part of clinical trials of pharmaceutical and biological products. u Adverse experiences (AEs) can be categorized as three types –Tier 1: Associated with specific hypotheses –Tier 2: Set encountered as part of trial safety evaluation –Tier 3: Rare spontaneous reports of serious events that require clinical evaluation u Our interest is primarily Tier 2

5 Heyse/MCP2002 bl 5 ICH Recommendations u ICH-E9 recommends descriptive statistical methods supplemented by confidence intervals u p-values useful to evaluate a specific difference of interest u If hypothesis tests are used, statistical adjustments for multiplicity to quantitate the Type I error are appropriate, but the Type II error is usually of more concern u p-values sometimes useful as a “flagging” device applied to a large number of safety variables to highlight differences worthy of further attention

6 Heyse/MCP2002 bl 6 Illustration Multiplicity in Safety Assessment u Clinical trial compared the safety and immunogenicity of the combination vaccine COMVAX™ * to its monovalent components u 1 of 92 safety comparisons revealed a higher rate of unusual high-pitched crying (UHPC) following the second of a three-dose series (6.7% vs. 2.3%, p=0.016) u No medical rationale for this finding was discovered and a larger hypothesis-driven study was designed u Comparable rates were observed following vaccination in this larger trial * COMVAX™ is a combination of HIB and HB vaccine

7 Heyse/MCP2002 bl 7 Motivating Example (MMRV * Vaccine) u Safety and immunogenicity vaccine trial. u Study population: healthy toddlers, 12-18 months of age u Group 1 = MMRV + PedvaxHIB  on Day 0 u Group 2 = MMR + PedvaxHIB  on Day 0, followed by (optional) varicella vaccine on Day 42 * MMRV is a combination measles, mumps, rubella, varicella vaccine

8 Heyse/MCP2002 bl 8 Motivating Example (cont’d) u Safety follow-up (local and systemic reactions) Group 1: Day 0-42 (N=148) Group 2: Day 0-42 (N=148) and Day 42-84 (N=132) u Question: Is the safety profile different if the varicella component is given as part of a combination vaccine on Day 0 compared with giving it 6 weeks later as a monovalent vaccine? u AEs: Group 1 (Day 0-42) vs. Group 2 (Day 42-84)

9 Heyse/MCP2002 bl 9 Clinical AE Counts (“Tier 2” AEs) #BSADVERSE EXPERIENCE 101ASTHENIA / FATIGUE 201FEVER 301INFECTION, FUNGAL 401INFECTION, VIRAL 501MALAISE 603ANOREXIA 703CANDIDIASIS, ORAL 803CONSTIPATION 903DIARRHEA 1003GASTROENTERITIS, INFECTIOUS 1103NAUSEA 1203VOMITING 1305LYMPHADENOPATHY Grp 1 (N1=148) X1 57 34 2 3 27 7 2 24 3 2 19 3 Grp 2 (N2=132) X2 40 26 0 1 20 2 0 10 1 7 19 2 DIFF (%) 8.2 3.3 1.4 1.3 3.1 3.2 1.4 8.6 1.3 -4.0 -1.6 0.5 p-value.1673.5606.4998.6248.5248.1791.4998.0289*.6248.0889.7295 1.0000

10 Heyse/MCP2002 bl 10 Clinical AE Counts (“Tier 2” AEs) - cont’d #BSADVERSE EXPERIENCE 1406DEHYDRATION 1508CRYING 1608INSOMNIA 1708IRRITABILITY 1809BRONCHITIS 1909CONGESTION, NASAL 2009CONGESTION, RESPIRATORY 2109COUGH 2209INFECTION, RESPIRATORY, UPPER 2309LARYNGOTRACHEOBRONCHITIS 2409PHARYNGITIS 2509RHINORRHEA 2609SINUSITIS Grp 1 (N1=148) X1 0 2 75 4 1 13 28 2 13 15 3 Grp 2 (N2=132) X2 2 0 2 43 1 2 8 20 1 8 14 1 DIFF (%) -1.5 1.4 -0.2 18.1 1.9 1.2 -0.8 2.7 3.8 0.6 2.7 -0.5 1.3 p-value.2214.4998 1.0000.0025*.3746.6872.6033.4969.4308 1.0000.4969 1.0000.6248

11 Heyse/MCP2002 bl 11 Clinical AE Counts (“Tier 2” AEs) - cont’d #BSADVERSE EXPERIENCE 2709TONSILLITIS 2809WHEEZING 2910BITE/STING, NON-VENOMOUS 3010ECZEMA 3110PRURITUS 3210 RASH 3310 RASH, DIAPER 3410 RASH, MEASLES/RUBELLA-LIKE 3510 RASH, VARICELLA-LIKE 3610 URTICARIA 3710VIRAL EXANTHEMA 3811CONJUNCTIVITIS 3911OTITIS MEDIA 4011OTORRHEA Grp 1 (N1=148) X1 2 3 4 2 13 6 8 4 0 1 0 18 2 Grp 2 (N2=132) X2 1 0 1 3 2 1 2 14 1 DIFF (%) 0.6 1.3 2.7 1.4 0.6 6.5 2.5 4.6 1.2 -1.5 -0.8 -1.5 1.6 0.6 p-value 1.0000.6248.1248.4998 1.0000.0209*.2885.0388*.6872.2214.6033.2214.7109 1.0000

12 Heyse/MCP2002 bl 12 Multiplicity Issues - The Problem u Potential for too many false positive safety findings if the multiplicity problem is ignored (for “Tier 2” AEs). u This can muddy the interpretation of the safety profile of the vaccine/drug.

13 Heyse/MCP2002 bl 13 Multiplicity Issues - The Challenge To develop a procedure for tackling multiplicity that: u Provides a proper balance between “no adjustment” and “too much adjustment”. u Is easy to automate/implement.

14 Heyse/MCP2002 bl 14 Familywise Error Rate (FWER) u Let F = {H 1,H 2 … H m } denote a family of m hypotheses. u FWER = Pr(any true H i  F is rejected).  We usually seek methods for which FWER  . u Benjamini & Hochberg (1995) argue that, in certain settings, requiring control of the FWER is often too conservative. They suggest controlling the “false discovery rate” instead, as a more powerful alternative. Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, B, 57, 289-300.

15 Heyse/MCP2002 bl 15 False Discovery Rate (FDR) (Benjamini & Hochberg) u

16 Heyse/MCP2002 bl 16 False Discovery Rate (FDR) (cont’d) (Benjamini & Hochberg) u u FDR  FWER {equality holds if m = m 0 }. u Effect of correlations on FDR is an area of research. {This controls FDR at } m Example

17 Heyse/MCP2002 bl 17 Proposal for Flagging AEs u We routinely summarize AEs by body system (BS). s body systems (i = 1, 2, …, s) k i AEs associated with body system i p ij = between-group p-value for the j th AE within i th BS (e.g., based on two-tailed Fisher’s exact test.)

18 Heyse/MCP2002 bl 18 Proposal for Flagging AEs (cont’d) u Step 1 Ignore AEs for which the total incidence is so low that a rejection even at the unadjusted 0.05 level is impossible. u Step 2 Among the remaining AEs, flag those for which the p-value achieves statistical significance after adjusting for multiplicity using a “Double FDR” approach.

19 Heyse/MCP2002 bl 19 Double FDR Approach u Define This represents the strongest safety “signal” for body system i. u 1 st level FDR adjustment –Apply FDR adjustment to –Let u 2 nd level FDR adjustment –Within body system i, apply FDR adjustment to –Let

20 Heyse/MCP2002 bl 20 Double FDR Approach (cont’d) Proposed Flagging Rule Flag AE(i,j) if  What values of  1 and  2 should we use?

21 Heyse/MCP2002 bl 21 Choosing  1 and  2  Set  2 =  and use either (a) or (b) below for  1. (a)Using resampling (non-parametric bootstrap) to determine the largest data-dependent  1 (   2 ) that ensures FDR  . OR (b)Choose  1 (   2 ) independent of the data. For example, let, and estimate the resulting FDR using resampling.

22 Heyse/MCP2002 bl 22 Resampling Procedure u Purpose –To estimate the false discovery rates of the following: –To determine the largest  1 (   2 ) that guarantees FDR   when using DFDR(  1,  2 ). NOADJ FULLFDR(  ) DFDR(  1,  2 ) No multiplicity adjustment; flag AE if unadjusted p <.05 Full FDR adjustment (ignore BS grouping) Double FDR adjustment for selected (  1,  2 )

23 Heyse/MCP2002 bl 23 Resampling Procedure (cont’d) u Details 1.POOL data from both treatment groups into a common population. Sample with replacement from this common population, to simulate many repetitions of the original trial. This procedure: a)simulates a true null situation (Group 1 = Group 2). b)preserves the correlation structure of original data. 2.Implement our proposal for flagging AEs using the NOADJ, FULLFDR(  ), and DFDR(  1,  2 ) approaches, and calculate the corresponding FDRs.

24 Heyse/MCP2002 bl 24 MMRV Example - Resampling Results * out of 40; 2000 simulations

25 Heyse/MCP2002 bl 25 MMRV Example - Resampling Results DFDR(  1,  2 ): Estimated FDR (%)

26 Heyse/MCP2002 bl 26 First Level FDR Adjustment Body System ID Nervous system Skin Digestive system Body site unspecified Special senses Metabolic / immune Respiratory Hematologic and lymphatic Number of AE Types 3 9 7 5 3 1 11 1 Unadjusted p-value 0.0025 0.0209 0.0289 0.1673 0.2214 0.3746 1.0000 FDR Adjusted p-value 0.0200 0.0771 0.2952 0.4281 1.0000

27 Heyse/MCP2002 bl 27 Second Level FDR Adjustment Body System 08: Nervous System and Psychiatric Adverse Experience Irritability Crying Insomnia Unadjusted p-value 0.0025 0.4998 1.0000 FDR Adjusted p-value 0.0075 0.7497 1.0000

28 Heyse/MCP2002 bl 28 Summary of Three Examples

29 Heyse/MCP2002 bl 29 Concluding Remarks u Current approach of flagging AEs based on unadjusted p-values (or C.I.s) can result in excessive false positive safety findings. These can cause undue concern for approval/labeling, and can affect post-marketing commitments. u Under our proposal, the unadjusted p-values (or C.I.s) would still be reported. The Double FDR multiplicity adjustment is a method to facilitate the interpretation of the unadjusted p-values.

30 Heyse/MCP2002 bl 30 Concluding Remarks (cont’d) u Our proposal for tackling multiplicity will: –substantially reduce the percentage of incorrectly flagged AEs. –be better accepted if described a priori in the protocol/DAP rather than on a post-hoc basis. –facilitate comparable interpretation of safety results across studies, with respect to Type I error.


Download ppt "Use of the False Discovery Rate for Evaluating Clinical Safety Data Joseph F. Heyse Devan V. Mehrotra Clinical Biostatistics – Vaccines Merck Research."

Similar presentations


Ads by Google