John W. Tukey’s Multiple Contributions to Statistics at Merck Joseph F. Heyse Merck Research Laboratories Third International Conference on Multiple Comparisons Bethesda, Maryland August 5, 2002
Heyse/JWT-Merck 2 Overview u Professor John W. Tukey began consulting with Merck Sharp and Dohme Research Laboratories in 1953 and continued until u Prior to 1953 John was a consultant to Merck in the area of manufacturing. u Through the years John made major contributions to the statistical aspects of all major research disciplines u His consultations led to the establishment of Merck and industry standards for several statistical approaches
Heyse/JWT-Merck 3 Areas of Involvement u Safety assessment u Clinical trials u Laboratory quality control u Clinical safety analyses u Health economics u Gene expression and microarray data u Use of graphics
Heyse/JWT-Merck 4 Agenda for June 1, 2000 Meeting 1.Multiple comparisons: Applications of the False Discovery Rate to Vaccine Adverse Experience Data 2.Transformations for analyzing parasite count data with many zero counts 3.Use of TaqMan assay for gene expression 4.Error models for microarray data
Heyse/JWT-Merck 5 Examples u Trend testing in safety assessment u Adjusting for multiplicity in rodent carcinogenicity studies u Multiplicity applied to estimated variances
Heyse/JWT-Merck 6 Trend Test for Dose Response (Tukey et al., 1985) u Trend defined as progressiveness of response with increasing dose u Three sets of carriers for the candidate set –Arithmetic –Ordinal –Arithmetic-Logarithmic u Statistical assessment for trend is taken as most extreme P-value computed from candidate set u NOSTASOT Dose - No Statistical Significance of Trend Dose - Highest dose through which test for trend is N.S.
Heyse/JWT-Merck 7 Properties of Trend Test u Trend test inflates P-values slightly in conservative direction for safety assessment u Adjusted trend test reported by Capizzi et al. (1992) favorable to other tests against ordered alternative hypothesis u NOSTASOT is closed sequential procedure u Tukey et al. also proposed an adjustment procedure for multiple safety assessment parameters with unknown correlation
Heyse/JWT-Merck 8 Example Summary statistics for toxicity study in dogs Dose of drug in mg/kg/day Number of days Mean albumin Sample variance Control s 2 = with 31 d.f.
Heyse/JWT-Merck 9 Example Trend Test Results Carrier Arithmetic Ordinal Arithmetic-Logarithmic P-value Trend Test P=0.006
Heyse/JWT-Merck 10 Example NOSTASOT Analysis Trend Analysis All Dose Groups (C D 4 ) 4 Dose Groups (C D 3 ) 3 Dose Groups (C D 2 ) P-value NOSTASOT dose is D 2 = 1.0 mg/kg/day
Heyse/JWT-Merck 11 Example Adjusted P-value for Dunnett’s procedure Separate Dose Analyses C vs. D 1 C vs. D 2 C vs. D 3 C vs. D 4 P-value S-P = 0.066
Heyse/JWT-Merck 12 Multiple Significance Testing in Rodent Carcinogenicity Experiments u Mantel (1980) credits Tukey with proposal to adjust multiple P-values in carcinogenicity experiments where P 1 is the smallest observed P-value, k 1 is the number of tumor types that could have attained P 1 u These methods have been improved by several authors and now are commonly applied
Heyse/JWT-Merck 13 Grouping Based on Estimated Variances u The naïve procedure of weighting the results of different experiments inversely to their estimated variance is unsatisfactory u Cochran (1954) introduced the idea of partial weighting in which ½ to ²/ 3 of the studies that appear less variable are assigned equal weight u Mosteller and Tukey (1984) treated the more realistic case with the possible presence of interaction u Ciminera et al. (1993) applied those methods in the multicenter clinical trial setting.
Heyse/JWT-Merck 14 Grouping of Centers Based on Estimated Variances (Ciminera et al., 1994) Center d.f Estimated Variance Center Estimated Variance d.f Groupings algorithm based on the medians of order statistics using the Wilson-Hilferty (1931) approximation.
Heyse/JWT-Merck 15 Insights on Statistics u Randomization is the only thing you can safely assume when analyzing clinical trial data u There is no such thing as a null effect u There is always interaction u Having only two points is the only time you should pretend that you have a linear relationship; and in these cases, you should get more data
Heyse/JWT-Merck 16 Insights on Character “The best thing about being a statistician is that you get to play in other people’s backyards.” (J.W.T.) –Remember that you are a guest and need to bring your manners and respect. –It’s all about relationships.
Heyse/JWT-Merck 17 What would John think of these remarks? Thank you for the kind words, but... you could have said them using fewer slides.
Heyse/JWT-Merck 18 References u Capizzi T, Survill TT, Heyse JF, and Malani H: An empirical and simulated comparison of some tests for detecting progressiveness of response with increasing doses of a compound. Biometrical Journal, 34: , u Ciminera JL, Heyse JF, Nguyen HH, and Tukey JW: Evaluation of multicentre clinical trial data using adaptations of the Mosteller-Tukey procedure. Statistics in Medicine, 12: , u Ciminera JL, Heyse JF, Nguyen HH, and Tukey JW: Tests for qualitative treatment-by-centre interaction using a “pushback” procedure. Statistics in Medicine, 12: , 1993.
Heyse/JWT-Merck 19 References (cont.) u Cox JL, Heyse JF, and Tukey JW: Efficacy estimates from parasite count data that include zero counts. Experimental Parasitology, 96:1-8, u Heyse JF and Rom D: Adjusting for multiplicity of statistical tests in the analysis of carcinogenicity studies. Biometrical Journal, 30: , u Mantel N: Assessing laboratory evidence for neoplastic activity. Biometrics, 36: , u Mantel N, Tukey JW, Ciminera JL, and Heyse JF: Tumorigenicity assays, including use of the jackknife. Biometrical Journal, 24: , 1982.
Heyse/JWT-Merck 20 References (cont.) u Tukey JW, Ciminera JL, and Heyse JF: Testing the statistical certainty of a response to increasing doses of a drug. Biometrics, 41: , 1985.