Topics in the Development and Validation of Gene Expression Profiling Based Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch.

Slides:



Advertisements
Similar presentations
Publications Reviewed Searched Medline Hand screening of abstracts & papers Original study on human cancer patients Published in English before December.
Advertisements

Relating Gene Expression to a Phenotype and External Biological Information Richard Simon, D.Sc. Chief, Biometric Research Branch, NCI
Development and Validation of Predictive Classifiers using Gene Expression Profiles Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
Clinical Trial Designs for the Evaluation of Prognostic & Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
Model generalization Test error Bias, variance and complexity
Model Assessment, Selection and Averaging
Myths and Statistical Principles in DNA Microarray Research Richard Simon, D.Sc. Chief, Biometric Research Branch Head, Molecular Statistics & Bioinformatics.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.
Classification of Microarray Data. Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical.
Evaluation.
Predictive Classifiers Based on High Dimensional Data Development & Use in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch.
Classification of Microarray Data. Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical.
Richard Simon, D.Sc. Chief, Biometric Research Branch
Evaluation of Results (classifiers, and beyond) Biplav Srivastava Sources: [Witten&Frank00] Witten, I.H. and Frank, E. Data Mining - Practical Machine.
Statistical Challenges for Predictive Onclogy Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Guidelines on Statistical Analysis and Reporting of DNA Microarray Studies of Clinical Outcome Richard Simon, D.Sc. Chief, Biometric Research Branch National.
Ensemble Learning (2), Tree and Forest
Use of Genomics in Clinical Trial Design and How to Critically Evaluate Claims for Prognostic & Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric.
Predictive Biomarkers and Their Use in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.
Clustering and Classification In Gene Expression Data Carlo Colantuoni Slide Acknowledgements: Elizabeth Garrett-Mayer, Rafael Irizarry,
Development and Validation of Prognostic Classifiers using High Dimensional Data Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
Some Statistical Aspects of Predictive Medicine
A Multivariate Biomarker for Parkinson’s Disease M. Coakley, G. Crocetti, P. Dressner, W. Kellum, T. Lamin The Michael L. Gargano 12 th Annual Research.
2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition,
Validation of Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Whole Genome Expression Analysis
Classification (Supervised Clustering) Naomi Altman Nov '06.
Gene Expression Profiling Illustrated Using BRB-ArrayTools.
Analysis of Molecular and Clinical Data at PolyomX Adrian Driga 1, Kathryn Graham 1, 2, Sambasivarao Damaraju 1, 2, Jennifer Listgarten 3, Russ Greiner.
Use of Prognostic & Predictive Genomic Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Statistical Aspects of the Development and Validation of Predictive Classifiers for High Dimensional Data Richard Simon, D.Sc. Chief, Biometric Research.
The Broad Institute of MIT and Harvard Classification / Prediction.
Steps on the Road to Predictive Oncology Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
1 Critical Review of Published Microarray Studies for Cancer Outcome and Guidelines on Statistical Analysis and Reporting Authors: A. Dupuy and R.M. Simon.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
Gene Expression Profiling. Good Microarray Studies Have Clear Objectives Class Comparison (gene finding) –Find genes whose expression differs among predetermined.
Manu Chandran. Outline Background and motivation Over view of techniques Cross validation Bootstrap method Setting up the problem Comparing AIC,BIC,Crossvalidation,Bootstrap.
Use of Candidate Predictive Biomarkers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
The Use of Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Steps on the Road to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.
Adaptive Designs for Using Predictive Biomarkers in Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Using Predictive Classifiers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Statistics for Differential Expression Naomi Altman Oct. 06.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
New Approaches to Clinical Trial Design Development of New Drugs & Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric Research Branch National.
Cluster validation Integration ICES Bioinformatics.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
Steps on the Road to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Validation methods.
Moving From Correlative Science to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Gist 2.3 John H. Phan MIBLab Summer Workshop June 28th, 2006.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Design & Analysis of Microarray Studies for Diagnostic & Prognostic Classification Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Predictive Automatic Relevance Determination by Expectation Propagation Y. Qi T.P. Minka R.W. Picard Z. Ghahramani.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Class Prediction Based on Gene Expression Data Issues in the Design and Analysis of Microarray Experiments Michael D. Radmacher, Ph.D. Biometric Research.
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Outlines Introduction & Objectives Methodology & Workflow
Presentation transcript:

Topics in the Development and Validation of Gene Expression Profiling Based Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute Linus.nci.nih.gov/brb

BRB Website Powerpoint presentations and audio files Reprints & Technical Reports BRB-ArrayTools software BRB-ArrayTools Data Archive Sample Size Planning for Targeted Clinical Trials

Simplified Description of Microarray Assay Extract mRNA from cells of interest –Each mRNA molecule was transcribed from a single gene and it has a linear structure complementary to that gene Convert mRNA to cDNA introducing a fluorescently labeled dye to each molecule Distribute the cDNA sample to a solid surface containing “probes” of DNA representing all “genes”; the probes are in known locations on the surface Let the molecules from the sample hybridize with the probes for the corresponding genes Remove excess sample and illuminate surface with laser with frequency corresponding to the dye Measure intensity of fluorescence over each probe

Resulting Data Intensity over a probe is approximately proportional to abundance of mRNA molecules in the sample for the gene corresponding to the probe 40,000 variables measured for each case –Excessive hype –Excessive skepticism –Some familiar statistical paradigms don’t work well

Good Microarray Studies Have Clear Objectives Class Comparison (Gene Finding) –Find genes whose expression differs among predetermined classes, e.g. tissue or experimental condition Class Prediction –Prediction of predetermined class (e.g. treatment outcome) using information from gene expression profile –Survival risk-group prediction Class Discovery –Discover clusters of specimens having similar expression profiles

Class Comparison and Class Prediction Not clustering problems Supervised methods

Class Prediction ≠ Class Comparison A set of genes is not a predictive model Emphasis in class comparison is often on understanding biological mechanisms –More difficult than accurate prediction and usually requires a different experiment Demonstrating statistical significance of prognostic factors is not the same as demonstrating predictive accuracy

Components of Class Prediction Feature (gene) selection –Which genes will be included in the model Select model type –E.g. Diagonal linear discriminant analysis, Nearest-Neighbor, … Fitting parameters (regression coefficients) for model –Selecting value of tuning parameters

Feature Selection Genes that are differentially expressed among the classes at a significance level  (e.g. 0.01) –The  level is a tuning parameter –Number of false discoveries is not of direct relevance for prediction For prediction it is usually more serious to exclude an informative variable than to include some noise variables

Optimal significance level cutoffs for gene selection. 50 differentially expressed genes out of 22,000 genes on the microarrays 2δ/σn=10n=30n=

Complex Gene Selection Small subset of genes which together give most accurate predictions –Genetic algorithms Little evidence that complex feature selection is useful in microarray problems

Linear Classifiers for Two Classes

Fisher linear discriminant analysis Diagonal linear discriminant analysis (DLDA) –Ignores correlations among genes Compound covariate predictor Golub’s weighted voting method Support vector machines with inner product kernel Perceptrons

When p>>n It is always possible to find a set of features and a weight vector for which the classification error on the training set is zero. There is generally not sufficient information in p>>n training sets to effectively use more complex methods

Myth Complex classification algorithms such as neural networks perform better than simpler methods for class prediction.

Comparative studies have shown that simpler methods work as well or better for microarray problems because they avoid overfitting the data.

Other Simple Methods Nearest neighbor classification Nearest k-neighbors Nearest centroid classification Shrunken centroid classification

Evaluating a Classifier Most statistical methods were not developed for p>>n prediction problems Fit of a model to the same data used to develop it is no evidence of prediction accuracy for independent data Demonstrating statistical significance of prognostic factors is not the same as demonstrating predictive accuracy Testing whether analysis of independent data results in selection of the same set of genes is not an appropriate test of predictive accuracy of a classifier

Internal Validation of a Classifier Re-substitution estimate –Develop classifier on dataset, test predictions on same data –Very biased for p>>n Split-sample validation Cross-validation

Split-Sample Evaluation Training-set –Used to select features, select model type, determine parameters and cut-off thresholds Test-set –Withheld until a single model is fully specified using the training-set. –Fully specified model is applied to the expression profiles in the test-set to predict class labels. –Number of errors is counted

Leave-one-out Cross Validation Omit sample 1 –Develop multivariate classifier from scratch on training set with sample 1 omitted –Predict class for sample 1 and record whether prediction is correct

Leave-one-out Cross Validation Repeat analysis for training sets with each single sample omitted one at a time e = number of misclassifications determined by cross-validation Subdivide e for estimation of sensitivity and specificity

With proper cross-validation, the model must be developed from scratch for each leave-one-out training set. This means that feature selection must be repeated for each leave-one-out training set. –Simon R, Radmacher MD, Dobbin K, McShane LM. Pitfalls in the analysis of DNA microarray data. Journal of the National Cancer Institute 95:14-18, The cross-validated estimate of misclassification error is an estimate of the prediction error for model fit using specified algorithm to full dataset

Prediction on Simulated Null Data Generation of Gene Expression Profiles 14 specimens (P i is the expression profile for specimen i) Log-ratio measurements on 6000 genes P i ~ MVN(0, I 6000 ) Can we distinguish between the first 7 specimens (Class 1) and the last 7 (Class 2)? Prediction Method Compound covariate prediction Compound covariate built from the log-ratios of the 10 most differentially expressed genes.

Major Flaws Found in 40 Studies Published in 2004 Inadequate control of multiple comparisons in gene finding –9/23 studies had unclear or inadequate methods to deal with false positives 10,000 genes x.05 significance level = 500 false positives Misleading report of prediction accuracy –12/28 reports based on incomplete cross-validation Misleading use of cluster analysis –13/28 studies invalidly claimed that expression clusters based on differentially expressed genes could help distinguish clinical outcomes 50% of studies contained one or more major flaws

Myth Split sample validation is superior to LOOCV or 10-fold CV for estimating prediction error

Comparison of Internal Validation Methods Molinaro, Pfiffer & Simon For small sample sizes, LOOCV is much less biased than split-sample validation For small sample sizes, LOOCV is preferable to 10-fold, 5-fold cross-validation or repeated k-fold versions For moderate sample sizes, 10-fold is preferable to LOOCV Some claims for bootstrap resampling for estimating prediction error are not valid for p>>n problems

Simulated Data 40 cases, 10 genes selected from 5000 MethodEstimateStd Deviation True.078 Resubstitution LOOCV fold CV fold CV Split sample Split sample bootstrap

Simulated Data 40 cases MethodEstimateStd Deviation True fold Repeated 10-fold fold Repeated 5-fold Split Repeated split

DLBCL Data MethodBiasStd DeviationMSE LOOCV fold CV fold CV Split Split bootstrap

Ordinary bootstrap –Training and test sets overlap Bootstrap cross-validation (Fu, Carroll,Wang) –Perform LOOCV on bootstrap samples –Training and test sets overlap Leave-one-out bootstrap –Predict for cases not in bootstrap sample –Training sets are too small Out-of-bag bootstrap (Breiman) –Predict for case i based on majority rule of predictions for bootstrap samples not containing case i.632+ bootstrap –w*LOOBS+(1-w)RSB

Permutation Distribution of Cross- validated Misclassification Rate of a Multivariate Classifier Randomly permute class labels and repeat the entire cross-validation Re-do for all (or 1000) random permutations of class labels Permutation p value is fraction of random permutations that gave as few misclassifications as e in the real data

Does an Expression Profile Classifier Predict More Accurately Than Standard Prognostic Variables? Not an issue of which variables are significant after adjusting for which others or which are independent predictors –Predictive accuracy, not significance The two classifiers can be compared by ROC analysis as functions of the threshold for classification The predictiveness of the expression profile classifier can be evaluated within levels of the classifier based on standard prognostic variables

Does an Expression Profile Classifier Predict More Accurately Than Standard Prognostic Variables? Some publications fit logistic model to standard covariates and the cross-validated predictions of expression profile classifiers This is valid only with split-sample analysis because the cross-validated predictions are not independent

Survival Risk Group Prediction For analyzing right censored data to develop predictive classifiers it is not necessary to make the data binary Can do cross-validation to predict high or low risk group for each case Compute Kaplan-Meier curves of predicted risk groups Permutation significance of log-rank statistic Implemented in BRB-ArrayTools BRB-ArrayTools also provides for comparing the risk group classifier based on expression profiles to one based on standard covariates and one based on a combination of both types of variables

Myth Huge sample sizes are needed to develop effective predictive classifiers

Sample Size Planning References K Dobbin, R Simon. Sample size determination in microarray experiments for class comparison and prognostic classification. Biostatistics 6:27-38, 2005 K Dobbin, R Simon. Sample size planning for developing classifiers using high dimensional DNA microarray data. Biostatistics (2007)

Sample Size Planning for Classifier Development The expected value (over training sets) of the probability of correct classification PCC(n) should be within  of the maximum achievable PCC(  )

Probability Model Two classes Log expression or log ratio MVN in each class with common covariance matrix m differentially expressed genes p-m noise genes Expression of differentially expressed genes are independent of expression for noise genes All differentially expressed genes have same inter-class mean difference 2  Common variance for differentially expressed genes and for noise genes

Classifier Feature selection based on univariate t- tests for differential expression at significance level  Simple linear classifier with equal weights (except for sign) for all selected genes. Power for selecting each of the informative genes that are differentially expressed by mean difference 2  is 1-  (n)

For 2 classes of equal prevalence, let 1 denote the largest eigenvalue of the covariance matrix of informative genes. Then

Sample size as a function of effect size (log-base 2 fold-change between classes divided by standard deviation). Two different tolerances shown,. Each class is equally represented in the population genes on an array.

b) PCC(60) as a function of the proportion in the under-represented class. Parameter settings same as a), with 10 differentially expressed genes among 22,000 total genes. If the proportion in the under- represented class is small (e.g., <20%), then the PCC(60) can decline significantly.

Acknowledgements Kevin Dobbin Alain Dupuy Wenyu Jiang Annette Molinaro Ruth Pfeiffer Michael Radmacher Joanna Shih Yingdong Zhao BRB-ArrayTools Development Team

BRB-ArrayTools Contains analysis tools that I have selected as valid and useful Analysis wizard and multiple help screens for biomedical scientists Imports data from all platforms and major databases Automated import of data from NCBI Gene Express Omnibus

Predictive Classifiers in BRB-ArrayTools Classifiers –Diagonal linear discriminant –Compound covariate –Bayesian compound covariate –Support vector machine with inner product kernel –K-nearest neighbor –Nearest centroid –Shrunken centroid (PAM) –Random forrest –Tree of binary classifiers for k- classes Survival risk-group –Supervised pc’s Feature selection options –Univariate t/F statistic –Hierarchical variance option –Restricted by fold effect –Univariate classification power –Recursive feature elimination –Top-scoring pairs Validation methods –Split-sample –LOOCV –Repeated k-fold CV –.632+ bootstrap

Selected Features of BRB-ArrayTools Multivariate permutation tests for class comparison to control number and proportion of false discoveries with specified confidence level –Permits blocking by another variable, pairing of data, averaging of technical replicates SAM –Fortran implementation 7X faster than R versions Extensive annotation for identified genes –Internal annotation of NetAffx, Source, Gene Ontology, Pathway information –Links to annotations in genomic databases Find genes correlated with quantitative factor while controlling number of proportion of false discoveries Find genes correlated with censored survival while controlling number or proportion of false discoveries Analysis of variance

Selected Features of BRB-ArrayTools Gene set enrichment analysis. –Gene Ontology groups, signaling pathways, transcription factor targets, micro-RNA putative targets –Automatic data download from Broad Institute –KS & LS test statistics for null hypothesis that gene set is not enriched –Hotelling’s and Goeman’s Global test of null hypothesis that no genes in set are differentially expressed –Goeman’s Global test for survival data Class prediction –Multiple classifiers – Complete LOOCV, k-fold CV, repeated k-fold,.632 bootstrap –permutation significance of cross-validated error rate

Selected Features of BRB-ArrayTools Survival risk-group prediction –Supervised principal components with and without clinical covariates –Cross-validated Kaplan Meier Curves –Permutation test of cross-validated KM curves Clustering tools for class discovery with reproducibility statistics on clusters –Internal access to Eisen’s Cluster and Treeview Visualization tools including rotating 3D principal components plot exportable to Powerpoint with rotation controls Extensible via R plug-in feature Tutorials and datasets

BRB-ArrayTools Extensive built-in gene annotation and linkage to gene annotation websites Publicly available for non-commercial use –

BRB-ArrayTools December Registered users 1938 Distinct institutions 68 Countries 311 Citations