Collin Tokheim, Rachel Karchin  Cell Systems 

Slides:



Advertisements
Similar presentations
K. Brennan, J.L. Koenig, A.J. Gentles, J.B. Sunwoo, O. Gevaert
Advertisements

From: Epigenetic instability of imprinted genes in human cancers
Sensitivity Analysis of the MGMT-STP27 Model and Impact of Genetic and Epigenetic Context to Predict the MGMT Methylation Status in Gliomas and Other.
Figure 1. Exploring and comparing context-dependent mutational profiles in various cancer types. (A) Mutational profiles of pan-cancer somatic mutations,
Volume 44, Issue 1, Pages (January 2016)
Sensitivity Analysis of the MGMT-STP27 Model and Impact of Genetic and Epigenetic Context to Predict the MGMT Methylation Status in Gliomas and Other.
Volume 2, Issue 2, Pages (February 2016)
Strategy Description Discovery Validation Application
The Functional Impact of Alternative Splicing in Cancer
Lu Chen, PhD, Brienne E. Engel, PhD, Eric A. Welsh, PhD, Sean J
Volume 18, Issue 8, Pages (August 2017)
Spatiotemporal Evolution of the Primary Glioblastoma Genome
Volume 18, Issue 8, Pages (August 2017)
Jianbin Wang, H. Christina Fan, Barry Behr, Stephen R. Quake  Cell 
Volume 6, Issue 5, Pages e5 (May 2018)
Volume 74, Issue 5, Pages (November 2018)
Volume 46, Issue 2, Pages (February 2017)
Total-Genome Analysis of BRCA1/2-Related Invasive Carcinomas of the Breast Identifies Tumor Stroma as Potential Landscaper for Neoplastic Initiation 
Robust Detection of DNA Hypermethylation of ZNF154 as a Pan-Cancer Locus with in Silico Modeling for Blood-Based Diagnostic Development  Gennady Margolin,
Volume 5, Issue 4, Pages e5 (October 2017)
Volume 165, Issue 1, Pages (March 2016)
Translation of Genotype to Phenotype by a Hierarchy of Cell Subsystems
A Massively Parallel Reporter Assay of 3′ UTR Sequences Identifies In Vivo Rules for mRNA Degradation  Michal Rabani, Lindsey Pieper, Guo-Liang Chew,
Volume 25, Issue 6, Pages e4 (November 2018)
Volume 33, Issue 4, Pages e6 (April 2018)
Proteome-Scale Investigation of Protein Allosteric Regulation Perturbed by Somatic Mutations in 7,000 Cancer Genomes  Qiancheng Shen, Feixiong Cheng,
The Functional Impact of Alternative Splicing in Cancer
Volume 4, Issue 3, Pages (August 2013)
Target-Specific Precision of CRISPR-Mediated Genome Editing
Volume 25, Issue 5, Pages e5 (October 2018)
Patterns of Somatically Acquired Amplifications and Deletions in Apparently Normal Tissues of Ovarian Cancer Patients  Leila Aghili, Jasmine Foo, James.
Volume 3, Issue 1, Pages (July 2016)
Volume 22, Issue 3, Pages (January 2018)
Volume 24, Issue 4, Pages (July 2018)
Volume 151, Issue 2, Pages e6 (August 2016)
Integrative Multi-omic Analysis of Human Platelet eQTLs Reveals Alternative Start Site in Mitofusin 2  Lukas M. Simon, Edward S. Chen, Leonard C. Edelstein,
Xing Hua, Haiming Xu, Yaning Yang, Jun Zhu, Pengyuan Liu, Yan Lu 
Volume 14, Issue 7, Pages (February 2016)
Volume 24, Issue 12, Pages e5 (September 2018)
Volume 152, Issue 1, Pages (January 2019)
Volume 24, Issue 8, Pages (August 2018)
Volume 155, Issue 4, Pages (November 2013)
Volume 22, Issue 3, Pages (January 2018)
Large-scale meta-analysis of mutations identified in panels of breast/ovarian cancer- related genes — Providing evidence of cancer predisposition genes 
Volume 5, Issue 1, Pages 1-9 (July 2015)
Volume 7, Issue 3, Pages e12 (September 2018)
Scott A. Foster, Christiaan Klijn, Shiva Malek  Trends in Cancer 
Spatiotemporal Evolution of the Primary Glioblastoma Genome
Volume 29, Issue 5, Pages (May 2016)
Varying Intolerance of Gene Pathways to Mutational Classes Explain Genetic Convergence across Neuropsychiatric Disorders  Shahar Shohat, Eyal Ben-David,
Volume 26, Issue 7, Pages e4 (February 2019)
Peilin Jia, Zhongming Zhao  Cell Reports 
Patterns of Somatically Acquired Amplifications and Deletions in Apparently Normal Tissues of Ovarian Cancer Patients  Leila Aghili, Jasmine Foo, James.
Brandon Ho, Anastasia Baryshnikova, Grant W. Brown  Cell Systems 
Maria S. Robles, Sean J. Humphrey, Matthias Mann  Cell Metabolism 
Volume 52, Issue 1, Pages (October 2013)
Figure 1. Identification of three tumour molecular subtypes in CIT and TCGA cohorts. We used CIT multi-omics data ( Figure 1. Identification of.
Volume 13, Issue 6, Pages (November 2015)
Stephen Bridgett, James Campbell, Christopher J. Lord, Colm J. Ryan 
NRG1 rearrangements are found in multiple solid tumors.
Xing Hua, Haiming Xu, Yaning Yang, Jun Zhu, Pengyuan Liu, Yan Lu 
A Saturation Mutagenesis Approach to Understanding PTEN Lipid Phosphatase Activity and Genotype-Phenotype Relationships  Taylor L. Mighell, Sara Evans-Dutson,
Volume 17, Issue 11, Pages (December 2016)
Gene expression patterns of SLC and ABC transporters in normal and tumor tissues. Gene expression patterns of SLC and ABC transporters in normal and tumor.
Global mutational landscape of the 170 lung adenocarcinoma patients (discovery cohort). Global mutational landscape of the 170 lung adenocarcinoma patients.
Mutational Analysis of Ionizing Radiation Induced Neoplasms
Volume 28, Issue 3, Pages e7 (July 2019)
Molecular characterization of esophagogastric tumors.
TME characteristics of TCGA-STAD subtype and cancer somatic genome.
Volume 28, Issue 4, Pages e6 (July 2019)
Presentation transcript:

CHASMplus Reveals the Scope of Somatic Missense Mutations Driving Human Cancers  Collin Tokheim, Rachel Karchin  Cell Systems  Volume 9, Issue 1, Pages 9-23.e8 (July 2019) DOI: 10.1016/j.cels.2019.05.005 Copyright © 2019 The Authors Terms and Conditions

Cell Systems 2019 9, 9-23.e8DOI: (10.1016/j.cels.2019.05.005) Copyright © 2019 The Authors Terms and Conditions

Figure 1 Overview of CHASMplus See also Figure S1 and Table S1. (A) CHASMplus, a machine learning approach, was applied to somatic missense mutations in tumors from 32 different cancer types found in The Cancer Genome Atlas (TCGA). High scores predicted by CHASMplus indicate stronger evidence for a mutation being a cancer driver. (B) CHASMplus has a model for each cancer type in the TCGA to identify driver missense mutations specific to a cancer type. Putative driver missense mutations are identified at a false discovery rate (FDR) threshold of 1%. In the example, the mutation was statistically significant for the BRCA model but not for the UVM model. Abbreviations: breast invasive carcinoma (BRCA), uveal melanoma (UVM), and cumulative distribution function (CDF). Cell Systems 2019 9, 9-23.e8DOI: (10.1016/j.cels.2019.05.005) Copyright © 2019 The Authors Terms and Conditions

Figure 2 Assessment of Pan-cancer and Cancer-Type-Specific Predictions See also Figure S2 and Table S2. (A) Receiver operating characteristic (ROC) curve for identifying activating mutations in MCF10A cells, a breast epithelium line, with breast cancer-specific models. (B) ROC curve for discerning literature annotated oncogenic mutations from OncoKB for specific cancer types: breast invasive ductal carcinoma (BRCA), glioblastoma multiforme (GBM), high-grade serous ovarian cancer (OV), and colon adenocarcinoma (COAD). (C) Score distribution for cancer-type-specific models (labeled by title) for TCGA missense mutations in EGFR, found in either lung adenocarcinoma (LUAD; typically, kinase domain mutation) or glioblastoma multiforme (GBM; typically, extracellular domain mutations). Box plots show quartiles, with whiskers defined according to Tukey's criterion. (D) Overview of 5 pan-cancer benchmarks by scale of evaluation and type of supportive evidence. (E) Pan-cancer performance measured by the area under the receiver operating characteristic curve (auROC) on the 5 mutation-level benchmarks (shown in text). The color scale from red to blue indicates methods ranked from high to low performance. Benchmarks are categorized by in vitro (green), in vivo (yellow), and literature-based benchmarks (turquoise). Cell Systems 2019 9, 9-23.e8DOI: (10.1016/j.cels.2019.05.005) Copyright © 2019 The Authors Terms and Conditions

Figure 3 Frequency Spectrum of Driver Missense Mutations See also Figures S3 and S4 and Tables S3, S4, and S5. (A) Proportion of the overall frequency of driver missense mutations found to be rare (<1% of samples or singleton mutations), intermediate (1%–5%), and common (>5%), correspondingly shown as light to dark blue. (B) Correlation between tumor mutation burden and overall driver prevalence (number of driver mutations per tumor) across the frequency spectrum of drivers (rare, intermediate, and common). Shaded area indicates the 95% confidence interval. (C) Analysis of genes containing predicted driver missense mutations that preferentially occurred in a subtype-specific manner (chi-square test; q < 0.1). The pie chart illustrates the percentage of genes for all cancer types with a significant subtype-specific pattern, while the heatmap illustrates significant genes for breast invasive carcinoma. (D) Structure of the phosphatase 2A holoenzyme (PDB: 2IAE). (E) Structures of the ERBB2 extracellular domain (left; PDB: 2A91) and kinase domain (right; PDB: 3PP0). (F) Lollipop plot of driver missense mutations identified by CHASMplus (yellow), and likely truncating variants (frameshift insertion or deletion: purple; nonsense mutation: red; and splice site mutation: orange) in CASP8 for head and neck squamous cell carcinoma (HNSC). TCGA acronyms for cancer types are listed in STAR Methods. Cell Systems 2019 9, 9-23.e8DOI: (10.1016/j.cels.2019.05.005) Copyright © 2019 The Authors Terms and Conditions

Figure 4 Inferred Immune Cell Content and Gene Expression Correlate with Presence of Predicted Driver Missense Mutations in CASP8 See also Figure S5. Twelve immune-related biomarkers are shown, as estimated in head and neck squamous cell carcinoma (HNSC) tumor samples from TCGA (Thorsson et al., 2018). Each panel compares the distribution of a marker in samples harboring a rare driver missense mutation (orange), a truncating mutation (purple), and control samples with no CASP8 mutations (green). Both tumor samples with rare driver missense mutations and truncating mutations showed a similar significantly elevated inferred immune cell infiltration. Top row: inferred immune infiltrates from DNA methylation or gene expression from TCGA HNSC samples (Thorsson et al., 2018). Bottom row: gene expression values from RNA-Seq for several important immune-related genes reported in (Thorsson et al., 2018). “mis” indicates samples with driver missense mutations identified by CHASMplus, and “lof” is likely loss-of-function variants (nonsense, frameshift insertion or deletions, splice site, translation start site, and nonstop mutations). Mann-Whitney U test, adjusted p value (Benjamini-Hochberg method): ∗ < 0.05, ∗∗ < 0.01, and ∗∗∗ < 0.001. Box plots show quartiles, with whiskers defined according to Tukey's criterion. Cell Systems 2019 9, 9-23.e8DOI: (10.1016/j.cels.2019.05.005) Copyright © 2019 The Authors Terms and Conditions

Figure 5 CHASMplus Predictions Correlate with Multiplexed Functional Assays in PTEN See also Table S6. (A) Heatmap displaying gene-weighted CHASMplus scores (gwCHASMplus) of PTEN missense mutations. (B and C) (B) Scores are negatively correlated with PTEN lipid phosphatase activity (Spearman rho = −0.520) (Mighell et al., 2018) and (C) PTEN protein abundance (Spearman rho = −0.428) (Matreyek et al., 2018). (D) Both gwCHASMplus correlations had a larger absolute value than the correlation between the two experiments (Spearman rho = 0.351). ∗∗∗ = p < 0.001. (E and F) Distribution of (E) PTEN lipid phosphatase activity or (F) protein abundance in predicted driver missense mutations from TCGA (common: >5% of tumor samples; intermediate: 1%–5%; and rare: <1%), all other missense mutations and truncating mutations. Box plots show quartiles, with whiskers defined according to Tukey's criterion. (G) Comparison of CHASMplus to the 2nd and 3rd ranked methods in Figure 2E. Left: specificity of methods at identifying PTEN missense mutations that do not lower lipid phosphatase activity. Right: sensitivity (recall), precision, and F1 score for identifying missense mutations that lower lipid phosphatase activity. CHASMplus had the highest specificity, precision, and F1 score. Cell Systems 2019 9, 9-23.e8DOI: (10.1016/j.cels.2019.05.005) Copyright © 2019 The Authors Terms and Conditions

Figure 6 Characteristics and Trajectory of Missense Mutation Driver Discovery See also Figure S7 and Table S7. (A) Plot displaying normalized driver diversity and driver prevalence (fraction of tumor samples mutated) for driver missense mutations in 32 cancer types. K-means clustering identified 5 clusters with centroids shown as numerically designated circles. (B) Prevalence of driver missense mutations identified by CHASMplus as a function of sample size. Lines represent LOWESS fit to different rarities of driver missense mutations. TCGA acronyms for cancer types are listed in the STAR Methods. Cell Systems 2019 9, 9-23.e8DOI: (10.1016/j.cels.2019.05.005) Copyright © 2019 The Authors Terms and Conditions

Figure 7 Hotspot Detection Alone Has Limited Statistical Power to Identify Driver Mutations (A) Statistical power to detect a significantly elevated number of non-silent mutations in an individual codon, as a function of sample size and mutation rate. Circles represent each cancer type from the TCGA and are placed by sample size and median mutation rate. Curves are colored by the frequency of driver mutations (fraction of non-silent mutated cancer samples above background). If a circle is below a curve, then hotspot detection is not yet sufficiently powerful to detect driver mutations of that frequency. (B) Bar graph comparing power (sensitivity) to detect labeled oncogenic driver missense mutations from OncoKB between CHASMplus and the cancer hotspots method (Chang et al., 2016). Stratification by TP53 suggests that the increased power provided by CHASMplus is not solely a result of high performance on oncogenic TP53 mutations. Cell Systems 2019 9, 9-23.e8DOI: (10.1016/j.cels.2019.05.005) Copyright © 2019 The Authors Terms and Conditions