Tao Wang Assistant Professor Quantitative Biomedical Research Center

Slides:



Advertisements
Similar presentations
Analysis of Microarray Genomic Data of Breast Cancer Patients Hui Liu, MS candidate Department of statistics Prof. Eric Suess, faculty mentor Department.
Advertisements

ICSA, 6/2007 Pei Wang, 1 Spatial Smoothing and Hot Spot Detection for CGH data using the Fused Lasso Pei Wang Cancer Prevention Research.
Basic Gene Expression Data Analysis--Clustering
Glioblastoma Multiforme (GBM) – Subtype Analysis Lance Parsons.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Introduction of Cancer Molecular Epidemiology Zuo-Feng Zhang, MD, PhD University of California Los Angeles.
ICA-based Clustering of Genes from Microarray Expression Data Su-In Lee 1, Serafim Batzoglou 2 1 Department.
Clustering of DNA Microarray Data Michael Slifker CIS 526.
Apostolos Zaravinos and Constantinos C Deltas Molecular Medicine Research Center and Laboratory of Molecular and Medical Genetics, Department of Biological.
Application of Class Discovery and Class Prediction Methods to Microarray Data Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics.
ICNCT-16, June 2014, Helsinki Glioma heterogeneity and the L-Amino acid transporter-1 (LAT1): A first step to stratified BPA-based BNCT? D. Ngoga 1 ; C.
Lecture 11. Topics in Omic Studies (Cancer Genomics, Transcriptomics and Epignomics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational.
Anthony Gitter Cancer Bioinformatics (BMI 826/CS 838) May 5, 2015
Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and.
Use of gene expression to identify heterogeneity of metastatic behavior among high-grade pleomorphic soft tissue sarcomas Keith Skubitz 1, Princy Francis.
Tumor Heterogeneity: From biological concepts to computational methods Bo Li, PhD Dana Farber Cancer Institute Harvard Statistics Department.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 3.
Bioinformatics for biologists (2) Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
Multi-scale network biology model & the model library 多尺度网络生物学模型 -- 兼论模型库的建立与应用 Jianghui Xiong 熊江辉
High-throughput genomic profiling of tumor-infiltrating leukocytes
Pathway Informatics 16th August, 2017
Intermediate Atypical Carcinoma: Novel Histologic Subtype of mCRPC in Patients Resistant to Androgen Receptor Agonists CCO Independent Conference Highlights.
Why weight? Variance modelling for designed RNA-seq experiments
Cancer Genomics and Class Discovery
Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 2016 Workshop
Areas of Research Xia Jiang Associate Professor of
Gregory Cooper Professor of Biomedical Informatics Director, Center for Causal Discovery Vice Chair Research, Department of Biomedical Informatics.
Statistical Applications in Biology and Genetics
Gene expression.
Global Transcriptional Dysregulation in Breast Cancer
Harry Hochheiser Assistant Professor
Connecting Cancer Genomics to Cancer Biology using Proteomics
Lecture 7. Topics in RNA Bioinformatics (Single-Cell RNA Sequencing)
Microarray Clustering
Day 4 Session 22: Questions and follow-up…. James C. Fleet, PhD
RNA Sequencing Approaches to Identify Novel Biomarkers for Venous Thromboembolism (VTE) in Lung Cancer Tamara A. Sussman MD1, Mohamed Abazeed MD PhD1,
Areas of Research Xia Jiang Assistant Professor
Gregory Cooper Professor of Biomedical Informatics Director, Center for Causal Discovery Vice Chair, Department of Biomedical Informatics Research involves.
Subspace Clustering for Microarray Data Analysis:
Characterization of microRNA transcriptome in tumor, adjacent, and normal tissues of lung squamous cell carcinoma  Jun Wang, MD, PhD, Zhi Li, MD, PhD,
Volume 2, Issue 4, Pages (April 2008)
Loyola Marymount University
Gene expression analysis
Gregory Cooper Professor of Biomedical Informatics Director, Center for Causal Discovery Vice Chair Research, Department of Biomedical Informatics.
Volume 172, Issue 1, Pages (January 2018)
Molecular Phenotyping Small (Asian) versus Large (Western) Plaque Psoriasis Shows Common Activation of IL-17 Pathway Genes but Different Regulatory Gene.
V13 Multi-omics data integration
Volume 17, Issue 8, Pages (November 2016)
miR-210 Is a Prognostic Marker in Clear Cell Renal Cell Carcinoma
Pearson correlation of gene expression identifies distinct groups of male- and female-enriched genes. Pearson correlation of gene expression identifies.
Gene Expression Analysis
Volume 9, Pages (November 2018)
Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer Yang et al Presented by Yves A. Lussier MD PhD The University.
Integrative statistical analysis pipeline for RNA-seq and NanoString with application to gene expression data of cancer patients Jeea Choi, Catarina D.
Session 1: WELCOME AND INTRODUCTIONS
Loyola Marymount University
Genetic Mutations Associated with Histopathology Changes in Kidney Cancer Kun Huang, PhD Jun Cheng, PhD, Zhi Han, PhD, Qianjin Feng, PhD, Liang Cheng,
Loyola Marymount University
Loyola Marymount University
Loyola Marymount University
Fig. 5 n-HA regulates gene expressions related to tumor suppression, calcium homeostasis, and immune response. n-HA regulates gene expressions related.
Gene expression profiles of T cells.
The Role of TIPE2 Protein in Invasive Breast Carcinoma
Frequently mutated genes in colorectal cancer.
Defining the eTME genes.
JAK3 mutations and ectopic expression of HOXA9 are significantly associated in clinical T-ALL cases. JAK3 mutations and ectopic expression of HOXA9 are.
Differential Expression of RNA-Seq Data
CD36 expression is coordinately regulated in multiple cellular compartments. CD36 expression is coordinately regulated in multiple cellular compartments.
Highly metastatic PDAC cells have a unique gene signature, which is not preserved in metastases but predicts poor patient outcome. Highly metastatic PDAC.
DO NOT POST #4054 Gene expression Difference (GED) Revealed Immune Function Gene UP- or Down-regulation as Tumor-associated Inflammatory Cell (TAIC) Infiltration.
Presentation transcript:

Three-component dissection of tumor cellular heterogeneity by a Bayesian Hierarchical Model Tao Wang Assistant Professor Quantitative Biomedical Research Center Department of Clinical Sciences UT Southwestern Medical Center Cancer Discovery, 2018, IF=24.373

Tumor cellular heterogeneity presents challenges and opportunities for understanding cancer Normal parenchyma Malignant tumor Immune cells Stroma cells Tumor cells Challenges: The sampled tumor tissue cannot be assumed to be pure population of the same type of cells Opportunities: non-tumor cells provide critical information regarding the biology of tumors. Need robust statistical methods to dissect the different components of cells

The core dissection problem In each patient i, we need to dissect one vector of expression (of length J) into three vectors (the basic idea) i iterates through patients j iterates through all genes ρ denotes mixing proportions for each patient eij denotes expression of gene j in patient i for T, S, N, and M T: tumor cell expression (Tumorgraft sample) N: normal cell expression (Normal tissue sample) S: stroma/immune expression (Unknown component) M: Expression of mixture of cells in bulk tumor RNA-Seq (Bulk tumor sample) 𝑒 𝑖𝑗 𝑇 𝜌 𝑖 𝑇 + 𝑒 𝑖𝑗 𝑆 𝜌 𝑖 𝑆 + 𝑒 𝑖𝑗 𝑁 𝜌 𝑖 𝑁 = 𝑒 𝑖𝑗 𝑀 𝜌 𝑖 𝑇 + 𝜌 𝑖 𝑆 + 𝜌 𝑖 𝑁 =1,0< 𝜌 𝑖 𝑇 , 𝜌 𝑖 𝑆 , 𝜌 𝑖 𝑁 <1 𝑒 𝑖𝑗 𝑇 𝜌 𝑖 𝑇 𝑒 𝑖𝑗 𝑆 𝜌 𝑖 𝑆 𝑒 𝑖𝑗 𝑁 𝜌 𝑖 𝑁 𝑒 𝑖𝑗 𝑀 = +

Technical challenge #1: Dissecting gene expression on the raw scale or log scale? Raw scale: numerical instabilities will likely crash the estimation or cause serious bias (Ahn et al, DeMix, 2013) Log scale: it will be numerically stable but the estimation will be distorted (Zhong and Liu 2012) 𝑒 𝑖𝑗 𝑇 𝜌 𝑖 𝑇 + 𝑒 𝑖𝑗 𝑆 𝜌 𝑖 𝑆 + 𝑒 𝑖𝑗 𝑁 𝜌 𝑖 𝑁 = 𝑒 𝑖𝑗 𝑀 log( 𝑒 𝑖𝑗 𝑇 ) 𝜌 𝑖 𝑇 + log⁡(𝑒 𝑖𝑗 𝑆 ) 𝜌 𝑖 𝑆 + log⁡(𝑒 𝑖𝑗 𝑁 ) 𝜌 𝑖 𝑁 = log⁡(𝑒 𝑖𝑗 𝑀 )

Technical challenge #2: The need to model all three components of cells Many current models and software confuse/mix the stroma/immune and the normal components ISOpure (Anghel et al, BMC Bioinformatics, 2015) DeMix (Ahn et al, Bioinformatics, 2013) InfiniumPurify (Zheng et al, Genome Biology, 2017) Normal cells, immune cells, stromal cells, etc 𝐵𝑢𝑙 𝑘 𝑖𝑗 = 𝜌 𝑖 𝑇𝑢𝑚𝑜 𝑟 𝑖𝑗 +(1− 𝜌 𝑖 )𝑁𝑜𝑟𝑚𝑎 𝑙 𝑖𝑗

DisHet: A Bayesian Hierarchical model for Dissecting Heterogeneous bulk tumors The DisHet model is a hybrid of a raw-scale model: the average expression of the S component across all patients a log-scale model: individual expression variation in S Dissects all three components of cells properly log( 𝑒 𝑖𝑗 𝑀 )~𝑁(log( 𝑒 𝑖𝑗 𝑇 𝜌 𝑖 𝑇 + 𝑒 𝑗 𝑆 𝜌 𝑖 𝑆 + 𝑒 𝑖𝑗 𝑁 𝜌 𝑖 𝑁 ), 𝜎 𝑗 2 Population-wise average of S (S of different patients should look similar) Individual variation in S

Real data analysis of DisHet agrees with biological expert knowledge - 𝜌 35 RCC patients, 25,000 genes Correlate estimated tumor component proportion with true proportions given by pathologists

Real data analysis agrees with biological expert knowledge – 𝑒 𝑆 Searching for enriched Gene Ontology pathways of genes that are highly activated in each component Stroma/immune component: immune-regulated pathways P values showing enrichment of pathways

Discovering a highly inflamed kidney cancer subtype using stroma-specific gene expression Inflamed Non-inflamed subtype subtype IS patients have worse survival Wang T, Lu R, Kapur P, et al. An Empirical Approach Leveraging Tumorgrafts to Dissect the Tumor Microenvironment in Renal Cell Carcinoma Identifies Missing Link to Prognostic Inflammatory Factors. Cancer Discovery. 2018.

Acknowledgements UTSW Kidney Cancer Program James Brugarolas Payal Kapur Raquibul Hannan Ivan Pedrosa Qurratulain Yousuf Mingyi Chen Alexander Filatenkov Jose Torrealba UTSW QBRC/BICF Rong Lu Ze Zhang He Zhang Min Soo Kim Danni Luo Xin Luo Jingxuan He Genentech Eric W Stawiski Zora Modrusan Steffen Durinck Somasekar Seshagiri 10

Tao Wang 972 567 2356 Tao.wang@utsouthwestern.edu 12245 Montego Plz, Dallas, TX, 75230