Data Integration for Cancer Genomics. Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?

Slides:



Advertisements
Similar presentations
Understanding Genome-Wide Profiling of Cancer
Advertisements

Yan Guo Assistant Professor Department of Cancer Biology Vanderbilt University USA.
Gene 210 Cancer Genomics April 29, Key events in investigating the cancer genome M R Stratton Science 2011;331:
Data integration across omics landscapes Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine
Gene regulation in cancer 11/14/07. Overview The hallmark of cancer is uncontrolled cell proliferation. Oncogenes code for proteins that help to regulate.
Bioinformatics lectures at Rice University Li Zhang Lecture 10: Networks and integrative genomic analysis-2 Genome instability and DNA copy number data.
TCGA(The cancer genome atlas) catalogue genetic mutations responsible for cancer, using genome sequencing and bioinformatics The TCGA is sequencing the.
Gene 210 Cancer Genomics May 5, Key events in investigating the cancer genome M R Stratton Science 2011;331:
Network-based stratification of tumor mutations Matan Hofree.
Introduction Integrative Analysis of Genomic Variants in Carcinogenesis Syed Haider, Arek Kasprzyk, Pietro Lio Artificial Intelligence and Computational.
Public data - available for projects 6 data sets: –Human Tissues –Leukemia –Spike-in –FARO compendium – Yeast Cell Cycle –Yeast Rosetta Find one yourself.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Andrey Alexeyenko M edical E pidemiology and B iostatistics Gene network approach in epidemiology.
Genomic signatures to guide the use of chemotherapeutics Authors: Anil Potti et. al Presenter: Jong Cheol Jeong.
Evaluating cell lines as tumor models by comparison of genomic profiles Domcke, S. et al. Nat. Commun 4:2126.
DNA Microarrays Examining Gene Expression. Prof. GrossBiology 4 DNA MicroArrays DNA MicroArrays use hybridization technology to examine gene expression.
Takeda Pharmaceutical Inc.
Multi-dimensional Genomic Profiling of Acute Leukemias Characterized by MLL gene rearrangements Eunice S. Wang MD (Medicine) and Norma J. Nowak PhD (Cancer.
Introduction to Glioblastoma Chris Plaisier Introduction to Systems Biology Course Institute for Systems Biology.
Comprehensive Gene Expression Analysis of Prostate Cancer Reveals Distinct Transcriptional Programs Associated With Metastatic Disease Kevin Paiz-Ramirez.
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
Metastatic Breast Cancer: One Size Does Not Fit All Clifford Hudis, M.D. Chief, Breast Cancer Medicine Service MSKCC.
Introduction The goal of translational bioinformatics is to enable the transformation of increasingly voluminous genomic and biological data into diagnostics.
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Presented by: Andrew McMurry Boston University Bioinformatics Children’s Hospital Informatics Program Harvard Medical School Center for BioMedical Informatics.
The aim of my research is to establish a relation among diseases, physiological processes and the action of small molecules like mithramycin Our goal.
Data Analysis Summary. Elephant in the room General Comments General understanding that informatics is integral in medical sequencing and other –omics.
Genetics-multistep tumorigenesis genomic integrity & cancer Sections from Weinberg’s ‘the biology of Cancer’ Cancer genetics and genomics Selected.
©Edited by Mingrui Zhang, CS Department, Winona State University, 2008 Identifying Lung Cancer Risks.
Michael Birrer Ian McNeish New Developments in Biology and Targets of Epithelial Ovarian Cancer.
Construction of cancer pathways for personalized medicine | Presented By Date Construction of cancer pathways for personalized medicine Predictive, Preventive.
Understanding Cancer and Related Topics Understanding Cancer Developed by: Lewis J. Kleinsmith, Ph.D. Donna Kerrigan, M.S. Jeanne Kelly Brian Hollen Illustrates.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Watson Genomic Analytics. Select Watson solutions address a wide range of clinical and research needs in oncology Patient InsightsEvidence-based InsightsResearch.
Shuang Liang ● Southern Medical University Building a Knowledge Discovery System.
Computational Laboratory: aCGH Data Analysis Feb. 4, 2011 Per Chia-Chin Wu.
Introduction to caIntegrator caBIG ® Molecular Analysis Tools Knowledge Center April 3, 2011.
Affymetrix microarray analysis by using Cmap By NFU Biology Algorithm lab.
Lecture 11. Topics in Omic Studies (Cancer Genomics, Transcriptomics and Epignomics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational.
Brad Windle, Ph.D Unsupervised Learning and Microarrays Web Site: Link to Courses and.
Jin MENG Shen FU (DPD 08) Biology 2 - Head/Neck and CNS Tumors
INTERPRETING GENETIC MUTATIONAL DATA FOR CLINICAL ONCOLOGY Ben Ho Park, M.D., Ph.D. Associate Professor of Oncology Johns Hopkins University May 2014.
CCLE Cancer Cell Line Encyclopedia Alexey Erohskin.
Recent Advances in Genomic Science Julian Sampson Institute of Medical Genetics, Cardiff.
Introduction to Oncomine Xiayu Stacy Huang. Oncomine is a cancer-specific microarray database and has a web-based data-mining platform aimed at facilitating.
(1) Genotype-Tissue Expression (GTEx) Largest systematic study of genetic regulation in multiple tissues to date 53 tissues, 500+ donors, 9K samples, 180M.
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
Tumor Heterogeneity: From biological concepts to computational methods Bo Li, PhD Dana Farber Cancer Institute Harvard Statistics Department.
Multi-scale network biology model & the model library 多尺度网络生物学模型 -- 兼论模型库的建立与应用 Jianghui Xiong 熊江辉
 Cancer  Compound perturbations  Gene perturbations  Tumor development  Cancer metastasis  Cancer treatments Altered Caspase-8 Expression.
Data and Hartwig Medical Foundation
A graph-based integration of multiple layers of cancer genomics data (Progress Report) Do Kyoon Kim 1.
Cancer Genomics and Class Discovery
Optimizing Biological Data Integration
Microarray Technology and Applications
Dept of Biomedical Informatics University of Pittsburgh
Figure 1 Number of somatic mutation rates across The Cancer Genome Atlas (TCGA) projects Figure 1 | Number of somatic mutation rates across The Cancer.
Prediction of Optimal Cancer Drug therapies via SVM
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
DNA Chip Data Interpretation Tools: Genmapp & Dragon View
WES detects a limited number of clinically targetable alterations in patients with advanced cancer. WES detects a limited number of clinically targetable.
Genomic alterations in breast cancer cell line MDA-MB-231.
How will cancer be treated in the 21st century?
Volume 17, Issue 8, Pages (November 2016)
Altered Caspase-8 Expression
NRG1 rearrangements are found in multiple solid tumors.
The NCI Genomic Data Commons as an engine for precision medicine
Volume 28, Issue 4, Pages e6 (July 2019)
Presentation transcript:

Data Integration for Cancer Genomics

Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?

The Cancer Genome Atlas Project (TCGA) SNP Structural variations DNA methylation Gene expression microRNA expression Paired samples/unpaired samples

Data Processing Challenges Contamination Subclones

Biological questions Changes in genes between cancer and normals Disease heterogeneity, subtypes Joint modeling, mechanisms

Integrative approach Meta-analytical approach

PARADIGM: PAthway Recognition Algorithm using Data Integration on Genomic Models

X pxn = W px(k-1) Z (k-1)xn + e pxn cov(e) = diag(ψ 1, ψ 2,…, ψ p )

Non-negative matrix factorization X MxN = W MxK x H KxN All matrix entries are nonnegative Minimize

X 1 : an M x N 1 matrix X 2 : an M x N 2 matrix X 3 : an M x N 3 matrix X 1 = W x H 1 X 2 = W x H 2 X 3 = W x H 3

TCGA and GWAS, and ENCODE

Cancer Treatment

Examples Gene expression data: HG-U133A chip, mapped to genes across 59 cell lines (expression data of the cell line “LC:NCI_H23” was unavailable). Use genes included in two lists: (1) 766 cancer-related genes (Chen, et al., 2008); (2) 8919 genes from the Integrated Druggable Genome Database (IDGD) Project (Hopkins and Groom, 2002; Russ and Lampel, 2005). After this filtering, 6958 genes retained.Chen, et al., 2008Hopkins and Groom, 2002Russ and Lampel, 2005 Drug response data: 101 drugs annotated in the CancerResource database (Ahmed, et al., 2011). –log(GI 50 )Ahmed, et al., 2011 Pathway association information: Retrieved from the KEGG MEDICUS database (Kanehisa, et al., 2010). 58 pathways which are either known to be related to cancer or have drug targets. Among the 6958 genes selected in step (1), 1863 genes are covered by these 58 pathways and constitute the final list of genes in our real data analysis.Kanehisa, et al., 2010

GI50 values

Cancer Types Cancer typeNumber of cell lines Leukemia6 Non-Small Cell Lung8 Colon7 CNS6 Melanoma9 Ovarian7 Renal8 Prostate (excluded)2 Breast6

Connectivity Map Data CMap Build 02 ( provides public download of genome-wide transcriptional profiles of five human cancer cell lines (MCF7: human breast cancer; HL60: human promyelocytic leukemia; ssMCF7: MCF7 grown in a different vehicle; PC3: human epitelial prostate cancer; SKMEL5: human skin melanoma) both before and after the treatments of 1309 distinct bioactive small molecules. Used the data from the HT_HG-U133A array platform, which consists of 4466 expression response profiles, representing 1084 different compounds.

Integration within the same cancer type Integration across different cancer types

One individual with 188 fold coverage

Ideal Pipeline Patient diagnosis and sample collection Various types of genomics profiling Driver mutations, disease subtypes Targeted treatments, monitoring, and additional treatments

Topics of Interest Data processing Relationships among different data types Tumor heterogeneity Single cell analysis Modeling Targeted treatment Integration over different tumor types TCGA, ENCODE, GWAS, 1000 Genomes, and others