Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Integration for Cancer Genomics. Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?

Similar presentations


Presentation on theme: "Data Integration for Cancer Genomics. Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?"— Presentation transcript:

1 Data Integration for Cancer Genomics

2 Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?

3

4

5 The Cancer Genome Atlas Project (TCGA) SNP Structural variations DNA methylation Gene expression microRNA expression Paired samples/unpaired samples

6 Data Processing Challenges Contamination Subclones

7

8 Biological questions Changes in genes between cancer and normals Disease heterogeneity, subtypes Joint modeling, mechanisms

9

10 Integrative approach Meta-analytical approach

11 PARADIGM: PAthway Recognition Algorithm using Data Integration on Genomic Models

12

13

14 X pxn = W px(k-1) Z (k-1)xn + e pxn cov(e) = diag(ψ 1, ψ 2,…, ψ p )

15

16

17

18 Non-negative matrix factorization X MxN = W MxK x H KxN All matrix entries are nonnegative Minimize

19 X 1 : an M x N 1 matrix X 2 : an M x N 2 matrix X 3 : an M x N 3 matrix X 1 = W x H 1 X 2 = W x H 2 X 3 = W x H 3

20

21 TCGA and GWAS, and ENCODE

22 Cancer Treatment

23 Examples http://discover.nci.nih.gov/cellminer/ Gene expression data: HG-U133A chip, mapped to 12980 genes across 59 cell lines (expression data of the cell line “LC:NCI_H23” was unavailable). Use genes included in two lists: (1) 766 cancer-related genes (Chen, et al., 2008); (2) 8919 genes from the Integrated Druggable Genome Database (IDGD) Project (Hopkins and Groom, 2002; Russ and Lampel, 2005). After this filtering, 6958 genes retained.Chen, et al., 2008Hopkins and Groom, 2002Russ and Lampel, 2005 Drug response data: 101 drugs annotated in the CancerResource database (Ahmed, et al., 2011). –log(GI 50 )Ahmed, et al., 2011 Pathway association information: Retrieved from the KEGG MEDICUS database (Kanehisa, et al., 2010). 58 pathways which are either known to be related to cancer or have drug targets. Among the 6958 genes selected in step (1), 1863 genes are covered by these 58 pathways and constitute the final list of genes in our real data analysis.Kanehisa, et al., 2010

24 GI50 values

25 Cancer Types Cancer typeNumber of cell lines Leukemia6 Non-Small Cell Lung8 Colon7 CNS6 Melanoma9 Ovarian7 Renal8 Prostate (excluded)2 Breast6

26 Connectivity Map Data CMap Build 02 (http://www.broadinstitute.org/cmap/) provides public download of genome-wide transcriptional profiles of five human cancer cell lines (MCF7: human breast cancer; HL60: human promyelocytic leukemia; ssMCF7: MCF7 grown in a different vehicle; PC3: human epitelial prostate cancer; SKMEL5: human skin melanoma) both before and after the treatments of 1309 distinct bioactive small molecules.http://www.broadinstitute.org/cmap/ Used the data from the HT_HG-U133A array platform, which consists of 4466 expression response profiles, representing 1084 different compounds.

27 Integration within the same cancer type Integration across different cancer types

28 One individual with 188 fold coverage

29

30 Ideal Pipeline Patient diagnosis and sample collection Various types of genomics profiling Driver mutations, disease subtypes Targeted treatments, monitoring, and additional treatments

31 Topics of Interest Data processing Relationships among different data types Tumor heterogeneity Single cell analysis Modeling Targeted treatment Integration over different tumor types TCGA, ENCODE, GWAS, 1000 Genomes, and others


Download ppt "Data Integration for Cancer Genomics. Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?"

Similar presentations


Ads by Google