Download presentation
Presentation is loading. Please wait.
Published byLee Wade Modified over 6 years ago
1
Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014
Designing in silico experiments for identification of the most efficient synergistic treatments of cancer Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014
2
Hypothesis Because cancer is a heterogeneous disease, synergistic medications can treat it better than a single drug.
3
Rational Treatment A Relapse Treatment B Relapse
4
Rational Cured Treatment A+B
5
Which drug combination to use?
Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. … Topotecan ~100 compounds known to have some effect on cancer
6
Which drug combination to use?
Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. … Topotecan ~5000 combinations of 2 compounds A+B ~100 compounds known to have some effect on cancer
7
Which drug combination to use?
Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. … Topotecan ~5000 combinations of 2 compounds A+B A+B +C ~160,000 combinations of 3 compounds ~100 compounds known to have some effect on cancer
8
Which drug combination to use?
Challenge Which drug combination to use? It is not feasible to try all possible combinations of compounds in vivo or in vitro.
9
Which drug combination to use?
Challenge Which drug combination to use? It is not feasible to try all possible combinations of compounds in vivo or in vitro. Can in silico experiments help?
10
Predicting the compound response
In order for in in silico experiment to work, we need to develop a reasonable framework to model the underling biological phenomena.
11
Bayesian networks are useful in modeling genes interactions
12
CCLE provides expression data useful for learning the network
13
First application Biological interpretation
Gene hubs, causal relationships, interaction between pathways, ….
14
Response to compounds can be
incorporated to the model too.
15
Response to compounds can be incorporated to the model too.
1 Bernoulli variable per compound
16
Response to compounds can be incorporated to the model too.
1 Bernoulli variable per compound The response variable (e.g. activity area)
17
The dependencies can be learned from CCLE data
18
Second application Biological interpretation
Which genes and compounds interact? The response is dependent on which genes?
19
Predicting response to compounds
Objective To identify compounds which are most promising for further in vitro experiments.
20
Predicting response to compounds
Given the response to a selected set of compounds Objective To identify compounds which are most promising for in vitro experiments
21
Third application In silico experiments can predict the best candidates
22
Third application In silico experiments can predict the best candidates In vitro experiments can be designed more efficiently. (Run on 100 compounds instead of 1000.)
23
Which drug combination to use?
Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. … Topotecan A+B ~5000 combinations of 2 compounds A+B +C ~160,000 combinations of 3 compounds ~100 compounds known to have some effect on cancer
24
Synergistic treatment
Identifying the best compound combinations in silico Unlike in vitro experiments, running thousands of in silico experiments is feasible.
25
Forth and most exciting application
Erlotinib+Lapatinib Vandetanib+Lapatinib AEW541+Topotecan 17-AAG+AZD6244 Erlotinib+AZD6244 Erlotinib+LBW242 . .. … 100. Topoteca+Paclitaxel By inference on the Bayesian network, we can rank top compound combinations.
26
Preliminary Results on CCLE data
27
Learning the Bayesian network
Because learning a Bayesian network with thousand of variables is difficult, our strategy is to first identify gene modules by WGCNA, learn a network for each module, and then combine the networks in a later step.
28
We identified a “cancer module”.
~Corrected P-value 0.001
29
Comparison with random selection
~Corrected P-value 0.01 Cancer module ~Corrected P-value 0.01 Random selection of the same size (1309 genes)
30
Primary analysis of the gene modules
In contrast to random networks, most identified clusters show significant enrichment in pathways, biological processes, and /or transcriptional motifs. Overall, 47 out of 71 identified modules (66%) are significantly enriched in regulatory motifs of transcription factors. Some of the learned modules are very strongly enriched in important cancer related terms, and/ or targets of oncogenic microRNAs. For instance, the cancer module is significantly enriched in targets of 16 miRNAs (corrected P-value < 0.01), including several members of let-7 family.
31
Examples of learned networks
by Banjo
32
Examples of learned networks
by Banjo
33
More results to come, Work is in progress …..
34
Former projects
35
Former project 1 Automatic analysis of flow cytometry data and its application in lymphoma diagnosis Supported by NSERC and MITACS
36
A highly collaborative study
(The University of British Columbia & BC Cancer Agency) Dr. Andrew Weng, MD, PhD Hematopathologist Dr. Ryan Brinkman, PhD Bioinformatician Dr. Arvind Gupta, PhD Mathematician Dr. Gabor Tardosh, PhD Dr.Valentine Kabanets, PhD Computer Scientist
37
Goal of Study: To reassess flow cytometry data in an unbiased fashion to discover immunophenotypes that improve MCL vs. SLL diagnostic accuracy CD5 CD23 FMC7 MCL MCL + - SLL + - ??? + SLL practicing pathologist would then consider secondary criteria such as… ??? 2º criteria: sIg intensity CD20 intensity
38
algorithm identifies the most informative markers
Methodology: Automated Computational Analysis Multidimensional Clustering populations in multi- dimensional space MCL FSC SSC FL1 FL2 FL3 … FLn Multidimensional Clustering algorithm identifies the most informative markers SLL 38
39
populations in multi- dimensional space
Methodology: Automated Computational Analysis Multidimensional Clustering populations in multi- dimensional space MCL FSC SSC FL1 FL2 FL3 … FLn Data reduction for spectral clustering to analyze high throughput flow cytometry data 39
40
algorithm identifies the most informative markers
Methodology: FeaLect identifies the informative features. Multidimensional Clustering populations in multi- dimensional space MCL FSC SSC FL1 FL2 FL3 … FLn Multidimensional Clustering algorithm identifies the most informative markers SLL 40
41
populations in multi- dimensional space
Methodology: FeaLect identifies the informative features. Multidimensional Clustering populations in multi- dimensional space MCL FSC SSC FL1 FL2 FL3 … FLn Multidimensional Clustering Scoring relevancy of features based on combinatorial analysis of Lasso with application to lymphoma diagnosis SLL 41
42
Conclusion Former project 1
Automatic, unbiased analysis of flow cytometry data reveals CD20/CD23 ratio is most discriminative in differential diagnosis between MCL and SLL. Automated Analysis of Multidimensional Flow Cytometry Data Improves Diagnostic Accuracy Between Mantle Cell Lymphoma and Small Lymphocytic Lymphoma 42
43
Former project 2 Inferring clonal composition of a breast cancer from multiple tissue samples Supported by NIH
44
A highly collaborative study
(The University of Washington) Dr. Anthony Blau, MD Oncologist Dr. Junfeng Wang, MD Dr. ChaoZhong Song, MD Lab Scientist Dr. Daniela Witten, PhD Biostatistician Dr.William Noble, PhD Computational Biologist
45
Traditional concept of a tumor
Schematic figure
46
Most tumors are heterogeneous Different clones have different genotypes and phenotypes
Schematic figure
47
It is important to identify the clonal composition
Treatment A Relapse Treatment B Relapse
48
Our approach to analyze multiple samples from a single tumor
49
Our approach to analyze multiple samples from a single tumor
50
Each sample has different information about the clonal composition
Counting the number of reads which support each mutation PCR Next Gen Sequencing PCR Next Gen Sequencing PCR Next Gen Sequencing
51
How to validate? Inferring clonal composition of a breast cancer from multiple tissue samples
Oncologists Validate? EM Next-Gen Sequencing Data Clonal structure 51
52
Validated by simulations
Usefulness? Inferring clonal composition of a breast cancer from multiple tissue samples Oncologists ? EM Validated by simulations Next-Gen Sequencing Data Clonal structure 52
53
Experiment with real data Study on a primary breast cancer
10 breast tumor samples 1 adjacent normal 2 samples from the metastatic lymph node Maybe move tis slide and the next one, with a plot of frequencies, to earlier.
54
Clone frequencies vary smoothly across the tumor sections
The model doesn’t know anything about the anatomic location of the samples!
55
Clone frequencies vary smoothly across the tumor sections
56
Phylogenetic analysis
tells the story of the tumor over time
57
Five clone solution
58
Six clone solution is consistent with five-clone solution
59
Anatomic variation of clones Validated by simulations
Overview of former project Inferring clonal composition of a breast cancer from multiple tissue samples Oncologists Anatomic variation of clones Phylogenetic trees EM Validated by simulations Next-Gen Sequencing Data Clonal structure 59
60
Software publicly available
60
61
Leukemia or lymphoma sample
Proposed project based on former experiences: Identifying clonal decomposition using sub-tissues SamSPECTRAL Leukemia or lymphoma sample Sort cell populations Next Gen Sequencing Clonal analysis 61
62
Supplementary slides
63
Bioinformatics: Computational and statistical analysis of biological data
Biologists Data Genotypes / Phenotypes Results 63
64
MCL A rare type of B-cell lymphoma; 6% of all Non-Hodgkin
Survival improved from 3 years to 6 years CD5 positive Right: MCL histology
65
MCL A rare type of B-cell lymphoma; 6% of all Non-Hodgkin
Survival improved from 3 years to 6 years. Aggressive CD5 positive t(11:14) translocation Over-expression of Cyclin D1 practicing pathologist would then consider secondary criteria such as…
66
SLL 5% of Non-Hogdkin B-cell lymphoma, but 30% of leukemias are CLL
Mean survival of 25 years even without treatment. Indolent CD5 positive practicing pathologist would then consider secondary criteria such as…
67
Exome sequencing followed by targeted capture
Exome sequencing (~100x) identified 281 candidate loci. Targeted capture verified 17 of these sites. Mean coverage ~2000 reads per locus per sample.
68
Methodology: SamSPECTRAL clusters data.
Data reduction for spectral clustering to analyze high throughput flow cytometry data 68
69
Building a generative model
Technical Generate C Parameters
70
Inference Given the observed counts, how do we infer the clonal structure?
Technical EM Inference C
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.