Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014

Similar presentations


Presentation on theme: "Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014"— Presentation transcript:

1 Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014
Designing in silico experiments for identification of the most efficient synergistic treatments of cancer Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014

2 Hypothesis Because cancer is a heterogeneous disease, synergistic medications can treat it better than a single drug.

3 Rational Treatment A Relapse Treatment B Relapse

4 Rational Cured Treatment A+B

5 Which drug combination to use?
Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. Topotecan ~100 compounds known to have some effect on cancer

6 Which drug combination to use?
Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. Topotecan ~5000 combinations of 2 compounds A+B ~100 compounds known to have some effect on cancer

7 Which drug combination to use?
Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. Topotecan ~5000 combinations of 2 compounds A+B A+B +C ~160,000 combinations of 3 compounds ~100 compounds known to have some effect on cancer

8 Which drug combination to use?
Challenge Which drug combination to use? It is not feasible to try all possible combinations of compounds in vivo or in vitro.

9 Which drug combination to use?
Challenge Which drug combination to use? It is not feasible to try all possible combinations of compounds in vivo or in vitro. Can in silico experiments help?

10 Predicting the compound response
In order for in in silico experiment to work, we need to develop a reasonable framework to model the underling biological phenomena.

11 Bayesian networks are useful in modeling genes interactions

12 CCLE provides expression data useful for learning the network

13 First application Biological interpretation
Gene hubs, causal relationships, interaction between pathways, ….

14 Response to compounds can be
incorporated to the model too.

15 Response to compounds can be incorporated to the model too.
1 Bernoulli variable per compound

16 Response to compounds can be incorporated to the model too.
1 Bernoulli variable per compound The response variable (e.g. activity area)

17 The dependencies can be learned from CCLE data

18 Second application Biological interpretation
Which genes and compounds interact? The response is dependent on which genes?

19 Predicting response to compounds
Objective To identify compounds which are most promising for further in vitro experiments.

20 Predicting response to compounds
Given the response to a selected set of compounds Objective To identify compounds which are most promising for in vitro experiments

21 Third application In silico experiments can predict the best candidates

22 Third application In silico experiments can predict the best candidates In vitro experiments can be designed more efficiently. (Run on 100 compounds instead of 1000.)

23 Which drug combination to use?
Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. Topotecan A+B ~5000 combinations of 2 compounds A+B +C ~160,000 combinations of 3 compounds ~100 compounds known to have some effect on cancer

24 Synergistic treatment
Identifying the best compound combinations in silico Unlike in vitro experiments, running thousands of in silico experiments is feasible.

25 Forth and most exciting application
Erlotinib+Lapatinib Vandetanib+Lapatinib AEW541+Topotecan 17-AAG+AZD6244 Erlotinib+AZD6244 Erlotinib+LBW242 . .. 100. Topoteca+Paclitaxel By inference on the Bayesian network, we can rank top compound combinations.

26 Preliminary Results on CCLE data

27 Learning the Bayesian network
Because learning a Bayesian network with thousand of variables is difficult, our strategy is to first identify gene modules by WGCNA, learn a network for each module, and then combine the networks in a later step.

28 We identified a “cancer module”.
~Corrected P-value 0.001

29 Comparison with random selection
~Corrected P-value 0.01 Cancer module ~Corrected P-value 0.01 Random selection of the same size (1309 genes)

30 Primary analysis of the gene modules
In contrast to random networks, most identified clusters show significant enrichment in pathways, biological processes, and /or transcriptional motifs. Overall, 47 out of 71 identified modules (66%) are significantly enriched in regulatory motifs of transcription factors. Some of the learned modules are very strongly enriched in important cancer related terms, and/ or targets of oncogenic microRNAs. For instance, the cancer module is significantly enriched in targets of 16 miRNAs (corrected P-value < 0.01), including several members of let-7 family.

31 Examples of learned networks
by Banjo

32 Examples of learned networks
by Banjo

33 More results to come, Work is in progress …..

34 Former projects

35 Former project 1 Automatic analysis of flow cytometry data and its application in lymphoma diagnosis Supported by NSERC and MITACS

36 A highly collaborative study
(The University of British Columbia & BC Cancer Agency) Dr. Andrew Weng, MD, PhD Hematopathologist Dr. Ryan Brinkman, PhD Bioinformatician Dr. Arvind Gupta, PhD Mathematician Dr. Gabor Tardosh, PhD Dr.Valentine Kabanets, PhD Computer Scientist

37 Goal of Study: To reassess flow cytometry data in an unbiased fashion to discover immunophenotypes that improve MCL vs. SLL diagnostic accuracy CD5 CD23 FMC7 MCL MCL + - SLL + - ??? + SLL practicing pathologist would then consider secondary criteria such as… ??? 2º criteria: sIg intensity CD20 intensity

38 algorithm identifies the most informative markers
Methodology: Automated Computational Analysis Multidimensional Clustering populations in multi- dimensional space MCL FSC SSC FL1 FL2 FL3 FLn Multidimensional Clustering algorithm identifies the most informative markers SLL 38

39 populations in multi- dimensional space
Methodology: Automated Computational Analysis Multidimensional Clustering populations in multi- dimensional space MCL FSC SSC FL1 FL2 FL3 FLn Data reduction for spectral clustering to analyze high throughput flow cytometry data 39

40 algorithm identifies the most informative markers
Methodology: FeaLect identifies the informative features. Multidimensional Clustering populations in multi- dimensional space MCL FSC SSC FL1 FL2 FL3 FLn Multidimensional Clustering algorithm identifies the most informative markers SLL 40

41 populations in multi- dimensional space
Methodology: FeaLect identifies the informative features. Multidimensional Clustering populations in multi- dimensional space MCL FSC SSC FL1 FL2 FL3 FLn Multidimensional Clustering Scoring relevancy of features based on combinatorial analysis of Lasso with application to lymphoma diagnosis SLL 41

42 Conclusion Former project 1
Automatic, unbiased analysis of flow cytometry data reveals CD20/CD23 ratio is most discriminative in differential diagnosis between MCL and SLL. Automated Analysis of Multidimensional Flow Cytometry Data Improves Diagnostic Accuracy Between Mantle Cell Lymphoma and Small Lymphocytic Lymphoma 42

43 Former project 2 Inferring clonal composition of a breast cancer from multiple tissue samples Supported by NIH

44 A highly collaborative study
(The University of Washington) Dr. Anthony Blau, MD Oncologist Dr. Junfeng Wang, MD Dr. ChaoZhong Song, MD Lab Scientist Dr. Daniela Witten, PhD Biostatistician Dr.William Noble, PhD Computational Biologist

45 Traditional concept of a tumor
Schematic figure

46 Most tumors are heterogeneous Different clones have different genotypes and phenotypes
Schematic figure

47 It is important to identify the clonal composition
Treatment A Relapse Treatment B Relapse

48 Our approach to analyze multiple samples from a single tumor

49 Our approach to analyze multiple samples from a single tumor

50 Each sample has different information about the clonal composition
Counting the number of reads which support each mutation PCR Next Gen Sequencing PCR Next Gen Sequencing PCR Next Gen Sequencing

51 How to validate? Inferring clonal composition of a breast cancer from multiple tissue samples
Oncologists Validate? EM Next-Gen Sequencing Data Clonal structure 51

52 Validated by simulations
Usefulness? Inferring clonal composition of a breast cancer from multiple tissue samples Oncologists ? EM Validated by simulations Next-Gen Sequencing Data Clonal structure 52

53 Experiment with real data Study on a primary breast cancer
10 breast tumor samples 1 adjacent normal 2 samples from the metastatic lymph node Maybe move tis slide and the next one, with a plot of frequencies, to earlier.

54 Clone frequencies vary smoothly across the tumor sections
The model doesn’t know anything about the anatomic location of the samples!

55 Clone frequencies vary smoothly across the tumor sections

56 Phylogenetic analysis
tells the story of the tumor over time

57 Five clone solution

58 Six clone solution is consistent with five-clone solution

59 Anatomic variation of clones Validated by simulations
Overview of former project Inferring clonal composition of a breast cancer from multiple tissue samples Oncologists Anatomic variation of clones Phylogenetic trees EM Validated by simulations Next-Gen Sequencing Data Clonal structure 59

60 Software publicly available
60

61 Leukemia or lymphoma sample
Proposed project based on former experiences: Identifying clonal decomposition using sub-tissues SamSPECTRAL Leukemia or lymphoma sample Sort cell populations Next Gen Sequencing Clonal analysis 61

62 Supplementary slides

63 Bioinformatics: Computational and statistical analysis of biological data
Biologists Data Genotypes / Phenotypes Results 63

64 MCL A rare type of B-cell lymphoma; 6% of all Non-Hodgkin
Survival improved from 3 years to 6 years CD5 positive Right: MCL histology

65 MCL A rare type of B-cell lymphoma; 6% of all Non-Hodgkin
Survival improved from 3 years to 6 years. Aggressive CD5 positive t(11:14) translocation Over-expression of Cyclin D1 practicing pathologist would then consider secondary criteria such as…

66 SLL 5% of Non-Hogdkin B-cell lymphoma, but 30% of leukemias are CLL
Mean survival of 25 years even without treatment. Indolent CD5 positive practicing pathologist would then consider secondary criteria such as…

67 Exome sequencing followed by targeted capture
Exome sequencing (~100x) identified 281 candidate loci. Targeted capture verified 17 of these sites. Mean coverage ~2000 reads per locus per sample.

68 Methodology: SamSPECTRAL clusters data.
Data reduction for spectral clustering to analyze high throughput flow cytometry data 68

69 Building a generative model
Technical Generate C Parameters

70 Inference Given the observed counts, how do we infer the clonal structure?
Technical EM Inference C


Download ppt "Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014"

Similar presentations


Ads by Google