Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014

Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014
Designing in silico experiments for identification of the most efficient synergistic treatments of cancer Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014

Hypothesis Because cancer is a heterogeneous disease, synergistic medications can treat it better than a single drug.

Rational Treatment A Relapse Treatment B Relapse

Rational Cured Treatment A+B

Which drug combination to use?
Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. … Topotecan ~100 compounds known to have some effect on cancer

Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. … Topotecan ~5000 combinations of 2 compounds A+B ~100 compounds known to have some effect on cancer

Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. … Topotecan ~5000 combinations of 2 compounds A+B A+B +C ~160,000 combinations of 3 compounds ~100 compounds known to have some effect on cancer

Challenge Which drug combination to use? It is not feasible to try all possible combinations of compounds in vivo or in vitro.

Challenge Which drug combination to use? It is not feasible to try all possible combinations of compounds in vivo or in vitro. Can in silico experiments help?

Predicting the compound response
In order for in in silico experiment to work, we need to develop a reasonable framework to model the underling biological phenomena.

Bayesian networks are useful in modeling genes interactions

CCLE provides expression data useful for learning the network

First application Biological interpretation
Gene hubs, causal relationships, interaction between pathways, ….

Response to compounds can be
incorporated to the model too.

Response to compounds can be incorporated to the model too.
1 Bernoulli variable per compound

Response to compounds can be incorporated to the model too.
1 Bernoulli variable per compound The response variable (e.g. activity area)

The dependencies can be learned from CCLE data

Second application Biological interpretation
Which genes and compounds interact? The response is dependent on which genes?

Predicting response to compounds
Objective To identify compounds which are most promising for further in vitro experiments.

Predicting response to compounds
Given the response to a selected set of compounds Objective To identify compounds which are most promising for in vitro experiments

Third application In silico experiments can predict the best candidates

Third application In silico experiments can predict the best candidates In vitro experiments can be designed more efficiently. (Run on 100 compounds instead of 1000.)

Challenge Which drug combination to use? Erlotinib Lapatinib Vandetanib AEW541 . .. … Topotecan A+B ~5000 combinations of 2 compounds A+B +C ~160,000 combinations of 3 compounds ~100 compounds known to have some effect on cancer

Synergistic treatment
Identifying the best compound combinations in silico Unlike in vitro experiments, running thousands of in silico experiments is feasible.

Forth and most exciting application
Erlotinib+Lapatinib Vandetanib+Lapatinib AEW541+Topotecan 17-AAG+AZD6244 Erlotinib+AZD6244 Erlotinib+LBW242 . .. … 100. Topoteca+Paclitaxel By inference on the Bayesian network, we can rank top compound combinations.

Preliminary Results on CCLE data

Learning the Bayesian network
Because learning a Bayesian network with thousand of variables is difficult, our strategy is to first identify gene modules by WGCNA, learn a network for each module, and then combine the networks in a later step.

We identified a “cancer module”.
~Corrected P-value 0.001

Comparison with random selection
~Corrected P-value 0.01 Cancer module ~Corrected P-value 0.01 Random selection of the same size (1309 genes)

Primary analysis of the gene modules
In contrast to random networks, most identified clusters show significant enrichment in pathways, biological processes, and /or transcriptional motifs. Overall, 47 out of 71 identified modules (66%) are significantly enriched in regulatory motifs of transcription factors. Some of the learned modules are very strongly enriched in important cancer related terms, and/ or targets of oncogenic microRNAs. For instance, the cancer module is significantly enriched in targets of 16 miRNAs (corrected P-value < 0.01), including several members of let-7 family.

Examples of learned networks
by Banjo

More results to come, Work is in progress …..

Former projects

Former project 1 Automatic analysis of flow cytometry data and its application in lymphoma diagnosis Supported by NSERC and MITACS

A highly collaborative study
(The University of British Columbia & BC Cancer Agency) Dr. Andrew Weng, MD, PhD Hematopathologist Dr. Ryan Brinkman, PhD Bioinformatician Dr. Arvind Gupta, PhD Mathematician Dr. Gabor Tardosh, PhD Dr.Valentine Kabanets, PhD Computer Scientist

Goal of Study: To reassess flow cytometry data in an unbiased fashion to discover immunophenotypes that improve MCL vs. SLL diagnostic accuracy CD5 CD23 FMC7 MCL MCL + - SLL + - ??? + SLL practicing pathologist would then consider secondary criteria such as… ??? 2º criteria: sIg intensity CD20 intensity

algorithm identifies the most informative markers
Methodology: Automated Computational Analysis Multidimensional Clustering populations in multidimensional space MCL FSC SSC FL1 FL2 FL3 … FLn Multidimensional Clustering algorithm identifies the most informative markers SLL 38

populations in multidimensional space
Methodology: Automated Computational Analysis Multidimensional Clustering populations in multidimensional space MCL FSC SSC FL1 FL2 FL3 … FLn Data reduction for spectral clustering to analyze high throughput flow cytometry data 39

algorithm identifies the most informative markers
Methodology: FeaLect identifies the informative features. Multidimensional Clustering populations in multidimensional space MCL FSC SSC FL1 FL2 FL3 … FLn Multidimensional Clustering algorithm identifies the most informative markers SLL 40

populations in multidimensional space
Methodology: FeaLect identifies the informative features. Multidimensional Clustering populations in multidimensional space MCL FSC SSC FL1 FL2 FL3 … FLn Multidimensional Clustering Scoring relevancy of features based on combinatorial analysis of Lasso with application to lymphoma diagnosis SLL 41

Conclusion Former project 1
Automatic, unbiased analysis of flow cytometry data reveals CD20/CD23 ratio is most discriminative in differential diagnosis between MCL and SLL. Automated Analysis of Multidimensional Flow Cytometry Data Improves Diagnostic Accuracy Between Mantle Cell Lymphoma and Small Lymphocytic Lymphoma 42

Former project 2 Inferring clonal composition of a breast cancer from multiple tissue samples Supported by NIH

A highly collaborative study
(The University of Washington) Dr. Anthony Blau, MD Oncologist Dr. Junfeng Wang, MD Dr. ChaoZhong Song, MD Lab Scientist Dr. Daniela Witten, PhD Biostatistician Dr.William Noble, PhD Computational Biologist

Traditional concept of a tumor
Schematic figure

Most tumors are heterogeneous Different clones have different genotypes and phenotypes
Schematic figure

It is important to identify the clonal composition
Treatment A Relapse Treatment B Relapse

Our approach to analyze multiple samples from a single tumor

Each sample has different information about the clonal composition
Counting the number of reads which support each mutation PCR Next Gen Sequencing PCR Next Gen Sequencing PCR Next Gen Sequencing

How to validate? Inferring clonal composition of a breast cancer from multiple tissue samples
Oncologists Validate? EM Next-Gen Sequencing Data Clonal structure 51

Validated by simulations
Usefulness? Inferring clonal composition of a breast cancer from multiple tissue samples Oncologists ? EM Validated by simulations Next-Gen Sequencing Data Clonal structure 52

Experiment with real data Study on a primary breast cancer
10 breast tumor samples 1 adjacent normal 2 samples from the metastatic lymph node Maybe move tis slide and the next one, with a plot of frequencies, to earlier.

Clone frequencies vary smoothly across the tumor sections
The model doesn’t know anything about the anatomic location of the samples!

Clone frequencies vary smoothly across the tumor sections

Phylogenetic analysis
tells the story of the tumor over time

Five clone solution

Six clone solution is consistent with five-clone solution

Anatomic variation of clones Validated by simulations
Overview of former project Inferring clonal composition of a breast cancer from multiple tissue samples Oncologists Anatomic variation of clones Phylogenetic trees EM Validated by simulations Next-Gen Sequencing Data Clonal structure 59

Software publicly available
60

Leukemia or lymphoma sample
Proposed project based on former experiences: Identifying clonal decomposition using sub-tissues SamSPECTRAL Leukemia or lymphoma sample Sort cell populations Next Gen Sequencing Clonal analysis 61

Supplementary slides

Bioinformatics: Computational and statistical analysis of biological data
Biologists Data Genotypes / Phenotypes Results 63

MCL A rare type of B-cell lymphoma; 6% of all Non-Hodgkin
Survival improved from 3 years to 6 years CD5 positive Right: MCL histology

MCL A rare type of B-cell lymphoma; 6% of all Non-Hodgkin
Survival improved from 3 years to 6 years. Aggressive CD5 positive t(11:14) translocation Over-expression of Cyclin D1 practicing pathologist would then consider secondary criteria such as…

SLL 5% of Non-Hogdkin B-cell lymphoma, but 30% of leukemias are CLL
Mean survival of 25 years even without treatment. Indolent CD5 positive practicing pathologist would then consider secondary criteria such as…

Exome sequencing followed by targeted capture
Exome sequencing (~100x) identified 281 candidate loci. Targeted capture verified 17 of these sites. Mean coverage ~2000 reads per locus per sample.

Methodology: SamSPECTRAL clusters data.
Data reduction for spectral clustering to analyze high throughput flow cytometry data 68

Building a generative model
Technical Generate C Parameters

Inference Given the observed counts, how do we infer the clonal structure?
Technical EM Inference C

Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014

Similar presentations

Presentation on theme: "Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014

Similar presentations

Presentation on theme: "Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014"— Presentation transcript:

Similar presentations

About project

Feedback