Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gene expression patterns of breast cancer phenotype revealed by molecular profiling Gabriela Alexe, IBM Research DIMACS Workshop on Detecting and Processing.

Similar presentations


Presentation on theme: "Gene expression patterns of breast cancer phenotype revealed by molecular profiling Gabriela Alexe, IBM Research DIMACS Workshop on Detecting and Processing."— Presentation transcript:

1 Gene expression patterns of breast cancer phenotype revealed by molecular profiling Gabriela Alexe, IBM Research DIMACS Workshop on Detecting and Processing Regularities in High Throughput Biological Data June 20 - 22, 2005

2 Peter L Hammer Sorin Alexe David E Axelrod Endre Boros Gyan Bhanot Jorge Lepre Gustavo Stolovitzky Ram Ramaswamy Lillian Chiang Babu Vengatharagavan Arnold J Levine Michael Reiss

3 Outline Motivation Motivation Finding relevant molecular profiles for breast cancer Finding relevant molecular profiles for breast cancer Consensus clustering Consensus clustering Multi-gene biomarker selection Multi-gene biomarker selection Robust pattern-based diagnosis models Robust pattern-based diagnosis models Future work Future work

4 Breast cancer incidence most commonly diagnosed cancer after nonmelanoma skin cancer second leading cause of cancer deaths after lung cancer. US 2005: estimated 213,000 new BCA cases will be diagnosed, and 41,000 deaths / 1.2 million worldwide 1/8 chance to develop BCA 1/33 chance of death 5-10% hereditary

5 Breast cancer: extensive heterogeneous disease both genetic (5-10% BRCA1/2) and non-genetic both genetic (5-10% BRCA1/2) and non-genetic highly variable with regard to pathological and clinical features at molecular level highly variable with regard to pathological and clinical features at molecular level pathological and molecular heterogeneity among pathological and molecular heterogeneity among –different breast cancers –different areas within individual neoplasms personalized treatment: personalized treatment: genuine need to identify parameters that might accurately predict the effectiveness of treatment

6 Stages of breast cancer

7 Histology Hormone receptor status ER +/-, PR+/-, HER2neu+/- DNA Cytometry 2/3 aneuploid (less DNA) / diploid Image Cytometry S-phase Genetic mutations Similar histopathological appearance BCA may have divergent clinical and prognostical course Similar histopathological appearance BCA may have divergent clinical and prognostical course Major need to develop specific and alternative therapies Major need to develop specific and alternative therapies

8 Molecular profiling of BCA Measurement of global expression patterns towards identification of individual genes that mediate particular aspects of cellular physiology DNA microarrays –systematic method to study the mRNA variation between cancer/healthy cells –identification of clinically relevant tumor entities and subclasses –prognostic biomarkers / pathways/ potential therapeutic targets

9 Molecular profiling of BCA Perou et al. Nature 2000 Molecular portraits of human breast tumours genome-www.stanford.edu/breast_cancer identified multiple tumor classes which differ in expression of the ER Luminal A Luminal B ERBB+ Basal Normal

10 Biomedical data Sorlie et al., PNAS 2003 Breast cancer data (Stanford & Norway) cDNA gene expression data 122 breast cancer samples 552 “intrinsic genes” Hierarchical clustering 5 major subgroups of samples / genes Used same techniques to validate findings on external datasets (van’t Veer, West)

11 Biomedical problem Sources of noise - data measurements: experimental noise, 7% missing data - data analysis techniques: hierarchical clustering sensitive to data perturbations - selection of biomarkers: dependent on chip / data analysis technique Goal Robust approach to assess molecular profiles

12 Methods: Preprocessing data Stochastic kNN imputation method Stochastic kNN imputation method similar to kNN imputation (Troyanskaya et al, 2001) Dynamic programming: ensemble of imputations Dynamic programming: ensemble of imputations 530 genes, 118 samples

13 Consensus clustering Assesses the stability of hierarchical clustering across multiple perturbations of the data by simulated stratified re-sampling of 80% of the cases (Monti et al., 2003) Assesses the stability of hierarchical clustering across multiple perturbations of the data by simulated stratified re-sampling of 80% of the cases (Monti et al., 2003) Implemented in GenePattern ttp://www.broad.mit.edu/cancer/software/genepattern / Implemented in GenePattern ttp://www.broad.mit.edu/cancer/software/genepattern / ttp://www.broad.mit.edu/cancer/software/genepattern / ttp://www.broad.mit.edu/cancer/software/genepattern / Consensus (core) clusters: maximal bicliques in agreement matrix (incremental polynomial alg, 2004) Consensus (core) clusters: maximal bicliques in agreement matrix (incremental polynomial alg, 2004)

14 Agreement matrix

15

16 Finding multi-gene biomarkers Logical Analysis of Data, Hammer 1988 Discretization (noise reduction) Discretization (noise reduction) Pattern extraction (efficient algorithms, 2004) Pattern extraction (efficient algorithms, 2004) Model construction (weighted voting) Model construction (weighted voting) Validation Validation Additional information (prominent classes, important features) Additional information (prominent classes, important features) Applied to various biomedical datasets Applied to various biomedical datasets

17 Patterns, Models, Classifiers Positive Patterns Negative Patterns Model

18 P N

19 Examples of patterns

20 Multi-gene biomarkers E.g., Combinations of genes highly predictive of phenotype, not identified in Sorlie et al. Luminal A: 10 Luminal B: 9 ERBB+: 9 Basal: 12 Normal: 12

21 Extensive multi-gene biomarker annotations BIOCARTA, KEGG, DAVID, GENMAPP, GOMINER, PANTHER, I-HOP

22 Pattern-based diagnosis model Prediction Classification

23 Validation Classification accuracy of pattern models through leave-one-out cross validation experiments

24 Conclusions and Future work Provide a robust classification which has significant overlap with previous analyses Provide a robust classification which has significant overlap with previous analyses Clusters Luminal B and ERBB+ unreliable – need further analyses Clusters Luminal B and ERBB+ unreliable – need further analyses Sample reproducibility Sample reproducibility Validate on novel external BCA gene expression datasets Validate on novel external BCA gene expression datasets

25 Thank you for your attention


Download ppt "Gene expression patterns of breast cancer phenotype revealed by molecular profiling Gabriela Alexe, IBM Research DIMACS Workshop on Detecting and Processing."

Similar presentations


Ads by Google