Gene expression patterns of breast cancer phenotype revealed by molecular profiling Gabriela Alexe, IBM Research DIMACS Workshop on Detecting and Processing.

Slides:



Advertisements
Similar presentations
TOP2A IS AN INDEPENDENT PREDICTOR OF SURVIVAL IN UNSELECTED BREAST CANCER Amit Pancholi Molecular Profiling of Breast Cancer: Predictive Markers of Long.
Advertisements

Triple-Negative Breast Cancer
A gene expression analysis system for medical diagnosis D. Maroulis, D. Iakovidis, S. Karkanis, I. Flaounas D. Maroulis, D. Iakovidis, S. Karkanis, I.
Supervised and unsupervised analysis of gene expression data Bing Zhang Department of Biomedical Informatics Vanderbilt University
Bayesian Factor Regression Models in the “Large p, Small n” Paradigm Mike West, Duke University Presented by: John Paisley Duke University.
A trial for women with –‘Triple negative’ breast cancer (TNBC) –Localised to breast +/- lymph nodes –Recommended standard treatment involves NEPTUNE Taxane.
Discovery Challenge Gene expression datasets On behalf of Olivier Gandrillon.
Carolina Breast Cancer Study: Breast cancer subtypes and race Robert Millikan University of North Carolina Chapel Hill, NC.
MiRNA-drug resistance mechanisms Summary Hypothesis: The interplay between miRNAs, signaling pathways and epigenetic and genetic alterations are responsible.
Expression profiles for prognosis and prediction Laura J. Van ‘t Veer The Netherlands Cancer Institute, Amsterdam.
1 Robust diagnosis DLBCL from gene expression data from different laboratories Dimacs Workshop, June 22, 2005 Gyan Bhanot, IBM Research.
Genetic algorithms applied to multi-class prediction for the analysis of gene expressions data C.H. Ooi & Patrick Tan Presentation by Tim Hamilton.
Introduction Integrative Analysis of Genomic Variants in Carcinogenesis Syed Haider, Arek Kasprzyk, Pietro Lio Artificial Intelligence and Computational.
Microarrays Dr Peter Smooker,
Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur,
III 1 Sorin Alexe RUTCOR, Rutgers University, Piscataway, NJ URL: rutcor.rutgers.edu/~salexe Datascope - a new tool.
Logical Analysis of Diffuse Large B Cell Lymphoma Gabriela Alexe 1, Sorin Alexe 1, David Axelrod 2, Peter Hammer 1, and David Weissmann 3 of RUTCOR(1)
Discrimination and clustering with microarray gene expression data Terry Speed, Jane Fridlyand, Yee Hwa Yang and Sandrine Dudoit* Department of Statistics,
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Part II: Discriminative Margin Clustering Joint work with: Rob Tibshirani, Dept of Statistics Patrick O. Brown, School of Medicine Stanford University.
4 th NETTAB Workshop Camerino, 5 th -7 th September 2004 Alberto Bertoni, Raffaella Folgieri, Giorgio Valentini
DIMACS Workshop on Machine Learning Techniques in Bioinformatics 1 Cancer Classification with Data-dependent Kernels Anne Ya Zhang (with Xue-wen.
DNA Microarrays Examining Gene Expression. Prof. GrossBiology 4 DNA MicroArrays DNA MicroArrays use hybridization technology to examine gene expression.
Supervised gene expression data analysis using SVMs and MLPs Giorgio Valentini
1 Robust diagnosis of DLBCL from gene expression data from different laboratories DIMACS - RUTCOR Workshop on Boolean and Pseudo-Boolean Functions in Memory.
Breast Cancers With Brain Metastases are More Likely to be Estrogen Receptor Negative, Express the Basal Cytokeratin CK5/6, and Overexpress HER2 or EGFR.
MammaPrint, the story of the 70-gene profile
Analysis of microarray data
Comprehensive Gene Expression Analysis of Prostate Cancer Reveals Distinct Transcriptional Programs Associated With Metastatic Disease Kevin Paiz-Ramirez.
Metastatic Breast Cancer: One Size Does Not Fit All Clifford Hudis, M.D. Chief, Breast Cancer Medicine Service MSKCC.
2nd Quebec Conference on Therapeutic Resistance in Cancer Bienvenue !!!!!
Expression profiling of peripheral blood cells for early detection of breast cancer Introduction Early detection of breast cancer is a key to successful.
Gene expression profiling identifies molecular subtypes of gliomas
Chapter 7 Essential Concepts in Molecular Pathology Companion site for Molecular Pathology Author: William B. Coleman and Gregory J. Tsongalis.
Clustering of DNA Microarray Data Michael Slifker CIS 526.
Exagen Diagnostics, Inc., all rights reserved Biomarker Discovery in Genomic Data with Partial Clinical Annotation Cole Harris, Noushin Ghaffari.
LUNG ADENOCARCINOMAS. CLINICOPATHOLOGICAL STUDY WITH RESPECT TO THE UPCOMING NEW CLASSIFICATION AND EGFR-KRAS MUTATION ANALYSIS IMPLICATIONS. First author:
Combined Experimental and Computational Modeling Studies at the Example of ErbB Family Birgit Schoeberl.
The Broad Institute of MIT and Harvard Classification / Prediction.
Computational biology of cancer cell pathways Modelling of cancer cell function and response to therapy.
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks From Nature Medicine 7(6) 2001 By Javed.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
Lecture 8. Functional Genomics: Gene Expression Profiling using DNA microarrays. Part II Clark EA, Golub TR, Lander ES, Hynes RO.(2000) Genomic analysis.
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology Doug Welsh and Brian Davis BioQuest Workshop Beloit Wisconsin, June.
Nuria Lopez-Bigas Methods and tools in functional genomics (microarrays) BCO17.
Dr Godfrey Grech University of Malta
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Cluster validation Integration ICES Bioinformatics.
A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt,
Brad Windle, Ph.D Unsupervised Learning and Microarrays Web Site: Link to Courses and.
Prof. Yechiam Yemini (YY) Computer Science Department Columbia University (c)Copyrights; Yechiam Yemini; Lecture 2: Introduction to Paradigms 2.3.
Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and.
Computational Biology Group. Class prediction of tumor samples Supervised Clustering Detection of Subgroups in a Class.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Full Proposal for the German Cancer Aid Priority Program 'Translational Oncology' (2st call) 2015 Lead Applicants: Prof. Dr. med. Magnus von Knebel Doeberitz.
THIRD CLASSIFICATION OF MICROCALCIFICATION STAGES IN MAMMOGRAPHIC IMAGES THIRD REVIEW Supervisor: Mrs.P.Valarmathi HOD/CSE Project Members: M.HamsaPriya( )
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
Annals of Oncology 23: 298–304, 2012 종양혈액내과 R4 김태영 / prof. 김시영.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 3.
Estrogen-Regulated Genes Predict Survival in Hormone Receptor–Positive Breast Cancers J Clin Oncol 24: Daniel S. Oh, Melissa A. Troester,
High-throughput genomic profiling of tumor-infiltrating leukocytes
Gene expression.
Deep Learning Analysis of Gene Expression Data for Breast Cancer Classification AS Y.P. Manawadu.
Hallett, et al., - Supplementary Figure 1
Class Prediction Based on Gene Expression Data Issues in the Design and Analysis of Microarray Experiments Michael D. Radmacher, Ph.D. Biometric Research.
Robust diagnosis of DLBCL from gene expression data from different laboratories DIMACS - RUTCOR Workshop on Boolean and Pseudo-Boolean Functions in Memory.
Loyola Marymount University
Breast Cancer Subtype Identification Using RNA-Seq Data
Nadia Howlader, PhD National Cancer Institute
Presentation transcript:

Gene expression patterns of breast cancer phenotype revealed by molecular profiling Gabriela Alexe, IBM Research DIMACS Workshop on Detecting and Processing Regularities in High Throughput Biological Data June , 2005

Peter L Hammer Sorin Alexe David E Axelrod Endre Boros Gyan Bhanot Jorge Lepre Gustavo Stolovitzky Ram Ramaswamy Lillian Chiang Babu Vengatharagavan Arnold J Levine Michael Reiss

Outline Motivation Motivation Finding relevant molecular profiles for breast cancer Finding relevant molecular profiles for breast cancer Consensus clustering Consensus clustering Multi-gene biomarker selection Multi-gene biomarker selection Robust pattern-based diagnosis models Robust pattern-based diagnosis models Future work Future work

Breast cancer incidence most commonly diagnosed cancer after nonmelanoma skin cancer second leading cause of cancer deaths after lung cancer. US 2005: estimated 213,000 new BCA cases will be diagnosed, and 41,000 deaths / 1.2 million worldwide 1/8 chance to develop BCA 1/33 chance of death 5-10% hereditary

Breast cancer: extensive heterogeneous disease both genetic (5-10% BRCA1/2) and non-genetic both genetic (5-10% BRCA1/2) and non-genetic highly variable with regard to pathological and clinical features at molecular level highly variable with regard to pathological and clinical features at molecular level pathological and molecular heterogeneity among pathological and molecular heterogeneity among –different breast cancers –different areas within individual neoplasms personalized treatment: personalized treatment: genuine need to identify parameters that might accurately predict the effectiveness of treatment

Stages of breast cancer

Histology Hormone receptor status ER +/-, PR+/-, HER2neu+/- DNA Cytometry 2/3 aneuploid (less DNA) / diploid Image Cytometry S-phase Genetic mutations Similar histopathological appearance BCA may have divergent clinical and prognostical course Similar histopathological appearance BCA may have divergent clinical and prognostical course Major need to develop specific and alternative therapies Major need to develop specific and alternative therapies

Molecular profiling of BCA Measurement of global expression patterns towards identification of individual genes that mediate particular aspects of cellular physiology DNA microarrays –systematic method to study the mRNA variation between cancer/healthy cells –identification of clinically relevant tumor entities and subclasses –prognostic biomarkers / pathways/ potential therapeutic targets

Molecular profiling of BCA Perou et al. Nature 2000 Molecular portraits of human breast tumours genome- identified multiple tumor classes which differ in expression of the ER Luminal A Luminal B ERBB+ Basal Normal

Biomedical data Sorlie et al., PNAS 2003 Breast cancer data (Stanford & Norway) cDNA gene expression data 122 breast cancer samples 552 “intrinsic genes” Hierarchical clustering 5 major subgroups of samples / genes Used same techniques to validate findings on external datasets (van’t Veer, West)

Biomedical problem Sources of noise - data measurements: experimental noise, 7% missing data - data analysis techniques: hierarchical clustering sensitive to data perturbations - selection of biomarkers: dependent on chip / data analysis technique Goal Robust approach to assess molecular profiles

Methods: Preprocessing data Stochastic kNN imputation method Stochastic kNN imputation method similar to kNN imputation (Troyanskaya et al, 2001) Dynamic programming: ensemble of imputations Dynamic programming: ensemble of imputations 530 genes, 118 samples

Consensus clustering Assesses the stability of hierarchical clustering across multiple perturbations of the data by simulated stratified re-sampling of 80% of the cases (Monti et al., 2003) Assesses the stability of hierarchical clustering across multiple perturbations of the data by simulated stratified re-sampling of 80% of the cases (Monti et al., 2003) Implemented in GenePattern ttp:// / Implemented in GenePattern ttp:// / ttp:// / ttp:// / Consensus (core) clusters: maximal bicliques in agreement matrix (incremental polynomial alg, 2004) Consensus (core) clusters: maximal bicliques in agreement matrix (incremental polynomial alg, 2004)

Agreement matrix

Finding multi-gene biomarkers Logical Analysis of Data, Hammer 1988 Discretization (noise reduction) Discretization (noise reduction) Pattern extraction (efficient algorithms, 2004) Pattern extraction (efficient algorithms, 2004) Model construction (weighted voting) Model construction (weighted voting) Validation Validation Additional information (prominent classes, important features) Additional information (prominent classes, important features) Applied to various biomedical datasets Applied to various biomedical datasets

Patterns, Models, Classifiers Positive Patterns Negative Patterns Model

P N

Examples of patterns

Multi-gene biomarkers E.g., Combinations of genes highly predictive of phenotype, not identified in Sorlie et al. Luminal A: 10 Luminal B: 9 ERBB+: 9 Basal: 12 Normal: 12

Extensive multi-gene biomarker annotations BIOCARTA, KEGG, DAVID, GENMAPP, GOMINER, PANTHER, I-HOP

Pattern-based diagnosis model Prediction Classification

Validation Classification accuracy of pattern models through leave-one-out cross validation experiments

Conclusions and Future work Provide a robust classification which has significant overlap with previous analyses Provide a robust classification which has significant overlap with previous analyses Clusters Luminal B and ERBB+ unreliable – need further analyses Clusters Luminal B and ERBB+ unreliable – need further analyses Sample reproducibility Sample reproducibility Validate on novel external BCA gene expression datasets Validate on novel external BCA gene expression datasets

Thank you for your attention