1 Robust diagnosis of DLBCL from gene expression data from different laboratories DIMACS - RUTCOR Workshop on Boolean and Pseudo-Boolean Functions in Memory.

Slides:



Advertisements
Similar presentations
Kwee Yong, UCL Cancer Institute
Advertisements

Predictive Analysis of Gene Expression Data from Human SAGE Libraries Alexessander Alves* Nikolay Zagoruiko + Oleg Okun § Olga Kutnenko + Irina Borisova.
Yan Guo Assistant Professor Department of Cancer Biology Vanderbilt University USA.
Ancha Baranova George Mason University, Fairfax, VA
MiRNA-drug resistance mechanisms Summary Hypothesis: The interplay between miRNAs, signaling pathways and epigenetic and genetic alterations are responsible.
MOLECULAR GENETICS OF B CELL LYMPHOMAS: AN UPDATE Michel Trudel, MD, FRCPC Shaikh Khalifa Medical Center.
1 Robust diagnosis DLBCL from gene expression data from different laboratories Dimacs Workshop, June 22, 2005 Gyan Bhanot, IBM Research.
Introduction Integrative Analysis of Genomic Variants in Carcinogenesis Syed Haider, Arek Kasprzyk, Pietro Lio Artificial Intelligence and Computational.
III 1 Sorin Alexe RUTCOR, Rutgers University, Piscataway, NJ URL: rutcor.rutgers.edu/~salexe Datascope - a new tool.
Logical Analysis of Diffuse Large B Cell Lymphoma Gabriela Alexe 1, Sorin Alexe 1, David Axelrod 2, Peter Hammer 1, and David Weissmann 3 of RUTCOR(1)
Gene expression patterns of breast cancer phenotype revealed by molecular profiling Gabriela Alexe, IBM Research DIMACS Workshop on Detecting and Processing.
4 th NETTAB Workshop Camerino, 5 th -7 th September 2004 Alberto Bertoni, Raffaella Folgieri, Giorgio Valentini
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
. Differentially Expressed Genes, Class Discovery & Classification.
Non-Hodgkin Lymphoma Showing Abnormalities of c-myc Including Dual Translocations Involving c-myc and Bcl-2: A Clinicopathologic Study R Jastania, V Kukreti,
Introduction of Cancer Molecular Epidemiology Zuo-Feng Zhang, MD, PhD University of California Los Angeles.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Supervised gene expression data analysis using SVMs and MLPs Giorgio Valentini
MammaPrint, the story of the 70-gene profile
JAVED KHAN ET AL. NATURE MEDICINE – Volume 7 – Number 6 – JUNE 2001
2nd Quebec Conference on Therapeutic Resistance in Cancer Bienvenue !!!!!
Malignant Melanoma and CDKN2A
Expression profiling of peripheral blood cells for early detection of breast cancer Introduction Early detection of breast cancer is a key to successful.
Gene expression profiling identifies molecular subtypes of gliomas
Classification of multiple cancer types by multicategory support vector machines using gene expression data.
Whole Genome Expression Analysis
Knowledge Discovery in Biomedicine Limsoon Wong Institute for Infocomm Research.
Clustering of DNA Microarray Data Michael Slifker CIS 526.
Biomarker and Classifier Selection in Diverse Genetic Datasets J AMES L INDSAY 1 E D H EMPHILL 2 C HIH L EE 1 I ON M ANDOIU 1 C RAIG N ELSON 2 U NIVERSITY.
Dr. Ziad W Jaradat Cancer Stem Cells. Recently biologically distinct and relatively rare populations of tumor-initiating cells have been identified in.
Exagen Diagnostics, Inc., all rights reserved Biomarker Discovery in Genomic Data with Partial Clinical Annotation Cole Harris, Noushin Ghaffari.
University of Washington Institute of Technology Tacoma, WA, USA Ecole des Hautes Etudes en Santé Publique Département Infobiostat Rennes, France Isabelle.
1 Classifying Lymphoma Dataset Using Multi-class Support Vector Machines INFS-795 Advanced Data Mining Prof. Domeniconi Presented by Hong Chai.
Arthur Edwards Broad Summer Research Program in Genomics Cancer Program 08/06/07 Genome-wide miRNA Expression Analysis in Lymphoma miRNAs Lymphoma.
The Broad Institute of MIT and Harvard Classification / Prediction.
Building and Running caGrid Workflows in Taverna 1 Computation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL, USA 2 Mathematics.
Selection of Patient Samples and Genes for Disease Prognosis Limsoon Wong Institute for Infocomm Research Joint work with Jinyan Li & Huiqing Liu.
Construction of cancer pathways for personalized medicine | Presented By Date Construction of cancer pathways for personalized medicine Predictive, Preventive.
Apostolos Zaravinos and Constantinos C Deltas Molecular Medicine Research Center and Laboratory of Molecular and Medical Genetics, Department of Biological.
Class Prediction and Discovery Using Gene Expression Data Donna K. Slonim, Pablo Tamayo, Jill P. Mesirov, Todd R. Golub, Eric S. Lander 발표자 : 이인희.
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks From Nature Medicine 7(6) 2001 By Javed.
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
Prognostic and Predictive Factors: Current Evidence for Individualized Therapy Predictive Molecular Markers: Hormone Receptor Status Presented by Kathleen.
PREDICTING OUTCOME IN OSTEOSARCOMA USING A GENOME-WIDE APPROACH N Gokgoz,T Yan, M Ghert, S Eskandarian W He, R Parkes, SB Bull, RS Bell, IL Andrulis and.
Artificial Intelligence Project #3 : Diagnosis Using Bayesian Networks May 19, 2005.
Examples of Classifying Expression Data / 7.90 Computational Functional Genomics Spring 2002.
Prof. Yechiam Yemini (YY) Computer Science Department Columbia University (c)Copyrights; Yechiam Yemini; Lecture 2: Introduction to Paradigms 2.3.
The Broad Institute of MIT and Harvard Differential Analysis.
Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring T.R. Golub et al., Science 286, 531 (1999)
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
(1) Genotype-Tissue Expression (GTEx) Largest systematic study of genetic regulation in multiple tissues to date 53 tissues, 500+ donors, 9K samples, 180M.
Annals of Oncology 23: 298–304, 2012 종양혈액내과 R4 김태영 / prof. 김시영.
Raphael Sandaltzopoulos, PhD, MBA Professor at MBG (Molecular Biology) Lab. of Gene Expression, Molecular Diagnosis and Modern Therapeutics,
Evolution-informed Modeling discover biomarkers for precision oncology Li Liu, M.D. August 22, 2016.
R3 조 욱 Salivary Transcriptomic Biomarkers for Detection of Resectable Pancreatic Cancer Articles LEI ZHANG, JAMES J. FARRELL, HUI ZHOU, DAVID ELASHOFF,
David Amar, Tom Hait, and Ron Shamir
Combinatorial interactions of cyclins and cyclin-dependent kinases (cdks) during the cell cycle. Progression from G0 through the restriction point in G1.
Classifiers!!! BCH339N Systems Biology / Bioinformatics – Spring 2016
Chronic immune activation in HIV associated Non Hodgkin lymphoma and the effect of antiretroviral therapy Brian Flepisi University of the Western Cape.
Impact of Formal Methods in Biology and Medicine Final Review
Impact of Formal Methods in Biology and Medicine
Impact of Formal Methods in Biology and Medicine
Dan Gordon  Gastroenterology  Volume 114, Issue 4, (April 1998)
Class Prediction Based on Gene Expression Data Issues in the Design and Analysis of Microarray Experiments Michael D. Radmacher, Ph.D. Biometric Research.
Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer Yang et al Presented by Yves A. Lussier MD PhD The University.
Robust diagnosis of DLBCL from gene expression data from different laboratories DIMACS - RUTCOR Workshop on Boolean and Pseudo-Boolean Functions in Memory.
Presentation transcript:

1 Robust diagnosis of DLBCL from gene expression data from different laboratories DIMACS - RUTCOR Workshop on Boolean and Pseudo-Boolean Functions in Memory of Peter L. Hammer January 19-22, 2009

2 Peter L Hammer Sorin Alexe David E Axelrod RUTGERS UNIV Gustavo Stolovitzky IBM TJ WATSON RESEARCH Gyan Bhanot Arnold J Levine INSTITUTE FOR ADVANCED STUDY PRINCETON David Weissmann CANCER INSTITUTE OF NEW JERSEY

3 Overview Motivation Pattern-based ensemble classifiers Case study – compare data from two labs for DLBCL vs FL diagnosis Shipp et al. (2002) Nature Med.; 8(1), (Whitehead Lab) Stolovitzky G. (2005) In Deisboeck et al Complex Systems Science in BioMedicine (in press) (preprint: (DellaFavera Lab) Alexe, Alexe, Axelrod, Hammer, Weissmann (2005) Artificial Intelligence in Medicine Bhanot, Alexe, Stolowitzky, Levine (2005) Genome Informatics

4 Non-Hodgkin lymphomas FLlow grade non-Hodgkin lymphoma / no cure if advanced stage second most frequent subtype of nodal lymphoid malignancies Incidence has risen from 2–3/ to more than 5–7/ 100,000/year (’50 –’00) t(14;18) translocation:over-expression of anti-apoptotic bcl % FL cases evolve to DLBCL DLBCL high grade non-Hodgkin lymphoma / high variability to treatment most frequent subtype of NHL < 2 years survival if untreated Biomarkers: FL transformation to DLBCL p53/MDM2 (Moller et al., 1999) p16 (Pyniol, 1998) p38MAPK (Elenitoba-Johnson et al., 2003) c-myc (Lossos et al., 2002)

5 Gene arrays Gene arrays are a way to study the variation of mRNA levels between different types of cells. This allows diagnosis and inference of pathways that cause disease / early stage diagnosis Identify molecular profiles of disease – personalized medicine

6 Lymphoma datasets Data:WI (Shipp et al., 2002) Affy HuGeneFL CU (DallaFavera Lab, Stolovitzky, 2005) Affy Hu95Av2 Samples: WI: 58 DLBCL & 19 FL CU: 14 DLBCL & 7 FL Genes: WI: 6817 CU: 12581

7 Diagnosis problem Input Training (biomedical) data: 2 classes: FL and DLBCL m samples described by N >> features Output Collection of robust biomarkers, models Robust, accurate classifier / tested on out-of-sample data

8

9 Patterns (Logical Analysis of Data, Hammer 1988) Positive Patterns Negative Patterns Model - Exhaustive collections of patterns -Pattern space -Classification / attribute analysis / new class identification

10 Data Preprocessing 50 % P calls, UL = 16000, LL = 20 2/1 stratify WI data to train/test CU data test Normalize data to median 1000 per array Generate 500 data sets using noise + k fold stratified sampling + jackknife Find genes with high correlation to phenotype using t-test or SNR. Keep genes that are in > 90% of datasets

11 Choosing support sets Create quality patterns using small subsets of genes, validate using weighted voting with 10 fold cross validation Sort genes by their appearance in good patterns Select top genes to cover each sample by at least 10 patterns Alexe, Alexe, Hammer, Vizvari (2005)

12 The 30 genes that best distinguish FL from DLBCL

13 Genes identified by LAD (AIIM 2005) to distinguish DLBCL from FL

14 Examples of FL and DLBCL patterns WI training data: Each DLBCL case satisfies at least one of the patterns P1 and P2 Each FL case satisfies the pattern N1 (and none of the patterns P1 and P2)

15 Pattern data

16 Meta-classifier performance

17 Error distribution: raw and pattern data

18 Biology based method

19 p53 related genes identified by filtering procedure FL  DLBCL progression

20 p53 pattern data

21 Examples of p53 responsive genes patterns WI data: Each DLBCL case satisfies one of the patterns P1, P2, P3 Each FL case satisfies one of the patterns N1, N2, N3

22 p53 combinatorial biomarker 77% FL & 21% DLBCL cases (3.7 fold) at most one gene over-expressed 79% DLBCL & 23% FL cases (3.4 fold) at least two genes over-expressed Each individual gene: over- expressed in about 40-70% DLBCL & 20-40% FL (specificity 50-60%, sensitivity 60-70%)

23 What are these genes? Plk1 (stpk13): polo-like kinase serine threonine protein kinase 13, M-phase specific cell transformation, neoplastic, drives quiescent cells into mitosis over-expressed in various human tumors Takai et al., Oncogene, 2005: plk1 potential target for cancer therapy, new prognostic marker for cancer Mito et al, Leuk Lymph, 2005: plk1 biomarker for DLBCL Cdk2 (p33): cyclin -dependent kinase: G2/M transition of mitotic cell cycle, interacts with cyclins A, B3, D, E P53 tumor suppressor gene (Levine 1982)

24 Conclusions Pattern-based meta-classifier is robust against noise Good prediction of FL  DLBCL Biology based analysis also possible Yields useful biomarker Should study biologically motivated sets of genes  build pathways

25 Thank you for your attention ! <>