Presentation is loading. Please wait.

Presentation is loading. Please wait.

Logical Analysis of Diffuse Large B Cell Lymphoma Gabriela Alexe 1, Sorin Alexe 1, David Axelrod 2, Peter Hammer 1, and David Weissmann 3 of RUTCOR(1)

Similar presentations


Presentation on theme: "Logical Analysis of Diffuse Large B Cell Lymphoma Gabriela Alexe 1, Sorin Alexe 1, David Axelrod 2, Peter Hammer 1, and David Weissmann 3 of RUTCOR(1)"— Presentation transcript:

1 Logical Analysis of Diffuse Large B Cell Lymphoma Gabriela Alexe 1, Sorin Alexe 1, David Axelrod 2, Peter Hammer 1, and David Weissmann 3 of RUTCOR(1) and Department of Genetics(2), Rutgers University; and Robert Wood Johnson Medical School(3)

2 RUTCORRUTCOR 2 This Talk LymphomaLymphoma Gene Expression Level AnalysisGene Expression Level Analysis cDNA MicroarraycDNA Microarray Applied to Diffuse Large B-Cell LymphomaApplied to Diffuse Large B-Cell Lymphoma Logical Analysis of Dat Logical Analysis of Data Discretization/BinarizationDiscretization/Binarization Support SetsSupport Sets Pattern GenerationPattern Generation Theories and ModelsTheories and Models PredictionPrediction

3 Lymphoma Lymphoma

4 RUTCORRUTCOR 4 Lymphoma Cancer of lymphoid cellsCancer of lymphoid cells ClonalClonal Uncontrolled growthUncontrolled growth MetastasisMetastasis LymphomaLymphoma DiagnosisDiagnosis GradeGrade

5 RUTCORRUTCOR 5 Diffuse Large B Cell Lymphoma (DLBCL) 31% of non-Hodgkin lymphoma cases31% of non-Hodgkin lymphoma cases 50% long-term, disease-free survival50% long-term, disease-free survival Clinical variabilityClinical variability Prognosis & therapyPrognosis & therapy IPIIPI MorphologyMorphology Gene expressionGene expression

6 RUTCORRUTCOR 6 Diffuse Large B Cell Lymphoma

7 RUTCORRUTCOR 7 Spleen with Diffuse Large B Cell Lymphoma

8 Gene Expression Level Analysis

9 RUTCORRUTCOR 9 DNA-RNA Hybridization

10 RUTCORRUTCOR 10 Gene Expression Profiling cDNA microarray analysis Tumor Standard

11 RUTCORRUTCOR 11 DLBCL & cDNA Microarray Analysis Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling, Alizadeh et al., Nature, Vol 403, pp 503-511Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling, Alizadeh et al., Nature, Vol 403, pp 503-511 cDNA microarray data -> unsupervised hierarchical agglomerative clusteringcDNA microarray data -> unsupervised hierarchical agglomerative clustering Germinal center signature: 76% survival at 5 yearsGerminal center signature: 76% survival at 5 years Activated B cell signature: 16% at 5 yearsActivated B cell signature: 16% at 5 years

12 RUTCORRUTCOR 12 DLBCL Clustering Each case (patient) is a point in N-dimensional space where N = # of genes Germinal center genes Activated B cell genes

13 RUTCORRUTCOR 13 DLBCL Survival by Type

14 RUTCORRUTCOR 14 Supervised Learning Classification of DLBCL Diffuse large B-cell lymphoma prediction by gene-expression profiling and supervised machine learning Shipp et al., Nature Medicine, vol 8, p 68-74Diffuse large B-cell lymphoma prediction by gene-expression profiling and supervised machine learning Shipp et al., Nature Medicine, vol 8, p 68-74 Prognosis of DLBCLPrognosis of DLBCL Highly correlated genes -> weighted voting algorithmHighly correlated genes -> weighted voting algorithm

15 RUTCORRUTCOR 15 Shipp’s 13 Gene Predictor

16 Logical Analysis of Data

17 RUTCORRUTCOR 17 Logical Analysis of Data (LAD) Non-statistical method based on:Non-statistical method based on: CombinatoricsCombinatorics OptimizationOptimization LogicLogic Based on dataset of cases/patientsBased on dataset of cases/patients LAD learns patterns characteristic of classesLAD learns patterns characteristic of classes Subsets of patients who are +/- for a conditionSubsets of patients who are +/- for a condition Collections of patterns are extensibleCollections of patterns are extensible PredictionsPredictions

18 RUTCORRUTCOR 18 The Problem : Approximation of Hidden Function Hidden Function LAD Approximation Dataset

19 RUTCORRUTCOR 19 Main Components of LAD Discretization/BinarizationDiscretization/Binarization Support SetsSupport Sets Pattern GenerationPattern Generation Theories and ModelsTheories and Models PredictionPrediction

20 RUTCORRUTCOR 20 Discretization SeparatingCutpoints Minimum Set of Separating Cutpoints

21 RUTCORRUTCOR 21 Cutpoints and Support Set Minimization is NP hard Numerous powerful methods Support set: Cutpoints define a grid in which ideally no cell contains both + and – cases Cutpoints simplify data and decrease noise

22 RUTCORRUTCOR 22 Patterns Examples: Gene A > 34 & gene B < 24 & gene C < 2 Positive and negative patterns Pattern parameters: Degree (# of conditions) Prevalence (# of +/- cases that satisfy it) Homogeneity (proportion of +/- cases among those it covers) Best: low degree, large prevalence, high homogeneity Patterns are extensible!

23 RUTCORRUTCOR 23 Pattern Generation Generate patterns based on learning set Stipulate control parameters. For example: Degree  4 + & - prevalences >= 70% + & - homogeneities = 100% All 75 patterns in 1.2 seconds on Pentium IV 1 Gz PC Evaluate set: Average # of patterns covering each observation Accuracy applied to evaluation set

24 RUTCORRUTCOR 24 Patterns: Illustration Positive Pattern Negative Pattern Negative Pattern

25 RUTCORRUTCOR 25 Positive Theory Negative Theory Theories: Approximations of the 2 Regions A theory is a set of positive (or negative) patterns such that every positive (or negative) case is covered.

26 RUTCORRUTCOR 26 Models A set of a positive and a negative theory A good model: Small number of features (genes) Patterns are high quality Low degrees High prevalences High homogeneities Number of patterns is small Maximize their biologic interpretability

27 RUTCORRUTCOR 27 Positive Theory Negative Theory Model Theories and Models Positive Area Negative Area Unexplained Area Discordant Area

28 RUTCORRUTCOR 28 LAD Prediction A new case: a set of gene expression levelsA new case: a set of gene expression levels Satisfy some positive & no negative?Satisfy some positive & no negative? Satisfy some negative & no positive ?Satisfy some negative & no positive ? Satisfy some of both?Satisfy some of both? Which more?Which more? Does not satisfy any (rare)Does not satisfy any (rare)

29 RUTCORRUTCOR 29 8 Gene Classification Model

30 RUTCORRUTCOR 30 Accuracy of Prognosis

31 RUTCORRUTCOR 31 Conclusion Logical Analysis of Data (LAD ): a versatile new classification method here applied to diagnosis and prognosis of lymphoma.Logical Analysis of Data (LAD ): a versatile new classification method here applied to diagnosis and prognosis of lymphoma. LAD genes differ almost entirely from those specified by other studies.LAD genes differ almost entirely from those specified by other studies. Genes not individually correlated with diagnosis or prognosis but highly correlated in combinations of as few as two genes.Genes not individually correlated with diagnosis or prognosis but highly correlated in combinations of as few as two genes. Patterns suggest biologic pathwaysPatterns suggest biologic pathways LAD provides highly accurate pLAD provides highly accurate prognosis of DLBCL

32 RUTCORRUTCOR 32 Contacts Gabriela Alexe: galexe@us.ibm.com Soren Alexe: salexe@rutcor.rutgers.edu David Axelrod: axelrod@biology.rutgers.edu Peter Hammer: hammer@rutcor.rutgers.edu David Weissmann: weissmdj@umdnj.edu


Download ppt "Logical Analysis of Diffuse Large B Cell Lymphoma Gabriela Alexe 1, Sorin Alexe 1, David Axelrod 2, Peter Hammer 1, and David Weissmann 3 of RUTCOR(1)"

Similar presentations


Ads by Google