Download presentation
Presentation is loading. Please wait.
1
Logical Analysis of Diffuse Large B Cell Lymphoma Gabriela Alexe 1, Sorin Alexe 1, David Axelrod 2, Peter Hammer 1, and David Weissmann 3 of RUTCOR(1) and Department of Genetics(2), Rutgers University; and Robert Wood Johnson Medical School(3)
2
RUTCORRUTCOR 2 This Talk LymphomaLymphoma Gene Expression Level AnalysisGene Expression Level Analysis cDNA MicroarraycDNA Microarray Applied to Diffuse Large B-Cell LymphomaApplied to Diffuse Large B-Cell Lymphoma Logical Analysis of Dat Logical Analysis of Data Discretization/BinarizationDiscretization/Binarization Support SetsSupport Sets Pattern GenerationPattern Generation Theories and ModelsTheories and Models PredictionPrediction
3
Lymphoma Lymphoma
4
RUTCORRUTCOR 4 Lymphoma Cancer of lymphoid cellsCancer of lymphoid cells ClonalClonal Uncontrolled growthUncontrolled growth MetastasisMetastasis LymphomaLymphoma DiagnosisDiagnosis GradeGrade
5
RUTCORRUTCOR 5 Diffuse Large B Cell Lymphoma (DLBCL) 31% of non-Hodgkin lymphoma cases31% of non-Hodgkin lymphoma cases 50% long-term, disease-free survival50% long-term, disease-free survival Clinical variabilityClinical variability Prognosis & therapyPrognosis & therapy IPIIPI MorphologyMorphology Gene expressionGene expression
6
RUTCORRUTCOR 6 Diffuse Large B Cell Lymphoma
7
RUTCORRUTCOR 7 Spleen with Diffuse Large B Cell Lymphoma
8
Gene Expression Level Analysis
9
RUTCORRUTCOR 9 DNA-RNA Hybridization
10
RUTCORRUTCOR 10 Gene Expression Profiling cDNA microarray analysis Tumor Standard
11
RUTCORRUTCOR 11 DLBCL & cDNA Microarray Analysis Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling, Alizadeh et al., Nature, Vol 403, pp 503-511Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling, Alizadeh et al., Nature, Vol 403, pp 503-511 cDNA microarray data -> unsupervised hierarchical agglomerative clusteringcDNA microarray data -> unsupervised hierarchical agglomerative clustering Germinal center signature: 76% survival at 5 yearsGerminal center signature: 76% survival at 5 years Activated B cell signature: 16% at 5 yearsActivated B cell signature: 16% at 5 years
12
RUTCORRUTCOR 12 DLBCL Clustering Each case (patient) is a point in N-dimensional space where N = # of genes Germinal center genes Activated B cell genes
13
RUTCORRUTCOR 13 DLBCL Survival by Type
14
RUTCORRUTCOR 14 Supervised Learning Classification of DLBCL Diffuse large B-cell lymphoma prediction by gene-expression profiling and supervised machine learning Shipp et al., Nature Medicine, vol 8, p 68-74Diffuse large B-cell lymphoma prediction by gene-expression profiling and supervised machine learning Shipp et al., Nature Medicine, vol 8, p 68-74 Prognosis of DLBCLPrognosis of DLBCL Highly correlated genes -> weighted voting algorithmHighly correlated genes -> weighted voting algorithm
15
RUTCORRUTCOR 15 Shipp’s 13 Gene Predictor
16
Logical Analysis of Data
17
RUTCORRUTCOR 17 Logical Analysis of Data (LAD) Non-statistical method based on:Non-statistical method based on: CombinatoricsCombinatorics OptimizationOptimization LogicLogic Based on dataset of cases/patientsBased on dataset of cases/patients LAD learns patterns characteristic of classesLAD learns patterns characteristic of classes Subsets of patients who are +/- for a conditionSubsets of patients who are +/- for a condition Collections of patterns are extensibleCollections of patterns are extensible PredictionsPredictions
18
RUTCORRUTCOR 18 The Problem : Approximation of Hidden Function Hidden Function LAD Approximation Dataset
19
RUTCORRUTCOR 19 Main Components of LAD Discretization/BinarizationDiscretization/Binarization Support SetsSupport Sets Pattern GenerationPattern Generation Theories and ModelsTheories and Models PredictionPrediction
20
RUTCORRUTCOR 20 Discretization SeparatingCutpoints Minimum Set of Separating Cutpoints
21
RUTCORRUTCOR 21 Cutpoints and Support Set Minimization is NP hard Numerous powerful methods Support set: Cutpoints define a grid in which ideally no cell contains both + and – cases Cutpoints simplify data and decrease noise
22
RUTCORRUTCOR 22 Patterns Examples: Gene A > 34 & gene B < 24 & gene C < 2 Positive and negative patterns Pattern parameters: Degree (# of conditions) Prevalence (# of +/- cases that satisfy it) Homogeneity (proportion of +/- cases among those it covers) Best: low degree, large prevalence, high homogeneity Patterns are extensible!
23
RUTCORRUTCOR 23 Pattern Generation Generate patterns based on learning set Stipulate control parameters. For example: Degree 4 + & - prevalences >= 70% + & - homogeneities = 100% All 75 patterns in 1.2 seconds on Pentium IV 1 Gz PC Evaluate set: Average # of patterns covering each observation Accuracy applied to evaluation set
24
RUTCORRUTCOR 24 Patterns: Illustration Positive Pattern Negative Pattern Negative Pattern
25
RUTCORRUTCOR 25 Positive Theory Negative Theory Theories: Approximations of the 2 Regions A theory is a set of positive (or negative) patterns such that every positive (or negative) case is covered.
26
RUTCORRUTCOR 26 Models A set of a positive and a negative theory A good model: Small number of features (genes) Patterns are high quality Low degrees High prevalences High homogeneities Number of patterns is small Maximize their biologic interpretability
27
RUTCORRUTCOR 27 Positive Theory Negative Theory Model Theories and Models Positive Area Negative Area Unexplained Area Discordant Area
28
RUTCORRUTCOR 28 LAD Prediction A new case: a set of gene expression levelsA new case: a set of gene expression levels Satisfy some positive & no negative?Satisfy some positive & no negative? Satisfy some negative & no positive ?Satisfy some negative & no positive ? Satisfy some of both?Satisfy some of both? Which more?Which more? Does not satisfy any (rare)Does not satisfy any (rare)
29
RUTCORRUTCOR 29 8 Gene Classification Model
30
RUTCORRUTCOR 30 Accuracy of Prognosis
31
RUTCORRUTCOR 31 Conclusion Logical Analysis of Data (LAD ): a versatile new classification method here applied to diagnosis and prognosis of lymphoma.Logical Analysis of Data (LAD ): a versatile new classification method here applied to diagnosis and prognosis of lymphoma. LAD genes differ almost entirely from those specified by other studies.LAD genes differ almost entirely from those specified by other studies. Genes not individually correlated with diagnosis or prognosis but highly correlated in combinations of as few as two genes.Genes not individually correlated with diagnosis or prognosis but highly correlated in combinations of as few as two genes. Patterns suggest biologic pathwaysPatterns suggest biologic pathways LAD provides highly accurate pLAD provides highly accurate prognosis of DLBCL
32
RUTCORRUTCOR 32 Contacts Gabriela Alexe: galexe@us.ibm.com Soren Alexe: salexe@rutcor.rutgers.edu David Axelrod: axelrod@biology.rutgers.edu Peter Hammer: hammer@rutcor.rutgers.edu David Weissmann: weissmdj@umdnj.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.