Presentation is loading. Please wait.

Presentation is loading. Please wait.

Knowledge Discovery in Biomedicine Limsoon Wong Institute for Infocomm Research.

Similar presentations


Presentation on theme: "Knowledge Discovery in Biomedicine Limsoon Wong Institute for Infocomm Research."— Presentation transcript:

1 Knowledge Discovery in Biomedicine Limsoon Wong Institute for Infocomm Research

2 Copyright © 2004 by Limsoon Wong Plan Knowledge discovery in brief Eg 1: Optimizing treatment of childhood ALL Eg 2: Predicting survivals of patients with DLBC lymphoma Concluding remarks

3 Copyright © 2004 by Limsoon Wong Knowledge Discovery in Brief

4 Jonathan’s rules: Blue or Circle Jessica’s rules: All the rest Whose block is this? Jonathan’s blocks Jessica’s blocks What is Knowledge Discovery? Copyright © 2004 by Limsoon Wong

5 Question: Can you explain how? What is Knowledge Discovery? Copyright © 2004 by Limsoon Wong

6 Some classifiers/learning methods Steps of Knowledge Discovery Training data gathering Feature generation –k-grams, colour, texture, domain know-how,... Feature selection –Entropy,  2, CFS, t-test, domain know-how... Feature integration –SVM, ANN, PCL, CART, C4.5, kNN,...

7 Copyright © 2004 by Limsoon Wong Knowledge Discovery for Optimizing Treatment of Childhood ALL Image credit: Yeoh et al, 2002

8 Childhood ALL Major subtypes: T-ALL, E2A-PBX, TEL-AML, BCR-ABL, MLL genome rearrangements, Hyperdiploid>50, Diff subtypes respond differently to same Tx Over-intensive Tx –Development of secondary cancers –Reduction of IQ Under-intensiveTx –Relapse The subtypes look similar Conventional diagnosis –Immunophenotyping –Cytogenetics –Molecular diagnostics Unavailable in most ASEAN countries Copyright © 2004 by Limsoon Wong

9 Copyright © 2004 by Jinyan Li and Limsoon Wong Single-Test Platform of Microarray & Knowledge Discovery training data collection feature selection Image credit: Affymetrix feature generation feature integration

10 Conventional Tx: intermediate intensity to all  10% suffers relapse  50% suffers side effects  costs US$150m/yr Our optimized Tx: high intensity to 10% intermediate intensity to 40% low intensity to 50% costs US$100m/yr Copyright © 2004 by Jinyan Li and Limsoon Wong High cure rate of 80% Less relapse Less side effects Save US$51.6m/yr Impact

11 Copyright © 2004 by Limsoon Wong Knowledge Discovery for Predicting Survival of Patients with DLBC Lymphoma Image credit: Rosenwald et al, 2002

12 Copyright © 2004 by Limsoon Wong Diffuse Large B-Cell Lymphoma DLBC lymphoma is the most common type of lymphoma in adults Can be cured by anthracycline-based chemotherapy in 35 to 40 percent of patients  DLBC lymphoma comprises several diseases that differ in responsiveness to chemotherapy Intl Prognostic Index (IPI) –age, “Eastern Cooperative Oncology Group” Performance status, tumor stage, lactate dehydrogenase level, sites of extranodal disease,... Not good for stratifying DLBC lymphoma patients for therapeutic trials  Use gene-expression profiles to predict outcome of chemotherapy?

13 Knowledge Discovery from Gene Expression of “Extreme” Samples “extreme” sample selection knowledge discovery from gene expression 240 samples 80 samples 26 long- term survivors 47 short- term survivors 7399 genes 84 genes T is long-term if S(T) < 0.3 T is short-term if S(T) > 0.7

14 p-value of log-rank test: < 0.0001 Risk score thresholds: 0.7, 0.5, 0.3 Kaplan-Meier Plot for 80 Test Cases

15 (A) IPI low, p-value = 0.0063 (B) IPI intermediate, p-value = 0.0003 Improvement Over IPI

16 (A) W/o sample selection (p =0.38) (B) With sample selection (p=0.009) No clear difference on the overall survival of the 80 samples in the validation group of DLBCL study, if no training sample selection conducted Merit of “Extreme” Samples

17 Copyright © 2004 by Limsoon Wong Knowledge Discovery for A Few Other Biomedical Applications

18 Develop systems to recognize protein peptides that bind MHC molecules Develop systems to recognize hot spots in viral antigens Predict Epitopes, Find Vaccine Targets Vaccines are often the only solution for viral diseases Finding & developing effective vaccine targets (epitopes) is slow and expensive process

19 Dragon’s 10x reduction of TSS recognition false positives Recognize Functional Sites, Help Scientists Effective recognition of initiation, control, & termination of biological processes is crucial to speeding up & focusing scientific expts Data mining of bio seqs to find rules to recognize & understand functional sites

20 Knowledge extraction system to process free text extract protein names extract interactions Understand Proteins, Fight Diseases Understanding function & role of protein needs organised info on interaction pathways Such info are often reported in scientific paper but are seldom found in structured db

21 Copyright © 2004 by Limsoon Wong Benefits of Bioinformatics To the patient: –Better drug, better treatment To the pharma: –Save time, save cost, make more $ To the scientist: –Better science

22 Copyright © 2004 by Limsoon Wong References A. Yeoh et al, “Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling”, Cancer Cell, 1:133--143, 2002 A. Rosenwald et al, “The use of molecular profiling to predict survival after chemotherapy for diffuse large B-cell lymphoma”, NEJM, 346:1937--1947, 2002 H. Liu et al, “Selection of patient samples and genes for outcome prediction”, Proc. CSB2004, pages 382-- 392

23 Copyright © 2004 by Limsoon Wong Any Question?

24 Copyright © 2004 by Limsoon Wong To be presented 10/10/04, 8.30--10.00am Raffles Convention Centre NHG-IBM Symposium


Download ppt "Knowledge Discovery in Biomedicine Limsoon Wong Institute for Infocomm Research."

Similar presentations


Ads by Google