1 Discovery of Temporal Patterns in Course-of-Disease Medical Data Jorge C. G. Ramirez Ph.D. Candidate Lynn L. Peterson and Diane J. Cook Supervising Professors.

Slides:



Advertisements
Similar presentations
Mining Association Rules from Microarray Gene Expression Data.
Advertisements

Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
World Health Organization TB Case Definitions
Data Mining Sangeeta Devadiga CS 157B, Spring 2007.
Rule Based Systems Alford Academy Business Education and Computing
Diagnosing – Critical Activity HINF Medical Methodologies Session 7.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Therapeutic exercise foundation and techniques Therapeutic exercise foundation and concepts Part II.
Discovering Substructures in Chemical Toxicity Domain Masters Project Defense by Ravindra Nath Chittimoori Committee: DR. Lawrence B. Holder, DR. Diane.
Data Mining – Intro.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Patient Compliance With Medical Advice. Patient compliance (patient adherence) :  The extent to which the patient adheres to medical advice Patient compliance.
Clinical Pharmacy Basma Y. Kentab MSc..
CBR in Medicine Jen Bayzick CSE435 – Intelligent Decision Support Systems.
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Shilpa Seth.  What is Data Mining What is Data Mining  Applications of Data Mining Applications of Data Mining  KDD Process KDD Process  Architecture.
CS 548 Spring 2015 Showcase by Pankaj Didwania, Sarah Schultz, Mingchen Xie Showcasing Work by Malhotra, Chau, Sun, Hadjipanayis, & Navathe on Temporal.
Economic evaluation of health programmes Department of Epidemiology, Biostatistics and Occupational Health Class no. 16: Economic Evaluation using Decision.
Overview DM for Business Intelligence.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Techniques As Tools for Analysis of Customer Behavior
A Framework for Mining Signatures from Event Sequences and Its Applications in Healthcare Data.
Measuring the Severity of Medication Discrepancies: A Community Pharmacy Perspective.
Collecting data in clinic.  Aim of BADBIR  Definition of Adverse Events  Adverse events in BADBIR  Adverse event recording in hospital case notes.
Artificial Intelligence
Temporal Analysis of Platelet Data in Chronic Viral Hepatitis Dataset Shoji HiranoShusaku Tsumoto Department of Medical.
DATA MINING 1. 2 Data Mining Extracting or “mining” knowledge from large amounts of data Data mining is the process of autonomously retrieving useful.
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
1 Improving quality of graduate students by data mining Asst. Prof. Kitsana Waiyamai, Ph.D. Dept. of Computer Engineering Faculty of Engineering, Kasetsart.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
1 HIV Clinical Staging HAIVN Harvard Medical School AIDS Initiative in Vietnam.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Screening and its Useful Tools Thomas Songer, PhD Basic Epidemiology South Asian Cardiovascular Research Methodology Workshop.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College Bio Informatics January
Clinical Decision Support 1 Historical Perspectives.
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
Clinical Decision Support Systems Dimitar Hristovski, Ph.D. Institute of Biomedical.
Signal identification and development I.Ralph Edwards.
Data Mining and Decision Support
1 Mining Images of Material Nanostructure Data Aparna S. Varde, Jianyu Liang, Elke A. Rundensteiner and Richard D. Sisson Jr. ICDCIT December 2006 Bhubaneswar,
1 Mining Episode Rules in STULONG dataset N. Méger 1, C. Leschi 1, N. Lucas 2 & C. Rigotti 1 1 INSA Lyon - LIRIS FRE CNRS Université d’Orsay – LRI.
N VISUAL ANALYTICS FOR HEALTHCARE: BIG DATA, BIG DECISIONS David Gotz Healthcare Analytics Research Group IBM T.J. Watson Research Center.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Department of Pharmacy, Ditmanson Medical Foundation Chia-Yi Christian Hospital Suspect Moxifloxacin Induced Torsades de Pointes: A case report Ya-Wen.
By William Campbell MACHINE LEARNING. OVERVIEW What is machine learning? Human decision-making Learning algorithms Applications.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
Research in Health Informatics: Bangladesh Perspective Dr. Abu Sayed Md. Latiful Hoque Professor, Dept. of CSE, BUET Workshop on Health Data Analytics.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Data Mining – Intro.
Applying Deep Neural Network to Enhance EMPI Searching
1st International Online BioMedical Conference (IOBMC 2015)
An Artificial Intelligence Approach to Precision Oncology
School of Computer Science & Engineering
Measurement Wu Gong, MS, MD
A systematic review of the relationship between substance abuse and psychotropic medication adherence: opportunities to improve outcomes for patients with.
Introduction C.Eng 714 Spring 2010.
Chapter 12 Diagnosing Dr. James Pelletier The Swain Department of Nursing The Citadel.
Chapter 12 Diagnosing Fundamental of nursing Dr. James Pelletier The Swain Department of Nursing The Citadel.
Sangeeta Devadiga CS 157B, Spring 2007
Research Areas Christoph F. Eick
Data Warehousing and Data Mining
I don’t need a title slide for a lecture
Ninja Trader: Introduction to data mining in financial applications
Presentation transcript:

1 Discovery of Temporal Patterns in Course-of-Disease Medical Data Jorge C. G. Ramirez Ph.D. Candidate Lynn L. Peterson and Diane J. Cook Supervising Professors

2 Overview Objective Contributions Approach TEMPADIS Summary and Conclusions

3 Objective Discover patterns that represent groups of patients that had a similar course of disease for a catastrophic or chronic illness Motivation –Medical –AI

4 Contributions Data Preprocessing –Normalization –Learning Missing Data –Learning Implicit Knowledge Exploratory Analysis –Event Set Sequence Approach

5 Contributions Domain Understanding –New perspective on mass of data –Identify groups of patients for further medical study

6 Approach Example Events – Laboratory Results 461 L WBC L HCT L PLT L CD4% L CD4A

7 Approach Example Events 468 C CV 468 D AIDS-RELATED COMPLEX, UNSPECIFIED 469 P CTM 60 CO-TRIMOXAZOLE DS 469 P AZT 200 ZIDOVUDINE 100MG Example Events – Visits – Diagnoses – Pharmacy

8 Event Set Sequences –Events Value Event: laboratory test result, visit Duration Event: pharmacy, diagnosis –Event Set is all Events that occur in a window of time –Event Set Sequence is all Event Sets that occur over a long period of time Approach Event Set Sequences

9 Approach Example Event Set 461 L WBC L HCT L PLT L CD4% L CD4A C CV 468 D AIDS-RELATED COMPLEX, UNSPECIFIED 469 P CTM 60 CO-TRIMOXAZOLE DS 469 P AZT 200 ZIDOVUDINE 100MG

10 Approach Normalization –Normal for each patient is different –Especially when effected by a catastrophic or chronic illness –Example: CD4A General Population Normal: Well HIV-positive patient: Severely immune-compromised patient:

11 Approach Normalization (continued) –Scale to -4…0…+4 0 is normal Each number represents a deviation from normal 1 and 2 are noticeable but not severe 3 is severe 4 is very severe

12 Approach Replace Missing Data –Diagnosis data very incomplete –Learn severity of condition from pharmacy data –Induce decision tree to classify conditions

13 Approach Create Health Status Categories 1= HIV-positive asymptomatic 2 = Asymptomatic, on anti-HIV therapy 3 = Immune-compromised, on prophylactic therapy 4= Active illness 5 = Severe active illness

14 Approach Learn Implicit Knowledge –Need to augment explicit knowledge –Recovery time is expert’s implicit knowledge –Use neural network to learn recovery time function 0 = Nothing to recover from 1-4 = weeks to recover 5 = 5 or more weeks to recover

15 Approach Categorize Pharmacy Data –A myriad of drugs prescribed –Need to understand significance –Categorize by use

16 Approach Categories –Nucleoside Analogs –Protease Inhibitors –Prophylaxis Therapies –Intraveneous antibiotics –Anti-virals –Anti-PCP/Toxoplasmosis –Anti-mycobacterials

17 Approach Categories (continued) – Anti-wasting syndrome – Anti-fungals – Chemotherapies

18 Approach Result: Understandable representation of patient data 861 C : 30 38: H : 3 22: 1 35: H : 3 22: 1 35: : C C : 30 38: 50 39: : H : 2 22: 1 39: 12

19 Approach Result: Understandable representation of patient data 861 C – H – H – C C H –

20 Approach Result: Understandable representation of patient data < { (EV C)(HS 3)(RT 1)(WBC -4)(HCT -3)(PLT 0) (LMPH –1)(onD ) } { (EV H)(HS 4)(RT 4)(WBC 0)(HCT -4)(PLT -1) (LMPH –2)(onD ) } { (EV H)(HS 4)(RT 1)(WBC -2)(HCT -3)(PLT -1) (LMPH –4)(onD ) } { (EV C)(HS 4)(RT 3)(WBC -4)(HCT -1) (onD ) } { (EV C)(HS 4)(RT 2)(WBC -4)(HCT -2)(PLT -1) (LMPH 2)(onD ) } { (EV H)(HS 4)(RT 4)(WBC 0)(HCT -4)(PLT 0) (LMPH –2)(onD ) } >

21 Approach Inexact Match –Use set difference Partial match, feature by feature Assumes default partial match for missing data –Use weakest-link/average-link Require minimum degree of match Require average degree of match

22 TEMPADIS Raw Target Data Data Cleaning Data Normalization Normalized Database

23 TEMPADIS Normalized Database Decision Tree Neural Net Reduced, Knowledge-Added Data

24 TEMPADIS Knowledge-Added Database Sequence Builder Temporal Patterns

25 Validation –Results are temporal patterns that demonstrate groups of patients had similar experience during the course of disease –Only medical experts can assess validity of discovered patterns –These results have been validated by the experts in the HIV Clinical Research Group Results

26 Results Given a database of patients followed for 4 to 9 years –Discovered interesting patterns –Interestingness has multiple dimensions Length Data that appears in the patterns Data that does not appear in the patterns

27 Results Advanced patients, subject to various OIs < { (EV C)(HS 3)(RT 0)(WBC 0)(HCT -1)(PLT 0)(LMPH -3) (onD ) } { (EV E)(HS 3)(RT 2)(WBC 3)(HCT -1)(PLT 1)(LMPH 4) (onD ) } { (EV C)(HS 3)(RT 0)(WBC 1)(HCT 0)(PLT 0)(CD4P -3) (CD4A -1)(LMPH 0)(onD ) } { (EV C)(HS 3)(RT 1)(WBC -1)(HCT -1)(PLT 1)(LMPH 2) (onD ) } { (EV E)(HS 3)(RT 1)(WBC 2)(HCT -1)(PLT 1)(LMPH 4) (onD ) } { (EV C)(HS 3)(RT 1)(WBC 1)(HCT 0)(PLT 0)(CD4P -3) (CD4A -2)(LMPH 0)(onD ) } >

28 Advanced patients, fairly stable < { (EV C)(HS 3)(RT 0)(WBC -1)(HCT -1)(PLT 1)(CD4P -4) (CD4A -4)(LMPH 0)(onD ) } { (EV C)(HS 3)(RT 0)(WBC 0)(HCT 0)(PLT -1)(CD4P -4) (CD4A -4)(LMPH 0)(onD ) } { (EV C)(HS 3)(RT 0)(onD ) } { (EV C)(HS 3)(RT 0)(WBC -2)(HCT 0)(PLT -1)(CD4P -4) (CD4A -4)(LMPH 0)(onD ) } { (EV C)(HS 4)(RT 1)(WBC 1)(HCT -4)(PLT 0)(CD4P -4) (CD4A -4)(LMPH -4)(onD ) } { (EV C)(HS 3)(RT 3)(onD ) } { (EV )(HS 3)(RT 1)(WBC 0)(HCT 0)(PLT 0)(LMPH 0) (onD ) } { (EV C)(HS 3)(RT 0)(CD4A -4)(onD ) } >

29 Asymptomatic period < { (EV C)(HS 1)(RT 0)(onD ) } { (EV C)(HS 1)(RT 0)(onD ) } { (EV C)(HS 1)(RT 1)(onD ) } { (EV C)(HS 1)(RT 0)(onD ) } { (EV E)(HS 1)(RT 0)(WBC -1)(HCT 0)(PLT 1)(CD4P -1) (CD4A -2)(LMPH 0)(onD ) } { (EV C)(HS 1)(RT 0)(onD ) } { (EV C)(HS 1)(RT 0)(CD4A 0)(onD ) } { (EV E)(HS 1)(RT 0)(WBC 1)(HCT 0)(PLT 0)(CD4P 0) (CD4A 0)(LMPH 0)(onD ) } { (EV C)(HS 1)(RT 0)(onD ) } { (EV C)(HS 1)(RT 0)(onD ) } >

30 Summary Nine Steps of KDD –Identify goal –Identify target data set –Data cleaning and preprocessing –Data reduction and projection –Identify data mining method

31 Summary Nine Steps of KDD –Exploratory Analysis –Data Mining –Interpretation of Mined Patterns –Acting on Discovered Knowledge

32 Conclusions Objective Met with Contributions –Patterns discovered representing groups of patients with similar experience in course of disease –This perspective on the data has not previously been produced –This kind of computation on this kind of data has not previously been produced

33 Future Work Improve discovery algorithm –Backtracking is a barrier to overcome Improve search control Develop heuristic for measuring interestingness Add ability to identify clinically identical/similar patterns

34 Future Work Move database to new Intelligent Systems in Medicine and Biology Lab Bring database up to date Include more domain data in Event Sets Explore impact of new developments in HIV treatment