Time to CARE: A collaborative engine for practical disease prediction

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Dept of Biomedical Engineering, Medical Informatics Linköpings universitet, Linköping, Sweden A Data Pre-processing Method to Increase.
Cap.org v. # Pathologists’ Role in Coordinated Care and Managing Patient Populations.
1 February 9, 2007 Indigent Care Collaboration HIE Supports Community Collaboration February 9, 2007 Ann Kitchen  Executive Director Indigent Care Collaboration.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
Diagnostic Method Diagnosis Diagnosis means `through knowledge` and entails acquisition of data about the patient and their complaints using the senses:
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Utilizing severity to interpret changing trends of hospitalized injury rates in the United States, Claudia A. Steiner, MD, MPH 1 Li-Hui Chen,
INTRODUCTION TO ICD-9-CM
CBR in Medicine Jen Bayzick CSE435 – Intelligent Decision Support Systems.
Recommendations on Minimum Data Recording Requirements in Hospitals from the Directorate of Health in Iceland: Is it possible to use Hospital Patient Registry.
Uniform Coding and Simplified Pricing HEALTH AUTHORITY – ABU DHABI Health Systems Finance May, 2007.
Critical Appraisal of Clinical Practice Guidelines
TRANSLATING VISITS INTO PATIENTS USING AMBULATORY VISIT DATA (Hypertensive patient case study) by Esther Hing, M.P.H. and Julia Holmes, Ph.D U.S. DEPARTMENT.
RISK ADJUSTMENT CODING
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Creating a Model Curriculum in the United States Samuel Keim University of Arizona.
Components of HIV/AIDS Case Surveillance: Case Report Forms and Sources.
Exploratory Analysis of Observation Stay Pamela Owens, Ph.D. Ryan Mutter, Ph.D. September, 2009 AHRQ Annual Meeting.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Acknowledgements Contact Information Anthony Wong, MTech 1, Senthil K. Nachimuthu, MD 1, Peter J. Haug, MD 1,2 Patterns and Rules  Vital signs medoids.
Quality Improvement and Care Transitions in a Medical Home Maryland Learning Collaborative May 21, 2014 Stephanie Garrity, M.S., Cecil County Health Officer.
1 EPI235: Epi Methods in HSR April 5, 2005 L3 Evaluating Health Services using administrative data 2: Advanced Topics in Risk Adjustment (Dr. Schneeweiss)
Learning Outcomes Discuss current trends and issues in health care and nursing. Describe the essential elements of quality and safety in nursing and their.
Overcoming the Risk Adjustment Payment Challenge John G. Lovelace, President July 2010.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Medical Necessity Criteria An Overview of Key Components Presented by BHM Healthcare Solutions.
Shubhangi Arora1; Eden Haverfield2; Gabriele Richard2; Susanne B
Eric Schone and Randy Brown Mathematica Policy Research
Quality Measurement A Changing Landscape
UHC, DMO, and AWP UHC REIMBURSEMENT POLICY
EHR Coding and Reimbursement
Reduction Of Readmissions To Hospitals Based on Actionable Knowledge Discovery and Personalization Zbigniew W.  Ras Sponsored by.
CLINICAL DECISION MAKING
Analysis Manager Training Module
Impact of internet sources on e-patient knowledge
Chapter 7. Classification and Prediction
LATEST RESEARCH JUNE 2015 Formed in 2009 the Aston Research Centre for
1st International Online BioMedical Conference (IOBMC 2015)
An Artificial Intelligence Approach to Precision Oncology
MEDInsights™.
Prescribing.
Medical Care Cost of Medicare/Medicaid Beneficiaries with Vision Loss
Preface to the special issue on context-aware recommender systems
Evaluating Policies in Cardiovascular Medicine
Disclaimer This presentation is intended only for use by Tulane University faculty, staff, and students. No copy or use of this presentation should occur.
Component 11/Unit 7 Implementing Clinical Decision Support
Roland C. Merchant, MD, MPH, ScD
Authors Bo Sun, Fei Yu, Kui Wu, Yang Xiao, and Victor C. M. Leung.
Fenglong Ma1, Jing Gao1, Qiuling Suo1
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
M.Sc. Project Doron Harlev Supervisor: Dr. Dana Ron
A Modified Naïve Possibilistic Classifier for Numerical Data
Presentation 王睿.
SQL for Cleaning Data Farrokh Alemi, Ph.D.
Predicting Pneumonia & MRSA in Hospital Patients
To Admit…or not to Admit…that is the question!
The Medical Coding System
Component 1: Introduction to Health Care and Public Health in the U.S.
Selecting the Right Predictors
Provider Peer Grouping: Project Overview
Item-to-Item Recommender Network Optimization Methodology
Assignment 2 Learning Aim D: Individual Treatment Plan
The Research Question Has this patient with chest pain coronary artery disease? Diagnostic utility of a clinical decision rule. J Haasenritter, S Bösner,
Value Based Healthcare King’s Health Partners
Association between hidradenitis suppurativa and hospitalization for psychiatric disorders: A cross-sectional analysis of the National Inpatient Sample.
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Presentation transcript:

Time to CARE: A collaborative engine for practical disease prediction D. Davis et al. (2009) in Data mining and knowledge discovery Speaker: Sang Ho Oh Feb. 20th on 2018

Introduction Annual health care expenditure in the U.S. alone is an overwhelming sum. Majority of this money is used for disease treatment. Experts expect the burden on the medical system to continually increase in coming years. In 2001, 3.1 visits/patient were made to physician. In history, researchers shown many conditions to have recognizable indicators before onset/preventable risk factors. The prospective medicine and aim at minimizing the risk can be done. Current situation: Physicians can use family and health history and physical examination to approximate the risk of patient. Medical care is reactive, stepping in once the symptoms have emerged. How to prevent? Prevailing model of prospective health care -> Genome revolution. Not yet matured. Then what is the option? Phenotype and disease history based approaches offer the promise of advances towards disease prediction.

Purpose of the study Aim of the study: Development of a predictive system (called CARE: Collaborative Assessment and Recommendation Engine). How? Examining the use of medical history For? To examine information about disease correlations and inexpensively assess risk. How to predict about the future diseases a patient may develop? Generate a patient’s prognosis based on the experiences of other similar patients. Method used in the study: Collaborative filtering (will be explained in next page). Contributions of the study: A novel application of collaborative filtering in the medical domain for advancing the field of prospective medicine. Present a general system which makes predictions on all types of diseases and medical conditions (using ICD-9-CM). *ICD-9-CM: International classification of diseases codes.

Collaborative filtering It is designed to predict the preferences of one person(active user) based on the preferences of other similar persons(users). Assumption: people will enjoy the same items as their similar peers. Having some common preferences is a strong predictor of additional common preferences. Predictions are based on datasets consisting of many user profiles Accomplished by calculating a weight of similarity between active user and all others. Active user’s opinion is determined by the weighted average of the others’ opinion. How is it applied in medical area? Each user is a patients whose profile is a diagnosed disease. Using collaborative filtering, they generated predictions on other diseases based on a set of other similar patients. Difference between original and modified version of collaborative filtering The rating is binary: either patient has a disease (1) or not (0).

Data used The database comprises the Medicare records of 13,039,018 elderly patients in U.S. with total of 32,341,348 visits. The input for the methods consists of each patient’s diagnosis history and provided per inpatient visit. Each data record consists of hospital visit, patient ID, and list of up to 10 diagnosis codes per visit. The diagnosis code – International Classification of Diseases, 9th revision, Clinical Modification (ICD-9-CM). Each disease is given a unique code that can be up to 5 character long. ICD-9 codes are hierarchical in nature so it can be collapsed to fewer characters which identifies a small family of related medical conditions. There are total of 18,207 unique disease codes expressed. *Example of collapsing code 40201 - malignant hypertensive heart disease with heart failure. 4020 – non-specific malignant hypertensive heart disease. 402 - family of all hypertensive heart disease.

The CARE methodology The testing patient (denoted as 𝑎) is the individual for whom we are making predictions based on the histories of training patients (denoted as 𝐼,with each individuals denoted as 𝑖∈𝐼). The doted lines represent optional methods. All patients are represented by their medical history The training set is constrained to patients With at least two disease in common with testing patient. This will results the group of patients similar to the testing set patient. Collaborative filtering is performed generating predictions for the future visits of the testing patient. The multiple resulting predictions are combined. The output is the ranked list of diseases for the subsequent visit of the testing patient, ranked from the highest risk to the lowest.

Vector similarity Collaborative filtering is used to make a prediction 𝑝(𝑎,𝑗) on an active user 𝑎 for item 𝑗 based on the similarity between user 𝑎 and every other user 𝑖 who has previously given a vote 𝑣 𝑖,𝑗 for that item. where 𝑣 𝑖 – average vote of each user. 𝑘 – normalizing constant (makes sum of weights equal to 1). 𝐼 – The entire training set of users 𝐼 𝑗 – the subset of users who have voted on 𝑗 The similarity 𝑤(𝑎,𝑗) is calculated by vector similarity: where 𝐽 𝑖 - set of items rated by user 𝑖

Inverse frequency They further extended the vector similarity equation to include inverse frequency. Gives lower weights to very common diseases in the training set. Based on intuition that sharing rare disease has more impact on similarity than sharing common disease. There can be many medical diagnoses shared between patients but the most important contributions arises from uncommon connections. The inverse frequency of disease 𝑗 is defined as: where 𝑛 – number of patients in the training set 𝑛 𝑗 – number of patients who have 𝑗 This incorporated into vector similarity by multiplying each disease vote by corresponding IF factor. This results the following equation:

Grouping of training patients Before application of collaborative filtering, a group of relevant training patients is determined. Based on the number of diagnoses in common with the testing patient. Why? To remove the influence of patients who have little or no similarity. Training patients with no disease in common with the active patient do not contribute to the prediction score. Removing those does not result in loss of information but effectively reduces the runtime of the algorithm. How it works in CARE? In CARE, they include all patients with 2 or more diseases in common. This constraint enforces stronger similarities for all patients influencing the predictions. Helps to avoid the noise.

Optional methods ICARE ICD-9-CM code collapse Time-sensitive CARE This means the “Iterative CARE” This method developed to capture the effect of each individual disease with minimal noise from other diseases but without loss of information due to removing them. ICD-9-CM code collapse In some cases, it is desirable for 4/5 digit ICD-9-CM codes to be collapsed in to more general 3 digit code which represent small groups of related/similar disease. There are two method: truncated to 3-digits before (pre-collapse) or after (post-collapse) applying collaborative filtering. Pre-collapse: significantly reduces the runtime of algorithm. Post-collapse: makes the result simpler to evaluate and interpret. Time-sensitive CARE CARE & ICARE do not take the order of or length between disease diagnoses when generating vector similarity. But matching with two diseases which occurred many years apart may not be relevant. For that reason, they modified the method to incorporate the length of time between medical event.

Experiments Evaluate the performance on predicting diseases which happen on a later data than those that the collaborative algorithm was given. They determined performance based on the overall list of predictions ranked in order from the most likely to the least likely. Metrics used: Coverage: the percentage of diseases for which prediction is made and ranked. Average rank: it is desirable for future diseases to have low rank positions. Half-life accuracy: measures the expected utility of the ranked list.

Performance trends To check how performance changes with respect to the amount of data known about the testing patient. This provides guidelines for minimum amount of information needed for meaningful result (better than baseline) and threshold for good result. The visit and diseases trend show that performance continually increases as more information is known. In (a), just 1 visit is sufficient to outperform the baseline. (b) shows that visit should have at least 3 diseases. But the data more than 35 diseases is too sparse for further conclusion. (c) shows that older diagnoses are less relevant to immediate concerns which is very obvious result.

Conclusions The goal of the paper is to come up with a system that can assist a medical practitioner in decision making. The authors proposed CARE, a collaborative recommendation engine for prospective and proactive healthcare. This CARE, ICARE, and time-sensitive CARE can predict and provide the future diagnoses of the patient to doctor. then appropriate medical test can be proceeded. Improves the quality of life for the patient. Also can reduce the health care costs.

Thank you