Download presentation
Presentation is loading. Please wait.
Published byRosa Janis Bridges Modified over 6 years ago
1
Modeling Medical Records of Diabetes using Markov Decision Processes
1H. Asoh, 1M. Shiro, 1S. Akaho, 1T. Kamishima, 1K. Hasida, 2E. Aramaki, 3T. Kohro 1National Institute of Advanced Industrial Science and Technology 2Design School, Kyoto University 3The University of Tokyo Hospital Proceedings of the ICML2013 Workshop on Role of Machine Learning in Transforming Healthcare
2
Introduction State of problem Objective of the study Method
Analyzing long-term medical records of patients suffering from chronic diseases is beginning to be recognized as an important issue in medical data analysis. Objective of the study To obtain the optimal policy for the treatment of diabetes and compare it with the averaged policy of doctors. Method They modeled the data regarding diabetes using Markov decision process (MDP).
3
Data Raw data Medical records of heart disease patients cared in University of Tokyo hospital. Over 10,000 patients since 1987. Data includes: Attributes of patients Examination results Prescription of medicines Surgical operations Data used They preprocess the data with the patients who periodically attended the hospital and underwent examinations and treatment They used the data after January 1, 2000. They focused on the data related to diabetes. Value of Hemoglobin-A1c(HbA1c).
4
MDP Model ๐,๐ด,๐,๐
๐ โ๐ ๐๐ก ๐๐ ๐ ๐ก๐๐ก๐, ๐ดโ๐ ๐๐ก ๐๐ ๐๐๐ก๐๐๐๐
๐:๐ ร๐ดร๐ โ 0,1 โ๐ ๐๐ก ๐๐ ๐ก๐๐๐๐ ๐๐ก๐๐๐ ๐๐๐๐๐๐๐๐๐๐ก๐ฆ ๐
:๐ ร๐ดร๐ รโ โ 0,1 โ๐ ๐๐ก ๐๐ ๐๐๐๐๐๐๐๐๐๐ก๐ฆ ๐๐ ๐ก๐๐๐๐๐ ๐๐๐๐๐๐๐๐ก๐ ๐๐๐ค๐๐๐ Policy ๐: ๐ ร๐ด โ 0,1 Expectation of cumulative reward under policy ๐: ; ๐พโ[0,1] โ discount factor The value of an action ๐ at state ๐ under the policy ๐ can be defined as follows:
5
MDP Model Optimal policy ๐ โ
Satisfies ๐ ๐ โ ๐ โฅ ๐ ๐ s for all state ๐ โ๐ and for all policy ๐. The state values for optimal policy satisfy the following equation: With given MDP and policy, they evaluate state values ๐ ๐ s and action values ๐ ๐ s,๐ .
6
State & Action State Value of Hemoglobin-A1c(HbA1c).
Discretized into three levels. (Normal, Medium, Severe). Action: Pharmaceutical treatment They grouped the drugs according to their functions and identified patterns of combinations of drug groups prescribed at a time. The number of identified combination patterns that appeared in the data was 38.
7
Experimenting the data
To model and analyze medical records using MDP, they developed an MDP toolbox in R. Easily handle multiple episodes Estimate parameters of MDP Evaluate state and action values Compute the optimal policy. From the records they estimated MDP state transition probabilities ๐ and the policy ๐ of doctors. For the reward, they set state dependent values according to the opinion of a doctor.
8
State & Action values under estimated MDP
They evaluated patientsโ state values ๐ ๐ ๐ . Based on the estimated probabilities ๐ and policy ๐ Doctorsโ action values ๐ ๐ (๐ ,๐). (See appendix for all combinations)
9
State & Action values under โoptimal policyโ
They obtained optimal policy ๐ โ By value iteration for MDP. The optimal policy for the each state is the same as the top actions in Table 4. The state value under optimal policy The state value for optimal policy are larger compared to the doctorsโ policy. They noted that it doesnโt mean that the optimal policy performs better for the actual patients.
10
Evaluation of goodness of modeling
Evaluation of one step future patientsโ state prediction. They divided data into training data and test data. 90% : training data & 10% : test data Using the test data, they estimated the probabilities of MDP. For each state transition, they evaluated the log-likelihood of transition and averaged the values. ๐ ๐ :number of action steps in episode e. The prediction achieves a log-likelihood value of Evaluation of doctorsโ action prediction. They evaluated the average log-likelihood of actions in test episodes. The number of candidates for the action prediction was 38. The prediction achieves a log-likelihood value of
11
Conclusion In this paper, they exploited a Markov decision process to model the long-term process of disease treatment. They estimated the parameters of the model using the data extracted from patientsโ medical records. Using the model they predicted the progression of the state of the patients and evaluate the value of treatment.
12
APPENDIX
13
Doctorsโ action values
Figure. Action values for the โnormal" (left) and โmedium" (right) states
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.