An Inteligent System to Diabetes Prediction Zhilbert Tafa Presented at Mediterranean Conference on Embedded Computing MECO ‘2015, Budva, Montenegro
MECO'2015, Budva, June 2015, Mentenegro Outline Facts Objectives Materials Methods Results Conclusions MECO'2015, Budva, June 2015, Mentenegro
MECO'2015, Budva, June 2015, Mentenegro Facts Diabetes recently affects around 346 million people. Also the mayor cause for: Heart stroke Kidney failure Lower-limb amputation Blindness One-third go undetected in early stage. Early detection and treatment - substantial health benefits, (avoiding or minimizing the mentioned complications). MECO'2015, Budva, June 2015, Mentenegro
MECO'2015, Budva, June 2015, Mentenegro Objectives Final: contribution to reducing the time between diabetes onset and clinical diagnosis. Showing the efficiency of computer-based methods Improving the reliability by using a hybrid semi-automatic system for diabetes prediction. MECO'2015, Budva, June 2015, Mentenegro
MECO'2015, Budva, June 2015, Mentenegro Existing approaches Machine learning - the area of artificial intelligence that uses the statistical data analyses. recognized to be a promising area that can help in patient classification regarding the medical conditions. Many methods have been used so far: SVM, K Node Neighbors, Naïve Bayes, Neural Networks… Some studies have shown: SVM over performs other algorithms in detecting Diabetes. MECO'2015, Budva, June 2015, Mentenegro
Overview of the materials/approach Dataset is acquired from 402 patients located in three different locations in Kosovo Dataset contains some new features (physical activity, diet) as recognized to be important in medical examinations Sample size assures for the confidence level of 95 % and the confidence interval of 5%. Approach: individual and joint implementation of SVM Naïve Bayes classification Although some other methods were also tested… MECO'2015, Budva, June 2015, Mentenegro
MECO'2015, Budva, June 2015, Mentenegro Dataset description Regular diet - patients were asked to estimate if: - they take their meals in approx. same daily intervals - they take their meals at least 3 times/day - their meals are not voluminous The adult person is considered to be physically active if he/she conducts the 150-200 min of physical activities a week Attribute Value range From To BMI 15 40 Pre meal glucose 3.5 19 Post meal glucose 4.9 22.8 Diastolic blood pressure 55 110 Systolic blood pressure 90 200 Family history of diabetes No (0) Yes (1) Regular diet Physical activities MECO'2015, Budva, June 2015, Mentenegro
SVM and Naïve Bayes Classification SVM creates classifier by maximizing the margin between classes and placing the hyper plane classifier between support vectors Naïve Bayes Classifier relies on statistical analysis: The classifier is build upon the probabilities for each feature to belong to either class A or B. Even though based on unrealistic assumption (independency, importance eq.), the idea leads to a simple scheme that works well in practice. MECO'2015, Budva, June 2015, Mentenegro
MECO'2015, Budva, June 2015, Mentenegro The experiment (1) The application were built in MATLAB Data were randomly divided into train-set and test set. The classifiers (SVM and NB) were build on the train-set. The instances of the test-set were used to test the classifiers. To avoid the bias of the “wrong split” the procedure was repeated 100 times with randomly different sets. MECO'2015, Budva, June 2015, Mentenegro
MECO'2015, Budva, June 2015, Mentenegro The experiment (2) Given the output results, the individual average classification measures were then estimated. Also, on each iteration the common true and wrong answers as well as different classifications (of the SVM and NB) were also evaluated. This is related to the hybrid implementation where two conditions are considered: When two algorithms point to the different output. When both algorithms point to the same result. Finally, Given the output results, the average classification measures of the hybrid system were then also estimated. MECO'2015, Budva, June 2015, Mentenegro
MECO'2015, Budva, June 2015, Mentenegro The experiment (3) GUI – two modes. MECO'2015, Budva, June 2015, Mentenegro
Results – individual implementation The average performance of SVM and NB individually. Precision: on all the X’s found, how many of them were actually correct. Recall: of all the real X’s, how many of them were found. MECO'2015, Budva, June 2015, Mentenegro
Results – hybrid implementation After 100 iterations, the result show that, in 94, 8 % of cases, both classifiers will point to the same result. Under this condition, Precision (NO) Overall accuracy Hybrid 99.3% 97,6% MECO'2015, Budva, June 2015, Mentenegro
MECO'2015, Budva, June 2015, Mentenegro Discussion (1) We consider the overall accuracy and the class NO precision as more important YES will lead to the further (periodic) examinations while NO will often mean no further examinations. Regarding these two criteria, the algorithms can be used individually with the SVM having a slightly better performances. SVM performance – 95,52 %, NO class precision – 97 % NB performance – 94,53 %, NO class precision – 98 % MECO'2015, Budva, June 2015, Mentenegro
MECO'2015, Budva, June 2015, Mentenegro Discussion (2) Regarding these two criteria, the hybrid algorithm improves performances: If both algorithms point to the different result (in around 5% of cases) the user is notified and the patient is led to the further med. ex. If both algorithms show to the same result (in around 95 % of cases), the average system performance is around 98 % while the class NO precision will be 99,3 %. MECO'2015, Budva, June 2015, Mentenegro
Conclusions (1) Both SVM classifier and NB classifier perform well on given dataset The hybrid classifier increases the reliability of the computer-based diagnostic process.
MECO'2015, Budva, June 2015, Mentenegro Conclusions (2) The presented approaches can contribute to the timely and more precise (pre) diabetes diagnosis. The presented approaches are flexible – data could be updated periodically to follow the changes that happen due to various factors This can dynamically improve the classifiers MECO'2015, Budva, June 2015, Mentenegro
MECO'2015, Budva, June 2015, Mentenegro Conclusions (3) Therefore: the presented systems could be used as a support to medical decision making in health care environments The system could also be adapted for use in home environment. Users can use home glucose-meters to measure the level of blood glucose They can calculate the BMI their self User can provide other input parameters as well. However: Needs further adjustments and clinical testing. MECO'2015, Budva, June 2015, Mentenegro
THANK YOU Q&A Zhilbert Tafa tafaul@t-com.me