Decision Trees Showcase

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.
Imbalanced data David Kauchak CS 451 – Fall 2013.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
Decision Tree Approach in Data Mining
Capturing Best Practice for Microarray Gene Expression Data Analysis Gregory Piatetsky-Shapiro Tom Khabaza Sridhar Ramaswamy Presented briefly by Joey.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Decision Tree Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Lecture outline Classification Decision-tree classification.
Decision Tree Rong Jin. Determine Milage Per Gallon.
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
Decision Tree Algorithm
Induction of Decision Trees
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Three kinds of learning
Classification.
Ensemble Learning (2), Tree and Forest
Data Mining: Classification
Issues with Data Mining
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Mohammad Ali Keyvanrad
Decision Trees & the Iterative Dichotomiser 3 (ID3) Algorithm David Ramos CS 157B, Section 1 May 4, 2006.
Chapter 9 – Classification and Regression Trees
Learning from Observations Chapter 18 Through
Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.
CS690L Data Mining: Classification
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 6.2: Classification Rules Rodney Nielsen Many.
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
Brian Lukoff Stanford University October 13, 2006.
CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,
Machine Learning: Decision Trees Homework 4 assigned courtesy: Geoffrey Hinton, Yann LeCun, Tan, Steinbach, Kumar.
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
BY International School of Engineering {We Are Applied Engineering} Disclaimer: Some of the Images and content have been taken from multiple online sources.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
Decision Tree Learning DA514 - Lecture Slides 2 Modified and expanded from: E. Alpaydin-ML (chapter 9) T. Mitchell-ML.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Machine Learning Reading: Chapter Classification Learning Input: a set of attributes and values Output: discrete valued function Learning a continuous.
10. Decision Trees and Markov Chains for Gene Finding.
Classification with Gene Expression Data
Chapter 18 From Data to Knowledge
Rule Induction for Classification Using
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Chapter 6 Classification and Prediction
Parkinson's disease Parkinson's disease (PD) is the second-most common
CS548 Fall 2017 Decision Trees / Random Forest Showcase by Yimin Lin, Youqiao Ma, Ran Lin, Shaoju Wu, Bhon Bunnag Showcasing work by Cano,
Introduction to Data Mining, 2nd Edition by
Classification and Prediction
Machine Learning Feature Creation and Selection
Introduction to Data Mining, 2nd Edition by
Introduction to Data Mining, 2nd Edition by
Data Mining Practical Machine Learning Tools and Techniques
An Inteligent System to Diabetes Prediction
Experiments in Machine Learning
Machine Learning: Lecture 3
iSRD Spam Review Detection with Imbalanced Data Distributions
CSCI N317 Computation for Scientific Applications Unit Weka
Statistical Learning Dong Liu Dept. EEIS, USTC.
Ensemble learning.
©Jiawei Han and Micheline Kamber
MIS2502: Data Analytics Classification Using Decision Trees
Decision trees One possible representation for hypotheses
Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,
Decision Trees Jeff Storey.
Presenter: Donovan Orn
Presentation transcript:

Decision Trees Showcase CS548 Fall 2018 Decision Trees Showcase by Holly Nguyen, Hongyu Pan, Lei Shi, Muhammad Tahir Showcasing work by Themis P. Exarchos, Alexandros T. Tzallas, DinaBaga, Dimitra Chaloglou, Dimitrios I. Fotiadis, Sofia Tsouli, Maria Diakou, Spyros Konitsiotis on “Using partial decision trees to predict Parkinson’s symptoms: A new approach for diagnosis and therapy in patients suffering from Parkinson’s disease” 20 sec.

References T. P. Exarchos, A. T. Tzallas, D. Baga, D. Chaloglou, D. I. Fotiadis, S. Tsouli, M. Diakou, S. Konitsiotis. “Using partial decision trees to predict Parkinson’s symptoms: A new approach for diagnosis and therapy in patients suffering from Parkinson’s disease,” Computers in Biology and Medicine, vol. 42, no. 2, Nov., pp. 195-204, 2011. E. Frank. “Pruning decision trees and lists,” 2000. E. Frank, I. H. Witten. “Generating accurate rule sets without global optimization,” ICML, 1998. J. R. Quinlan. “C4.5: Programs for machine learning,” Morgan Kaufmann Publishers, 1993. Partial Decision Tree: https://www.cs.waikato.ac.nz/~eibe/pubs/thesis.final.pdf https://pdfs.semanticscholar.org/3998/12d46345ad7d93f5510b1bbda30948e7a65c.pdf https://octaviansima.wordpress.com/2011/03/25/decision-trees-c4-5/ https://www.slideshare.net/aorriols/lecture6-c45 https://en.wikipedia.org/wiki/C4.5_algorithm

30 sec: Parkinson’s disease In simple words Parkinson’s disease affects patient’s movement. But it is a lot more complex than that. Initial symptoms: Like Tremors, Muscle rigidity, and slowness in movement. The aim for medication and therapy is to alleviate the symptoms.

30 sec: Decision Support System The general concept of this study is the extraction of knowledge and Rules from a set of data and the development of decision support Systems for diagnosis, prediction and classification. System that support decision making activities.

PERFORM System Develop prediction model Extraction of rules 1 Min: PERFORM Systems

Methodology Data collection Data preprocessing Data mining 230 Parkinson’s Disease patients from University Hospital of Ioannina Medical History (e.g., gender, current age, age at diagnosis, smoker, diabetes, family with PD) Examinations (e.g., tremor, tremor severity, postural instability, falls, freezing, dementia) Medication (i.e., 5 medications and the number of years since they were first prescribed) Data preprocessing “Feature wrapper” (supervised feature selection) Data mining Prediction of symptoms (Partial decision trees → rules) Holly 1 minute During the data collection process, a patient’s daily motor activity is collected through sensors and then the LBU classifies them according to UPDRS

Wrapper for Feature Selection This use case Used wrapper on each symptom input to determine “worth” (i.e., accuracy estimates) Black box (target learning algorithm) Holly 30 seconds Wrapper approach for feature selection: https://pdfs.semanticscholar.org/0048/88621a4e4cee56b6633338a89aa036cf5ae5.pdf

About Partial Decision Trees The decision tree generated by C4.5 algorithm which is developed by Ross Quinlan is likely to be overfitting because it is too specific to correctly classify the test data set. Partial decision tree is smaller tree which is obtained by pruning the original decision tree. Lei https://www.cs.waikato.ac.nz/~eibe/pubs/thesis.final.pdf

About Partial Decision Trees PROCESS Expand tree from root to leaf Backtrack Consider replacement during backtracking Tree Pruning Method: Post-pruning Condition of Pruning: Replacing node with leaf will not increase validation error rate. Lei https://pdfs.semanticscholar.org/3998/12d46345ad7d93f5510b1bbda30948e7a65c.pdf

About Partial Decision Trees Procedure Expand Subset Choose the feature with largest information gain to split data into subsets While there are subsets that have not been expanded && all the subsets expanded so far are leaves expand the unexpanded subset with smallest entropy If all the subsets expanded are leaves && combined validation error rates of leaves >= validation error rate of node undo expansion into subsets and make node a leaf Lei https://pdfs.semanticscholar.org/3998/12d46345ad7d93f5510b1bbda30948e7a65c.pdf https://www.slideshare.net/aorriols/lecture6-c45

Stepping Through Partial Decision Trees Gray node: unexpanded Black node: leaf White node: expanded Lei https://pdfs.semanticscholar.org/3998/12d46345ad7d93f5510b1bbda30948e7a65c.pdf

STAGE 1 Gray node: unexpanded Black node: leaf White node: expanded 2 3 4 Lei https://pdfs.semanticscholar.org/3998/12d46345ad7d93f5510b1bbda30948e7a65c.pdf

STAGE 2 Gray node: unexpanded Black node: leaf White node: expanded 1 STAGE 2 Gray node: unexpanded Black node: leaf White node: expanded 2 3 4 5 Lei https://pdfs.semanticscholar.org/3998/12d46345ad7d93f5510b1bbda30948e7a65c.pdf

STAGE 3 Gray node: unexpanded Black node: leaf White node: expanded 1 STAGE 3 Gray node: unexpanded Black node: leaf White node: expanded 2 3 4 5 Lei https://pdfs.semanticscholar.org/3998/12d46345ad7d93f5510b1bbda30948e7a65c.pdf

STAGE 4 Gray node: unexpanded Black node: leaf White node: expanded 1 STAGE 4 Gray node: unexpanded Black node: leaf White node: expanded 2 3 4 Lei https://pdfs.semanticscholar.org/3998/12d46345ad7d93f5510b1bbda30948e7a65c.pdf

STAGE 5 Gray node: unexpanded Black node: leaf White node: expanded 1 STAGE 5 Gray node: unexpanded Black node: leaf White node: expanded 2 4 Lei https://pdfs.semanticscholar.org/3998/12d46345ad7d93f5510b1bbda30948e7a65c.pdf

Results -- Predicted Symptoms Tremor Rigidity Bradykinesia Postural instability Falls Freezing Autonomic Hypophonia Hypomimia Orthostatic Hypotension RBD LID On/Off Suddenoffs Dementia PREDICTOR Hongyu 10 fold cross validation method; The results of the method, range from 57.1 to 77.4%. The highest accuracy is achieved for the symptom Tremor, which is the most common symptom in PD and for which there is a relative balance between the patients where the symptom occurs and the patients where the symptom does not occur. Also, high results are achieved for Rigidity, Bradykinesia, Falls and LID, which are some of the most important symptoms in PD. On the other hand, low accuracy is obtained for some of the symptoms such as Postural Instability, Autonomic, Hypophonia, Hypomimia, Orthostatic Hypotension, RBD, Sudden offs and Dementia. This can be attributed to the fact that for those symptoms, there is high imbalance between the two categories (occurrence or not), resulting in a relatively low number of records for training the algorithms and developing the predictive models.

Results -- Example of generated rule 1 Hongyu 10 fold cross validation method; The results of the method, range from 57.1 to 77.4%. The highest accuracy is achieved for the symptom Tremor, which is the most common symptom in PD and for which there is a relative balance between the patients where the symptom occurs and the patients where the symptom does not occur. Also, high results are achieved for Rigidity, Bradykinesia, Falls and LID, which are some of the most important symptoms in PD. On the other hand, low accuracy is obtained for some of the symptoms such as Postural Instability, Autonomic, Hypophonia, Hypomimia, Orthostatic Hypotension, RBD, Sudden offs and Dementia. This can be attributed to the fact that for those symptoms, there is high imbalance between the two categories (occurrence or not), resulting in a relatively low number of records for training the algorithms and developing the predictive models.

Results -- Example of generated rule 2 Hongyu 10 fold cross validation method; The results of the method, range from 57.1 to 77.4%. The highest accuracy is achieved for the symptom Tremor, which is the most common symptom in PD and for which there is a relative balance between the patients where the symptom occurs and the patients where the symptom does not occur. Also, high results are achieved for Rigidity, Bradykinesia, Falls and LID, which are some of the most important symptoms in PD. On the other hand, low accuracy is obtained for some of the symptoms such as Postural Instability, Autonomic, Hypophonia, Hypomimia, Orthostatic Hypotension, RBD, Sudden offs and Dementia. This can be attributed to the fact that for those symptoms, there is high imbalance between the two categories (occurrence or not), resulting in a relatively low number of records for training the algorithms and developing the predictive models.

Results -- Accuracy of rules Hongyu 10 fold cross validation method; The results of the method, range from 57.1 to 77.4%. The highest accuracy is achieved for the symptom Tremor, which is the most common symptom in PD and for which there is a relative balance between the patients where the symptom occurs and the patients where the symptom does not occur. Also, high results are achieved for Rigidity, Bradykinesia, Falls and LID, which are some of the most important symptoms in PD. On the other hand, low accuracy is obtained for some of the symptoms such as Postural Instability, Autonomic, Hypophonia, Hypomimia, Orthostatic Hypotension, RBD, Sudden offs and Dementia. This can be attributed to the fact that for those symptoms, there is high imbalance between the two categories (occurrence or not), resulting in a relatively low number of records for training the algorithms and developing the predictive models.

Results -- Why this method makes difference Addresses all PD symptoms and provides an integrated approach for treatment in this way. The user receives interpretation for every prediction induced, using the rule that matches the data of the new patient. Medical experts could have a subjective view for their patients clinical condition at home and be able to make more correct treatment changes. Clinical trials will be able to assess the new developed drug response and safety in a more objective way. Hongyu 10 fold cross validation method; The results of the method, range from 57.1 to 77.4%. The highest accuracy is achieved for the symptom Tremor, which is the most common symptom in PD and for which there is a relative balance between the patients where the symptom occurs and the patients where the symptom does not occur. Also, high results are achieved for Rigidity, Bradykinesia, Falls and LID, which are some of the most important symptoms in PD. On the other hand, low accuracy is obtained for some of the symptoms such as Postural Instability, Autonomic, Hypophonia, Hypomimia, Orthostatic Hypotension, RBD, Sudden offs and Dementia. This can be attributed to the fact that for those symptoms, there is high imbalance between the two categories (occurrence or not), resulting in a relatively low number of records for training the algorithms and developing the predictive models.

Results -- Why this method makes difference (cont) The model can be readily interpreted by medical professionals without any specific knowledge of statistics because a simple representation was used in different areas of medical decision. The same model can be used to predict the severity of the occurring symptoms by the age. It predicts the future symptoms in patients, without their presence in the hospital and without requiring time consuming physical examinations and processes, so more data will be provided. The prediction of the symptoms, using only easily obtained features is a difficult task. The paper reports promising results, by using data from additional patient records in the future. Hongyu 10 fold cross validation method; The results of the method, range from 57.1 to 77.4%. The highest accuracy is achieved for the symptom Tremor, which is the most common symptom in PD and for which there is a relative balance between the patients where the symptom occurs and the patients where the symptom does not occur. Also, high results are achieved for Rigidity, Bradykinesia, Falls and LID, which are some of the most important symptoms in PD. On the other hand, low accuracy is obtained for some of the symptoms such as Postural Instability, Autonomic, Hypophonia, Hypomimia, Orthostatic Hypotension, RBD, Sudden offs and Dementia. This can be attributed to the fact that for those symptoms, there is high imbalance between the two categories (occurrence or not), resulting in a relatively low number of records for training the algorithms and developing the predictive models.

Thank you! Questions? Muhammad

Short Description of Application This article proposes a new method using partial decision trees and association rules to predict Parkinson’s Disease (PD) symptoms using features relating to medical history, physical examinations, and medications. The partial decision trees and corresponding rules were applied to a dataset of 230 PD patients from the University Hospital of Ioannina. The proposed method for PD symptom prediction was designed to use in the PERFORM system which uses individual wearable monitors to capture motor information, which can then be used to predict new PD symptoms for each patient.