C OMPARING A SSOCIATION R ULES AND D ECISION T REES FOR D ISEASE P REDICTION Carlos Ordonez.

Slides:



Advertisements
Similar presentations
Recap: Mining association rules from large datasets
Advertisements

Mining Association Rules from Microarray Gene Expression Data.
Association Analysis (Data Engineering). Type of attributes in assoc. analysis Association rule mining assumes the input data consists of binary attributes.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Mining High-Speed Data Streams
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5
Using a Mixture of Probabilistic Decision Trees for Direct Prediction of Protein Functions Paper by Umar Syed and Golan Yona department of CS, Cornell.
1 Data Mining Classification Techniques: Decision Trees (BUSINESS INTELLIGENCE) Slides prepared by Elizabeth Anglo, DISCS ADMU.
Classification Techniques: Decision Tree Learning
BOAT - Optimistic Decision Tree Construction Gehrke, J. Ganti V., Ramakrishnan R., Loh, W.
Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis.
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
What is Statistical Modeling
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Stock Movement Prediction Deepathi Lingala Sathindra K. Kamepalli Sudhir K. V. Potturi.
Basic Data Mining Techniques Chapter Decision Trees.
Tree-based methods, neutral networks
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Basic Data Mining Techniques
Lecture 5 (Classification with Decision Trees)
Knowledge Representation. 2 Outline: Output - Knowledge representation  Decision tables  Decision trees  Decision rules  Rules involving relations.
Covering Algorithms. Trees vs. rules From trees to rules. Easy: converting a tree into a set of rules –One rule for each leaf: –Antecedent contains a.
Evaluation of MineSet 3.0 By Rajesh Rathinasabapathi S Peer Mohamed Raja Guided By Dr. Li Yang.
Research Project Mining Negative Rules in Large Databases using GRD.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Classification.
Chapter 5 Data mining : A Closer Look.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Next Generation Techniques: Trees, Network and Rules
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Mining Frequent Itemsets with Constraints Takeaki Uno Takeaki Uno National Institute of Informatics, JAPAN Nov/2005 FJWCP.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Chapter 9 – Classification and Regression Trees
Chapter 7: Transformations. Attribute Selection Adding irrelevant attributes confuses learning algorithms---so avoid such attributes Both divide-and-conquer.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Mining various kinds of Association Rules
Business Intelligence and Decision Modeling Week 9 Customer Profiling Decision Trees (Part 2) CHAID CRT.
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
Detecting Group Differences: Mining Contrast Sets Author: Stephen D. Bay Advisor: Dr. Hsu Graduate: Yan-Cheng Lin.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Data Mining and Decision Support
Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Comparing Association Rules and Decision Trees for Disease.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 3 Basic Data Mining Techniques Jason C. H. Chen, Ph.D. Professor of MIS School of Business.
Searching for Pattern Rules Guichong Li and Howard J. Hamilton Int'l Conf on Data Mining (ICDM),2006 IEEE Advisor : Jia-Ling Koh Speaker : Tsui-Feng Yen.
Chapter 14 – Association Rules and Collaborative Filtering © Galit Shmueli and Peter Bruce 2016 Data Mining for Business Analytics (3rd ed.) Shmueli, Bruce.
DATA MINING: ASSOCIATION ANALYSIS (2) Instructor: Dr. Chun Yu School of Statistics Jiangxi University of Finance and Economics Fall 2015.
Artificial Neural Networks
Rule Induction for Classification Using
Trees, bagging, boosting, and stacking
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Tutorial 3: Using XLMiner for Association Rule Mining
Exam #3 Review Zuyin (Alvin) Zheng.
Introduction to Data Mining, 2nd Edition by
Classification and Prediction
Decision Trees By Cole Daily CSCI 446.
Nearest Neighbors CSC 576: Data Mining.
Discovering Constrained Association Rules to Predict Heart Disease
Chapter 7: Transformations
Presentation transcript:

C OMPARING A SSOCIATION R ULES AND D ECISION T REES FOR D ISEASE P REDICTION Carlos Ordonez

M OTIVATION Three main issues about mining association rules in medical datasets: 1. A significant fraction of association rules is irrelevant 2. Most relevant rules with high quality metrics appear only at low support 3. # of discovered rules becomes extremely large at low support Search constraints: Find only medically significant association rules Make search more efficient

M OTIVATION Decision tree  a well-known machine learning algorithm Association rules vs. Decision tree Accuracy Interpretability Applicability

A SSOCIATION R ULES Support Confidence Lift Lift quantifies the predictive power of x  y Rules such that lift(x  y) > 1 are interesting!

C ONSTRAINED A SSOCIATION R ULES Transforming Medical Data Set Data must be transformed to binary dimensions Numeric attributes  intervals, each interval is mapped to an item. Categorical attributes  each categorical value is an item If an attribute has negation add that as an item Each item is corresponds to the presence or absence of one categorical value or one numeric interval

C ONSTRAINED A SSOCIATION R ULES Search Constraints 1. Max itemset size (k) Reduces the combinatorial explosion of large itemsets and helps finding simple rules 2. Group g i >0  A j belongs to a group g i =0  A j is not group-constrained at all This avoids finding trivial or redundant rules 3. Antecedent/Consequent c i = 1  A i is an antecedent c i = 2  A i is a consequent

Patients655 attributes25 Percentage of vessel narrowing LAD, LCX and RCA are binned at 70% and 50% LM is binned at 30% and 50% 9 heart regions ( 2 ranges with 0.2 as cutoff) Binned at 40(adult) and 60(old) Binned at 200 and 250

P ARAMETERS k = 4 Min support = 1% ≈ 7 Min confidence = 70% Min lift = 1.2 To get rules where there is stronger implication dependence between X and Y Rules with conf ≥ 90 and lift ≥ 2, with 2 or more items in the consequent were considered medically significant.

H EALTHY ARTERIES 9,595 associations 771 rules

D ISEASED A RTERIES Several unneeded items were filtered out ( with values in lower (healthy) ranges) 10,218 associations 552 rules

P REDICTIVE R ULES FROM D ECISION T REES CN4.5  using gain ratio CART  similar results Threshold for the height of the tree to produce simple rules Percentage of patients (ls) Fraction of patients where the antecedent appears Confidence factor (cf) Focus on predicting LDA disease

P REDICTIVE R ULES FROM D ECISION T REES 1. All measurements without binning as independent variables, numerical variables are automatically split Without any threshold on height: 181 node 90% accuracy height = 14 most rules more than 5 attributes except 5 rules, other involve less than 2% of the patients More than 80% of rules refer to less than 1% of patients Many rules involve attributes with missing information Many rules had the same variable being split several times Few rules with cf = 1 but splits included borderline cases and involves few patients

P REDICTIVE R ULES FROM D ECISION T REES With threshold = 10 on height 83 nodes 77% accuracy Most rules have repeated attributes More than 5 attributes Perfusion cutoffs higher than 0.5 Low cf and involved less than 1% of the population With threshold = 3 on height 65% accuracy Simpler rules

R ELATED W ORK

P REDICTIVE R ULES FROM D ECISION T REES 2. Items (binary variables) as independent variables like association rules are used With threshold = 3 on height Most of the rules were much closer to the prediction requirements 10 nodes

D ISCUSSION Decision trees are not as powerful as association rules in this case Do not work well with combinations of several target variables Fail to identify many medically relevant combinations of independent numeric variable ranges and categorical values Tend to find complex and long rules, if the height is unlimited Find few predictive rules with reasonably sized (>1%) sets of patients in such cases Rules some times repeat the same attribute

D ISCUSSION - ALTERNATIVES build many decision trees with different independent attributes It’s error-prone, difficult to interpret, slow for higher # of attributes Create a family of small trees, each tree has a weight Each tree becomes similar to a small set of association rules Constraints for association rules can be adopted to decision trees (future work)

D ISCUSSION – D ECISION T REE A DVANTAGES DT partitions the data set, ARs on the same target attributes may refer to overlap DT represents a predictive model of data set, ARs are disconnected among themselves DT is guaranteed to have at least 50% prediction accuracy and generally above 80% for binary target variables, ARs require trial and error to find the best threshold