Download presentation
Presentation is loading. Please wait.
Published byAnnis Samantha Pierce Modified over 8 years ago
1
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Comparing Association Rules and Decision Trees for Disease Prediction Advisor : Dr. Hsu Presenter : Yu-San Hsieh Author : Carlos Ordonez 2006. CIKM.17-24
2
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Motivation Objective Method Experiments Conclusions Outline
3
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation The mining association rules exits some questions in a medical data set ─ Irrelevant ─ Most relevant rules appear only at low support ─ The number of discovered rules becomes large at low support The number of rules makes search slow and interpretation by the domain expert difficult.
4
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objective We propose search constraints to find only medically significant association rules and make search more efficient.
5
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Method Medical dataset Transforming Search Constraints Search constraints ─ User-specified maximum item-set size κ ─ group : A→g group(A j ) = g j group(AGE)=0 AGE is not group-constrained group(AL)=1 AL is constrained to belong group 1 ─ group(attribute(a)) ≠ group(attribute(b)) (-1.0<= IL < 0.2) and (-1.0 <= LA < 0.2) are not in the same itemset ─ ac : A→C ac(A j ) = c j ac(AGE) = 1 AGE is in antecedent ac(LAD) =2 LAD is in consequent Support confidence Phase 1 Phase 2 Phase 1 Phase 2 AGE LAD
6
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 6 Experiments The medical data set ─ 655 patients and 25 attribute (numeric and categorical) ─ Three basic elements for analysis Perfusion defect Coronary stenosis Risk fatocr ─ Default parameter setting Maximal itemset size κ=4 Minimum support = 1% Minimum confidence = 70% ─ Negation, ac and Group Association rules Decision tree
7
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 7 Conclusions The decision tree are less effective than constrained association rule ─ Predict disease with several related target attribute ─ Low confidence factor ─ Slight overfitting ─ Rule complexity ─ Data set fragmentation
8
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 8 My opinion Advantage ─ Producing medically useful rules, reducing the number of discovered rules and improving running time Drawback ─ Lack of quantitative evaluation ─ Most of rules’ analysis Application ─ Prediction ─ Classification
9
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 9 Method Transformed to binary dimension ─ Numerical data: age 0< age <=40 and 40< age <=60 ─ Categorical data: sex sex = Male and sex = Female First constraint ─ An attribute has negation Additional items are created and corresponding to each negated categorical value or each negated interval example: not(0 <= LM < 30), not(0 <= LAD <50), not(0 <= LCX <50)……
10
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 10 Experiments Predictive association rule healthy diseased LCX LAD RCA
11
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 Experiments Predictive Decision tree ─ Using the CN4.5 decision tree algorithm ─ Focused on predicting LAD disease (LAD ≧ 50 as the target class) ─ Result : maximal height = 3 Numeric dimensions and automatic splitsManually binned variable Confidence↓ , not useful Confidence↓
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.