Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

Similar presentations


Presentation on theme: "Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University."— Presentation transcript:

1 http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University

2 http://majorplus.nerbnerb.com 2 Outline  Introduction and Motivation  Preliminaries  Decision Tree  Ensemble of Decision Trees  Simple Majority Rule  Simple Majority Class  Experiment  Process  Making Rule into Group  Majority Rule + and Majority Class +  Results of Experiment  Conclusions and Future Work

3 http://majorplus.nerbnerb.com Introduction of Decision Tree  Decision Tree is world-wide  Used in data mining and machine learning  CART, C4.5 and ID3 etc.  Advantages  Simple to understand and interpret  Requires little data preparation  Able to handle both numerical and categorical data  Use a white box model  Possible to validate a model using statistical tests  Robust, perform well with large data in a short time

4 http://majorplus.nerbnerb.com Introduction of Decision Tree Outlook Yes NoYesNoYes Attribute Sunny Rain Overcast HumidityWind HighNormalStron g Weak Attribute Value Class

5 http://majorplus.nerbnerb.com Decision Tree Ensemble  Single Classifier -> Ensemble of Classifiers  More accurate than individual classifier  AdaBoost, Bagging and Random forest etc.

6 http://majorplus.nerbnerb.com Decision Tree Ensemble DT Training Set R DT3 Original Training Set R3 DTn Rn DT2 R2 DT1 R1 Training Set 1 for Tree 1 Training Set 2 for Tree 2 Training Set 3 for Tree 3 Training Set n for Tree n Individual Classifier Ensemble of Classifiers

7 http://majorplus.nerbnerb.com Bootstrapping  Also called replications  Be created by uniformly sampling m times with replacement from a dataset of size m  Used to train the multiple classifiers  Cart, nearest neighbor classifiers and C4.5 etc.  10 bootstrap replications

8 http://majorplus.nerbnerb.com Bootstrapping Example of Original Dataset Original Dataset:1,2,3,4,5,6,7,8 Example of Bootstrap Replications 1 st Bootstrap:2,7,8,3,7,6,3,1 2 nd Bootstrap:7,8,5,6,4,2,7,1 3 rd Bootstrap:3,6,2,7,5,6,2,2 4 th Bootstrap:4,5,1,4,6,4,3,8

9 http://majorplus.nerbnerb.com Simple Majority Vote (Bagging)  Advantages  Improve classification accuracy  Reduce variance  Helps to avoid over-fitting  Method  Generates T bootstrap samples  Generates each classifier  Majority vote among the resulting T decision tree is the final output

10 http://majorplus.nerbnerb.com Simple Majority Vote (Bagging) DT1 Test Set DT2 DT3 DT4 DT5 DT6 DT7 DT8 DT9 DT10 A A A B B A A B A A A A A A A A A A=7B=3 B B B Simple Majority Vote A

11 http://majorplus.nerbnerb.com Simple Majority Class  Based on Bagging  Difference in Voting  Majority vote among the class of training set that match the rule, which classify the data tester, in T decision trees, is the final output

12 http://majorplus.nerbnerb.com Simple Majority Class DT1 Test set DT2 DT3 DT4 DT5 DT6 DT7 DT8 DT9 DT10 DT1 Rule 1 Rule 2 Rule 3 Original Training Set Data 1: A Data 2: A Data 3: B Data 4: A Data n: B A:0 B:0 123 1 45678 2 3 Example A: 8 B: 3 A: 5 B: 2 A: 6 B: 3 A: 4 B: 5 A: 9 B: 2 A: 8 B: 2 B: 5 A: 8 B: 3 A: 8 B: 1 A: 7 B: 2 A: 6 A=69B=28 Simple Majority Class A

13 http://majorplus.nerbnerb.com Similarity Between Rules  Continuous Attribute  Using the overlap between two rules’ ranges  Discrete Attribute  Using the number of discrete attribute values in common between both rules More information about similarity between rules, please see “Bootstrapping Rule Induction to Archive Rule Stability and Reduction”

14 http://majorplus.nerbnerb.com Experiment  Nine well-known bench mark data sets  Ten-fold cross- validation  Based on C4.5  Run in default mode with pruning enabled Data SetInstancesAttributesClasses Balance- Scale 62543 Bridges105106 Car1,72864 Dermatology366346 Hayes-Roth13243 Labor-Neg40162 Soybean3073519 TAE15133 Zoo101167

15 http://majorplus.nerbnerb.com Experiment  Generate 10 bootstrap samples of the original training set.  Generate 10 classifiers by using of C4.5.  Find similarity between rules from all classifier.  All rules are made into groups.

16 http://majorplus.nerbnerb.com Group of Rules with Similarity Value 0.8 DT1 DT2 DT3 DT10 R(1,1) R(1,2) R(1,3) R(2,1) R(2,2) R(3,1) R(3,2) R(3,3) R(10,1) R(10,2) R(10,3) Group of Rule(1,1) R(1,1) Similarity Between Rule R(1,1) and R(1,2) 0.2302 Similarity Between Rule R(1,1) and R(1,3) -0.7495 Similarity Between Rule R(1,1) and R(2,1) 0.9454 Similarity Between Rule R(1,1) and R(2,2) 0.7382 R(2,1) R(3,2) R(5,4) R(6,2) R(7,1) R(10,3)

17 http://majorplus.nerbnerb.com Majority Rule +  Based on Bagging  Difference in Voting  Majority vote among the class of rule-member is the final output

18 http://majorplus.nerbnerb.com Majority Rule + DT1 Test Set DT2 DT3 DT4 DT5 DT6 DT7 DT8 DT9 DT10 A: 3 B: 4 A: 2 B: 5 A: 5 B: 5 A: 2 B: 1 A: 4 B: 2 A: 2 B: 3 B: 6 A: 4 B: 2 A: 1 B: 3 A: 1 B: 5 A: 3 A=27B=36 Majority Rule + B DT1 Rule 1 Rule 2 Rule 3 Example Group of Rule 2 R(1,2) R(2,2) R(3,1) R(4,2) R(6,2) R(8,2) R(9,1) A A B A B B B A=3B=4 A: B: 3 4

19 http://majorplus.nerbnerb.com Majority Class +  Based on Bagging  Difference in Voting  Majority vote among the class of training set that matches the rule-member is the final output

20 http://majorplus.nerbnerb.com Majority Class + DT1 Test Set DT2 DT3 DT4 DT5 DT6 DT7 DT8 DT9 DT10 A: 2 B: 10 A: 2 B: 8 A: 6 B: 4 A: 2 B: 12 A: 4 B: 7 A: 2 B: 14 B: 7 A: 4 B: 12 A: 1 B: 5 A: 5 B: 13 A: 5 A=33B=92 Majority Class + B DT1 Rule 1 Rule 2 Rule 3 Example Group of Rule 2 R(1,2) R(2,2) R(3,1) R(4,2) R(6,2) R(8,2) R(9,1) A: B: 0 0 Original Training Set Data 1: A Data 2: A Data 3: B Data 4: A Data n: B 12345678910 12

21 http://majorplus.nerbnerb.com Comparing Bagging and Majority Rule + Data Set Bagging Majority Rule + 0.60.70.80.9 Balance-Scale 78.58 ±4.09 78.09 ±3.43 79.21 ±3.66 79.70 ±4.79 ⊕ 80.01 ±4.54 ⊕ Bridges 59.91 ±14.86 61.82 ±15.6 8 ⊕ 59.00 ±18.2 0 61.73 ±16.5 7 Car 93.81 ±1.37 89.76 ±2.00 ⊖ 90.34 ±2.33 ⊖ 93.17 ±1.87 94.33 ±1.97 Dermatology 95.36 ±4.04 95.90 ±3.29 ⊕ 95.91 ±3.90 ⊕ 96.19 ±3.88 ⊕ 95.63 ±3.69 Hayes-Roth 74.89 ±8.65 74.89 ±13.0 2 75.66 ±11.9 4 74.95 ±11.9 6 73.41 ±11.6 0 Labor-Neg 67.50 ±22.50 Soybean 85.67 ±6.39 86.62 ±5.62 85.02 ±5.45 86.62 ±5.75 86.96 ±4.85 TAE 43.00 ±15.95 45.00 ±14.0 8 43.00 ±15.9 5 Zoo 92.00 ±7.48 94.00 ±6.63 ⊕ ⊕

22 http://majorplus.nerbnerb.com Comparing Simple Majority Class and Majority Class + Data Set Bagging Majority Class + 0.60.70.80.9 Balance-Scale 81.45 ±5.47 80.81 ±4.81 ⊖ 80.81 ±4.32 81.29 ±5.68 81.45 ±5.74 Bridges 63.64 ±19.13 59.82 ±17.0 0 ⊖ 62.64 ±19.3 7 63.64 ±17.7 8 64.55 ±18.8 9 Car 94.68 ±1.28 91.67 ±1.29 ⊖ 93.40 ±1.19 ⊖ 94.45 ±1.48 94.56 ±1.77 Dermatology 91.51 ±6.23 91.25 ±6.29 93.44 ±4.65 ⊕ 93.71 ±5.37 ⊕ 93.71 ±4.05 ⊕ Hayes-Roth 70.38 ±10.69 73.30 ±12.7 3 74.12 ±11.1 5 ⊕ 72.64 ±10.5 2 ⊕ 71.15 ±9.71 Labor-Neg 67.50 ±22.50 Soybean 87.28 ±5.41 63.94 ±11.3 9 ⊖ 78.84 ±6.99 ⊖ 85.00 ±3.38 ⊖ 87.60 ±3.87 TAE 43.67 ±14.49 39.67 ±14.6 4 ⊖ 44.33 ±15.0 6 43.67 ±14.4 9 Zoo 91.00 ±11.36 40.64 ±15.2 0 ⊖ 84.09 ±13.6 1 ⊖ 92.00 ±6.00

23 http://majorplus.nerbnerb.com Appropriate Similarity Value Comparing Bagging and Majority Rule + Similarity Value Significantl y Better Significantl y Worse 0.621 0.711 0.830 0.920 Comparing Simple Majority Class and Majority Class + Similarity Value Significantl y Better Significantl y Worse 0.606 0.723 0.821 0.910 The Best Similarity Between Rules is 0.8

24 http://majorplus.nerbnerb.com Conclusions  Majority vote with neighbor rules improves accuracy over traditional simple majority vote.  The least similarity value is 0.8.

25 http://majorplus.nerbnerb.com Future Work  Run more with 10-15 data set from UCI  Cluster the rules by using similarity value and derive to one classifier  Reduces time and resource.  The similarity value 0.8 could be used to apply with other decision trees methods, such as AdaBoost etc.

26 http://majorplus.nerbnerb.com The End


Download ppt "Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University."

Similar presentations


Ads by Google