Presentation is loading. Please wait.

Presentation is loading. Please wait.

Classifier Ensembles Ludmila Kuncheva School of Computer Science Bangor University Part 2 1.

Similar presentations


Presentation on theme: "Classifier Ensembles Ludmila Kuncheva School of Computer Science Bangor University Part 2 1."— Presentation transcript:

1 Classifier Ensembles Ludmila Kuncheva School of Computer Science Bangor University mas00a@bangor.ac.uk Part 2 1

2 Combiner Features Classifier 2Classifier 1Classifier L… Data set A Combination level selection or fusion? voting or another combination method? trainable or non-trainable combiner? and why not another classifier? B Classifier level same or different classifiers? decision trees, neural networks or other? how many? C Feature level all features or subsets of features? random or selected subsets? D Data level independent/dependent bootstrap samples? selected data sets? Levels of questions Building ensembles Building ensembles Boosting Random subspace Random Forest Rotation Forest Bagging Linear Oracle

3 Combiner Features Classifier 2Classifier 1Classifier L… Data set A Combination level selection or fusion? voting or another combination method? trainable or non-trainable combiner? and why not another classifier? B Classifier level same or different classifiers? decision trees, neural networks or other? how many? C Feature level all features or subsets of features? random or selected subsets? D Data level independent/dependent bootstrap samples? selected data sets? Levels of questions Boosting Random subspace Random Forest Rotation Forest Bagging Linear Oracle Building ensembles Building ensembles This seems under-researched...

4 Classifier combiners Nobody talks about this...

5 Label outputs Continuous-valued outputs 1 1 2 2 3 3 x 1 1 2 2 3 3 x Decision profile Combiner

6 Ensemble (label outputs, R,G,B) 204 R 102 G 54 B Red Blue Red Green Red Majority vote Combiner

7 Ensemble (label outputs, R,G,B) 200 R 219 G 190 B Red Blue Red Green Red Majority vote Green Weighted Majority vote 0.05 0.50 0.02 0.10 0.70 0.10 0.27 0.70 0.50 Combiner

8 Ensemble (label outputs, R,G,B) Red Blue Red Green Red RBRRGR Classifier Green Combiner

9 Ensemble (continuous outputs, [R,G,B]) [0.6 0.3 0.1] [0.1 0.0 0.6] [0.7 0.6 0.5] [0.4 0.3 0.1] [0 1 0] [0.9 0.7 0.8] Combiner

10 Ensemble (continuous outputs, [R,G,B]) [0.6 0.3 0.1] [0.1 0.0 0.6] [0.7 0.6 0.5] [0.4 0.3 0.1] [0 1 0] [0.9 0.7 0.8] Mean R = 0.45 Combiner

11 Ensemble (continuous outputs, [R,G,B]) [0.6 0.3 0.1] [0.1 0.0 0.6] [0.7 0.6 0.5] [0.4 0.3 0.1] [0 1 0] [0.9 0.7 0.8] Mean R = 0.45 Mean G = 0.48 Combiner

12 Ensemble (continuous outputs, [R,G,B]) [0.6 0.3 0.1] [0.1 0.0 0.6] [0.7 0.6 0.5] [0.4 0.3 0.1] [0 1 0] [0.9 0.7 0.8] Mean R = 0.45 Mean G = 0.48 Mean B = 0.35 Class GREEN Combiner

13 Ensemble (continuous outputs, [R,G,B]) [0.6 0.3 0.1] [0.1 0.0 0.6] [0.7 0.6 0.5] [0.4 0.3 0.1] [0 1 0] [0.9 0.7 0.8] Mean R = 0.45 Mean B = 0.35 Class GREEN Decision profile 0.6 0.3 0.1 0.1 0.0 0.6 0.7 0.6 0.5 0.4 0.3 0.1 0.0 1.0 0.0 0.9 0.7 0.8 Combiner Mean G = 0.48

14 Time for an example: combiner matters

15 Data set: Let’s call this data “The Tropical Fish” or just the fish data. 50-by-50 = 2500 objects in 2-d Bayes error rate = 0% Induce label noise to make the problem more interesting noise 10%noise 45%

16 Example: 2 ensembles Train 50 linear classifiers on bootstrap samples Throw 50 “straws” and label the “fish” side so that the accuracy is greater than 0.5

17 Example: 2 ensembles Each classifier returns an estimate for class “Fish” And, of course, we have but we will not need this.

18 Example: 2 ensembles 10% label noise

19 Example: 2 ensembles 45% label noise

20 Example: 2 ensembles 45% label noise

21 Example: 2 ensembles 45% label noise

22 What does the example show? The combiner matters (a lot) Noise helps the ensemble! The trained combiner for continuous labels is best (linear, tree) BKS works because of the small number of classes and classifiers Example: 2 ensembles However, nothing is as simple as it looks...

23 http://samcnitt.tumblr.com/ The Combining Classifier: to Train or Not to Train?

24

25 Train the COMBINER if you have “enough” data! Otherwise, like with any classifier, we may over- fit the data. Get this: Almost NOBODY trains the combiner, not in the CLASSIC ensemble methods anyway. Ha-ha-ha, what is “enough” data?

26 Diversity Everybody talks about this...

27 Publications (580) Citations (4594) “CLASSIFIER ENSEMBLE DIVERSITY” Search on 10 Sep 2014 Diversity

28 MULTIPLE CLASSIFIER SYSTEMS 30 INT JOINT CONF ON NEURAL NETWORKS (IJCNN) 22 PATTERN RECOGNITION 17 NEUROCOMPUTING 14 EXPERT SYSTEMS WITH APPLICATIONS 13 INFORMATION SCIENCES 12 APPLIED SOFT COMPUTING 11 PATTERN RECOGNITION LETTERS 10 INFORMATION FUSION 9 IEEE INT JOINT CONF ON NEURAL NETWORKS 9 KNOWLEDGE-BASED SYSTEMS 7 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 7 INT J OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE 6 MACHINE LEARNING 5 IEEE TRANSACTIONS ON NEURAL NETWORKS 5 JOURNAL OF MACHINE LEARNING RESEARCH 5 APPLIED INTELLIGENCE 4 INTELLIGENT DATA ANALYSIS 4 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 4 ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING 4 NEURAL INFORMATION PROCESSING 4 580 papers Diversity

29 Where in the world are we? China 140 UK 68 USA 63 Spain 55 Brazil 41 Canda 32 Poland 28 Iran 23 Italy 19... Diversity

30 Are we still talking about diversity in classifier ensembles? Apparently yes... That elusive diversity... We want the classifiers in the ensemble to be ACCURATE and DIVERSE simultaneously. And HOW CAN THIS HAPPEN?!? Diversity

31 All ensemble methods we have seen so far strive to keep the individual accuracy high while increasing diversity. How can we measure diversity? WHAT can we do with the diversity value?

32 Measure diversity for a PAIR of classifiers Classifier 2 Classifier 1 correct wrong correct wrong independent outputs ≠ independent errors hence, use ORACLE outputs Number of instances labelled correctly by classifier 1 and mislabelled by classifier 2 Diversity

33 Classifier 2 Classifier 1 correct wrong correct wrong Q kappa correlation (rho) disagreement double fault... Diversity

34 SEVENTY SIX !!! Diversity

35 Do we need more “NEW” pairwise diversity measures? Looks like we don’t... Diversity And the same holds for non-pairwise measures... Far too many already.

36 All ensemble methods we have seen so far strive to keep the individual accuracy high while increasing diversity. How can we measure diversity? WHAT can we do with the diversity value? -Compare ensembles -Explain why a certain ensemble heuristic works and others don’t -Construct ensemble by overproducing and selecting classifiers with high accuracy and high diversity

37 Why is diversity so baffling? The problem is that diversity is NOT monotonically related to the ensemble accuracy. In other words, diverse ensembles may be good or may be bad...

38 Good diversity and bad diversity

39 Good and Bad diversity 3 classifiers: A, B, C 15 objects, wrong vote, correct vote individual accuracy = 10/15 = 0.667 P = ensemble accuracy independent classifiers P = 11/15 = 0.733 identical classifiers P = 10/15 = 0.667 dependent classifiers 1 P = 7/15 = 0.467 dependent classifiers 2 P = 15/15 = 1.000 ABCABC ABCABC ABCABC ABCABC MAJORITY VOTE

40 Good and Bad diversity 3 classifiers: A, B, C 15 objects, wrong vote, correct vote individual accuracy = 10/15 = 0.667 P = ensemble accuracy independent classifiers P = 11/15 = 0.733 identical classifiers P = 10/15 = 0.667 dependent classifiers 1 P = 7/15 = 0.467 dependent classifiers 2 P = 15/15 = 1.000 ABCABC ABCABC ABCABC ABCABC MAJORITY VOTE Good diversity Bad diversity

41 Good and Bad diversity Data set Z Ensemble, L = 7 classifiers   Are these outputs diverse?

42 Good and Bad diversity Data set Z Ensemble, L = 7 classifiers    How about these?

43 Good and Bad diversity Data set Z Ensemble, L = 7 classifiers    3 vs 4... Can’t be more diverse, really...

44 Good and Bad diversity Data set Z Ensemble, L = 7 classifiers    MAJORITY VOTE  Good diversity

45 Good and Bad diversity Data set Z Ensemble, L = 7 classifiers    MAJORITY VOTE Bad diversity  

46 Good and Bad diversity maj  maj  Decomposition of the Majority Vote Error Individual error Subtract GOOD diversity Add BAD diversity Brown G., L.I. Kuncheva, "Good" and "bad" diversity in majority vote ensembles, Proc. Multiple Classifier Systems (MCS'10), Cairo, Egypt, LNCS 5997, 2010, 124-133.

47 Good and Bad diversity     Note that diversity quantity is 3 in both cases

48 Ensemble Margin     POSITIVE NEGATIVE

49 Ensemble Margin Average margin However, nearly all diversity measures are functions of Average absolute margin or Average square margin Margin has no sign...

50 Ensemble Margin

51 The bottom line is: Diversity is not MONOTONICALLY related to ensemble accuracy So, stop looking for what is not there...

52 Where next in classifier ensembles?

53 proposed by Margineantu and Dietterich in 1997 visualise individual accuracy and diversity in a 2-dimensional plot have been used to decide which ensemble members can be pruned without much harm to the overall performance Kappa-error diagrams

54 Adaboost 75.0% Bagging 77.0% Random subspace 80.9% Random oracle 83.3% Rotation Forest 84.7%  sonar data (UCI): 260 instances, 60 features, 2 classes, ensemble size L = 11 classifiers, base model – tree C4.5 Example Kuncheva L.I., A bound on kappa-error diagrams for analysis of classifier ensembles, IEEE Transactions on Knowledge and Data Engineering, 2013, 25 (3), 494-501 (DOI: 10.1109/TKDE.2011.234).

55 correctwrong C1 correct ab wrong cd C2 error kappa = (observed – chance)/(1-chance) Kappa-error diagrams

56 bound (tight) bound (tight) error kappa Kappa-error diagrams

57 error kappa Kappa-error diagrams – simulated ensembles L = 3

58 error kappa Real data: 77,422,500 pairs of classifiers room for improvement

59 Is there space for new classifier ensembles? Looks like yes...

60 Number of classifiers L 1 The perfect classifier 3-8 classifiers heterogeneous trained combiner (stacked generalisation) 100+ classifiers same model non-trained combiner (bagging, boosting, etc.)  Large ensemble of nearly identical classifiers - REDUNDANCY  Small ensembles of weak classifiers - INSUFFICIENCY ? ? Must engineer diversity… Strength of classifiers How about here? 30-50 classifiers same or different models? trained or non-trained combiner? selection or fusion?

61 61 MathWorks recommendations: AdaBoost and... wait for it... wait for iiiiit... AdaBoost

62 62 plus, is quite expensive MathWorks recommendations:

63 One final play instead of conclusions...

64 64 For the winner by my favourite illustrator Marcello Barenghi Well, I’ll give you a less crinkled one :)

65 65 Time for you now... Recall our digit example The competitors are: Bagging, AdaBoost, Random Forest, Random Subspace and Rotation Forest – ALL with 10 decision trees A guessing game Data for this example: A small part of MNIST... decision tree 68.2% YOUR TASK: Rank the competitors and predict the ensemble accuracy for each one. The WINNER will be a correct ranking and predictions within 3% of the true accuracies. (MSE for a tie-break) The judge is WEKA

66 decision tree 68.2% 4. Random Forest 78.7% 1. Rotation Forest 85.0% 2. AdaBoost 82.9% 5. Bagging 75.6% 3. Random Subspace 79.1% Ensembles of 10

67 decision tree 68.2% 4. Random Forest 78.7% 1. Rotation Forest 85.0% 2. AdaBoost 82.9% 5. Bagging 75.6% 3. Random Subspace 79.1% Ensembles of 10 But you know what the funny thing is?...

68 Rotation Forest 85.0% AdaBoost 82.9% Random Subspace 79.1% Random Forest 78.7% Bagging 75.6% decision tree 68.2% 1-nn 87.4% SVM 89.5%

69 The moral of the story... 1. There may be a simpler solution. Don’t overlook it! 2. The most acclaimed methods are not always the best. Heeeeey, this proves fallibility of my classifier ensemble theory, Marcello Pelillo! (who left already...) :(

70 Everyone, WAKE UP! And thank you for still being here :) Everyone, WAKE UP! And thank you for still being here :) 1. Classifier combiners. Nobody talks about this... 2. Time for an example: combiner matters 3. Diversity. Everybody talks about this... 4. Good diversity and bad diversity 5. Where next in classifier ensembles? 6. One final play instead of conclusions...


Download ppt "Classifier Ensembles Ludmila Kuncheva School of Computer Science Bangor University Part 2 1."

Similar presentations


Ads by Google