Are we still talking about diversity in classifier ensembles? Ludmila I Kuncheva School of Computer Science Bangor University, UK.

Slides:



Advertisements
Similar presentations
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Advertisements

My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Data Mining Classification: Alternative Techniques
School of Cybernetics, School of Systems Engineering, University of Reading Presentation Skills Workshop March 22, ‘11 Diagnosis of Breast Cancer by Modular.
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
MCS Multiple Classifier Systems, Cagliari 9-11 June Giorgio Valentini Random aggregated and bagged ensembles.
UCI KDD Archive University of California at Irvine –
Publication Venues Main Neural Network Conferences –NIPS (Neural Information Processing Systems) –IJCNN (Intl Joint Conf on Neural Networks) Main Neural.
MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?
A Technique for Advanced Dynamic Integration of Multiple Classifiers Alexey Tsymbal*, Seppo Puuronen**, Vagan Terziyan* *Department of Artificial Intelligence.
Sketched Derivation of error bound using VC-dimension (1) Bound our usual PAC expression by the probability that an algorithm has 0 error on the training.
Three kinds of learning
L ++ An Ensemble of Classifiers Approach for the Missing Feature Problem Using learn ++ IEEE Region 2 Student Paper Contest University of Maryland Eastern.
Sparse vs. Ensemble Approaches to Supervised Learning
Oregon State University – Intelligent Systems Group 8/22/2003ICML Giorgio Valentini Dipartimento di Scienze dell Informazione Università degli Studi.
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
3 ème Journée Doctorale G&E, Bordeaux, Mars 2015 Wei FENG Geo-Resources and Environment Lab, Bordeaux INP (Bordeaux Institute of Technology), France Supervisor:
Feature extraction for change detection Can you detect an abrupt change in this picture? Ludmila I Kuncheva School of Computer Science Bangor University.
Classifier Ensembles Ludmila Kuncheva School of Computer Science Bangor University Part 2 1.
Machine Learning CS 165B Spring 2012
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
Are we still talking about diversity in classifier ensembles? Ludmila I Kuncheva School of Computer Science Bangor University, UK.
Boosting Neural Networks Published by Holger Schwenk and Yoshua Benggio Neural Computation, 12(8): , Presented by Yong Li.
Full model selection with heuristic search: a first approach with PSO Hugo Jair Escalante Computer Science Department, Instituto Nacional de Astrofísica,
Visual Information Systems multiple processor approach.
Exploring a Hybrid of Support Vector Machines (SVMs) and a Heuristic Based System in Classifying Web Pages Santa Clara, California, USA Ahmad Rahman, Yuliya.
Rotation Invariant Neural-Network Based Face Detection
Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Data Dependence in Combining Classifiers Mohamed Kamel PAMI Lab University of Waterloo.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Special topics on text mining [ Part I: text classification ] Hugo Jair Escalante, Aurelio Lopez, Manuel Montes and Luis Villaseñor.
Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)
M.S in CS Introduction & more How do I select a concentration area? by Xudong Yu What is a concentration area? What is a topic paper? Thesis...is that.
CS Fall 2015 (© Jude Shavlik), Lecture 7, Week 3
ICDM 2003 Review Data Analysis - with comparison between 02 and 03 - Xindong Wu and Alex Tuzhilin Analyzed by Shusaku Tsumoto.
Ensemble Based Systems in Decision Making Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: IEEE CIRCUITS AND SYSTEMS MAGAZINE 2006, Q3 Robi.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
1 Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1classifier 2classifier.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
Classifier ensembles: Does the combination rule matter? Ludmila Kuncheva School of Computer Science Bangor University, UK
Ensemble Learning (1) Boosting Adaboost Boosting is an additive model
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
A Generalized Version Space Learning Algorithm for Noisy and Uncertain Data T.-P. Hong, S.-S. Tseng IEEE Transactions on Knowledge and Data Engineering,
Co-operative Training in Classifier Ensembles Rozita Dara PAMI Lab University of Waterloo.
Ensemble Methods in Machine Learning
Classification Ensemble Methods 1
COMP24111: Machine Learning Ensemble Models Gavin Brown
Data Mining and Decision Support
International Journal of Software Engineering and Its Applications Vol. 7, No. 2, March, 2013 Insights of Data Mining for Small and Unbalanced Data Set.
Finding τ → μ−μ−μ+ Decays at LHCb with Data Mining Algorithms
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Mining Concept-Drifting Data Streams Using Ensemble Classifiers Haixun Wang Wei Fan Philip S. YU Jiawei Han Proc. 9 th ACM SIGKDD Internal Conf. Knowledge.
BRAIN Alliance Research Team Annual Progress Report (Jul – Feb
Zaman Faisal Kyushu Institute of Technology Fukuoka, JAPAN
Trees, bagging, boosting, and stacking
RESEARCH APPROACH.
Source: Procedia Computer Science(2015)70:
Active Learning Intrusion Detection using k-Means Clustering Selection
COMP61011 : Machine Learning Ensemble Models
Basic machine learning background with Python scikit-learn
Neuro-Computing Lecture 5 Committee Machine
2006 IEEE World Congress on Computational Intelligence, International Joint Conference on Neural Networks (IJCNN) Evolutionary Search for Interesting Behavior.
Ensemble learning.
Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1 classifier 2 classifier.
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Are we still talking about diversity in classifier ensembles? Ludmila I Kuncheva School of Computer Science Bangor University, UK

Publications (580) Citations (4594) “CLASSIFIER ENSEMBLE DIVERSITY” Search on 10 Sep 2014

MULTIPLE CLASSIFIER SYSTEMS 30 INT JOINT CONF ON NEURAL NETWORKS (IJCNN) 22 PATTERN RECOGNITION 17 NEUROCOMPUTING 14 EXPERT SYSTEMS WITH APPLICATIONS 13 INFORMATION SCIENCES 12 APPLIED SOFT COMPUTING 11 PATTERN RECOGNITION LETTERS 10 INFORMATION FUSION 9 IEEE INT JOINT CONF ON NEURAL NETWORKS 9 KNOWLEDGE-BASED SYSTEMS 7 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 7 INT J OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE 6 MACHINE LEARNING 5 IEEE TRANSACTIONS ON NEURAL NETWORKS 5 JOURNAL OF MACHINE LEARNING RESEARCH 5 APPLIED INTELLIGENCE 4 INTELLIGENT DATA ANALYSIS 4 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 4 ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING 4 NEURAL INFORMATION PROCESSING papers, 335 journals / conferences

MULTIPLE CLASSIFIER SYSTEMS 30 INT JOINT CONF ON NEURAL NETWORKS (IJCNN) 22 PATTERN RECOGNITION 17 NEUROCOMPUTING 14 EXPERT SYSTEMS WITH APPLICATIONS 13 INFORMATION SCIENCES 12 APPLIED SOFT COMPUTING 11 PATTERN RECOGNITION LETTERS 10 INFORMATION FUSION 9 IEEE INT JOINT CONF ON NEURAL NETWORKS 9 KNOWLEDGE-BASED SYSTEMS 7 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 7 INT J OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE 6 MACHINE LEARNING 5 IEEE TRANSACTIONS ON NEURAL NETWORKS 5 JOURNAL OF MACHINE LEARNING RESEARCH 5 APPLIED INTELLIGENCE 4 INTELLIGENT DATA ANALYSIS 4 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 4 ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING 4 NEURAL INFORMATION PROCESSING papers, 335 journals / conferences

Where in the world are we? China 140 UK 68 USA 63 Spain 55 Brazil 41 Canda 32 Poland 28 Iran 23 Italy France 11

Where in the world are we? China 140 UK 68 USA 63 Spain 55 Brazil 41 Canda 32 Poland 28 Iran 23 Italy France 11 Laurent HEUTTE Professor of Computer Science, University of Rouen, France

Are we still talking about diversity in classifier ensembles? Apparently yes...

That elusive diversity... classifier feature values (object description) classifier class label “combiner” Classifier ensemble

That elusive diversity... Classifier 2 Classifier 1 correct wrong correct wrong independent outputs ≠ independent errors hence, use ORACLE outputs Number of instances labelled correctly by classifier 1 and mislabelled by classifier 2

That elusive diversity... Classifier 2 Classifier 1 correct wrong correct wrong Q kappa correlation (rho) disagreement double fault...

That elusive diversity... SEVENTY SIX !!!

Do we need more “NEW” pairwise diversity measures? Looks like we don’t...

proposed by Margineantu and Dietterich in 1997 visualise individual accuracy and diversity in a 2-dimensional plot have been used to decide which ensemble members can be pruned without much harm to the overall performance Kappa-error diagrams

Adaboost 75.0% Bagging 77.0% Random subspace 80.9% Random oracle 83.3% Rotation Forest 84.7%  sonar data (UCI): 260 instances, 60 features, 2 classes, ensemble size L = 11 classifiers, base model – tree C4.5 Example Kuncheva L.I., A bound on kappa-error diagrams for analysis of classifier ensembles, IEEE Transactions on Knowledge and Data Engineering, 2013, 25 (3), (DOI: /TKDE ).

correctwrong C1 correct ab wrong cd C2 error kappa = (observed – chance)/(1-chance) Kappa-error diagrams

bound (tight) bound (tight) error kappa Kappa-error diagrams

error kappa Kappa-error diagrams – simulated ensembles L = 3

error kappa Kappa-error diagrams – real data L = 11

error kappa Real data: 77,422,500 pairs of classifiers room for improvement

Is there space for new classifier ensembles? Looks like yes...

Good and Bad diversity Diversity is not MONOTONICALLY related to ensemble accuracy

Good and Bad diversity 3 classifiers: A, B, C 15 objects, wrong vote, correct vote individual accuracy = 10/15 = P = ensemble accuracy independent classifiers P = 11/15 = identical classifiers P = 10/15 = dependent classifiers 1 P = 7/15 = dependent classifiers 2 P = 15/15 = ABCABC ABCABC ABCABC ABCABC MAJORITY VOTE

Good and Bad diversity 3 classifiers: A, B, C 15 objects, wrong vote, correct vote individual accuracy = 10/15 = P = ensemble accuracy independent classifiers P = 11/15 = identical classifiers P = 10/15 = dependent classifiers 1 P = 7/15 = dependent classifiers 2 P = 15/15 = ABCABC ABCABC ABCABC ABCABC MAJORITY VOTE Good diversity Bad diversity

Good and Bad diversity Data set Z Ensemble, L = 7 classifiers   Are these outputs diverse?

Good and Bad diversity Data set Z Ensemble, L = 7 classifiers    How about these?

Good and Bad diversity Data set Z Ensemble, L = 7 classifiers    3 vs 4... Can’t be more diverse, really...

Good and Bad diversity Data set Z Ensemble, L = 7 classifiers    MAJORITY VOTE  Good diversity

Good and Bad diversity Data set Z Ensemble, L = 7 classifiers    MAJORITY VOTE Bad diversity  

Good and Bad diversity maj  maj  Decomposition of the Majority Vote Error Individual error Subtract GOOD diversity Add BAD diversity Brown G., L.I. Kuncheva, "Good" and "bad" diversity in majority vote ensembles, Proc. Multiple Classifier Systems (MCS'10), Cairo, Egypt, LNCS 5997, 2010,

Good and Bad diversity     Note that diversity quantity is 3 in both cases

Ensemble Margin     POSITIVE NEGATIVE

Ensemble Margin Average margin However, nearly all diversity measures are functions of Average absolute margin or Average square margin Margin has no sign...

Ensemble Margin

Diversity is not MONOTONICALLY related to ensemble accuracy So, STOP LOOKING for a monotonic relationship!!!

Conclusions 1.Beware! Overflow of diversity measures! 2.In theory, there is some room for better classifier ensembles. 3.Diversity is not monotonically related to ensemble accuracy, hence larger diversity does not necessarily mean better accuracy. Directly engineered or heuristic? – up to you

36

37