Are we still talking about diversity in classifier ensembles? Ludmila I Kuncheva School of Computer Science Bangor University, UK.

Slides:



Advertisements
Similar presentations
Ensemble Learning – Bagging, Boosting, and Stacking, and other topics
Advertisements

My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007.
Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and its Variations M.C. Fairhurst University of Kent UK.
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
The Decision-Making Process IT Brainpower
MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?
Ensemble Learning: An Introduction
Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.
Three kinds of learning
Machine Learning: Ensemble Methods
CS Instance Based Learning1 Instance Based Learning.
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
Ensemble Learning (2), Tree and Forest
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
Classifier Ensembles Ludmila Kuncheva School of Computer Science Bangor University Part 2 1.
Machine Learning CS 165B Spring 2012
Classifier Ensembles: Facts, Fiction, Faults and Future Ludmila I Kuncheva School of Computer Science Bangor University, Wales, UK.
Issues with Data Mining
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Modified from the slides by Dr. Raymond J. Mooney
Are we still talking about diversity in classifier ensembles? Ludmila I Kuncheva School of Computer Science Bangor University, UK.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
ECE 8443 – Pattern Recognition Objectives: Bagging and Boosting Cross-Validation ML and Bayesian Model Comparison Combining Classifiers Resources: MN:
1 Part II: Practical Implementations.. 2 Modeling the Classes Stochastic Discrimination.
CS Fall 2015 (© Jude Shavlik), Lecture 7, Week 3
Ensemble Based Systems in Decision Making Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: IEEE CIRCUITS AND SYSTEMS MAGAZINE 2006, Q3 Robi.
Kansas State University Department of Computing and Information Sciences CIS 690: Implementation of High-Performance Data Mining Systems Friday, 23 May.
ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Computational Intelligence: Methods and Applications Lecture 36 Meta-learning: committees, sampling and bootstrap. Włodzisław Duch Dept. of Informatics,
Ensemble Methods: Bagging and Boosting
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
Classifier ensembles: Does the combination rule matter? Ludmila Kuncheva School of Computer Science Bangor University, UK
AUTOMATIC TARGET RECOGNITION AND DATA FUSION March 9 th, 2004 Bala Lakshminarayanan.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Monday, 03 March 2008 William.
Konstantina Christakopoulou Liang Zeng Group G21
Classification Ensemble Methods 1
COMP24111: Machine Learning Ensemble Models Gavin Brown
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 8 Combining Methods and Ensemble Learning.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bagging and Boosting Cross-Validation ML.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Ensemble Learning, Boosting, and Bagging: Scaling up Decision Trees (with thanks to William Cohen of CMU, Michael Malohlava of 0xdata, and Manish Amde.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Face Detection 蔡宇軒.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Classification of models
Advanced data mining with TagHelper and Weka
Trees, bagging, boosting, and stacking
COMP61011 : Machine Learning Ensemble Models
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Basic machine learning background with Python scikit-learn
Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007
Neuro-Computing Lecture 5 Committee Machine
A “Holy Grail” of Machine Learing
Data Mining Practical Machine Learning Tools and Techniques
Ensembles.
Ensemble learning.
Model Combination.
Ensemble learning Reminder - Bagging of Trees Random Forest
Data Mining Ensembles Last modified 1/9/19.
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Are we still talking about diversity in classifier ensembles? Ludmila I Kuncheva School of Computer Science Bangor University, UK

Are we still talking about diversity in classifier ensembles? Ludmila I Kuncheva School of Computer Science Bangor University, UK Completely irrelevant to your Workshop...

Let’s talk instead of: Multi-view and classifier ensembles

classifier feature values (object description) classifier class label “combiner” A classifier ensemble

feature values (object description) class label a neural network classifier combiner classifier ensemble?

classifier feature values (object description) class label classifier ensemble? a fancy combiner

classifier? a fancy feature extractor classifier feature values (object description) classifier class label “combiner”

a. because we like to complicate entities beyond necessity (anti-Occam’s razor) b. because we are lazy and stupid and can’t be bothered to design and train one single sophisticated classifier c. because democracy is so important to our society, it must be important to classification Why classifier ensembles then?

combination of multiple classifiers [Lam95,Woods97,Xu92,Kittler98] classifier fusion [Cho95,Gader96,Grabisch92,Keller94,Bloch96] mixture of experts [Jacobs91,Jacobs95,Jordan95,Nowlan91] committees of neural networks [Bishop95,Drucker94] consensus aggregation [Benediktsson92,Ng92,Benediktsson97] voting pool of classifiers [Battiti94] dynamic classifier selection [Woods97] composite classifier systems [Dasarathy78] classifier ensembles [Drucker94,Filippi94,Sharkey99] bagging, boosting, arcing, wagging [Sharkey99] modular systems [Sharkey99] collective recognition [Rastrigin81,Barabash83] stacked generalization [Wolpert92] divide-and-conquer classifiers [Chiang94] pandemonium system of reflective agents [Smieja96] change-glasses approach to classifier selection [KunchevaPRL93] etc. fanciest oldest

combination of multiple classifiers [Lam95,Woods97,Xu92,Kittler98] classifier fusion [Cho95,Gader96,Grabisch92,Keller94,Bloch96] mixture of experts [Jacobs91,Jacobs95,Jordan95,Nowlan91] committees of neural networks [Bishop95,Drucker94] consensus aggregation [Benediktsson92,Ng92,Benediktsson97] voting pool of classifiers [Battiti94] dynamic classifier selection [Woods97] composite classifier systems [Dasarathy78] classifier ensembles [Drucker94,Filippi94,Sharkey99] bagging, boosting, arcing, wagging [Sharkey99] modular systems [Sharkey99] collective recognition [Rastrigin81,Barabash83] stacked generalization [Wolpert92] divide-and-conquer classifiers [Chiang94] pandemonium system of reflective agents [Smieja96] change-glasses approach to classifier selection [KunchevaPRL93] etc. Out of fashion Subsumed

Congratulations! The Netflix Prize sought to substantially improve the accuracy of predictions about how much someone is going to enjoy a movie based on their movie preferences. On September 21, 2009 we awarded the $1M Grand Prize to team “BellKor’s Pragmatic Chaos”. Read about their algorithm, checkout team scores on the Leaderboard, and join the discussions on the Forum.their algorithmLeaderboard Forum We applaud all the contributors to this quest, which improves our ability to connect people to the movies they love. classifier feature values (object description) classifier class label combiner classifier ensemble

cited 7194 times by 28 July 2013 (Google Scholar) classifier feature values (object description) classifier class label combiner classifier ensemble

Saso Dzeroski David Hand S. Dzeroski, and B. Zenko. (2004) Is combining classifiers better than selecting the best one? Machine Learning, 54, David J. Hand (2006) Classifier technology and the illusion of progress, Statist. Sci. 21 (1), Classifier combination? Hmmmm….. We are kidding ourselves; there is no real progress in spite of ensemble methods. Chances are that the single best classifier will be better than the ensemble.

Quo Vadis? "combining classifiers" OR "classifier combination" OR "classifier ensembles" OR "ensemble of classifiers" OR "combining multiple classifiers" OR "committee of classifiers" OR "classifier committee" OR "committees of neural networks" OR "consensus aggregation" OR "mixture of experts" OR "bagging predictors" OR adaboost OR (( "random subspace" OR "random forest" OR "rotation forest" OR boosting) AND "machine learning")

Gartner’s Hype Cycle: a typical evolution pattern of a new technology Where are we?...

(6) IEEE TPAMI = IEEE Transactions on Pattern Analysis and Machine Intelligence IEEE TSMC = IEEE Transactions on Systems, Man and Cybernetics JASA = Journal of the American Statistical Association IJCV = International Journal of Computer Vision JTB = Journal of Theoretical Biology (2) PPL = Protein and Peptide Letters JAE = Journal of Animal Ecology PR = Pattern Recognition (4) ML = Machine Learning NN = Neural Networks CC = Cerebral Cortex top cited paper is from… application paper

International Workshop on Multiple Classifier Systems 2000 – continuing

Combiner Features Classifier 2Classifier 1Classifier L… Data set A Combination level selection or fusion? voting or another combination method? trainable or non-trainable combiner? B Classifier level same or different classifiers? decision trees, neural networks or other? how many? C Feature level all features or subsets of features? random or selected subsets? D Data level independent/dependent bootstrap samples? selected data sets? Levels of questions

50 diverse linear classifiers 50 non-diverse linear classifiers

Number of classifiers L 1 The perfect classifier 3-8 classifiers heterogeneous trained combiner (stacked generalisation) 100+ classifiers same model non-trained combiner (bagging, boosting, etc.)  Large ensemble of nearly identical classifiers - REDUNDANCY  Small ensembles of weak classifiers - INSUFFICIENCY ? ? Must engineer diversity… Strength of classifiers How about here? classifiers same or different models? trained or non-trained combiner? selection or fusion? IS IT WORTH IT?

Number of classifiers L 1 The perfect classifier 3-8 classifiers heterogeneous trained combiner (stacked generalisation) 100+ classifiers same model non-trained combiner (bagging, boosting, etc.)  Large ensemble of nearly identical classifiers - REDUNDANCY  Small ensembles of weak classifiers - INSUFFICIENCY Must engineer diversity… Strength of classifiers classifiers same or different models? trained or non-trained combiner? selection or fusion? IS IT WORTH IT?

classifier feature values (object description) classifier class label “combiner” A classifier ensemble one view

classifier feature values (object description) classifier class label “combiner” A classifier ensemble multiple views feature values (object description) feature values (object description)

1998

“distinct” is what you call “late fusion” “shared” is what you call “early fusion”

EXPRESSION OF EMOTION - MODALITIES facial expression posture behavioural physiological peripheral nervous system central nervous system EEG fMRI Galvanic skin response blood pressure skin t o respiration EMG speech gesture interaction with the computer pressure on mouse drag-click speed eye tracking fNIRS pulse rate pulse variation dialogue with tutor

Data Classification Strategies modality 1 modality 2 modality 3 (1) Concatenate the features from all modalities (2) Feature extraction and concatenation (3) Straight ensemble classification ensemble “early fusion” “late fusion” “mid-fusion” And many combinations thereof...

Data Classification Strategies modality 1 modality 2 modality 3 (1) Concatenate the features from all modalities (2) Feature extraction and concatenation (3) Straight ensemble classification ensemble “early fusion” “late fusion” “mid-fusion” We capture all dependencies but can’t handle the complexity We lose the dependencies but can handle the complexity

Multiview early and mid-fusion Ensemble Feature Selection By the ensemble (RANKERS) For the ensemble Decision tree ensembles Ensembles of different rankers Bootstrap ensembles of rankers Random approach Systematic approach Uniform (Random subspace) Non- uniform (GA) Incremental or iterative Feature selection Multiview late fusion Greedy

Multiview early and mid-fusion Uniform (Random subspace) Non- uniform (GA) Incremental or iterative Feature selection Greedy

This is what I think: 1.Deciding which approach to take is rather art than science 2.This choice is, crucially, CONTEX-SPECIFIC.

Where does diversity come to this? Hmm... Nowhere...