ETHEM ALPAYDIN © The MIT Press, 2010 Lecture Slides for.

Slides:



Advertisements
Similar presentations
Combining Multiple Learners Ethem Chp. 15 Haykin Chp. 7, pp
Advertisements

Ensemble Learning Reading: R. Schapire, A brief introduction to boosting.
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
INTRODUCTION TO Machine Learning 2nd Edition
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning 3rd Edition
INTRODUCTION TO Machine Learning 3rd Edition
Longin Jan Latecki Temple University
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning 2nd Edition
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Sparse vs. Ensemble Approaches to Supervised Learning
Ensemble Learning what is an ensemble? why use an ensemble?
Ensemble Learning: An Introduction
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Competent Undemocratic Committees Włodzisław Duch, Łukasz Itert and Karol Grudziński Department of Informatics, Nicholas Copernicus University, Torun,
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Machine Learning: Ensemble Methods
Sparse vs. Ensemble Approaches to Supervised Learning
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
For Better Accuracy Eick: Ensemble Learning
Ensembles of Classifiers Evgueni Smirnov
Machine Learning CS 165B Spring 2012
Zhangxi Lin ISQS Texas Tech University Note: Most slides are from Decision Tree Modeling by SAS Lecture Notes 6 Ensembles of Trees.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
CS 391L: Machine Learning: Ensembles
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Ensemble Classification Methods Rayid Ghani IR Seminar – 9/26/00.
CS Fall 2015 (© Jude Shavlik), Lecture 7, Week 3
Data Mining - Volinsky Columbia University 1 Topic 10 - Ensemble Methods.
Ensemble Based Systems in Decision Making Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: IEEE CIRCUITS AND SYSTEMS MAGAZINE 2006, Q3 Robi.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Ensemble Methods.  “No free lunch theorem” Wolpert and Macready 1995.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Ensemble Methods in Machine Learning
Classification Ensemble Methods 1
… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Boosting ---one of combining models Xin Li Machine Learning Course.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Neuro-Computing Lecture 5 Committee Machine
A “Holy Grail” of Machine Learing
INTRODUCTION TO Machine Learning
Combining Base Learners
Introduction to Data Mining, 2nd Edition
Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)
Model Combination.
Ensemble learning Reminder - Bagging of Trees Random Forest
INTRODUCTION TO Machine Learning
INTRODUCTION TO Machine Learning 3rd Edition
Presentation transcript:

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for

Rationale No Free Lunch Theorem: There is no algorithm that is always the most accurate Generate a group of base-learners which when combined has higher accuracy Different learners use different Algorithms Hyperparameters Representations /Modalities/Views Training sets Subproblems Diversity vs accuracy 3Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Voting Linear combination Classification 4 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Bayesian perspective: If d j are iid Bias does not change, variance decreases by L If dependent, error increase with positive correlation 5Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Fixed Combination Rules 6Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Error-Correcting Output Codes 7 K classes; L problems (Dietterich and Bakiri, 1995) Code matrix W codes classes in terms of learners One per class L=K Pairwise L=K(K-1)/2 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

8 Full code L=2 (K-1) -1 With reasonable L, find W such that the Hamming distance btw rows and columns are maximized. Voting scheme Subproblems may be more difficult than one-per-K Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Bagging Use bootstrapping to generate L training sets and train one base-learner with each (Breiman, 1996) Use voting (Average or median with regression) Unstable algorithms profit from bagging 9Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

AdaBoost Generate a sequence of base-learners each focusing on previous one’s errors (Freund and Schapire, 1996) 10Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Voting where weights are input-dependent (gating) (Jacobs et al., 1991) Experts or gating can be nonlinear Mixture of Experts 11Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Stacking Combiner f () is another learner (Wolpert, 1992) 12Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Fine-Tuning an Ensemble Given an ensemble of dependent classifiers, do not use it as is, try to get independence 1. Subset selection: Forward (growing)/Backward (pruning) approaches to improve accuracy/diversity/independence 2. Train metaclassifiers: From the output of correlated classifiers, extract new combinations that are uncorrelated. Using PCA, we get “eigenlearners.” Similar to feature selection vs feature extraction 13Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Cascading Use d j only if preceding ones are not confident Cascade learners in order of complexity 14Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Combining Multiple Sources Early integration: Concat all features and train a single learner Late integration: With each feature set, train one learner, then either use a fixed rule or stacking to combine decisions Intermediate integration: With each feature set, calculate a kernel, then use a single SVM with multiple kernels Combining features vs decisions vs kernels 15Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)