INTRODUCTION TO Machine Learning 3rd Edition

Slides:



Advertisements
Similar presentations
Combining Multiple Learners Ethem Chp. 15 Haykin Chp. 7, pp
Advertisements

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
INTRODUCTION TO Machine Learning 2nd Edition
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning 3rd Edition
INTRODUCTION TO Machine Learning 3rd Edition
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning 2nd Edition
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Sparse vs. Ensemble Approaches to Supervised Learning
Ensemble Learning what is an ensemble? why use an ensemble?
Ensemble Learning: An Introduction
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Sparse vs. Ensemble Approaches to Supervised Learning
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
For Better Accuracy Eick: Ensemble Learning
Ensembles of Classifiers Evgueni Smirnov
Machine Learning CS 165B Spring 2012
Zhangxi Lin ISQS Texas Tech University Note: Most slides are from Decision Tree Modeling by SAS Lecture Notes 6 Ensembles of Trees.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
CS Fall 2015 (© Jude Shavlik), Lecture 7, Week 3
Ensemble Based Systems in Decision Making Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: IEEE CIRCUITS AND SYSTEMS MAGAZINE 2006, Q3 Robi.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Ensemble Methods.  “No free lunch theorem” Wolpert and Macready 1995.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Classification Ensemble Methods 1
… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Boosting ---one of combining models Xin Li Machine Learning Course.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Ensembles (Bagging, Boosting, and all that)
Ch 14. Combining Models Pattern Recognition and Machine Learning, C. M
INTRODUCTION TO Machine Learning 3rd Edition
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Neuro-Computing Lecture 5 Committee Machine
A “Holy Grail” of Machine Learing
INTRODUCTION TO Machine Learning
INTRODUCTION TO Machine Learning
Combining Base Learners
Introduction to Data Mining, 2nd Edition
Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)
Ensemble learning.
Model Combination.
Ensemble learning Reminder - Bagging of Trees Random Forest
Model generalization Brief summary of methods
INTRODUCTION TO Machine Learning
INTRODUCTION TO Machine Learning 2nd Edition
INTRODUCTION TO Machine Learning
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Ensembles (Bagging, Boosting, and all that)
Presentation transcript:

INTRODUCTION TO Machine Learning 3rd Edition Lecture Slides for INTRODUCTION TO Machine Learning 3rd Edition ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml3e

CHAPTER 17: Combining Multiple Learners

Rationale No Free Lunch Theorem: There is no algorithm that is always the most accurate Generate a group of base-learners which when combined has higher accuracy Different learners use different Algorithms Hyperparameters Representations /Modalities/Views Training sets Subproblems Diversity vs accuracy

Voting Linear combination Classification

Bayesian perspective: If dj are iid Bias does not change, variance decreases by L If dependent, error increase with positive correlation

Fixed Combination Rules

Error-Correcting Output Codes K classes; L problems (Dietterich and Bakiri, 1995) Code matrix W codes classes in terms of learners One per class L=K Pairwise L=K(K-1)/2

Full code L=2(K-1)-1 With reasonable L, find W such that the Hamming distance btw rows and columns are maximized. Voting scheme Subproblems may be more difficult than one-per-K

Bagging Use bootstrapping to generate L training sets and train one base-learner with each (Breiman, 1996) Use voting (Average or median with regression) Unstable algorithms profit from bagging

AdaBoost Generate a sequence of base- learners each focusing on previous one’s errors (Freund and Schapire, 1996)

Mixture of Experts Voting where weights are input-dependent (gating) (Jacobs et al., 1991) Experts or gating can be nonlinear

Stacking Combiner f () is another learner (Wolpert, 1992)

Fine-Tuning an Ensemble Given an ensemble of dependent classifiers, do not use it as is, try to get independence Subset selection: Forward (growing)/Backward (pruning) approaches to improve accuracy/diversity/independence Train metaclassifiers: From the output of correlated classifiers, extract new combinations that are uncorrelated. Using PCA, we get “eigenlearners.” Similar to feature selection vs feature extraction

Cascading Use dj only if preceding ones are not confident Cascade learners in order of complexity

Combining Multiple Sources/Views Early integration: Concat all features and train a single learner Late integration: With each feature set, train one learner, then either use a fixed rule or stacking to combine decisions Intermediate integration: With each feature set, calculate a kernel, then use a single SVM with multiple kernels Combining features vs decisions vs kernels