… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.

Slides:



Advertisements
Similar presentations
Ensemble Learning – Bagging, Boosting, and Stacking, and other topics
Advertisements

Combining Multiple Learners Ethem Chp. 15 Haykin Chp. 7, pp
Ensemble Learning Reading: R. Schapire, A brief introduction to boosting.
Data Mining and Machine Learning
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Data Mining Classification: Alternative Techniques
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Data Mining Classification: Alternative Techniques
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Ensemble Learning what is an ensemble? why use an ensemble?
2D1431 Machine Learning Boosting.
A Brief Introduction to Adaboost
Ensemble Learning: An Introduction
Adaboost and its application
Three kinds of learning
Examples of Ensemble Methods
Machine Learning: Ensemble Methods
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
For Better Accuracy Eick: Ensemble Learning
3 ème Journée Doctorale G&E, Bordeaux, Mars 2015 Wei FENG Geo-Resources and Environment Lab, Bordeaux INP (Bordeaux Institute of Technology), France Supervisor:
Ensembles of Classifiers Evgueni Smirnov
Machine Learning CS 165B Spring 2012
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
CS 391L: Machine Learning: Ensembles
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Benk Erika Kelemen Zsolt
CS Fall 2015 (© Jude Shavlik), Lecture 7, Week 3
Ensemble Based Systems in Decision Making Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: IEEE CIRCUITS AND SYSTEMS MAGAZINE 2006, Q3 Robi.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Computational Intelligence: Methods and Applications Lecture 36 Meta-learning: committees, sampling and bootstrap. Włodzisław Duch Dept. of Informatics,
Ensemble Methods: Bagging and Boosting
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CS Ensembles1 Ensembles. 2 A “Holy Grail” of Machine Learning Automated Learner Just a Data Set or just an explanation of the problem Hypothesis.
CLASSIFICATION: Ensemble Methods
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
E NSEMBLE L EARNING : A DA B OOST Jianping Fan Dept of Computer Science UNC-Charlotte.
COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.
Ensemble Methods in Machine Learning
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Classification Ensemble Methods 1
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Machine Learning in Practice Lecture 24 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Boosting and Additive Trees (Part 1) Ch. 10 Presented by Tal Blum.
Boosting ---one of combining models Xin Li Machine Learning Course.
1 Machine Learning: Ensemble Methods. 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training data or different.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
HW 2.
Trees, bagging, boosting, and stacking
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
A “Holy Grail” of Machine Learing
INTRODUCTION TO Machine Learning
Combining Base Learners
Adaboost Team G Youngmin Jun
Introduction to Data Mining, 2nd Edition
Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)
Ensemble learning.
Model Combination.
Ensemble learning Reminder - Bagging of Trees Random Forest
INTRODUCTION TO Machine Learning 3rd Edition
Presentation transcript:

… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo

 No Free Lunch theorem: There is no algorithm that is always the most accurate.  Generate a group of algorithms which when combined display higher accuracy. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2

… Different Algorithms Different Datasets …

Bagging

Bagging can easily be extended to regression. Bagging is most efficient when the base-learner is unstable. Bagging typically increases accuracy. Interpretability is lost.

“Breiman's work bridged the gap between statisticians and computer scientists, particularly in the field of machine learning. Perhaps his most important contributions were his work on classification and regression trees and ensembles of trees fit to bootstrap samples. Bootstrap aggregation was given the name bagging by Breiman. Another of Breiman's ensemble approaches is the random forest.” ( Extracted from Wikipedia ).

Boosting

Boosting tries to combine weak learners into a strong learner. Originally all examples have the same weight, but in following iterations examples wrongly classified increase their weight. Boosting can be applied to any learner.

Boosting Initialize all weights w i to 1/N (N: no. of examples) error = 0 Repeat (until error > 0.5 or max. iterations reached) Train classifier and get hypothesis h t (x) Compute error as the sum of weights for misclassified exs. error = Σ w i if w i is incorrectly classified. Set α t = log ( 1-error / error ) Updates weights w i = [ w i e - ( α t yi h t (xi) ] / Z Output f(x) = sign ( Σ t α t h t (x) )

Boosting Misclassified ExampleIncrease Weights

… Different Algorithms Different Datasets …

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 12 Voting where weights are input-dependent (gating) (Jacobs et al., 1991) Experts or gating can be nonlinear

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 13 Robert Jacobs University of Rochester Michael Jordan UC Berkeley

Stacking

Variations are among learners. The predictions of the base learners form a new meta-dataset. A testing example is first transformed into a new meta-example and then classified. Several variations have been proposed around stacking.

Cascade Generalization

Variations are among learners. Classifiers are used in sequence rather than in parallel as in stacking. The prediction of the first classifier is added to the example feature vector to form an extended dataset. The process can go on through many iterations.

Cascading

Like boosting, distribution changes across datasets. But unlike boosting we will vary the classifiers. Classification is based on prediction confidence. Cascading creates rules that account for most instances catching exceptions at the final step.

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 20  K classes; L problems (Dietterich and Bakiri, 1995)  Code matrix W codes classes in terms of learners  One per class L=K  Pairwise L=K(K-1)/2

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 21  Full code L=2 (K-1) -1  With reasonable L, find W such that the Hamming distance between rows and columns is maximized.