Ensemble Learning  Which of the two options increases your chances of having a good grade on the exam? –Solving the test individually –Solving the test.

Slides:



Advertisements
Similar presentations
Ensemble Learning Reading: R. Schapire, A brief introduction to boosting.
Advertisements

My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Data Mining Classification: Alternative Techniques
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Data Mining Classification: Alternative Techniques
Longin Jan Latecki Temple University
Algorithm-Independent Machine Learning Anna Egorova-Förster University of Lugano Pattern Classification Reading Group, January 2007 All materials in these.
Ensemble Learning what is an ensemble? why use an ensemble?
2D1431 Machine Learning Boosting.
A Brief Introduction to Adaboost
Ensemble Learning: An Introduction
Adaboost and its application
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Examples of Ensemble Methods
Machine Learning: Ensemble Methods
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Ensembles of Classifiers Evgueni Smirnov
Machine Learning CS 165B Spring 2012
CSSE463: Image Recognition Day 27 This week This week Last night: k-means lab due. Last night: k-means lab due. Today: Classification by “boosting” Today:
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Chapter 8 Probability Section R Review. 2 Barnett/Ziegler/Byleen Finite Mathematics 12e Review for Chapter 8 Important Terms, Symbols, Concepts  8.1.
Ensemble Classification Methods Rayid Ghani IR Seminar – 9/26/00.
Benk Erika Kelemen Zsolt
Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Ensemble Methods: Bagging and Boosting
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CS Ensembles1 Ensembles. 2 A “Holy Grail” of Machine Learning Automated Learner Just a Data Set or just an explanation of the problem Hypothesis.
CLASSIFICATION: Ensemble Methods
BAGGING ALGORITHM, ONLINE BOOSTING AND VISION Se – Hoon Park.
Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
E NSEMBLE L EARNING : A DA B OOST Jianping Fan Dept of Computer Science UNC-Charlotte.
CSSE463: Image Recognition Day 33 This week This week Today: Classification by “boosting” Today: Classification by “boosting” Yoav Freund and Robert Schapire.
Ensemble Methods.  “No free lunch theorem” Wolpert and Macready 1995.
Ensemble Methods in Machine Learning
Chapter 4 Probability, Randomness, and Uncertainty.
Data Analytics CMIS Short Course part II Day 1 Part 3: Ensembles Sam Buttrey December 2015.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
1 Machine Learning: Ensemble Methods. 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training data or different.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Ensemble Classifiers.
Machine Learning: Ensemble Methods
Chapter 7. Classification and Prediction
Reading: R. Schapire, A brief introduction to boosting
Bagging and Random Forests
Chapter 13 – Ensembles and Uplift
Trees, bagging, boosting, and stacking
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Basic machine learning background with Python scikit-learn
Neuro-Computing Lecture 5 Committee Machine
ECE 5424: Introduction to Machine Learning
Ungraded quiz Unit 6.
A “Holy Grail” of Machine Learing
Combining Base Learners
Data Mining Practical Machine Learning Tools and Techniques
Introduction to Data Mining, 2nd Edition
Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)
ADABOOST(Adaptative Boosting)
Ensembles.
Ensemble learning.
Model Combination.
Presentation transcript:

Ensemble Learning  Which of the two options increases your chances of having a good grade on the exam? –Solving the test individually –Solving the test in groups  Why?

Ensemble Learning  Weak classifier A

Ensemble Learning  Weak classifier B

Ensemble Learning  Weak classifier C

Ensemble Learning  Ensemble of A, B, and C

Ensemble Learning  For an ensemble to work the following conditions must be true: –The errors of the classifiers need not to be strongly correlated (think about the exam example, if everyone knows by heart exactly the same chapters, will it help to solve the test in groups?) –The errors of the individual classifiers making up the example need to be less than 0.5 (at least better than chance)

Ensemble Learning  Suppose we have a set of binary classifiers, each with a probability of error of 1/3 and that the errors of any two classifiers are independent. (Two events A and B are independent if p(A&B) = p(A)p(B)).  What is the probability of error of an ensemble of 3 classifiers? 7 classifiers? 21 classifiers?

Ensemble Learning For 3 classifiers, each with probability of error of 1/3, combined by simple voting, the probability of error is equal to the probability that two classifiers make an error plus the probability that all three classifiers make an error.

Ensemble Learning For 3 classifiers, each with probability of error of 1/3, combined by simple voting, the probability of error is equal to the probability that two classifiers make an error plus the probability that all three classifiers make an error. pe(ens) = c(3,2) pe^2(1-pe) + c(3,3) pe^3 = 3* (1/3)^2 (2/3) + (1/3)^3 = 2/9 + 1/27 = 7/27 = 0.26 (down from 0.33 for a single classifier)

Ensemble Learning For 3 classifiers, each with probability of error of 1/2, combined by simple voting, the probability of error is equal to the probability that two classifiers make an error plus the probability that all three classifiers make an error.

Ensemble Learning For 3 classifiers, each with probability of error of 1/2, combined by simple voting, the probability of error is equal to the probability that two classifiers make an error plus the probability that all three classifiers make an error. pe(ens) = c(3,2) pe^2(1-pe) + c(3,3) pe^3 = 3* (1/2)^2 (1/2) + (1/2)^3 = 3/8 + 1/8 = 1/2 = 0.5 (same as single classifier case)

Ensemble Learning For 3 classifiers, each with probability of error of 2/3, combined by simple voting, the probability of error is equal to the probability that two classifiers make an error plus the probability that all three classifiers make an error.

Ensemble Learning For 3 classifiers, each with probability of error of 2/3, combined by simple voting, the probability of error is equal to the probability that two classifiers make an error plus the probability that all three classifiers make an error. pe(ens) = c(3,2) pe^2(1-pe) + c(3,3) pe^3 = 3* (2/3)^2 (1/3) + (2/3)^3 = 4/9 + 8/27 = 20/27 = 0.74 (up from 0.67 for a single classifier)

Ensemble Learning

How to build ensembles:

Ensemble Learning How to build ensembles: 1.Heterogeneous ensembles (same training data, different learning algorithms)

Ensemble Learning How to build ensembles: 1.Heterogeneous ensembles (same training data, different learning algorithms) 2.Manipulate training data (same learning algorithm, different training data)

Ensemble Learning How to build ensembles: 1.Heterogeneous ensembles (same training data, different learning algorithms) 2.Manipulate training data (same learning algorithm, different training data) 3.Manipulate input features (use different subsets of the attribute sets)

Ensemble Learning How to build ensembles: 1.Heterogeneous ensembles (same training data, different learning algorithms) 2.Manipulate training data (same learning algorithm, different training data) 3.Manipulate input features (use different subsets of the attribute sets) 4.Manipulate output targets (same data, same algorithm, convert multiclass problems into many two-class problems)

Ensemble Learning How to build ensembles: 1.Heterogeneous ensembles (same training data, different learning algorithms) 2.Manipulate training data (same learning algorithm, different training data) 3.Manipulate input features (use different subsets of the attribute sets) 4.Manipulate output targets (same data, same algorithm, convert multiclass problems into many two-class problems) 5.Inject randomness to learning algorithms.

Ensemble Learning How to build ensembles: The two dominant approaches belong to category 3: Manipulate training data (same learning algorithm, different training data) They are: bagging and boosting

Ensemble Learning Bagging - Training 1. k = 1; 2. pi = 1/m, for i=1,...,m 3. While k < EnSize 3.1 Create training set Tk (normally of size m) by sampling from T with replacement according to probability distribution p. 3.2 Build classifier Ck using learning algorithm L and training set Tk 3.3 if errror_T (Ck) < threshold k = k Goto Output C1, C2,..., Ck Classification: Classify new examples by voting among C_1, C_2,..

Ensemble Learning Boosting - Training 1. k = 1, for i=1,...,m w1i = 1/m, 2. While k < EnsSize 2.1 for i=1,...,m pi = wi/sum(wi) 2.2 Create training set Tk (normally of size m) by sampling from T with replacement according to probability distribution p. 2.3 Build classifier Ck using learning algorithm L and training set Tk 2.4 Classify examples in T 2.5 if errror_T (Ck) < threshold k = k+1 For i = 1 to m if Ck(x_i) != yi wi = wi * Beta -- (Beta >1) Increse w of misclassified examples 2.6 Goto Output C1, C2,...,Ck