ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.

Slides:



Advertisements
Similar presentations
When Efficient Model Averaging Out-Perform Bagging and Boosting Ian Davidson, SUNY Albany Wei Fan, IBM T.J.Watson.
Advertisements

Co Training Presented by: Shankar B S DMML Lab
Ensemble Learning Reading: R. Schapire, A brief introduction to boosting.
Data Mining and Machine Learning
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Data Mining Classification: Alternative Techniques
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Data Mining Classification: Alternative Techniques
Longin Jan Latecki Temple University
Introduction to Boosting Slides Adapted from Che Wanxiang( 车 万翔 ) at HIT, and Robin Dhamankar of Many thanks!
Model Evaluation Metrics for Performance Evaluation
Ensemble Learning what is an ensemble? why use an ensemble?
2D1431 Machine Learning Boosting.
Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficiency with boostrap sampling: Every example.
Ensemble Learning: An Introduction
Machine Learning IV Ensembles CSE 473. © Daniel S. Weld 2 Machine Learning Outline Machine learning: Supervised learning Overfitting Ensembles of classifiers.
Adaboost and its application
Examples of Ensemble Methods
Machine Learning: Ensemble Methods
CS 4700: Foundations of Artificial Intelligence
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
Ensemble Learning (2), Tree and Forest
For Better Accuracy Eick: Ensemble Learning
Ensembles of Classifiers Evgueni Smirnov
Machine Learning CS 165B Spring 2012
Issues with Data Mining
Zhangxi Lin ISQS Texas Tech University Note: Most slides are from Decision Tree Modeling by SAS Lecture Notes 6 Ensembles of Trees.
ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.
Benk Erika Kelemen Zsolt
Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
Learning with AdaBoost
E NSEMBLE L EARNING : A DA B OOST Jianping Fan Dept of Computer Science UNC-Charlotte.
Ensemble Learning  Which of the two options increases your chances of having a good grade on the exam? –Solving the test individually –Solving the test.
Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.
Classification Ensemble Methods 1
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Genetic Algorithms (in 1 Slide) l GA: based on an analogy to biological evolution l Each.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
1 Machine Learning: Ensemble Methods. 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training data or different.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
By Subhasis Dasgupta Asst Professor Praxis Business School, Kolkata Classification Modeling Decision Tree (Part 2)
Ensemble Classifiers.
Machine Learning: Ensemble Methods
Reading: R. Schapire, A brief introduction to boosting
Bagging and Random Forests
Ensemble methods with Data Streams
Chapter 13 – Ensembles and Uplift
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
CSE P573 Applications of Artificial Intelligence Decision Trees
A “Holy Grail” of Machine Learing
Data Mining Practical Machine Learning Tools and Techniques
Introduction to Data Mining, 2nd Edition
CSE 573 Introduction to Artificial Intelligence Decision Trees
Multiple Decision Trees ISQS7342
Ensembles.
Ensemble learning.
Model Combination.
Ensemble learning Reminder - Bagging of Trees Random Forest
Data Mining Ensembles Last modified 1/9/19.
Presentation transcript:

ISQS 6347, Data & Text Mining1 Ensemble Methods

ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made by multiple classifiers

ISQS 6347, Data & Text Mining 3 General Idea

ISQS 6347, Data & Text Mining 4 Why does it work? Suppose there are 25 base classifiers  Each classifier has error rate,  = 0.35  Assume classifiers are independent  Probability that the ensemble classifier makes a wrong prediction:

ISQS 6347, Data & Text Mining 5 Combined Ensemble Models Training Data Sample 1 Score Data Modeling Method Ensemble Model (Average) Model 1 Model 2 Model 3 Sample 2 Sample 3

ISQS 6347, Data & Text Mining 6 Combined Ensemble Models Training Data Score Data Modeling Method A Ensemble Model (Average) Model A Model B Modeling Method B

ISQS 6347, Data & Text Mining 7 Examples of Ensemble Methods How to generate an ensemble of classifiers?  Bagging  Boosting

ISQS 6347, Data & Text Mining 8 Bagging Sampling with replacement Build classifier on each bootstrap sample Each sample has probability (1 – 1/n) n of being selected The probability an observation is not selected is 1 - (1 – 1/n) n. When n is large enough, (1 – 1/n) n  1/e. So the probability is 1 – 1/e  0.632

ISQS 6347, Data & Text Mining 9 Boosting An iterative procedure to adaptively change distribution of training data by focusing more on previously misclassified records  Initially, all N records are assigned equal weights  Unlike bagging, weights may change at the end of boosting round

ISQS 6347, Data & Text Mining 10 Boosting Records that are wrongly classified will have their weights increased Records that are classified correctly will have their weights decreased Example 4 is hard to classify Its weight is increased, therefore it is more likely to be chosen again in subsequent rounds