Rotation Forest: A New Classifier Ensemble Method 交通大學電子所蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva.

Slides:

Advertisements

Similar presentations

Random Forest Predrag Radenković 3237/10

Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.

Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.

Longin Jan Latecki Temple University

Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Model Evaluation Metrics for Performance Evaluation

Ensemble Learning: An Introduction

Adaboost and its application

Three kinds of learning

Machine Learning: Ensemble Methods

For Better Accuracy Eick: Ensemble Learning

3 ème Journée Doctorale G&E, Bordeaux, Mars 2015 Wei FENG Geo-Resources and Environment Lab, Bordeaux INP (Bordeaux Institute of Technology), France Supervisor:

Classifier Ensembles Ludmila Kuncheva School of Computer Science Bangor University Part 2 1.

Machine Learning CS 165B Spring 2012

Issues with Data Mining

Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.

Active Learning for Class Imbalance Problem

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Are we still talking about diversity in classifier ensembles? Ludmila I Kuncheva School of Computer Science Bangor University, UK.

Boosting Neural Networks Published by Holger Schwenk and Yoshua Benggio Neural Computation, 12(8): , Presented by Yong Li.

CS 391L: Machine Learning: Ensembles

LOGO Ensemble Learning Lecturer: Dr. Bo Yuan

Benk Erika Kelemen Zsolt

Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.

Special topics on text mining [ Part I: text classification ] Hugo Jair Escalante, Aurelio Lopez, Manuel Montes and Luis Villaseñor.

Classification Course web page: vision.cis.udel.edu/~cv May 12, 2003  Lecture 33.

Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.

Ensemble Methods: Bagging and Boosting

Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

CLASSIFICATION: Ensemble Methods

BAGGING ALGORITHM, ONLINE BOOSTING AND VISION Se – Hoon Park.

Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.

ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.

Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.

Ensemble Methods in Machine Learning

Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.

Classification Ensemble Methods 1

1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.

Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.

Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.

Machine Learning in Practice Lecture 24 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bagging and Boosting Cross-Validation ML.

Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.

1 Machine Learning: Ensemble Methods. 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training data or different.

1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.

Combining Bagging and Random Subspaces to Create Better Ensembles

Ensemble Classifiers.

Machine Learning: Ensemble Methods

Data Mining Practical Machine Learning Tools and Techniques

Zaman Faisal Kyushu Institute of Technology Fukuoka, JAPAN

Trees, bagging, boosting, and stacking

Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.

Basic machine learning background with Python scikit-learn

A “Holy Grail” of Machine Learing

Data Mining Practical Machine Learning Tools and Techniques

Introduction to Data Mining, 2nd Edition

Ensemble learning.

Somi Jacob and Christian Bach

Model Combination.

Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1 classifier 2 classifier.

INTRODUCTION TO Machine Learning 3rd Edition

Evolutionary Ensembles with Negative Correlation Learning

Machine Learning: Lecture 5

Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017

Presentation transcript:

Rotation Forest: A New Classifier Ensemble Method 交通大學電子所蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva

2 Outline Introduction Rotation forests Experimental results Conclusions

3 Outline Introduction Rotation forests Experimental results Conclusions

4 Introduction(1) Why classifier ensemble? combine the predictions of multiple classifiers instead of single classifier Motivation - reduce variance: less dependent on peculiarities of a single training set - reduce bias: learn a more expressive concept class than a single classifier

5 Introduction(2) Key step: formation of an ensemble of diverse classifiers from a single training set It’s necessary to modify the data set (Bagging, Boosting) or the learning method (Random Forest) to create different classifiers Performance evaluation: diversity, accuracy

6 Bagging(1)

7 Bagging(2) Bootstrap sample - the individual classifiers have high classification accuracy - low diversity 1. for m = 1 to M // M... number of iterations a) draw (with replacement) a bootstrap sample S m of the data b) learn a classifier C m from S m 2. for each test example a) try all classifiers C m b) predict the class that receives the highest number of votes

8 Boosting Basic idea: - later classifiers focus on examples that were misclassified by earlier classifiers - weight the predictions of the classifiers with their error

9 Bagging vs. Boosting Making the classifiers diverse will reduce individual accuracy  accuracy-diversity dilemma AdaBoost creates inaccurate classifiers by forcing them to concentrate on difficult objects and ignore the rest of the data  large diversity that boost the ensemble performance

10 Outline Introduction Rotation forests Experimental results Conclusions

11 Rotation Forest(1) Rotation forest transforms the data set while preserving all information PCA is used to transform the data - subset of the instances - subset of the classes - subset of the features: low computation, low storage

12

13 Rotation Forest(2) Base classifiers: decision tree  Forest PCA is a simple rotation of the coordinate axes  Rotation Forest

14 Method(1) X: the objects in the training data set x = [x 1, x 2, …, x n ] T a data point with n features N×n matrix Y = [y 1, y 2, …, y N ] T : class label with c classes

15 Method(2) Given: -L : the number of classifiers in the ensemble (D 1, D 2, …, D L ) -F : the feature set -X, Y All classifiers can be trained in parallel

16 Method(3) For i = 1 … L (to construct the training set for classifier D i ) F : feature set F i,1 F i,2 F i,3 … F i,K K subsets (F i,j j=1…K) each has M = n/K features

17 Method(3) For j = 1 … K F 1,1 F 1,2 F 1,3 … F 1,K X 1,1 : data set X for the features in F 1,1 Eliminate a random subset of classes Select a bootstrap sample from X 1,1 to obtain X’ 1,1 Run PCA on X’ 1,1 using only M features Principal components a (1) 1,1,…,a (M1) 1,1

18 Method(4) Arrange the principal components for all j to obtain rotation matrix Rearrange the rows of R 1 so as to match the order of features in F  obtain R 1 a Build classifier D 1 using XR 1 a as a training set

19 How It Works ? Diversity - Each decision tree uses different set of axes. - Trees are sensitive to rotation of the axes Accuracy - No principal components are discarded - The whole data set is used to train each classifier (with different extracted features)

20 Outline Introduction Rotation forests Experimental results Conclusions

21 Experimental Results(1) Experimental settings: 1. Bagging, AdaBoost, and Random Forest were kept at their default values in WEKA 2. for Rotation Forest, M is fixed to be 3 3. all ensemble methods have the same L 4. base classifier: tree classifier J48 (WEKA) 5. database: UCI Machine Learning Repository Waikato environment for knowledge analysis

22 Database

23 Experimental Results(2) TABLE 2 Classification Accuracy and Standard Deviation of J48 and Ensemble Methods without Pruning fold cross validation

24 Experimental Results(3) Fig. 1. Percentage diagram for the four studied ensemble methods with unpruned J48 trees. 3.03% % 3.03% 69.70%

25 Experimental Results (4) Fig. 2. Comparison of accuracy of Rotation Forest ensemble (RF) and the best accuracy from any of a single tree, Bagging, Boosting, and Random Forest ensembles.

26 Diversity-Error Diagram Pairwise diversity measures were chosen Kappa(κ) evaluates the level of agreement between two classifier outputs Diversity-error diagram - x-axis: κ for the pair - y-axis: averaged individual error of D i and D j E i,j =(E i +E j )/2 - small values of κ indicate the better diversity and small values of E i,j indicate better accuracy κ E i,j

27 Experimental Results (5) Rotation Forest has the potential to improve on diversity significantly without compromising the individual accuracy Fig. 3. Kappa-error diagrams for the vowel-n data set.

28 Experimental Results (6) Rotation Forest is not as diverse as the other ensembles but clearly has the most accurate classifiers Rotation Forest is similar to Bagging, but more accurate and diverse Fig. 4. Kappa-error diagrams for the waveform data set.

29 Conclusions Rotation Forest transforms the data with different axes while preserve the information completely  achieve diversity and accuracy Rotation Forest gives a scope for ensemble methods “on the side of Bagging”

30 References J.J. Rodriguez, L.I Kuncheva, and C.J. Alonso, “Rotation Forest: A New Classifier Ensemble Method,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 10, pp , Oct J.J. Rodriguez, C. J. Alonso, “Rotation-based ensembles,” Proc. Current Topics in Artificial Intelligence: 10th Conference of the Spanish Association for Artificial Intelligence, LNAI 3040, Springer, 2004, J. Furnkranz, “Ensemble Classifiers” (class notes)