Ensemble methods with Data Streams

Slides:



Advertisements
Similar presentations
A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions Jing Gao Wei Fan Jiawei Han Philip S. Yu University of Illinois.
Advertisements

Is Random Model Better? -On its accuracy and efficiency-
Data Mining and Machine Learning
My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Boosting Rong Jin.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Data Mining Classification: Alternative Techniques
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Data Mining Classification: Alternative Techniques
Longin Jan Latecki Temple University
Introduction to Boosting Slides Adapted from Che Wanxiang( 车 万翔 ) at HIT, and Robin Dhamankar of Many thanks!
UCI KDD Archive University of California at Irvine –
Model Evaluation Metrics for Performance Evaluation
Sparse vs. Ensemble Approaches to Supervised Learning
On Appropriate Assumptions to Mine Data Streams: Analyses and Solutions Jing Gao† Wei Fan‡ Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM.
2D1431 Machine Learning Boosting.
Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficiency with boostrap sampling: Every example.
Ensemble Learning: An Introduction
Adaboost and its application
Examples of Ensemble Methods
Popular Ensemble Methods: An Empirical Study David Opitz and Richard Maclin Presented by Scott Wespi 5/22/07.
Machine Learning: Ensemble Methods
Sparse vs. Ensemble Approaches to Supervised Learning
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
3 ème Journée Doctorale G&E, Bordeaux, Mars 2015 Wei FENG Geo-Resources and Environment Lab, Bordeaux INP (Bordeaux Institute of Technology), France Supervisor:
Machine Learning CS 165B Spring 2012
CISC Machine Learning for Solving Systems Problems Presented by: Akanksha Kaul Dept of Computer & Information Sciences University of Delaware SBMDS:
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
CS 391L: Machine Learning: Ensembles
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Benk Erika Kelemen Zsolt
Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.
Boris 2 Boris Babenko 1 Ming-Hsuan Yang 2 Serge Belongie 1 (University of California, Merced, USA) 2 (University of California, San Diego, USA) Visual.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
1 Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1classifier 2classifier.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Ensemble Methods.  “No free lunch theorem” Wolpert and Macready 1995.
Classification Ensemble Methods 1
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Xiangnan Kong,Philip S. Yu An Ensemble-based Approach to Fast Classification of Multi-label Data Streams Dept. of Computer Science University of Illinois.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Max-Confidence Boosting With Uncertainty for Visual tracking WEN GUO, LIANGLIANG CAO, TONY X. HAN, SHUICHENG YAN AND CHANGSHENG XU IEEE TRANSACTIONS ON.
Mining Concept-Drifting Data Streams Using Ensemble Classifiers Haixun Wang Wei Fan Philip S. YU Jiawei Han Proc. 9 th ACM SIGKDD Internal Conf. Knowledge.
AdaBoost Algorithm and its Application on Object Detection Fayin Li.
1 Machine Learning: Ensemble Methods. 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training data or different.
Adaboost (Adaptive boosting) Jo Yeong-Jun Schapire, Robert E., and Yoram Singer. "Improved boosting algorithms using confidence- rated predictions."
1 Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1classifier 2classifier.
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Chapter 13 – Ensembles and Uplift
Hidden Markov Models (HMM)
COMP61011 : Machine Learning Ensemble Models
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Cost-Sensitive Learning
Data Mining Practical Machine Learning Tools and Techniques
Introduction to Data Mining, 2nd Edition
Multiple Decision Trees ISQS7342
Model Combination.
Data Mining Ensembles Last modified 1/9/19.
Learning from Data Streams
Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1 classifier 2 classifier.
Presentation transcript:

Ensemble methods with Data Streams Jungbeom Lee CS240B

Outline Intro Ensemble in Machine learning Online ensemble algorithms Future work

Intro Previous class: Data Streams Classifiers Ensemble methods Online algorithm

Classifiers The batch classification problem: Given a finite training set D={(x,y)} , where y={y1, y2, …, yk}, |D|=n, find a function y=f(x) that can predict the y value for an unseen instance x The data stream classification problem: Given an infinite sequence of pairs of the form (x,y) where y={y1, y2, …, yk}, find a function y=f(x) that can predict the y value for an unseen instance x Example applications: Fraud detection in credit card transactions Topic classification in a news aggregation site, e.g. Google news Translator for foreign languages Supervised learning

Motivations Data Volume Changing data characteristics Cost of Learning Online mining different from static mining Data Volume impossible to mine the entire data at one time can only afford constant memory per data sample Changing data characteristics previously learned models are invalid Cost of Learning model updates can be costly can only afford constant time per data sample.

Ensemble A set of classifiers whose individual decisions are combined in some way to classify new examples An ensemble of classifiers to be more accurate than any of its individual members one key to successful is to use individual classifiers with error rates below .5

Reasons

Ensemble methods Manipulating the Training Examples Bagging Adaboost Injecting Randomness C4.5 decision tree algorithm

Bagging algorithm

Bagging algorithm

Online bagging algorithm

Online weighted bagging algorithm

AdaBoost algorithm

AdaBoost algorithm

Adaptive boosting algorithm

Experimental Results

Type of Data

Experimental Results

Experimental Results

Experimental Results

Future work Better online algorithm for Bagging Dealing with multiple data types

References http://web.engr.oregonstate.edu/~tgd/publications /mcs-ensembles.pdf http://pages.bangor.ac.uk/~mas00a/papers/lkSUEM A2008.pdf http://web.cs.ucla.edu/~zaniolo/papers/NBCAJM W77MW0J8CP.pdf https://ti.arc.nasa.gov/m/pub- archive/archive/0962.pdf https://engineering.purdue.edu/~givan/papers/bp.p df http://hanj.cs.illinois.edu/pdf/kdd03_emsemble.pdf