Mining Concept-Drifting Data Streams Using Ensemble Classifiers Haixun Wang Wei Fan Philip S. YU Jiawei Han Proc. 9 th ACM SIGKDD Internal Conf. Knowledge.

Slides:



Advertisements
Similar presentations
Wei Fan Ed Greengrass Joe McCloskey Philip S. Yu Kevin Drummey
Advertisements

Inductive Learning in Less Than One Sequential Data Scan Wei Fan, Haixun Wang, and Philip S. Yu IBM T.J.Watson Shaw-hwa Lo Columbia University.
A Fully Distributed Framework for Cost-sensitive Data Mining Wei Fan, Haixun Wang, and Philip S. Yu IBM T.J.Watson, Hawthorne, New York Salvatore J. Stolfo.
A Framework for Scalable Cost- sensitive Learning Based on Combining Probabilities and Benefits Wei Fan, Haixun Wang, and Philip S. Yu IBM T.J.Watson Salvatore.
Pruning and Dynamic Scheduling of Cost-sensitive Ensembles Wei Fan, Haixun Wang, and Philip S. Yu IBM T.J.Watson, Hawthorne, New York Fang Chu UCLA, Los.
On the Optimality of Probability Estimation by Random Decision Trees Wei Fan IBM T.J.Watson.
Systematic Data Selection to Mine Concept Drifting Data Streams Wei Fan IBM T.J.Watson.
A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions Jing Gao Wei Fan Jiawei Han Philip S. Yu University of Illinois.
Is Random Model Better? -On its accuracy and efficiency-
Decision Tree Evolution using Limited number of Labeled Data Items from Drifting Data Streams Wei Fan 1, Yi-an Huang 2, and Philip S. Yu 1 1 IBM T.J.Watson.
Classification, Regression and Other Learning Methods CS240B Presentation Peter Huang June 4, 2014.
Ranking Outliers Using Symmetric Neighborhood Relationship Wen Jin, Anthony K.H. Tung, Jiawei Han, and Wei Wang Advances in Knowledge Discovery and Data.
1 Active Mining of Data Streams Wei Fan, Yi-an Huang, Haixun Wang and Philip S. Yu Proc. SIAM International Conference on Data Mining 2004 Speaker: Pei-Min.
Probabilistic Skyline Operator over Sliding Windows Wenjie Zhang University of New South Wales & NICTA, Australia Joint work: Xuemin Lin, Ying Zhang, Wei.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Mining in Anticipation for Concept Change: Proactive-Reactive Prediction in Data Streams YING YANG, XINDONG WU, XINGQUAN ZHU Data Mining and Knowledge.
Data Stream Classification: Training with Limited Amount of Labeled Data Mohammad Mehedy Masud Latifur Khan Bhavani Thuraisingham University of Texas at.
Date : 21 st of May, Shri Ramdeo Baba College of Engineering and Management Presentation By : Rimjhim Singh Under the Guidance of: Dr. M.B. Chandak.
Evaluation.
On Appropriate Assumptions to Mine Data Streams: Analyses and Solutions Jing Gao† Wei Fan‡ Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM.
1 An Adaptive Nearest Neighbor Classification Algorithm for Data Streams Yan-Nei Law & Carlo Zaniolo University of California, Los Angeles PKDD, Porto,
Evaluation.
Ensemble Learning: An Introduction
Adaboost and its application
Sketched Derivation of error bound using VC-dimension (1) Bound our usual PAC expression by the probability that an algorithm has 0 error on the training.
Mining Long Sequential Patterns in a Noisy Environment Jiong Yang, Wei Wang, Philip S. Yu, Jiawei Han SIGMOD 2002.
Examples of Ensemble Methods
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Radial Basis Function Networks
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
by B. Zadrozny and C. Elkan
Thesis Proposal PrActive Learning: Practical Active Learning, Generalizing Active Learning for Real-World Deployments.
1 ENTROPY-BASED CONCEPT SHIFT DETECTION PETER VORBURGER, ABRAHAM BERNSTEIN IEEE ICDM 2006 Speaker: Li HueiJyun Advisor: Koh JiaLing Date:2007/11/6 1.
Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.
Ensemble Based Systems in Decision Making Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: IEEE CIRCUITS AND SYSTEMS MAGAZINE 2006, Q3 Robi.
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.
Randomization in Privacy Preserving Data Mining Agrawal, R., and Srikant, R. Privacy-Preserving Data Mining, ACM SIGMOD’00 the following slides include.
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Classification and Novel Class Detection in Data Streams Classification and Novel Class Detection in Data Streams Mehedy Masud 1, Latifur Khan 1, Jing.
1 AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Hong.
Mining Top-K Large Structural Patterns in a Massive Network Feida Zhu 1, Qiang Qu 2, David Lo 1, Xifeng Yan 3, Jiawei Han 4, and Philip S. Yu 5 1 Singapore.
Detecting New a Priori Probabilities of Data Using Supervised Learning Karpov Nikolay Associate professor NRU Higher School of Economics.
Ensemble Methods in Machine Learning
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
Finding τ → μ−μ−μ+ Decays at LHCb with Data Mining Algorithms
Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.
Xiangnan Kong,Philip S. Yu An Ensemble-based Approach to Fast Classification of Multi-label Data Streams Dept. of Computer Science University of Illinois.
1 Systematic Data Selection to Mine Concept-Drifting Data Streams Wei Fan Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery.
PEER TO PEER BOTNET DETECTION FOR CYBER- SECURITY (DEFENSIVE OPERATION): A DATA MINING APPROACH Masud, M. M. 1, Gao, J. 2, Khan, L. 1, Han, J. 2, Thuraisingham,
On Reducing Classifier Granularity in Mining Concept-Drifting Data Streams Peng Wang, H. Wang, X. Wu, W. Wang, and B. Shi Proc. of the Fifth IEEE International.
Technische Universität München Yulia Gembarzhevskaya LARGE-SCALE MALWARE CLASSIFICATON USING RANDOM PROJECTIONS AND NEURAL NETWORKS Technische Universität.
Machine Learning: Ensemble Methods
CS 9633 Machine Learning Support Vector Machines
Ensemble methods with Data Streams
C4.5 - pruning decision trees
Mining Time-Changing Data Streams
Classification with Perceptrons Reading:
Data Mining Practical Machine Learning Tools and Techniques
Introduction to Data Mining, 2nd Edition
A Framework for Clustering Evolving Data Streams
Classification of class-imbalanced data
Discriminative Frequent Pattern Analysis for Effective Classification
Decision Trees for Mining Data Streams
Learning from Data Streams
Presentation transcript:

Mining Concept-Drifting Data Streams Using Ensemble Classifiers Haixun Wang Wei Fan Philip S. YU Jiawei Han Proc. 9 th ACM SIGKDD Internal Conf. Knowledge discovery and data mining, pp , 2003 Reporter :侯佩廷 2011/06/08 1

Outline Introduction Concept Drift Data Expiration Ensemble Classifiers Instance Based Pruning Experiments Conclusion 2011/06/08 2

Introduction The problem of mining data : – The tremendous amount of data is constantly evolving – Concept drift Propose: Using weighted classifiers ensemble to prove the problem. 2011/06/08 3

Concept Drift 2011/06/08 4 5/115/125/135/145/155/165/175/185/195/20 When the concept is updating or changing, there consist the concept drift. Figure 1: Concept drift

Data Expiration 2011/06/08 5 The fundamental problem: How to identify the data which is no longer useful? A straight forward solution : Discards the old data after a fixed period time T

Data Expiration Figure 2: data distributions and optimum boundaries Optimum boundary:positive: Overfitting:negative: 2011/06/08 6 t t0t0 t1t1 t2t2 t3t3 S0S0 S1S1 S2S2

Expiration Figure 3: Which training dataset to use? Optimum boundary: 2011/06/08 7 (a) S 1 + S 2 (b) S 0 + S 1 + S 2 (c) S 2 + S 0

Data Expiration 2011/06/08 8 Instead of discarding data using criteria based on their arrival time, we shall make decisions based on their class distribution.

Ensemble Classifiers y : a test example f c (y): the probability of y being an instance of class c The probability output of the ensemble(via averaging): 2011/06/08 9 Where is the probability output of the i-th classifier in the ensemble

2011/06/08 10 y Classifier1Classifier2Classifier3 = 0.4= 0.6= 0.8

2011/06/08 11 t t1t1 t2t2 t3t3 t4t4 titi t i+1 S1S1 S2S2 S3S3 …… S 10 C1C1 C2C2 C3C3 C 10 G9G9 E9E9 Ensemble Classifiers CiCi GkGk EkEk

S n consists of records in the form of (x, c), where c is the true label of the record. C i ’s classification error of example (x, c) is 1- Mean square error of classifier C i : 2011/06/08 12

Ensemble Classifiers A classifier predicts randomly will have mean square error: Ex: 2011/06/08 13 Classifier Class 2Class 1 P = 0.5

Ensemble Classifiers We discard classifiers whose error is equal to or larger than MSE r. Weight w i for classifier C i : w i = MSE r - MES i 2011/06/08 14

Ensemble Classifiers For cost-sensitive applications such as credit card fraud detection. 2011/06/08 15 Predict fraudPredict not fraud Actual fraudt(x)-cost0 Actual not fraud-cost0 Predict fraudPredict not fraud Actual fraud Actual not fraud-900

Instance Based Pruning Goal: – Use first k classifiers with high weights to reach the same decision when we use all K classifiers. 2011/06/08 16

Instance Based Pruning The conditions of this pipeline procedure stops: – The confident prediction can be made – No more classifiers in the pipeline 2011/06/08 17 Weight C1C1 C2C2 CkCk CKCK High Low ……

Instance Based Pruning After consulting the first k classifiers, we derive the current weighted probability: 2011/06/08 18

Instance Based Pruning Let ε k (x)=F k (x)- F K (x) be the error at stage k. We compute the mean and the variance of ε k (x) 2011/06/08 19

Experiments Two kinds of data: Synthetic Data – Create synthetic data with drifting concepts on moving hyperplane. Credit Card Fraud Data – One year and 5 million transactions 2011/06/08 20

Experiments 2011/06/08 21 Figure 4: Training Time, ChunkSize, and Error Rate

Experiments 2011/06/08 22 Figure 5: Effects of Instance Base Pruning

Experiments 2011/06/08 23 Figure 6: Average Error Rate of Single and Ensemble Decision Tree Classifiers

Experiments 2011/06/08 24 Figure 7: Averaged Benefits using Single Classifiers and Classifier Ensembles The benfits are averaged from multiple runs with different chunk size ( 3000 to transactions per chunk) Average the benefits of E k and G k ( K = 2,…8) for each fixed chunk size.

Conclusion The problem of mining data : – The tremendous amount of data is constantly evolving – Concept drift Weight ensemble classifiers is more efficient than the single classifiers 2011/06/08 25

Q & A 2011/06/08 26

THANKS FOR ATTENTATION. 2011/06/08 27