Learning under concept drift: an overview Zhimin He iTechs – ISCAS 2013-03-21.

Slides:

Advertisements

Similar presentations

Recommender System A Brief Survey.

Advertisements

Applications of one-class classification

Systematic Data Selection to Mine Concept Drifting Data Streams Wei Fan IBM T.J.Watson.

A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions Jing Gao Wei Fan Jiawei Han Philip S. Yu University of Illinois.

Classification, Regression and Other Learning Methods CS240B Presentation Peter Huang June 4, 2014.

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.

Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.

The Problem of Concept Drift: Definitions and Related Work Alexev Tsymbalo paper. (April 29, 2004)

Mining in Anticipation for Concept Change: Proactive-Reactive Prediction in Data Streams YING YANG, XINDONG WU, XINGQUAN ZHU Data Mining and Knowledge.

On-Line Probabilistic Classification with Particle Filters Pedro Højen-Sørensen, Nando de Freitas, and Torgen Fog, Proceedings of the IEEE International.

MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

Introduction to Machine Learning course fall 2007 Lecturer: Amnon Shashua Teaching Assistant: Yevgeny Seldin School of Computer Science and Engineering.

What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

Machine Learning as Applied to Intrusion Detection By Christine Fossaceca.

Introduction to machine learning

Machine Learning in Simulation-Based Analysis 1 Li-C. Wang, Malgorzata Marek-Sadowska University of California, Santa Barbara.

(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence

Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.

Predictive Modeling with Heterogeneous Sources Xiaoxiao Shi 1 Qi Liu 2 Wei Fan 3 Qiang Yang 4 Philip S. Yu 1 1 University of Illinois at Chicago 2 Tongji.

Kernel Classifiers from a Machine Learning Perspective (sec ) Jin-San Yang Biointelligence Laboratory School of Computer Science and Engineering.

Naive Bayes Classifier

1 ENTROPY-BASED CONCEPT SHIFT DETECTION PETER VORBURGER, ABRAHAM BERNSTEIN IEEE ICDM 2006 Speaker: Li HueiJyun Advisor: Koh JiaLing Date:2007/11/6 1.

ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.

Universit at Dortmund, LS VIII

Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.

Computer Science, Software Engineering & Robotics Workshop, FGCU, April 27-28, 2012 Fault Prediction with Particle Filters by David Hatfield mentors: Dr.

Ensemble Based Systems in Decision Making Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: IEEE CIRCUITS AND SYSTEMS MAGAZINE 2006, Q3 Robi.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

Classification Techniques: Bayesian Classification

Maximum Entropy (ME) Maximum Entropy Markov Model (MEMM) Conditional Random Field (CRF)

ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.

Sparse Bayesian Learning for Efficient Visual Tracking O. Williams, A. Blake & R. Cipolloa PAMI, Aug Presented by Yuting Qi Machine Learning Reading.

ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.

Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.

Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.

HAITHAM BOU AMMAR MAASTRICHT UNIVERSITY Transfer for Supervised Learning Tasks.

Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.

Detecting New a Priori Probabilities of Data Using Supervised Learning Karpov Nikolay Associate professor NRU Higher School of Economics.

Applications of Supervised Learning in Bioinformatics Yen-Jen Oyang Dept. of Computer Science and Information Engineering.

KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.

Nonparametric Density Estimation Riu Baring CIS 8526 Machine Learning Temple University Fall 2007 Christopher M. Bishop, Pattern Recognition and Machine.

Data Mining and Decision Support

Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.

Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Machine Learning CUNY Graduate Center Lecture 6: Linear Regression II.

Bayesian Approach Jake Blanchard Fall Introduction This is a methodology for combining observed data with expert judgment Treats all parameters.

Machine Learning in Practice Lecture 6 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

CS Machine Learning Instance Based Learning (Adapted from various sources)

Web-Mining Agents: Transfer Learning TrAdaBoost R. Möller Institute of Information Systems University of Lübeck.

Transfer and Multitask Learning Steve Clanton. Multiple Tasks and Generalization “The ability of a system to recognize and apply knowledge and skills.

Computacion Inteligente Least-Square Methods for System Identification.

KNN & Naïve Bayes Hongning Wang

1 Tracking Dynamics of Topic Trends Using a Finite Mixture Model Satoshi Morinaga, Kenji Yamanishi KDD ’04.

CS 9633 Machine Learning Support Vector Machines

Intro to Machine Learning

Transfer Learning in Astronomy: A New Machine Learning Paradigm

Supervised Time Series Pattern Discovery through Local Importance

Classification Techniques: Bayesian Classification

Overview of Machine Learning

Multivariate Methods Berlin Chen, 2005 References:

Using Clustering to Make Prediction Intervals For Neural Networks

Presentation transcript:

Learning under concept drift: an overview Zhimin He iTechs – ISCAS

Agenda  What’s Concept Drift  Causes of a Concept Drift  Types of Concept Drift  Detecting and Handling Concept Drift  Implications for Software Engineering Research

Definitions  Prediction is a vector in p-dimensional feature space observed at time t and y t is the corresponding label. We call X t an instance and a pair (X t ; y t ) a labeled instance. We refer to instances (X 1 ; : : : ;X t ) as historical data and instance X t+1 as target (or testing) instance. The task is to predict a label y t+1 for the target instance X t+1.

Definitions(cont.)  Concept Drift Every instance X t is generated by a source S t. If all the data is sampled from the same source, i.e. S 1 = S 2 = : : : = S t+1 = S we say that the concept is stable. If for any two time points i and j S i != S j, we say that there is a concept drift.

Causes of Concept Drift  Let is an instance in p-dimensional feature space., where c 1, c 2,….c k is the set of class labels.  The optimal classier to classify is determined by a prior probabilities for the classes P(c i ) and the class-conditional probability density functions p(X | c i ), i = 1,….k.  Concept /data source: a set of a prior probabilities of the classes and class- conditional pdf's:

Causes of Concept Drift (cont.)  Concept drift may occur in three ways: Class priors P(c) might change over time. The distributions of one or several classes p(X|c i ) might change. (virtual drift) The posterior distributions of the class memberships p(c i |X) might change.(real drift)

Types of Concept Drift  Types: Sudden drift Gradual drift Incremental drift reoccurring contexts

Detecting and Handling Concept Drift  Detecting Monitoring the raw data Monitoring parameters of learners Monitoring prediction errors of learners  Handling Ensemble learning Instance selection Instance weights Training windows Training windows are naturally suitable for sudden concept drift, while ensembles are more flexible in terms of change type.

Detecting and Handling Concept Drift (cont.)  Overall solution for learning under concept drift

Implications for SE Research  Concept drift is a fundamental issue for SE predictions Cost estimation, defect prediction… Especially in the cross-company/cross-project context Be harmful to performance of prediction models  Detecting and handling concept drift is a challenging task! Quality problems of SE data, e.g., insufficient data Data generation context is highly unstable.  Has become a increasingly popular research topic in SE field! E.g., Burak Turhan [JESE 2012], Jayalath Ekanayake [MSR 2009, JESE 2011]

References 1.Indre Zliobaite, “Learning under Concept Drift: an Overview,” Tech-report, A. Dries and R. Ulrich, “Adaptive Concept Drift Detection,” Journal of Statictical Analysis and Data Mining, L. Minku, A. White, and X. Yao. “The impact of diversity on on-line ensemble learning in the presence of concept drift.” IEEE Transactions on Knowledge and Data Engineering, M. Kelly, D. Hand, and N. Adams. “The impact of changing populations on classier performance.” KDD,1999