University of Waikato, New Zealand

Slides:



Advertisements
Similar presentations
OpenCV Introduction Hang Xiao Oct 26, History  1999 Jan : lanched by Intel, real time machine vision library for UI, optimized code for intel 
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Real-Time Human Pose Recognition in Parts from Single Depth Images Presented by: Mohammad A. Gowayyed.
AdaBoost & Its Applications
Face detection Many slides adapted from P. Viola.
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei Li,
Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection Ejaz Ahmed 1, Gregory Shakhnarovich 2, and Subhransu Maji 3 1 University.
On the Relationship between Visual Attributes and Convolutional Networks Paper ID - 52.
Model Evaluation Metrics for Performance Evaluation
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
On Appropriate Assumptions to Mine Data Streams: Analyses and Solutions Jing Gao† Wei Fan‡ Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM.
1 An Adaptive Nearest Neighbor Classification Algorithm for Data Streams Yan-Nei Law & Carlo Zaniolo University of California, Los Angeles PKDD, Porto,
Ensemble Learning: An Introduction
Robust Real-Time Object Detection Paul Viola & Michael Jones.
COS 429 PS5: Finding Nemo. Exemplar -SVM Still a rigid template,but train a separate SVM for each positive instance For each category it can has exemplar.
1 Accurate Object Detection with Joint Classification- Regression Random Forests Presenter ByungIn Yoo CS688/WST665.
K-means Based Unsupervised Feature Learning for Image Recognition Ling Zheng.
Object Detection Using the Statistics of Parts Henry Schneiderman Takeo Kanade Presented by : Sameer Shirdhonkar December 11, 2003.
Learning and Recognizing Activities in Streams of Video Dinesh Govindaraju.
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
Face Detection and Neural Networks Todd Wittman Math 8600: Image Analysis Prof. Jackie Shen December 2001.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures.
EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()
Machine Learning CS 165B Spring 2012
GEOMETRIC VIEW OF DATA David Kauchak CS 451 – Fall 2013.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Lecture 29: Face Detection Revisited CS4670 / 5670: Computer Vision Noah Snavely.
Face detection Slides adapted Grauman & Liebe’s tutorial
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
CLASSIFICATION: Ensemble Methods
BAGGING ALGORITHM, ONLINE BOOSTING AND VISION Se – Hoon Park.
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Face Detection using the Spectral Histogram representation By: Christopher Waring, Xiuwen Liu Department of Computer Science Florida State University Presented.
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
Assignments CS fall Assignment 1 due Generate the in silico data set of 2sin(1.5x)+ N (0,1) with 100 random values of x between.
Validation methods.
Regression Tree Ensembles Sergey Bakin. Problem Formulation §Training data set of N data points (x i,y i ), 1,…,N. §x are predictor variables (P-dimensional.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Yann LeCun Other Methods and Applications of Deep Learning Yann Le Cun The Courant Institute of Mathematical Sciences New York University
Data Science Credibility: Evaluating What’s Been Learned
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Unsupervised Learning of Video Representations using LSTMs
University of Waikato, New Zealand
University of Waikato, New Zealand
Summary of “Efficient Deep Learning for Stereo Matching”
Deep Learning Amin Sobhani.
Journal Club M Havaei, et al. Université de Sherbrooke, Canada
Kin 304 Regression Linear Regression Least Sum of Squares
BPK 304W Regression Linear Regression Least Sum of Squares
Deep Learning Hierarchical Representations for Image Steganalysis
ADABOOST(Adaptative Boosting)
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
Learning Algorithm Evaluation
Pattern recognition in gait activities using a floor sensor system
Ensemble learning Reminder - Bagging of Trees Random Forest
An Adaptive Nearest Neighbor Classification Algorithm for Data Streams
Information Retrieval
Lecture 29: Face Detection Revisited
COMP6321 MACHINE LEARNING PROJECT PRESENTATION
Lecture 16. Classification (II): Practical Considerations
Report 2 Brandon Silva.
An introduction to Machine Learning (ML)
Evaluation David Kauchak CS 158 – Fall 2019.
Presentation transcript:

University of Waikato, New Zealand Data Stream Mining Lesson 2 Bernhard Pfahringer University of Waikato, New Zealand 1 1

Overview Drift and adaption Change detection Evaluation DDM Adwin CUSUM / Page-Hinkley DDM Adwin Evaluation Holdout Prequential Multiple runs: Cross-validation, … Pitfalls

Many dimensions for Model Management Data: fixed sized window, adaptive window, weighting Detection: monitor some performance measure Compare distributions over time windows Adaptation: Implicit/blind (e.g. based on windows) Explicit: use change detector Model: restart from scratch, or replace parts (tree-branch, ensemble member) 3 Props: true detection rate, false alarm rate, detection delay

CUSUM: cumulative sum Monitor residuals, raise alarm when the mean is significantly different from 0 (Page-Hinkley is a more sophisticated variant.)

DDM [Gama etal ‘04] Drift detection method: monitors prediction based on estimated standard deviation Normal state Warning state Alarm/Change state

Adwin [Bifet&Gavalda ‘07] Invariant: maximal size window with same mean (distribution) [uses exponential histogram idea to save space and time]

Evaluation: Holdout Have a separate test (or Holdout) set Evaluate current model after every k examples Where does the Holdout set come from? What about drift/change?

Prequential Also called “test than train”: Use every new example to test current model Then train the current model with the new example Simple and elegant, also tracks change and drift naturally But can suffer from initial bad performance of a model Use fading factors (e.g. alpha = 0.99) Or a sliding window

Comparison (no drift)

K-fold: Cross-validation

K-fold: split-validation

K-fold: bootstrap validation

K-fold: who wins? [Bifet etal 2015] Cross-validation strongest, but most expensive Split-validation weakest, but cheapest Bootstrap: in between, but closer to cross-validation

Evaluation can be misleading

“Magic” classifier

Published results

“Magic” = no-change classifier Problem is Auto-correlation Use for evaluation: Kappa-plus Exploit for better prediction

“Magic” = no-change classifier

SWT: Temporally Augmented Classifier

SWT: Accuracy and Kappa Plus, Electricity

SWT: Accuracy and Kappa Plus, Forest Cover

Forest Cover? “Time:” sorted by elevation

Can we exploit spatial correlation? Deep learning for Image Processing does it: Convolutional layers Video encoding does it: MPEG (@IBM) (@Yann LeCun)

Rain radar image prediction NZ rain radar images from metservice.com Automatically collected every 7.5 minutes Images are 601x728, ~450,000 pixels Each pixel represents a ~7 km2 area Predict the next picture, or 1 hour ahead, … http://www.metservice.com/maps-radar/rain-radar/all-new-zealand

Rain radar image prediction Predict every single pixel Include information from a neighbourhood, in past images

Results Actual (left) vs Predicted (right)

Big Open Question: How to exploit spatio-temporal relationships in data with rich features? Algorithm choice: Hidden Markov Models? Conditional Random Fields? Deep Learning? Feature representation: Include information from “neighbouring” examples? Explicit relational representation?