Time Series Anomaly Detection Experiments This file contains full color, large scale versions of the experiments shown in the paper, and additional experiments.

Slides:



Advertisements
Similar presentations
FEATURE PERFORMANCE COMPARISON FEATURE PERFORMANCE COMPARISON y SC is a training set of k-dimensional observations with labels S and C b C is a parameter.
Advertisements

SAX: a Novel Symbolic Representation of Time Series
Indexing DNA Sequences Using q-Grams
By Dr.Ahmed Mostafa Assist. Prof. of anesthesia & I.C.U.
ECG Signal processing (2)
Spatial Database Systems. Spatial Database Applications GIS applications (maps): Urban planning, route optimization, fire or pollution monitoring, utility.
Time Series Classification under More Realistic Assumptions Bing Hu Yanping Chen Eamonn Keogh SIAM Data Mining Conference (SDM), 2013.
An Approach to ECG Delineation using Wavelet Analysis and Hidden Markov Models Maarten Vaessen (FdAW/Master Operations Research) Iwan de Jong (IDEE/MI)
CPSC 335 Dr. Marina Gavrilova Computer Science University of Calgary Canada.
Eamonn Keogh Li Wei Xiaopeng Xi Stefano Lonardi Jin Shieh Scott Sirowy
Relevance Feedback Retrieval of Time Series Data Eamonn J. Keogh & Michael J. Pazzani Prepared By/ Fahad Al-jutaily Supervisor/ Dr. Mourad Ykhlef IS531.
An Introduction of Support Vector Machine
Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use these slides for teaching, if.
Model-Based ECG Fiducial Points Extraction Using a Modified EKF Structure Presented by: Omid Sayadi Biomedical Signal and Image Processing Lab (BiSIPL),
While we believe our paper is self contained, this presentation contains: 1.Augmented and larger scale versions of experiments shown in the paper. 2.Additional.
Classification Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA Who.
Jessica Lin, Eamonn Keogh, Stefano Loardi
Time Series Bitmap Experiments This file contains full color, large scale versions of the experiments shown in the paper, and additional experiments which.
Time Series Bitmap Experiments This file contains full color, large scale versions of the experiments shown in the paper, and additional experiments which.
The 3-class ECG problem: (left) the best clustering was our approach, the second best (right) was Euclidian distance.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Time-Series Data Kaitlin Duck Sherwood CS 533c. Why do you care? Time-series data is all over the place.
Classification of Electrocardiogram (ECG) Waveforms for the Detection of Cardiac Problems By Enda Moloney.
Gavin Russell-Rockliff BI Technical Specialist Microsoft BIN305.
Information Retrieval in Practice
Face Detection using the Viola-Jones Method
Data mining and machine learning A brief introduction.
Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
SAGExplore web server tutorial for Module II: Genome Mapping.
Random Forest-Based Classification of Heart Rate Variability Signals by Using Combinations of Linear and Nonlinear Features Alan Jovic, Nikola Bogunovic.
Line detection Assume there is a binary image, we use F(ά,X)=0 as the parametric equation of a curve with a vector of parameters ά=[α 1, …, α m ] and X=[x.
Section 12-1 Sequence and Series
Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.
Learning from observations
Play The Notes (new version) This presentation takes you through the 1 st Five notes on a brass instruments. It shows you how to play them And how they.
Classifying Normal and Abnormal Heartbeats From a Noisy ECG Eric Peterson ECE 539.
Evaluation of Techniques for Classifying Biological Sequences Authors: Mukund Deshpande and George Karypis Speaker: Sarah Chan CSIS DB Seminar May 31,
Fast Shapelets: All Figures in Higher Resolution.
1 Neighboring Feature Clustering Author: Z. Wang, W. Zheng, Y. Wang, J. Ford, F. Makedon, J. Pearlman Presenter: Prof. Fillia Makedon Dartmouth College.
Data Mining and Decision Support
SAGExplore web server tutorial. The SAGExplore server has three different modules …
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Consider the machine learning problem in the figure below (only four examples of each class are shown for simplicity, imagine that we have 1,000 examples.
Extending linear models by transformation (section 3.4 in text) (lectures 3&4 on amlbook.com)
ECE172A Project Report Image Search and Classification Isaac Caldwell.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Shape2Pose: Human Centric Shape Analysis CMPT888 Vladimir G. Kim Siddhartha Chaudhuri Leonidas Guibas Thomas Funkhouser Stanford University Princeton University.
Feature learning for multivariate time series classification Mustafa Gokce Baydogan * George Runger * Eugene Tuv † * Arizona State University † Intel Corporation.
Matrix Profile Examples
Support Feature Machine for DNA microarray data
Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets Chin-Chia Michael Yeh, Yan.
ECG data classification with deep learning tools
Abstract for: 2016 IEEE Applied Imagery Pattern Recognition Workshop
ECG Review for practical 1:
Rule Induction for Classification Using
Common arrhythmia.
Optimizing the detection of characteristic waves in ECG based on exploration of processing steps combinations Authors: Krešimir Friganović, Alan Jović,
Basic machine learning background with Python scikit-learn
A Time Series Representation Framework Based on Learned Patterns
Data fusion classification method based on Multi agents system
Nearest-Neighbor Classifiers
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Assignment 1: Classification by K Nearest Neighbors (KNN) technique
Machine Learning for Cyber
Presentation transcript:

Time Series Anomaly Detection Experiments This file contains full color, large scale versions of the experiments shown in the paper, and additional experiments which were omitted because of space constraints Note that in every case, all the data is freely available

Here the bitmaps are almost the same. Here the bitmaps are very different. This is the most unusual section of the time series, and it coincidences with the PVC. Here is a Premature Ventricular Contraction (PVC) Figure 1 Expanded

Figure 3 Expanded The gene sequences of mitochondrial DNA of four animals, used to create their own file icons using a chaos game representation. Note that Pan troglodytes is the familiar Chimpanzee, and Loxodonta africana and Elephas maximus are the African and Indian Elephants, respectively. The file icons show that humans and chimpanzees have similar genomes, as do the African and Indian elephants.

Premature ventricular contraction Supraventricular escape beat Annotations by a cardiologist Figure 6 Expanded

A very complex and noisy ECG, but according to a cardiologist there is only one abnormal heartbeat. The algorithm easily finds it. Figure 7 Expanded

Below are some examples of classification, clustering with our bitmap approach. These examples did not make it into the paper because of space limitations

Time Series Thumbnails A snapshot of a folder containing cardiograms when its files are arranged by “Cluster” option. Five cardiograms have been grouped into two different clusters based on their similarity. Cluster 1 (eeg 1 ~ 3): BIDMC Congestive Heart Failure Database (chfdb): record chf02 Start times at 0, 82, 150, respectively Cluster 2 (eeg 6 ~ 7): BIDMC Congestive Heart Failure Database (chfdb): record chf15 Start times at 0, 82 respectively

Clustering with Time Series Thumbnail Approach Cluster 1 (datasets 1 ~ 5): BIDMC Congestive Heart Failure Database (chfdb): record chf02 Start times at 0, 82, 150, 200, 250, respectively Cluster 2 (datasets 6 ~ 10): BIDMC Congestive Heart Failure Database (chfdb): record chf15 Start times at 0, 82, 150, 200, 250, respectively Cluster 3 (datasets 11 ~ 15): Long Term ST Database (ltstdb): record Start times at 0, 50, 100, 150, 200, respectively Cluster 4 (datasets 16 ~ 20): MIT-BIH Noise Stress Test Database (nstdb): record 118e6 Start times at 0, 50, 100, 150, 200, respectively Data Key

Clustering Extended In Ge and Smyth 2000, this dataset was explored with segmental hidden Markov models. After they careful adjusted the parameters they reported 98% classification accuracy. Using time series bitmap with virtually any parameter settings, we get perfect classifications and clustering. We can get perfect classifications using one nearest neighbor classification, or we can project the data into 2 dimensional space (see next slide) and get perfect accuracy using a simple linear classifier, a decision tree or SVD. (Dataset donated by Padhraic Smyth and Seyoung Kim) Parameters Level 1 N = 60 n = Segmental Markov model [1]

Parameters Level 1 N = 60 n = 12 Classification The MIT ECG Arrhythmia dataset projected into 2D space using only the information from a level 2-time series bitmap. The two classes are easily separated by a simple linear classifier (gray line).