Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data Thanawin Rakthanmanon Eamonn Keogh Stefano Lonardi Scott Evans.

Slides:



Advertisements
Similar presentations
A Fast PTAS for k-Means Clustering
Advertisements

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Advanced Piloting Cruise Plot.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 5 Author: Julia Richards and R. Scott Hawley.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
Dynamic Programming Introduction Prof. Muhammad Saeed.
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.
SMA 6304 / MIT / MIT Manufacturing Systems Lecture 11: Forecasting Lecturer: Prof. Duane S. Boning Copyright 2003 © Duane S. Boning. 1.
We need a common denominator to add these fractions.
XP New Perspectives on Microsoft Office Word 2003 Tutorial 6 1 Microsoft Office Word 2003 Tutorial 6 – Creating Form Letters and Mailing Labels.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
Database Design: ER Modelling (Continued)
1/23 Learning from positive examples Main ideas and the particular case of CProgol4.2 Daniel Fredouille, CIG talk,11/2005.
ZMQS ZMQS
TCP Sliding Windows, Flow Control, and Congestion Control Lecture material taken from Computer Networks A Systems Approach, Fourth Ed.,Peterson and Davie,
1 Clustering of location- based data Mohammad Rezaei May 2013.
Lecture 11: Algorithms and Time Complexity I Discrete Mathematical Structures: Theory and Applications.
Parsons Brinckerhoff Chicago, Illinois GIS Estimation of Transit Access Parameters for Mode Choice Models GIS in Transit Conference October 16-17, 2013.
Ken C. K. Lee, Baihua Zheng, Huajing Li, Wang-Chien Lee VLDB 07 Approaching the Skyline in Z Order 1.
Re-examining Instruction Reuse in Pre-execution Approaches By Sonya R. Wolff Prof. Ronald D. Barnes June 5, 2011.
Report Card P Only 4 files are exported in SAMS, but there are at least 7 tables could be exported in WebSAMS. Report Card P contains 4 functions: Extract,
Acceleration of Cooley-Tukey algorithm using Maxeler machine
SAX: a Novel Symbolic Representation of Time Series
Randomized Algorithms Randomized Algorithms CS648 1.
Chapter 4 Memory Management Basic memory management Swapping
Data Structures Using C++
Arrays Dr. Jey Veerasamy July 31 st – August 23 rd 9:30 am to 12 noon 1.
ABC Technology Project
Maps, tables, flow charts and diagrams in Qualitative Data Analysis
ITP © Ron Poet Lecture 4 1 Program Development. ITP © Ron Poet Lecture 4 2 Preparation Cannot just start programming, must prepare first. Decide what.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
© Charles van Marrewijk, An Introduction to Geographical Economics Brakman, Garretsen, and Van Marrewijk.
Progam.-(6)* Write a program to Display series of Leaner, Even and odd using by LOOP command and Direct Offset address. Design by : sir Masood.
Linking Verb? Action Verb or. Question 1 Define the term: action verb.
Squares and Square Root WALK. Solve each problem REVIEW:
YO-YO Leader Election Lijie Wang
Traditional IR models Jian-Yun Nie.
© 2012 National Heart Foundation of Australia. Slide 2.
ITEC200 Week10 Sorting. pdp 2 Learning Objectives – Week10 Sorting (Chapter10) By working through this chapter, students should: Learn.
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Addition 1’s to 20.
CS 240 Computer Programming 1
25 seconds left…...
11 = This is the fact family. You say: 8+3=11 and 3+8=11
Arithmetic of random variables: adding constants to random variables, multiplying random variables by constants, and adding two random variables together.
Week 1.
Systems Analysis and Design in a Changing World, Fifth Edition
We will resume in: 25 Minutes.
Less Than Matching Orgad Keller.
Introduction to Recursion and Recursive Algorithms
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
PARTITIONAL CLUSTERING
Mining Time Series.
Jessica Lin, Eamonn Keogh, Stefano Loardi
1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside.
Discovering the Intrinsic Cardinality and Dimensionality of Time Series using MDL BING HU THANAWIN RAKTHANMANON YUAN HAO SCOTT EVANS1 STEFANO LONARDI EAMONN.
CLUSTERING. Overview Definition of Clustering Existing clustering methods Clustering examples.
CURE: EFFICIENT CLUSTERING ALGORITHM FOR LARGE DATASETS VULAVALA VAMSHI PRIYA.
Ch. Eick: Introduction to Hierarchical Clustering and DBSCAN 1 Remaining Lectures in Advanced Clustering and Outlier Detection 2.Advanced Classification.
Discovery of Meaningful Rules in Time Series
Figures in High Resolution. Hamming distance for all sliding words using Average Link Three clusters of equal diameter when K = 20 { whoid, davud, njoin,
Basic machine learning background with Python scikit-learn
Time Series Filtering Time Series
Time Series Filtering Time Series
Presentation transcript:

Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data Thanawin Rakthanmanon Eamonn Keogh Stefano Lonardi Scott Evans

Subsequence Clustering Problem Given a time series, individual subsequences are extracted with a sliding window. Main task is to cluster those subsequences. 2 Sliding window

All subsequences Average subsequence Subsequence Clustering Problem Sliding window Keogh and Lin in ICDM Subsequence clustering is meaningless. Centers of 3 clusters

All data also contains.. Transitions (the connections between words) Some transitions has good meaning and worth to be discovered – The connection inside a group of words Some transitions has no meaning/structure – ASL: hand movement between two words – Speech: (un)expected sound like um.., ah.., er.. – Motion Capture: unexpected movement – Hand Writing: size of space between words 4

How to Deal with them? Possible approaches are Learn it! – Separate noise/unexpected data from the dataset. Use a very clean dataset – dataset contains only atomic words. Simple approach (our choice) – Just ignore some data. – Hope that we will ignore unimportant data. 5

Concepts in Our Algorithm Our clustering algorithm.. is a hierarchical clustering is parameter-lite – approx. length of subsequence (size of sliding window) ignores some data – the algorithm considers only non-overlapped data uses MDL-based distance, bitsave terminates if.. – no choice can save any bit (bitsave ≤ 0) – all data has been used 6

Minimum Description Length (MDL) The shortest code to output the data by Jorma Rissanen in 1978 Intractable complexity (Kolmogorov complexity) Basic concepts of MDL which we use: The better choice uses the smaller number of bits to represent the data – Compare between different operators – Compare between different lengths 7

A H A' denoted as A' is A given H A' = A|H = A-H How to use Description Length? If >, we will store A as A' and H DL( A ) DL( A' ) + DL( A ) is the number of bits to store A DL( H )

Clustering Algorithm 9 Current Clusters Add to cluster Create new cluster Merge clusters Create a cluster by 2 subsequences which are the most similar. Add the closet sub- sequence to a cluster. Merge 2 closet clusters.

What is the best choice? bitsave = DL(Before) - DL(After) 1)Create bitsave = DL(A) + DL(B) - DLC(C') - a new cluster C' from subsequences A and B 2)Add bitsave = DL(A) + DLC(C) - DLC(C') - a subsequence A to an existing cluster C - C' is the cluster C after including subsequence A. 3)Merge bitsave = DLC(C 1 ) + DLC(C 2 ) - DLC(C') - cluster C 1 and C 2 merge to a new cluster C'. The bigger save, the better choice. 10

Clustering Algorithm 11 Current Clusters Add to cluster Create new cluster Merge clusters Create a cluster by 2 subsequences which are the most similar. Add the closet sub- sequence to a cluster. Merge 2 closet clusters.

Bird Calls x Two interwoven calls from the Elf Owl, and Pied-billed Grebe. A time series extracted by using MFCC technique. 12

Current Clusters 13 Create Add Merge Create Motif Discovery Input Final Clusters Add Nearest Nighbor

Bird Calls: Clustering Result Step 1: Create Step 2: Create Step 3: Add Step 4: Merge Subsequences Center of cluster (or Hypothesis) Step of the clustering process bitsave per unit Clustering stops here Create Add Merge 14

Poem The Bells In a sort of Runic rhyme, To the throbbing of the bells-- Of the bells, bells, bells, To the sobbing of the bells; Keeping time, time, time, As he knells, knells, knells, In a happy Runic rhyme, To the rolling of the bells,-- Of the bells, bells, bells-- To the tolling of the bells, Of the bells, bells, bells, bells, Bells, bells, bells,-- To the moaning and the groan- ing of the bells. Edgar Allen Poe (Wikipedia)

The Bells: Clustering Result == Group by Clusters == bells, bells, bells, Bells, bells, bells, Of the bells, bells, bells, Of the bells, bells, bells— To the throbbing of the bells-- To the sobbing of the bells; To the tolling of the bells, To the rolling of the bells,-- To the moaning and the groan- time, time, time, knells, knells, knells, sort of Runic rhyme, groaning of the bells. == Original Order == In a sort of Runic rhyme, To the throbbing of the bells-- Of the bells, bells, bells, To the sobbing of the bells; Keeping time, time, time, As he knells, knells, knells, In a happy Runic rhyme, To the rolling of the bells,-- Of the bells, bells, bells-- To the tolling of the bells, Of the bells, bells, bells, bells, Bells, bells, bells,-- To the moaning and the groan- ing of the bells.

Summary Clustering time series algorithm using MDL. Some data must be ignored or not appeared in any cluster. MDL is used to.. – select the best choice among different operators. – select the best choice among the different lengths. Final clusters can contain subsequences of different length. To speed up, Euclidean is used instead of MDL in core modules, e.g., motif discovery. 17

18

How to calculate DL? A is a subsequence. DL(A) = entropy(A) – Similar result if use Shanon-Fano or Huffman coding. H is a hypothesis, which can be any subsequence. *DL(A) = DL(H) + DL(A - H) – Compression idea; never use – directly in algorithm Cluster C contains subsequence A and B DLC(C) = DL(center) + min(DL(A-center), DL(B-center)) 20 0 A H A'

Running Time 21 motif length s = Cluster plotted Stacked, Dithered Koshi-ECG time series O(m3/s)O(m3/s)

ED vs MDL in Random Walk 22 ED calculated in original continuous space MDL calculated in discrete space (64 cardinality)

Discretization vs Accuracy 23 Classification Accuracy of 18 data sets. The reduction from original continuous space to different discretization does not hurt much, at least in classification accuracy.