Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets Chin-Chia Michael Yeh, Yan.

Slides:

Advertisements

Similar presentations

Advertisements

Algorithms and applications

Nearest Neighbor Search

Clustering (2). Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram –A tree like.

1 CSE 980: Data Mining Lecture 16: Hierarchical Clustering.

Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.

Hierarchical Clustering, DBSCAN The EM Algorithm

Longest Common Subsequence

Multimedia Database Systems

Clustering Basic Concepts and Algorithms

CMPT 225 Sorting Algorithms Algorithm Analysis: Big O Notation.

Mining Time Series.

Metrics, Algorithms & Follow-ups Profile Similarity Measures Cluster combination procedures Hierarchical vs. Non-hierarchical Clustering Statistical follow-up.

Uncertainty Representation. Gaussian Distribution variance Standard deviation.

Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.

Today Unsupervised Learning Clustering K-means. EE3J2 Data Mining Lecture 18 K-means and Agglomerative Algorithms Ali Al-Shahib.

Classification Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA Who.

Jessica Lin, Eamonn Keogh, Stefano Loardi

Slide 1 EE3J2 Data Mining Lecture 16 Unsupervised Learning Ali Al-Shahib.

Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.

Parallel Computation of the Minimum Separation Distance of Bezier Curves and Surfaces Lauren Bissett, Nicholas Woodfield,

Finding Time Series Motifs on Disk-Resident Data

© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.

1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside.

October 8, 2013Computer Vision Lecture 11: The Hough Transform 1 Fitting Curve Models to Edges Most contours can be well described by combining several.

SPANISH CRYPTOGRAPHY DAYS (SCD 2011) A Search Algorithm Based on Syndrome Computation to Get Efficient Shortened Cyclic Codes Correcting either Random.

RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?

Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.

MINING FREQUENT ITEMSETS IN A STREAM TOON CALDERS, NELE DEXTERS, BART GOETHALS ICDM2007 Date: 5 June 2008 Speaker: Li, Huei-Jyun Advisor: Dr. Koh, Jia-Ling.

Abdullah Mueen Eamonn Keogh University of California, Riverside.

Discovering Deformable Motifs in Time Series Data Jin Chen CSE Fall 1.

Wenqi Zhu 3D Reconstruction From Multiple Views Based on Scale-Invariant Feature Transform.

Semi-Supervised Time Series Classification Li Wei Eamonn Keogh University of California, Riverside {wli,

UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.

Course 8 Contours. Def: edge list ---- ordered set of edge point or fragments. Def: contour ---- an edge list or expression that is used to represent.

LogTree: A Framework for Generating System Events from Raw Textual Logs Liang Tang and Tao Li School of Computing and Information Sciences Florida International.

Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree like diagram that.

Vector Quantization Vector quantization is used in many applications such as image and voice compression, voice recognition (in general statistical pattern.

Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.

Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!

CHAN Siu Lung, Daniel CHAN Wai Kin, Ken CHOW Chin Hung, Victor KOON Ping Yin, Bob Fast Algorithms for Projected Clustering.

Parameter Reduction for Density-based Clustering on Large Data Sets Elizabeth Wang.

Color Image Segmentation Mentor : Dr. Rajeev Srivastava Students: Achit Kumar Ojha Aseem Kumar Akshay Tyagi.

Matrix Profile Examples

Image Representation and Description – Representation Schemes

Hierarchical Clustering

Clustering CSC 600: Data Mining Class 21.

Matrix Profile II: Exploiting a Novel Algorithm and GPUs to break the one Hundred Million Barrier for Time Series Motifs and Joins Yan Zhu, Zachary Zimmerman,

Efficient Image Classification on Vertically Decomposed Data

Supervised Time Series Pattern Discovery through Local Importance

Dimension reduction : PCA and Clustering by Agnieszka S. Juncker

Hierarchical Clustering

Mean Shift Segmentation

At Last! Time Series Joins, Motifs, Discords and Shapelets at Interactive Speeds Eamonn Keogh With Yan Zhu, Chin-Chia Michael Yeh, Abdullah Mueen with.

A Time Series Representation Framework Based on Learned Patterns

Convolutional Networks

K-means and Hierarchical Clustering

Time Series Chains: A New Primitive for Time Series Data Mining

Fitting Curve Models to Edges

Efficient Image Classification on Vertically Decomposed Data

Image Retrieval Longin Jan Latecki.

Lecture 7: Dynamic sampling Dimension Reduction

Motion Segmentation at Any Speed

Multivariate Statistical Methods

Introduction to Grasshopper for rhino

Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.

Nearest Neighbors CSC 576: Data Mining.

K-Medoid May 5, 2019.

Hierarchical Clustering

Simple Case Studies Using MASS

Presentation transcript:

Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova Nurjahan Begum, Yifei Ding, Hoang Anh Dau Diego Furtado Silva, Abdullah Mueen, Eamonn Keogh http://www.cs.ucr.edu/~eamonn/MatrixProfile.html

Motivation Given a time series, T and a desired subsequence length, m

Motivation Given a time series, T and a desired subsequence length, m We can use sliding window of length m to extract all subsequences of length m |T|-m+1 …

Motivation Given a time series, T and a desired subsequence length, m We can then compute the pairwise distance among these subsequences and store them to a matrix 7.6952 7.7399 … 7.7106 |T|-m+1 … 4

Motivation Given a time series, T and a desired subsequence length, m We can visualize the matrix by plot the matrix as a image where blue is more similar subsequences and red is more dissimilar subsequences |T|-m+1 … 5

Motivation Given a time series, T and a desired subsequence length, m Different information about the time series can be extract from the matrix (e.g., pair with smallest distance is the motif, checkerboard kernel can be used to segment the time series [a]) |T|-m+1 … 6 [a] J Foote, Automatic Audio Segmentation Using A Measure of Audio Novelty

Motivation Given a time series, T and a desired subsequence length, m However, since the space complexity is O( T −m+1 2 ), which is quadratic in T , the corresponding matrix for a time series with 1 million doubles requires 4,000 GB to store |T|-m+1 … 7

Problem statement Given a time series, T and a desired subsequence length, m m Most part of the matrix contains information that is repeated or irrelevant for tasks in time series data mining What kind of information should we extract from the matrix? How do we compactly storing it? |T|-m+1

What kind of information should be extracted from the matrix? set of all subsequences set of corresponding nearest neighbor ⁞ ⁞ Who is each subsequence’s nearest neighbor |T|-m+1

What kind of information should be extracted from the matrix? set of all subsequences set of corresponding nearest neighbor 2.88 0.90 1.61 6.04 5.69 1.23 1.40 ⁞ ⁞ Who is each subsequence’s nearest neighbor How far away from their corresponding nearest neighbor |T|-m+1

How to store the information? (1 of 2) The distance to the corresponding nearest neighbor of each subsequence can be stored in a vector called matrix profile time series, T matrix profile, P

How to store the information? (1 of 2) The distance to the corresponding nearest neighbor of each subsequence can be stored in a vector called matrix profile Ti time series, T matrix profile, P m The matrix profile value at this location i is the distance between Ti and its nearest neighbor

How to store the information? (1 of 2) The distance to the corresponding nearest neighbor of each subsequence can be stored in a vector called matrix profile time series, T matrix profile, P Local minimums are corresponding to motifs

How to store the information? (2 of 2) The index of corresponding nearest neighbor of each subsequence is also stored in a vector called matrix profile index Ti time series, T matrix profile index, I m 236 252 166 171 176 181 186 191 196 201 206 211 216 220 148 10 15 256 261 266 271 276 281 286 291 296 301 304 306 69 222 227 232 11 16 21 26 31 36 41 46 51 56 61 150 155 160 220 86 91 73 86 91 96 101 106 111 116 121 126 131 135 176 4 241 … 192 193 194 195 196 The matrix profile index value at this location i is the index of Ti‘s nearest neighbor

How to store the information? (2 of 2) The index of corresponding nearest neighbor of each subsequence is also stored in a vector called matrix profile index Ti T194 time series, T matrix profile index, I m 236 252 166 171 176 181 186 191 196 201 206 211 216 220 148 10 15 256 261 266 271 276 281 286 291 296 301 304 306 69 222 227 232 11 16 21 26 31 36 41 46 51 56 61 150 155 160 220 86 91 73 86 91 96 101 106 111 116 121 126 131 135 176 4 241 … 192 193 194 195 196 It turns out that Ti‘s nearest neighbor is T194

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m m inf Matrix profile is initialized as inf vector This is just a toy example, so the values and the vector length does not fit the time series shown above

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Ti m inf At the first iteration, a subsequence T i is randomly selected from T

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Ti m inf We compute the distances between T i and every subsequences from T (time complexity = O(|T|log(|T|))) We them put the distances in a vector based on the position of the subsequences 3 2 5 4 1 9 8 6 The distance between T i and T 1 (first subsequence) is 3

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Ti m inf We compute the distances between T i and every subsequences from T (time complexity = O(|T|log(|T|))) We them put the distances in a vector based on the position of the subsequences 3 2 5 4 1 9 8 6 Let say T i happen to be the third subsequences, therefore the third value in the distance vector is 0

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Ti m inf Matrix profile is updated by apply elementwise minimum to these two vectors min 3 2 5 4 1 9 8 6

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Ti m 3 inf Matrix profile is updated by apply elementwise minimum to these two vectors min 3 2 5 4 1 9 8 6

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Ti m 3 2 inf Matrix profile is updated by apply elementwise minimum to these two vectors min 3 2 5 4 1 9 8 6 Because T i is the third subsequences and the distance between oneself is unimportant, we skip it

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Ti m 3 2 inf 5 4 1 9 8 6 After we finish update matrix profile for the first iteration 3 2 5 4 1 9 8 6

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Tj m 3 2 inf 5 4 1 9 8 6 In the second iteration, we randomly select another subsequence Tj and it happens to be the 12th subsequences

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Tj m 3 2 inf 5 4 1 9 8 6 Once again, we compute the distance between Tj and every subsequences of T 2 3 1 4 6 5 8 9

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Tj m 3 2 inf 5 4 1 9 8 6 min The same elementwise minimum 2 3 1 4 6 5 8 9

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Tj m 2 inf 5 3 4 1 9 8 6 min The same elementwise minimum 2 3 1 4 6 5 8 9

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Tj m 2 inf 5 3 4 1 9 8 6 min The same elementwise minimum 2 3 1 4 6 5 8 9

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Tj m 2 1 5 3 4 9 8 6 min The same elementwise minimum 2 3 1 4 6 5 8 9

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Tj m 2 1 5 3 4 9 8 6 min 2 3 1 4 6 5 8 9 We repeat the two steps (distance computation and update) until we have used every subsequences

How to compute matrix profile? (1 of 2) Given a time series, T and a desired subsequence length, m Tj m 2 1 5 3 4 9 8 6 min 2 3 1 4 6 5 8 9 There are |T| subsequences and the distance computation is O(|T|log(|T|)) The overall time complexity is O(|T|2log(|T|))

How to compute matrix profile? (2 of 2) It is exact It is simple and parameter-free It is fast O(n2 log n) or O(n2) if not anytime It is space efficient O(n) It allows anytime algorithms (the quality of matrix profile improves over iteration) It can leverage hardware (embarrassingly parallelizable) It is incrementally maintainable http://www.cs.ucr.edu/~eamonn/MatrixProfile.html

Motif mining with matrix profile: repeated earthquakes Seismologists are interested in finding repeated earthquakes in long sequence of seismometer readings as the repeated earthquakes could be years apart Excerpt of a seismic time series from Mammoth Lakes, California

Motif mining with matrix profile: repeated earthquakes Seismologists are interested in finding repeated earthquakes in long sequence of seismometer readings as the repeated earthquakes could be years apart

Discord mining with matrix profile: abnormal heartbeat detection Identifying abnormal heartbeat from ECG reading is important for diagnosis patients Maximum value in matrix profile indicates discord

Discord mining with matrix profile: abnormal heartbeat detection Identifying abnormal heartbeat from ECG reading is important for diagnosis patients Premature ventricular contraction

Common subsequence mining with matrix profile: music sampling detection (1 of 2) Sometimes a musician “borrows” section of other music to generate new music The borrowed section is the common subsequence in both music and can be discovered using matrix profile Given two time series T a and T b , matrix profile can be compute by either finding the nearest neighbor of T a ’s subsequences in T b or vise versa

Common subsequence mining with matrix profile: music sampling detection (2 of 2)

Common subsequence mining with matrix profile: music sampling detection (2 of 2) Most similar 5 secs subsequences

Common subsequence mining with matrix profile: music sampling detection (2 of 2) Music 1: Under Pressure by Queen-David Bowie Music 2: Ice Ice Baby by Vanilla Ice Most similar 5 secs subsequences Under Pressure Ice Ice Baby

Segmentation with matrix profile: human activity (1 of 2) Matrix profile and index can also be used for segmentation Given a time series, by examining its matrix profile index, the neighboring information can be extracted, and similar pattern will be neighbors The neighboring information can be visualized with arcs walkwalkwalkwalkwalkwalkwalkwalkrunrunrunrunrunrunrunrunrun

Segmentation with matrix profile: human activity (1 of 2) Matrix profile and index can also be used for segmentation Given a time series, by examining its matrix profile index, the neighboring information can be extracted, and similar pattern will be neighbors The neighboring information can be visualized with arcs walkwalkwalkwalkwalkwalkwalkwalkrunrunrunrunrunrunrunrunrun 1 arc 3 arc 2 arc

Segmentation with matrix profile: human activity (1 of 2) Matrix profile and index can also be used for segmentation Given a time series, by examining its matrix profile index, the neighboring information can be extracted, and similar pattern will be neighbors The neighboring information can be visualized with arcs walkwalkwalkwalkwalkwalkwalkwalkrunrunrunrunrunrunrunrunrun Arc counting curve

Segmentation with matrix profile: human activity (1 of 2) Matrix profile and index can also be used for segmentation Given a time series, by examining its matrix profile index, the neighboring information can be extracted, and similar pattern will be neighbors The neighboring information can be visualized with arcs walkwalkwalkwalkwalkwalkwalkwalkrunrunrunrunrunrunrunrunrun Local minimum is a good split point Arc counting curve

Segmentation with matrix profile: human activity (2 of 2) CMU Motion Capture database Local minimum is found with MATLAB’s peak finding function 5,000 10,000 Matrix Profile Arc Counting Curve One-dimension of multi-d time series: Subject 86, recording 4, dimension 30 W S P N W C T D P W Ground Truth

Conclusion http://www.cs.ucr.edu/~eamonn/MatrixProfile.html Given a time series, T and a desired subsequence length, m m pairwise distance matrix Important (neighboring) information are extracted from O( T 2 ) pairwise distance matrix and stored in O( T ) matrix profile and matrix profile index matrix profile matrix profile index 236 252 166 171 176 181 186 191 196 201 206 211 216 220 148 10 15 256 261 266 271 276 281 286 291 296 301 304 306 69 222 227 232 11 16 21 26 31 36 41 46 51 56 61 150 155 160 220 86 91 73 86 91 96 101 106 111 116 121 126 131 135 176 4 241 Tasks such as motif mining, discord mining, or segmentation can be easily performed with matrix profile and matrix profile index http://www.cs.ucr.edu/~eamonn/MatrixProfile.html

Questions?