FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space

Slides:



Advertisements
Similar presentations
A Fast Estimation of SRAM Failure Rate Using Probability Collectives Fang Gong Electrical Engineering Department, UCLA Collaborators:
Advertisements

CISC 489/689 Spring 2009 University of Delaware
Noise & Data Reduction. Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum.
SkewReduce YongChul Kwon Magdalena Balazinska, Bill Howe, Jerome Rolia* University of Washington, *HP Labs Skew-Resistant Parallel Processing of Feature-Extracting.
Word Spotting DTW.
Fast Algorithms For Hierarchical Range Histogram Constructions
Yasuhiro Fujiwara (NTT Cyber Space Labs)
Presented by: GROUP 7 Gayathri Gandhamuneni & Yumeng Wang.
Self Organization of a Massive Document Collection
Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.
Effectively Indexing Uncertain Moving Objects for Predictive Queries School of Computing National University of Singapore Department of Computer Science.
74 th EAGE Conference & Exhibition incorporating SPE EUROPEC 2012 Automated seismic-to-well ties? Roberto H. Herrera and Mirko van der Baan University.
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) – FastMap Dimensionality Reductions or data projections.
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) – FastMap Dimensionality Reductions or data projections.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
1 Abstract This paper presents a novel modification to the classical Competitive Learning (CL) by adding a dynamic branching mechanism to neural networks.
Making Time-series Classification More Accurate Using Learned Constraints © Chotirat “Ann” Ratanamahatana Eamonn Keogh 2004 SIAM International Conference.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Using Relevance Feedback in Multimedia Databases
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.
1 Theory I Algorithm Design and Analysis (11 - Edit distance and approximate string matching) Prof. Dr. Th. Ottmann.
UCSC 1 Aman ShaikhICNP 2003 An Efficient Algorithm for OSPF Subnet Aggregation ICNP 2003 Aman Shaikh Dongmei Wang, Guangzhi Li, Jennifer Yates, Charles.
Time Series I.
Exact Indexing of Dynamic Time Warping
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Scan Conversion Line and Circle
Surface Simplification Using Quadric Error Metrics Michael Garland Paul S. Heckbert.
Scott Perryman Jordan Williams.  NP-completeness is a class of unsolved decision problems in Computer Science.  A decision problem is a YES or NO answer.
Path-State Modeling for Time Series Anomaly Detection Matt Mahoney.
CS910: Foundations of Data Analytics Graham Cormode Time Series Analysis.
Qualitative approximation to Dynamic Time Warping similarity between time series data Blaž Strle, Martin Možina, Ivan Bratko Faculty of Computer and Information.
Techniques for Analysis and Calibration of Multi- Agent Simulations Manuel Fehler Franziska Klügl Frank Puppe Universität Würzburg Lehrstuhl für Künstliche.
Stochastic Algorithms Some of the fastest known algorithms for certain tasks rely on chance Stochastic/Randomized Algorithms Two common variations – Monte.
Analysis of Constrained Time-Series Similarity Measures
S DTW: COMPUTING DTW DISTANCES USING LOCALLY RELEVANT CONSTRAINTS BASED ON SALIENT FEATURE ALIGNMENTS K. Selçuk Candan Arizona State University Maria Luisa.
Efficient and Scalable Computation of the Energy and Makespan Pareto Front for Heterogeneous Computing Systems Kyle M. Tarplee 1, Ryan Friese 1, Anthony.
Comparing Audio Signals Phase misalignment Deeper peaks and valleys Pitch misalignment Energy misalignment Embedded noise Length of vowels Phoneme variance.
Budget-based Control for Interactive Services with Partial Execution 1 Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.
CSCI 256 Data Structures and Algorithm Analysis Lecture 14 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
K. Selçuk Candan, Maria Luisa Sapino Xiaolan Wang, Rosaria Rossini
Efficient Elastic Burst Detection in Data Streams Yunyue Zhu and Dennis Shasha Department of Computer Science Courant Institute of Mathematical Sciences.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
Christopher Moh 2005 Competition Programming Analyzing and Solving problems.
Chapter 9 DTW and VQ Algorithm  9.1 Basic idea of DTW  9.2 DTW algorithm  9.3 Basic idea of VQ  9.4 LBG algorithm  9.5 Improvement of VQ.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Akram Bitar and Larry Manevitz Department of Computer Science
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
Exact indexing of Dynamic Time Warping
QuickYield: An Efficient Global-Search Based Parametric Yield Estimation with Performance Constraints Fang Gong 1, Hao Yu 2, Yiyu Shi 1, Daesoo Kim 1,
Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy Nurjahan BegumLiudmila Ulanova Jun Wang 1 Eamonn Keogh University.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
Genetic algorithms: A Stochastic Approach for Improving the Current Cadastre Accuracies Anna Shnaidman Uri Shoshani Yerach Doytsher Mapping and Geo-Information.
ICS 353: Design and Analysis of Algorithms Backtracking King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Statistical Models of Appearance for Computer Vision 主講人:虞台文.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Semi-Supervised Clustering
Data-Streams and Histograms
Supervised Time Series Pattern Discovery through Local Importance
4.7 TIME ALIGNMENT AND NORMALIZATION
COSC160: Data Structures Linked Lists
Enumeration of Time Series Motifs of All Lengths
Estimation Error and Portfolio Optimization
Design Hierarchy Guided Multilevel Circuit Partitioning
4.7 TIME ALIGNMENT AND NORMALIZATION
Estimation Error and Portfolio Optimization
Akram Bitar and Larry Manevitz Department of Computer Science
Presentation transcript:

FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space Department of Computer Sciences Florida Institute of Technology Stan Salvador and Philip Chan

Outline Dynamic Time Warping (DTW) Problem Statement Related Work for Speeding up DTW FastDTW Algorithm Evaluation of FastDTW Contributions Limitations and Future Work

Dynamic Time Warping (DTW) Aligns two time series by warping the time dimension Warping - expanding/contracting the time dimension

The Dynamic Time Warping Algorithm A dynamic programming approach Solutions to slightly smaller problems used to find larger solutions

The DTW Cost Matrix

Distance of Min-Cost Warp Path

Finding Min-Cost Warp Path

Advantages of DTW DTW is optimal An intuitive distance measurement Local variation in the time axis is common Handwriting Speech “Events” that start after varying delays

Disadvantages of DTW O(N2) time and space complexity Only practical for small data sets (<3,000) Time series are often very long Data mining requires a scalable DTW algorithm

Problem Statement We desire an efficient Dynamic Time Warping algorithm Linear time complexity Linear space complexity Warp path is needed in addition to warp distance Warp path must be nearly optimal

Does DTW Need to be Faster? “Myth 3: There is a need (and room) for improvements in the speed of DTW for data mining applications.” (Keogh today-9:45am) Keogh: many time series FastDTW: Long time series

Existing Methods to Speed Up DTW Constraints – only fill in part of the cost matrix Abstraction – sample the data before time warping

Constraints Sakoe-Chiba Band (Sakoe & Chiba 1978) Itakura Parallelogram (Itakura 1975) Still O(N2) if the window width is a function of input size (linear if the width is constant) Assumes a near-optimal warp path stays near the i=j axis Accuracy depends on the domain

(Keogh & Pazzani 2000), (Chu et al. 2002) Abstraction (Keogh & Pazzani 2000), (Chu et al. 2002) O(N) if N pts are sampled down to ≤ Assumptions Sampling preserves time series structure Small deviations from the optimal path cause little increase in warp-path distance

Our FastDTW Algorithm A multi-resolution approach inspired by a multi-level graph bisection algorithm (Karypis 1997) 3 key operations Coarsening – reduce the resolution of a time series Projection – use a low-res warp path as an initial solution at a higher resolution Refinement – Refine a projected warp path locally adjusting the path

Sample Run of FastDTW

FastDTW Algorithm Set the resolution to be the coarsest Find the initial path using regular DTW Repeat Double the resolution Project the path onto the finer resolution Find a path through the projected area (plus a small radius around the projected area) Until the original resolution is reached

Complexity O(N) time O(N) space Details in the paper

Evaluation Criteria Accuracy Efficiency The error of an approximate Time Warping algorithm: % error = where: approxDist – the warp path distance of the approximate algorithm optimalDist – the warp path distance of the DTW algorithm Efficiency Runtime (measured in seconds)

Evaluation Procedure (Accuracy) Data Sets – UCR Time Series Data Mining Archive (Keogh & Folias 2002), 3 groups used: Random – 45 unrelated time series (earthquakes, random walk, eeg, speech, etc.) Trace – 200 time series simulating nuclear power plant failure (4 classes) Gun – 200 time series of a gun being drawn and pointed (2 classes) Procedure Run FastDTW, Constraints (Sakoe-Chiba Band), and Data Abstraction on all pairs within a data set group, also vary the radius Record the average error of all three methods for a group of data and a radius

Average % Error (Accuracy) Radius 1 10 20 30 FastDTW 19.2% 8.6% 1.5% 0.8% 0.6% Abstraction 983.3% 547.9% 6.5% 2.8% 1.8% Band 2749.2% 2385.7% 794.1% 136.8% 9.3%

Error in Different Data Sets

Evaluation Procedure (Execution-time) Data Sets Synthetic sine waves with Gaussian noise 10 to 180,000 data points Procedure Run FastDTW and DTW on each data set, vary the radius for FastDTW Compare the Execution times

Execution Time

Summary of Contributions FastDTW – an approximation of DTW O(N) time and space complexity Scales well to long time series Accurate, 8.6% error if radius=1, 0.8% error if radius=20

Limitations and Future Work FastDTW does not always find an optimal solution Future Work Examine using different step sizes between resolutions Investigate search algorithms to help improve refinement Examine # of cells evaluated vs. accuracy between the FastDTW, Abstraction, and Band algorithms.

Questions? Thanks to those who helped with this research: Matt Mahoney (Florida Institute of Technology), Brian Buckley, Walter Schiefele (Interface & Control Systems) This research is partially supported by NASA

FastDTW Pseudocode Input: X, Y, radius Output: 1) A minimum distance warp path between X and Y 2) The warped path distance between X and Y 1| // The min size of the coarsest resolution. 2| Integer minTSsize = radius+2 3| 4| IF (|X|≤ minTSsize OR |Y|≤ minTSsize) 5| { 6| // Base Case: for a very small time series run the full DTW algorithm 7| RETURN DTW(X, Y) 8| } 9| ELSE 10| { 11| // Recursive Case: Project the warp path from a coarser resolution onto the current current resolution. 12| // Run DTW only along the projected path (and also radius cells from the projected path). 13| TimeSeries shrunkX = X.reduceByHalf() // Coarsening 14| TimeSeries shrunkY = Y.reduceByHalf() // Coarsening 15| 16| WarpPath lowResPath = FastDTW(shrunkX, shrunkY, radius) 17| 18| SearchWindow window = ExpandedResWindow(lowResPath, X, Y, radius) // Projection 19| 20| RETURN DTW(X, Y, window) // Refinement 21| }