Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Why does subsequence time-series clustering produce sine waves? IBM.

Slides:



Advertisements
Similar presentations
Matrix Representation
Advertisements

Ch 7.7: Fundamental Matrices
Lect.3 Modeling in The Time Domain Basil Hamed
Leo Lam © Signals and Systems EE235 Leo Lam.
Lecture 7: Basis Functions & Fourier Series
DFT/FFT and Wavelets ● Additive Synthesis demonstration (wave addition) ● Standard Definitions ● Computing the DFT and FFT ● Sine and cosine wave multiplication.
L2:Non-equilibrium theory: Keldish formalism
Time-domain Crosstalk The equations that describe crosstalk in time-domain are derived from those obtained in the frequency-domain under the following.
Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Started detailed look at PCA Reviewed linear algebra Today: More linear algebra Multivariate.
Slides by Olga Sorkine, Tel Aviv University. 2 The plan today Singular Value Decomposition  Basic intuition  Formal definition  Applications.
Lecture 21: Spectral Clustering
Motion Analysis Slides are from RPI Registration Class.
3D Geometry for Computer Graphics
Tokyo Research Laboratory © Copyright IBM Corporation 2009 | 2009/04/ | SDM 2009 / Anomaly-Detection Proximity-Based Anomaly Detection using Sparse Structure.
Ch 7.9: Nonhomogeneous Linear Systems
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
© 2003 by Davi GeigerComputer Vision October 2003 L1.1 Structure-from-EgoMotion (based on notes from David Jacobs, CS-Maryland) Determining the 3-D structure.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
3D Geometry for Computer Graphics
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Orthogonal Transforms
Chapter 15 Fourier Series and Fourier Transform
Radial Basis Function Networks
Chapter 25 Nonsinusoidal Waveforms. 2 Waveforms Used in electronics except for sinusoidal Any periodic waveform may be expressed as –Sum of a series of.
Summarized by Soo-Jin Kim
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Motivation Music as a combination of sounds at different frequencies
Chapter 9 Superposition and Dynamic Programming 1 Chapter 9 Superposition and dynamic programming Most methods for comparing structures use some sorts.
Principles of Pattern Recognition
FOURIER SERIES §Jean-Baptiste Fourier (France, ) proved that almost any period function can be represented as the sum of sinusoids with integrally.
Lecture 2 Signals and Systems (I)
1 Chapter 5 Image Transforms. 2 Image Processing for Pattern Recognition Feature Extraction Acquisition Preprocessing Classification Post Processing Scaling.
Tokyo Research Laboratory © Copyright IBM Corporation 2005 | 2005/11/30 | ICDM 05 IBM Research, Tokyo Research Lab. Tsuyoshi Idé Pairwise Symmetry Decomposition.
Quantum One: Lecture Representation Independent Properties of Linear Operators 3.
HP-PURDUE-CONFIDENTIAL Final Exam May 16th 2008 Slide No.1 Outline Motivations Analytical Model of Skew Effect and its Compensation in Banding and MTF.
Chem Patterson Methods In 1935, Patterson showed that the unknown phase information in the equation for electron density:  (xyz) = 1/V ∑ h ∑ k.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
SINGULAR VALUE DECOMPOSITION (SVD)
Integral Transform Method CHAPTER 15. Ch15_2 Contents  15.1 Error Function 15.1 Error Function  15.2Applications of the Laplace Transform 15.2Applications.
Fourier series: Eigenfunction Approach
CHEE825 Fall 2005J. McLellan1 Spectral Analysis and Input Signal Design.
1 ECE 3336 Introduction to Circuits & Electronics Note Set #8 Phasors Spring 2013 TUE&TH 5:30-7:00 pm Dr. Wanda Wosik.
Modern Navigation Thomas Herring MW 11:00-12:30 Room
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
Ch11: Normal Modes 1. Review diagonalization and eigenvalue problem 2. Normal modes of Coupled oscillators (3 springs, 2 masses) 3. Case of Weakly coupled.
Tokyo Research Laboratory © Copyright IBM Corporation 2005SDM 05 | 2005/04/21 | IBM Research, Tokyo Research Lab Tsuyoshi Ide Knowledge Discovery from.
Fourier series, Discrete Time Fourier Transform and Characteristic functions.
Advanced Engineering Mathematics, 7 th Edition Peter V. O’Neil © 2012 Cengage Learning Engineering. All Rights Reserved. CHAPTER 4 Series Solutions.
5. Quantum Theory 5.0. Wave Mechanics
Principal Component Analysis (PCA)
Tokyo Research Laboratory © Copyright IBM Corporation 2003KDD 04 | 2004/08/24 | Eigenspace-based anomaly detection in computer systems IBM Research, Tokyo.
Algebraic Techniques for Analysis of Large Discrete-Valued Datasets 
Chapter 13 Discrete Image Transforms
Camera Calibration Course web page: vision.cis.udel.edu/cv March 24, 2003  Lecture 17.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
Learning from the Past, Looking to the Future James R. (Jim) Beaty, PhD - NASA Langley Research Center Vehicle Analysis Branch, Systems Analysis & Concepts.
CLASSIFICATION OF ECG SIGNAL USING WAVELET ANALYSIS
Solid State Physics Lecture 15 HW 8 Due March 29 Kittel Chapter 7: 3,4,6 The free electron standing wave and the traveling wave are not eigenstates.
Digital Image Processing Lecture 8: Fourier Transform Prof. Charlene Tsai.
Chapter 4 Dynamical Behavior of Processes Homework 6 Construct an s-Function model of the interacting tank-in-series system and compare its simulation.
Chapter 4 Dynamical Behavior of Processes Homework 6 Construct an s-Function model of the interacting tank-in-series system and compare its simulation.
Boyce/DiPrima 10th ed, Ch 7.7: Fundamental Matrices Elementary Differential Equations and Boundary Value Problems, 10th edition, by William E. Boyce and.
Quantum One.
SVD: Physical Interpretation and Applications
Parallelization of Sparse Coding & Dictionary Learning
X.1 Principal component analysis
Chapter 3 Modeling in the Time Domain
Presentation transcript:

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Why does subsequence time-series clustering produce sine waves? IBM Tokyo Research Lab. Tsuyoshi Idé

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 2 Contents  What is subsequence time-series clustering?  Describing the dependence between subsequences  Reducing k-means to eigen problem  Deriving sine waves  Experiments  beat waves in k-means !  Summary

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 3 What is subsequence time-series clustering (STSC)?

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 4 What’s STSC : k-means clustering of subsequences generated from a time series. Cluster centers are patterns discovered.  subsequences generated by sliding window techniques  subsequences are treated as independent data objects in k-means clustering Cluster centers (centroids) (the average of cluster members) extracted pattern

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 5 What’s sinusoid effect : unexpectedly, cluster centers in STSC become sinusoids. The reason is unknown.  Shocking report  Keogh-Lin-Truppel, “Clustering of time series subsequences is meaningless”, ICDM ’03  k-means STSC almost always produces sinusoid cluster centers  almost independent of the input time series  almost no relation to the original patterns concatenate to produce a long time series k-means STSC Sinusoid cluster centers ! Explaining why is an open problem We focus on explaining why. Example

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 6 Describing the dependence between subsequences

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 7 In reality, the subsequences are NOT independent at all. We need to describe the dependence.  subsequences generated by sliding window techniques  subsequences are treated as independent data objects in clustering Let us study how the subsequences are dependent. ?

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 8 Theoretical model for time series: Think of a time series as a “state” on a periodic ring. Artificially assume the periodic boundary condition (PBC) Assign the value x l on each lattice points (sites) l. Attach the orthonormal basis to the sites. Think of the time series as an n-D vector. Whole time series

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 9 Each subsequence s p is concisely expressed using the translation operator 1n Definition of the translation operator Ex. shifts e 2 with l steps “Make p steps backward and take the sites from 1 st thru w-th” w-dim vector w: window size

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 10 Reducing k-means to eigen problem Easer to analyze theoretically Simple but theoretically a little difficult to handle

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 11 Rewriting the objective of k-means using the indicator and the density matrix. centroid The objective function of k-means clustering “density matrix” We finally get objective to find m (j) Inserting the def of the centroid, and introducing an indicator u (j) as objective to find u (j).

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 12 So what? Minimizing E is equivalent to eigen equation. Our goal is to solve it and to show the solution to be sinusoidal. Minimizing E is equivalent to the eigen equation: Let us study the sinusoid effect as ’s eigen equation. From the definition, it can be shown centroideigenvector H = …

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 13 Deriving sine waves By solving the eigen equation

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 14 Mathematical feature of. The expression based on implies a translational symmetry. Fourier basis will simplify the problem. By using, the rho matrix can be written as This form suggests a (pseudo-) translational symmetry of the problem. wave number Translational invariant basis would be more natural So, use the Fourier representation instead of the site representation orthogonal transform Window size Summation of shifted ones

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 15 When represented in the Fourier basis, is almost diagonal. Thus, the eigenstate is almost pure sinusoid. Will be small when a f q is dominant. power of a Fourier component f q Theorem When a |f q | is dominant, the eigen state is well approximated by the sine waves with the wavelength of w/|q|, irrespective of the details of the input time series data. If we take { f q } as the basis, it follows (after straightforward calculations)

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 16 Experiments

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 17 Let us see the correspondence between the two formulations using standard data sets. k-means STSC spectral STSC k-means STSC 1.eigenvectors minimize the SoS objective 2.eigenvector  centroid 3.dominant F.c. governs the eigenvectors

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 18 [1/2] For data with no particular periodicities, the power concentrates at the longest wavelength w. Only this peak does matter in the centroids. DFT K-means STSC centroids (k=3, w=128) Spectral STSC centroids The resulting centroids are almost independent of the tail of the spectrum.

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 19 [2/2] STSC centroids become beat waves when a few neighboring |q| are dominant (k=2, w=60). Concatenate 100 instances for each to make a long time series |q| = 4, 5, 6 are dominant Resulting sine waves exhibit beat wave by interference K-means STSC centroids (k=2, w=60) Spectral STSC centroids

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 20 Summary

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 21 Summary  The sinusoid effect is an important open problem in data mining.  The pseudo-translational symmetry introduced by the sliding window technique is the origin of the sinusoid effect.  In particular, if there is no particular periodicities within the window size, the clustering centers will be the sine waves of wavelength of w, irrespective of the details of the data. Thank you.

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 22 Appendix

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 23 STSC can produce useful results IF some of the conditions of the sinusoid effect are NOT satisfied. For example, if STSC is done locally… time Cluster center at present Cluster centers in the past comparison to get the CP score A STSC-based change-point detection method (singular spectrum transformation [Ide, SDM’05])

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 24 In this case, the locality of STSC leads to the loss of (pseudo) translational symmetry, resulting non-meaningless results. SST can produce useful CP detection results, which do depend on the input signal. “Break the pseudo- translational symmetry” One general rule could be…

Tokyo Research Laboratory © Copyright IBM Corporation 2006 | 2006/09/19 | PKDD 2006 Page 25 Even for the random data, centroids become sinusoid when w = n (k=2, w=60 and 6000). k=2, w=n=6000 (subseq = total) The k-means STSC introduces a mathematical artifact. It is so strong that the resulting centroids are dominated by it. Implication