Parsimonious Linear Fingerprinting for Time Series Lei Li joint work with B. Aditya Prakash, Christos Faloutsos School of Computer Science Carnegie Mellon.

Slides:



Advertisements
Similar presentations
Noise & Data Reduction. Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum.
Advertisements

State Estimation and Kalman Filtering CS B659 Spring 2013 Kris Hauser.
CMU SCS : Multimedia Databases and Data Mining Lecture #19: SVD - part II (case studies) C. Faloutsos.
Aggregating local image descriptors into compact codes
Size-estimation framework with applications to transitive closure and reachability Presented by Maxim Kalaev Edith Cohen AT&T Bell Labs 1996.
FUNNEL: Automatic Mining of Spatially Coevolving Epidemics Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Willem G. van Panhuis (University of.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Yasuhiro Fujiwara (NTT Cyber Space Labs)
Dynamic Bayesian Networks (DBNs)
15-826: Multimedia Databases and Data Mining
Efficient Distribution Mining and Classification Yasushi Sakurai (NTT Communication Science Labs), Rosalynn Chong (University of British Columbia), Lei.
Streaming Pattern Discovery in Multiple Time-Series Spiros Papadimitriou Jimeng Sun Christos Faloutsos Carnegie Mellon University VLDB 2005, Trondheim,
1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.
Introduction to Data-driven Animation Jinxiang Chai Computer Science and Engineering Texas A&M University.
SCS CMU Joint Work by Hanghang Tong, Spiros Papadimitriou, Jimeng Sun, Philip S. Yu, Christos Faloutsos Speaker: Hanghang Tong Aug , 2008, Las Vegas.
WindMine: Fast and Effective Mining of Web-click Sequences SDM 2011Y. Sakurai et al.1 Yasushi Sakurai (NTT) Lei Li (Carnegie Mellon Univ.) Yasuko Matsubara.
Principal Component Analysis
Hongtao Cheng 1 Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences Author:Flip Korn H. V. Jagadish Christos Faloutsos From ACM.
Dimensional reduction, PCA
Overview Of Clustering Techniques D. Gunopulos, UCR.
Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.
Reverse Hashing for Sketch Based Change Detection in High Speed Networks Ashish Gupta Elliot Parsons with Robert Schweller, Theory Group Advisor: Yan Chen.
A Multiresolution Symbolic Representation of Time Series
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
Optimal Placement and Selection of Camera Network Nodes for Target Localization A. O. Ercan, D. B. Yang, A. El Gamal and L. J. Guibas Stanford University.
Presented By Wanchen Lu 2/25/2013
Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
BRAID: Discovering Lag Correlations in Multiple Streams Yasushi Sakurai (NTT Cyber Space Labs) Spiros Papadimitriou (Carnegie Mellon Univ.) Christos Faloutsos.
AutoPlait: Automatic Mining of Co-evolving Time Sequences Yasuko Matsubara (Kumamoto University) Yasushi Sakurai (Kumamoto University) Christos Faloutsos.
Carnegie Mellon School of Computer Science Forecasting with Cyber-physical Interactions in Data Centers Lei Li PDL Seminar 9/28/2011.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Virtual Vector Machine for Bayesian Online Classification Yuan (Alan) Qi CS & Statistics Purdue June, 2009 Joint work with T.P. Minka and R. Xiang.
Mingyang Zhu, Huaijiang Sun, Zhigang Deng Quaternion Space Sparse Decomposition for Motion Compression and Retrieval SCA 2012.
Chapter 6 Spectrum Estimation § 6.1 Time and Frequency Domain Analysis § 6.2 Fourier Transform in Discrete Form § 6.3 Spectrum Estimator § 6.4 Practical.
k-Shape: Efficient and Accurate Clustering of Time Series
Lei Li Computer Science Department Carnegie Mellon University Pre Proposal Time Series Learning completed work 11/27/2015.
NCAF Manchester July 2000 Graham Hesketh Information Engineering Group Rolls-Royce Strategic Research Centre.
Stream Monitoring under the Time Warping Distance Yasushi Sakurai (NTT Cyber Space Labs) Christos Faloutsos (Carnegie Mellon Univ.) Masashi Yamamuro (NTT.
Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.
Streaming Pattern Discovery in Multiple Time-Series Jimeng Sun Spiros Papadimitrou Christos Faloutsos PARALLEL DATA LABORATORY Carnegie Mellon University.
Chapter 8. Learning of Gestures by Imitation in a Humanoid Robot in Imitation and Social Learning in Robots, Calinon and Billard. Course: Robots Learning.
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks Authors: Pegna, J.M., Lozano, J.A., Larragnaga, P., and Inza, I. In.
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
D YNA MM O : M INING AND S UMMARIZATION OF C OEVOLVING S EQUENCES WITH M ISSING V ALUES Christos Faloutsos joint work with Lei Li, James McCann, Nancy.
D YNA MM O : M INING AND S UMMARIZATION OF C OEVOLVING S EQUENCES WITH M ISSING V ALUES Lei Li joint work with Christos Faloutsos, James McCann, Nancy.
Facets: Fast Comprehensive Mining of Coevolving High-order Time Series Hanghang TongPing JiYongjie CaiWei FanQing He Joint Work by Presenter:Wei Fan.
Arizona State University1 Fast Mining of a Network of Coevolving Time Series Wei FanHanghang TongPing JiYongjie Cai.
SCS CMU Speaker Hanghang Tong Colibri: Fast Mining of Large Static and Dynamic Graphs Speaking Skill Requirement.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Data statistics and transformation revision Michael J. Watts
Enabling Real Time Alerting through streaming pattern discovery Chengyang Zhang Computer Science Department University of North Texas 11/21/2016 CRI Group.
Carnegie Mellon School of Computer Science Forecasting with Cyber-physical Interactions in Data Centers Lei Li PDL Seminar 9/28/2011.
Dimensionality Reduction and Principle Components Analysis
Forecasting with Cyber-physical Interactions in Data Centers (part 3)
New Characterizations in Turnstile Streams with Applications
Applying Control Theory to Stream Processing Systems
The Linear Dynamic System and its application in moction capture
Supervised Time Series Pattern Discovery through Local Importance
Non-linear Mining of Competing Local Activities
Pre Proposal Time Series Learning completed work
Kijung Shin1 Mohammad Hammoud1
Turnstile Streaming Algorithms Might as Well Be Linear Sketches
Hidden Markov Models Part 2: Algorithms
Probabilistic Models with Latent Variables
Jongik Kim1, Dong-Hoon Choi2, and Chen Li3
Learning to Rank Typed Graph Walks: Local and Global Approaches
Presentation transcript:

Parsimonious Linear Fingerprinting for Time Series Lei Li joint work with B. Aditya Prakash, Christos Faloutsos School of Computer Science Carnegie Mellon University © L. Li, 2010 Machine Learning Lunch 2010/11/29

Motion Capture Understanding human motion Robots to assist the disabled

Network security Anomaly detection in computer network traffic BGP updates in network

DataStream monitoring Monitoring a datacenter with 5000 servers: 1TB data per day, 55 million streams ([Reeves+ 2009]) Temperature in datacenter

Find similar motions SELECT * FROM WHERE data LIKE

Motivation Answering similarity queries in Time Series Databases © L. Li, SELECT * FROM TSDB WHERE data LIKE “ ” TSDB

Database + Machine learning “Databased Learning ” Statistical effective Deeper pattern/functional relation (regression, Bayesian network, SVM/kernels, Clustering) Efficient Scalable (indexing, hashing, query optimization, Buffering/caching) Research Philosophy

Database + Machine learning “Databased Learning ” Example find similar motions/time series

Beyond similarity queries 9 Featureextraction Query & Indexing Similarity function, clustering/classification Visualization Forecasting Compression

CMU SCS Outline Motivation Proposed Method: Intuition & Example Experiments & Results PLiF: Insight Details Conclusion © L. Li,

CMU SCS Intuition: Goals © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4

CMU SCS Intuition: Goals © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4 (1a) lag independent (1b) frequency proximity (1c) grouping harmonics

CMU SCS 14 Example: synthetic signals Equations (a)sin(2πt/100) (b)cos(2πt/100) (c)sin(2πt/98 + π/6) (d) sin(2πt/110) + 0.2sin(2πt/30) (e) cos(2πt/110) + 0.2sin(2πt/30 + π/4) © L. Li, 2010

CMU SCS 15 Intuition (1a) Equations (a)sin(2πt/100) (b)cos(2πt/100) (c)sin(2πt/98 + π/6) (d) sin(2πt/110) + 0.2sin(2πt/30) (e) cos(2πt/110) + 0.2sin(2πt/30 + π/4) Time shift © L. Li, 2010 e.g. left-foot-start walking v.s. right-foot-start walking

CMU SCS 16 Intuition (1b) Equations (a)sin(2πt/100) (b)cos(2πt/100) (c)sin(2πt/98 + π/6) (d) sin(2πt/110) + 0.2sin(2πt/30) (e) cos(2πt/110) + 0.2sin(2πt/30 + π/4) nearby frequency & time shift © L. Li, 2010 e.g. running v.s. fast running

CMU SCS 17 Intuition (1c) Equations (a)sin(2πt/100) (b)cos(2πt/100) (c)sin(2πt/98 + π/6) (d) sin(2πt/110) + 0.2sin(2πt/30) (e) cos(2πt/110) + 0.2sin(2πt/30 + π/4) groups of harmonics © L. Li, 2010 ~ human voices

CMU SCS 18 Q: only two numbers to represent each! Proposed PLiF - + © L. Li,

CMU SCS 19 Intuition: how it works © L. Li, find hidden variable/pattern f=1/100 f=1/110 f=1/30 HV1HV2HV3

CMU SCS 20 Intuition: how it works © L. Li, 2010 find hidden variable/pattern f=1/110 f=1/30 HV2HV3 Co-occur HV2’ = HV2 HV3

CMU SCS 21© L. Li, 2010 HV1HV2’

CMU SCS 22© L. Li, 2010 HV HV2’

CMU SCS 23 Why it works? / How to interpret? Proposed PLiF - + harmonics.1 /100 Group of harmonics 1/110 & 1/30 © L. Li, 2010

CMU SCS 24 Basic Idea pattern/harmonics 1/100 pattern/harmonics 1/110 & 1/30 “walking” “running” projection to harmonics (aka. frequency) © L. Li, 2010

CMU SCS 25 Why not SVD/PCA? PCA PLiF - + no clear grouping Confused! © L. Li, 2010

CMU SCS Outline Motivation Proposed Method: Intuition & Example Experiments & Results PLiF: Insight Details Conclusion © L. Li,

CMU SCS Experiment: Goals to Verify © L. Li, Good features (low dimensional) G1 Good compression G2 Ability to forecast G3 Scalability G4

CMU SCS Experiments Datasets: © L. Li, BGP: 10 * 103kChlorine:166 * 4k Mocap 49 *

CMU SCS Result – Visualization © L. Li, Mocap PLiF first two “fingerprints” With PLiF, now able to visualize very high dimensional time sequences

CMU SCS Result – Clustering Pred.walkrun © L. Li, Mocap PLiF first two “fingerprints” walking running PLiF + thresholding Pred.walkrun PCA + kmeans Accuracy = 46/49 Accuracy = 25/49

CMU SCS Result – Clustering © L. Li, BGP data: PLiF + hierarchical clustering

CMU SCS Intuition: Goals © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4

CMU SCS Result - Compression © L. Li, Chlorine 166 * 4k Storing only the PLiF features & sampling of hidden variables Ideal compression ratio error

CMU SCS Result - Compression © L. Li, Mocap: 93 * 300 Storing only the PLiF features & sampling of hidden variables Ideal compression ratio error

CMU SCS Intuition: Goals © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4 later

CMU SCS Scalability © L. Li, Linear ~ sequence length sequence length wall clock time (s) sequence length wall clock time (s)

CMU SCS Scalability Optimized algorithm Details later © L. Li, PLiF-basic PLiF wall clock time SLOPE=1/3

CMU SCS Intuition: Goals © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4 later

CMU SCS Outline Motivation Proposed Method: Intuition & Example Experiments & Results PLiF: Insight Details Conclusion © L. Li,

CMU SCS Proposed Method: PLiF © L. Li, S4 S1 S2 S3 Learning Dynamics Finding Canonical Form Handling the Lag Grouping Harmonics

CMU SCS Step 1. Learning Dynamics Use machine learning to find: –“Transition” of Hidden Variables (HV): one time- tick to other –“Mixing” weights: HVs  observed data © L. Li, Time series of hidden variables

CMU SCS 42 Underlying Model: Linear Dynamical Systems Details © L. Li, 2010

CMU SCS 43 Linear Dynamical Systems: parameters Details © L. Li, 2010 namemeaning & example µ0µ0 initial state for hidden variable e.g. initial position, velocity & acceleration Atransition matrixhow the states move forward, e.g. soccer flying in the air Ctransmission/ projection/ output matrix hidden state  observation, e.g. camera taking picture of the soccer Q0Q0 Initial covariance Qtransition covariancehow precision is the soccer motion Rtransmission/ projection covariance i.e. observation noise; e.g. how accurate is the camera

CMU SCS Dynamics/Transition in Hidden Variables © L. Li, HV(t+1) transition matrix HV(t) - enables forecasting

CMU SCS Mixing Weights © L. Li, mixing/output matrix C - +

CMU SCS 46 Learning the Parameters Expectation-Maximization maximizing the expected log likelihood: Details © L. Li, 2010 Standard EM: expensive! Further speed optimization in our PLiF: matrix inversion using Woodbury matrix identity

CMU SCS Step 2: Canonicalization But, hidden variables –hard to interpret –non-unique: many combinations are essentially the same Intuition: –To make hidden variables compact and “uniquely” identified © L. Li,

CMU SCS Canonicalization adds Interpretability © L. Li, Time series of HV after canonicalization (real part) frequency scaling (subtle) “Harmonics” HV before f=1/110 f=1/100 f=1/30

CMU SCS Step 2: Canonicalization Again, Estimating how each signal is composed of “harmonics”/patterns but, in complex space © L. Li, Mixing matrix (complex valued)

CMU SCS Step 3:Handling Lag Intuition: –Groups emerge.. –reducing redundancy –eliminating phase shift © L. Li, Conjugate! Mixing matrix (complex valued)

CMU SCS Step 3:Handling Lag Idea: –only magnitude counts –removing duplicates © L. Li,

CMU SCS Step 3:Handling Lag interpretability © L. Li, harmonics.1/100 harmonics 1/110 harmonics 1/30

CMU SCS Step 4:Grouping Harmonics Intuition: –Still a little redundancy © L. Li, harmonics.1/100 harmonics 1/110 harmonics 1/30 Think Minimum Description Length

CMU SCS Step 4: Grouping Harmonics © L. Li, Dimensional Reduction - + SVD/PCA U,S,V  min |X-U*S*V T | 2

CMU SCS Step 4: Grouping Harmonics © L. Li, Group of harmonics 1/110 & 1/30 harmonics.1/100

CMU SCS Parsimonious Linear Fingerprinting Goals  steps © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4 (1a) lag independent (1b) frequency proximity (1c) grouping harmonics S4 S1 S2 S3 Learning Dynamics Canonical Form Handling Lag Grouping Harmonics PLiF alg. steps PLiF Goals

CMU SCS Outline Motivation Proposed Method: Intuition & Example Experiments & Results PLiF: Insight Details Conclusion © L. Li,

CMU SCS Conclusion Need for finding compact representation of time series data Intuition & Insights of PLiF Interpretation of PLiF & How it works Experiments on a diverse set of data –It really works! –It is fast & scalable. © L. Li,

CMU SCS Take away message Need to find Good feature for time series:  Similarity func., Compression, Forecasting Design the method meets TS characteristics –e.g. Phase shift/lag correlation When to use PLiF –near periodic & relatively smooth signals © L. Li,

CMU SCS References Lei Li, B. Aditya Prakash, Christos Faloutsos. Parsimonious Linear Fingerprinting for Time Series. VLDB Lei Li, Jim McCann, Christos Faloutsos, Nancy Pollard. DynaMMo: Mining and Summarization of Coevolving Sequences with Missing Values. ACM KDD © L. Li,

CMU SCS Question? Thanks! © L. Li, Christos F aloutsos Lei Li B. Aditya P rakash

CMU SCS BACKUP appendix © L. Li,

CMU SCS 63 Why not Fourier (DFT)? 1. FT cannot do forecasting © L. Li, 2010

CMU SCS 64 Why not Fourier (DFT)? 1. FT cannot do forecasting © L. Li, 2010

CMU SCS 65 Why not Fourier (DFT)? FT spectrum 1. FT cannot do forecasting 2. No arbitrary frequency true freq. frequency © L. Li, 2010

CMU SCS 66 Why not Fourier (DFT)? 1. FT cannot do forecasting 2. No arbitrary frequency 3. nearby frequency treated differently, not suited for across signals freq.=5 freq.=5.1 © L. Li, 2010

CMU SCS Handling Missing Values Lei Li, Jim McCann, Nancy Pollard, Christos Faloutsos. BoLeRO: A Principled Technique for Including Bone Length Constraints in Motion Capture Occlusion Filling, ACM SIGGRAPH / Eurographics Symposium on Computer Animation, Lei Li, Jim McCann, Christos Faloutsos, Nancy Pollard. DynaMMo: Mining and Summarization of Coevolving Sequences with Missing Values. ACM KDD Lei Li, Jim McCann, Christos Faloutsos, Nancy Pollard. Laziness is a virtue: Motion stitching using effort minimization. Eurographics 2008, © L. Li,

CMU SCS Details for Implementation © L. Li, Read this only if you want to implement it

CMU SCS 69 Modelling the data: Linear Dynamical Systems Details © L. Li, 2010

CMU SCS 70 Linear Dynamical Systems: parameters namemeaning & example µ0µ0 initial state for hidden variable e.g. initial position, velocity & acceleration Atransition matrixhow the states move forward, e.g. soccer flying in the air Ctransmission/ projection/ output matrix hidden state  observation, e.g. camera taking picture of the soccer Q0Q0 Initial covariance Qtransition covariancehow precision is the soccer motion Rtransmission/ projection covariance i.e. observation noise; e.g. how accurate is the camera Details © L. Li, 2010

CMU SCS 71 Learning the Dynamics Expectation-Maximization maximizing the expected log likelihood Details © L. Li, 2010

CMU SCS 72 Finding Canonical Form Intuition: find the canonical dynamics taking eigenvalue decomposition of the transition matrix A compensate C with C h is a projection of the data to the dynamics but... Details © L. Li, 2010