Parsimonious Linear Fingerprinting for Time Series Lei Li joint work with B. Aditya Prakash, Christos Faloutsos School of Computer Science Carnegie Mellon University © L. Li, 2010 Machine Learning Lunch 2010/11/29
Motion Capture Understanding human motion Robots to assist the disabled
Network security Anomaly detection in computer network traffic BGP updates in network
DataStream monitoring Monitoring a datacenter with 5000 servers: 1TB data per day, 55 million streams ([Reeves+ 2009]) Temperature in datacenter
Find similar motions SELECT * FROM WHERE data LIKE
Motivation Answering similarity queries in Time Series Databases © L. Li, SELECT * FROM TSDB WHERE data LIKE “ ” TSDB
Database + Machine learning “Databased Learning ” Statistical effective Deeper pattern/functional relation (regression, Bayesian network, SVM/kernels, Clustering) Efficient Scalable (indexing, hashing, query optimization, Buffering/caching) Research Philosophy
Database + Machine learning “Databased Learning ” Example find similar motions/time series
Beyond similarity queries 9 Featureextraction Query & Indexing Similarity function, clustering/classification Visualization Forecasting Compression
CMU SCS Outline Motivation Proposed Method: Intuition & Example Experiments & Results PLiF: Insight Details Conclusion © L. Li,
CMU SCS Intuition: Goals © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4
CMU SCS Intuition: Goals © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4 (1a) lag independent (1b) frequency proximity (1c) grouping harmonics
CMU SCS 14 Example: synthetic signals Equations (a)sin(2πt/100) (b)cos(2πt/100) (c)sin(2πt/98 + π/6) (d) sin(2πt/110) + 0.2sin(2πt/30) (e) cos(2πt/110) + 0.2sin(2πt/30 + π/4) © L. Li, 2010
CMU SCS 15 Intuition (1a) Equations (a)sin(2πt/100) (b)cos(2πt/100) (c)sin(2πt/98 + π/6) (d) sin(2πt/110) + 0.2sin(2πt/30) (e) cos(2πt/110) + 0.2sin(2πt/30 + π/4) Time shift © L. Li, 2010 e.g. left-foot-start walking v.s. right-foot-start walking
CMU SCS 16 Intuition (1b) Equations (a)sin(2πt/100) (b)cos(2πt/100) (c)sin(2πt/98 + π/6) (d) sin(2πt/110) + 0.2sin(2πt/30) (e) cos(2πt/110) + 0.2sin(2πt/30 + π/4) nearby frequency & time shift © L. Li, 2010 e.g. running v.s. fast running
CMU SCS 17 Intuition (1c) Equations (a)sin(2πt/100) (b)cos(2πt/100) (c)sin(2πt/98 + π/6) (d) sin(2πt/110) + 0.2sin(2πt/30) (e) cos(2πt/110) + 0.2sin(2πt/30 + π/4) groups of harmonics © L. Li, 2010 ~ human voices
CMU SCS 18 Q: only two numbers to represent each! Proposed PLiF - + © L. Li,
CMU SCS 19 Intuition: how it works © L. Li, find hidden variable/pattern f=1/100 f=1/110 f=1/30 HV1HV2HV3
CMU SCS 20 Intuition: how it works © L. Li, 2010 find hidden variable/pattern f=1/110 f=1/30 HV2HV3 Co-occur HV2’ = HV2 HV3
CMU SCS 21© L. Li, 2010 HV1HV2’
CMU SCS 22© L. Li, 2010 HV HV2’
CMU SCS 23 Why it works? / How to interpret? Proposed PLiF - + harmonics.1 /100 Group of harmonics 1/110 & 1/30 © L. Li, 2010
CMU SCS 24 Basic Idea pattern/harmonics 1/100 pattern/harmonics 1/110 & 1/30 “walking” “running” projection to harmonics (aka. frequency) © L. Li, 2010
CMU SCS 25 Why not SVD/PCA? PCA PLiF - + no clear grouping Confused! © L. Li, 2010
CMU SCS Outline Motivation Proposed Method: Intuition & Example Experiments & Results PLiF: Insight Details Conclusion © L. Li,
CMU SCS Experiment: Goals to Verify © L. Li, Good features (low dimensional) G1 Good compression G2 Ability to forecast G3 Scalability G4
CMU SCS Experiments Datasets: © L. Li, BGP: 10 * 103kChlorine:166 * 4k Mocap 49 *
CMU SCS Result – Visualization © L. Li, Mocap PLiF first two “fingerprints” With PLiF, now able to visualize very high dimensional time sequences
CMU SCS Result – Clustering Pred.walkrun © L. Li, Mocap PLiF first two “fingerprints” walking running PLiF + thresholding Pred.walkrun PCA + kmeans Accuracy = 46/49 Accuracy = 25/49
CMU SCS Result – Clustering © L. Li, BGP data: PLiF + hierarchical clustering
CMU SCS Intuition: Goals © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4
CMU SCS Result - Compression © L. Li, Chlorine 166 * 4k Storing only the PLiF features & sampling of hidden variables Ideal compression ratio error
CMU SCS Result - Compression © L. Li, Mocap: 93 * 300 Storing only the PLiF features & sampling of hidden variables Ideal compression ratio error
CMU SCS Intuition: Goals © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4 later
CMU SCS Scalability © L. Li, Linear ~ sequence length sequence length wall clock time (s) sequence length wall clock time (s)
CMU SCS Scalability Optimized algorithm Details later © L. Li, PLiF-basic PLiF wall clock time SLOPE=1/3
CMU SCS Intuition: Goals © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4 later
CMU SCS Outline Motivation Proposed Method: Intuition & Example Experiments & Results PLiF: Insight Details Conclusion © L. Li,
CMU SCS Proposed Method: PLiF © L. Li, S4 S1 S2 S3 Learning Dynamics Finding Canonical Form Handling the Lag Grouping Harmonics
CMU SCS Step 1. Learning Dynamics Use machine learning to find: –“Transition” of Hidden Variables (HV): one time- tick to other –“Mixing” weights: HVs observed data © L. Li, Time series of hidden variables
CMU SCS 42 Underlying Model: Linear Dynamical Systems Details © L. Li, 2010
CMU SCS 43 Linear Dynamical Systems: parameters Details © L. Li, 2010 namemeaning & example µ0µ0 initial state for hidden variable e.g. initial position, velocity & acceleration Atransition matrixhow the states move forward, e.g. soccer flying in the air Ctransmission/ projection/ output matrix hidden state observation, e.g. camera taking picture of the soccer Q0Q0 Initial covariance Qtransition covariancehow precision is the soccer motion Rtransmission/ projection covariance i.e. observation noise; e.g. how accurate is the camera
CMU SCS Dynamics/Transition in Hidden Variables © L. Li, HV(t+1) transition matrix HV(t) - enables forecasting
CMU SCS Mixing Weights © L. Li, mixing/output matrix C - +
CMU SCS 46 Learning the Parameters Expectation-Maximization maximizing the expected log likelihood: Details © L. Li, 2010 Standard EM: expensive! Further speed optimization in our PLiF: matrix inversion using Woodbury matrix identity
CMU SCS Step 2: Canonicalization But, hidden variables –hard to interpret –non-unique: many combinations are essentially the same Intuition: –To make hidden variables compact and “uniquely” identified © L. Li,
CMU SCS Canonicalization adds Interpretability © L. Li, Time series of HV after canonicalization (real part) frequency scaling (subtle) “Harmonics” HV before f=1/110 f=1/100 f=1/30
CMU SCS Step 2: Canonicalization Again, Estimating how each signal is composed of “harmonics”/patterns but, in complex space © L. Li, Mixing matrix (complex valued)
CMU SCS Step 3:Handling Lag Intuition: –Groups emerge.. –reducing redundancy –eliminating phase shift © L. Li, Conjugate! Mixing matrix (complex valued)
CMU SCS Step 3:Handling Lag Idea: –only magnitude counts –removing duplicates © L. Li,
CMU SCS Step 3:Handling Lag interpretability © L. Li, harmonics.1/100 harmonics 1/110 harmonics 1/30
CMU SCS Step 4:Grouping Harmonics Intuition: –Still a little redundancy © L. Li, harmonics.1/100 harmonics 1/110 harmonics 1/30 Think Minimum Description Length
CMU SCS Step 4: Grouping Harmonics © L. Li, Dimensional Reduction - + SVD/PCA U,S,V min |X-U*S*V T | 2
CMU SCS Step 4: Grouping Harmonics © L. Li, Group of harmonics 1/110 & 1/30 harmonics.1/100
CMU SCS Parsimonious Linear Fingerprinting Goals steps © L. Li, Good features/similarity function G1 Good compression G2 Ability to forecast G3 Scalability G4 (1a) lag independent (1b) frequency proximity (1c) grouping harmonics S4 S1 S2 S3 Learning Dynamics Canonical Form Handling Lag Grouping Harmonics PLiF alg. steps PLiF Goals
CMU SCS Outline Motivation Proposed Method: Intuition & Example Experiments & Results PLiF: Insight Details Conclusion © L. Li,
CMU SCS Conclusion Need for finding compact representation of time series data Intuition & Insights of PLiF Interpretation of PLiF & How it works Experiments on a diverse set of data –It really works! –It is fast & scalable. © L. Li,
CMU SCS Take away message Need to find Good feature for time series: Similarity func., Compression, Forecasting Design the method meets TS characteristics –e.g. Phase shift/lag correlation When to use PLiF –near periodic & relatively smooth signals © L. Li,
CMU SCS References Lei Li, B. Aditya Prakash, Christos Faloutsos. Parsimonious Linear Fingerprinting for Time Series. VLDB Lei Li, Jim McCann, Christos Faloutsos, Nancy Pollard. DynaMMo: Mining and Summarization of Coevolving Sequences with Missing Values. ACM KDD © L. Li,
CMU SCS Question? Thanks! © L. Li, Christos F aloutsos Lei Li B. Aditya P rakash
CMU SCS BACKUP appendix © L. Li,
CMU SCS 63 Why not Fourier (DFT)? 1. FT cannot do forecasting © L. Li, 2010
CMU SCS 64 Why not Fourier (DFT)? 1. FT cannot do forecasting © L. Li, 2010
CMU SCS 65 Why not Fourier (DFT)? FT spectrum 1. FT cannot do forecasting 2. No arbitrary frequency true freq. frequency © L. Li, 2010
CMU SCS 66 Why not Fourier (DFT)? 1. FT cannot do forecasting 2. No arbitrary frequency 3. nearby frequency treated differently, not suited for across signals freq.=5 freq.=5.1 © L. Li, 2010
CMU SCS Handling Missing Values Lei Li, Jim McCann, Nancy Pollard, Christos Faloutsos. BoLeRO: A Principled Technique for Including Bone Length Constraints in Motion Capture Occlusion Filling, ACM SIGGRAPH / Eurographics Symposium on Computer Animation, Lei Li, Jim McCann, Christos Faloutsos, Nancy Pollard. DynaMMo: Mining and Summarization of Coevolving Sequences with Missing Values. ACM KDD Lei Li, Jim McCann, Christos Faloutsos, Nancy Pollard. Laziness is a virtue: Motion stitching using effort minimization. Eurographics 2008, © L. Li,
CMU SCS Details for Implementation © L. Li, Read this only if you want to implement it
CMU SCS 69 Modelling the data: Linear Dynamical Systems Details © L. Li, 2010
CMU SCS 70 Linear Dynamical Systems: parameters namemeaning & example µ0µ0 initial state for hidden variable e.g. initial position, velocity & acceleration Atransition matrixhow the states move forward, e.g. soccer flying in the air Ctransmission/ projection/ output matrix hidden state observation, e.g. camera taking picture of the soccer Q0Q0 Initial covariance Qtransition covariancehow precision is the soccer motion Rtransmission/ projection covariance i.e. observation noise; e.g. how accurate is the camera Details © L. Li, 2010
CMU SCS 71 Learning the Dynamics Expectation-Maximization maximizing the expected log likelihood Details © L. Li, 2010
CMU SCS 72 Finding Canonical Form Intuition: find the canonical dynamics taking eigenvalue decomposition of the transition matrix A compensate C with C h is a projection of the data to the dynamics but... Details © L. Li, 2010