Facets: Fast Comprehensive Mining of Coevolving High-order Time Series Hanghang TongPing JiYongjie CaiWei FanQing He Joint Work by Presenter:Wei Fan
Arizona State University Ubiquitous Coevolving * Time Series 2 a) Room condition monitoring in a smart building b) Intelligent transportation systems d) Climate Monitoring c) Stock Market *a.k.a. multivariate in statistics
Arizona State University Challenges C1. High-order C2. Contextual constraints C3. Temporal smoothness 3
Arizona State University Challenges C1. High-order Multiple sources, multiple types, e.g., – sensor, humidity, temperature, light, … – vehicle, trace location (x, y), speed, … – stock, max price, min price, volume,… – latitude, longitude, temperature, wind, … 4 sensor time humidity light temperature Voltage Latitude Longitude Metrics Latitude Longitude Metrics Latitude Longitude Metrics t=1 t=2 … t=T
Arizona State University Challenges C2. Contextual constraints 5 (a) A Simplified Sensor Network (b) Measured Temperature Time Series. The time series are inter-connected with each other by its embedded network. sensor time humidity light temperature Voltage
Arizona State University Challenges C2. Contextual constraints 6 The time series are inter-connected with each other by its embedded network. sensor time humidity light temperature Voltage (a) Network of Types.(b) Time Series of Room Conditions. Humidity Light Voltage Temperature time
Arizona State University Challenges C3. Temporal smoothness 7 Correlated adjacent values Anomaly Intuition: || X t+1 - X t || is expected to be small
Arizona State University Singular Value Decomposition (SVD) 8 Coevolving time seriesMatrix representation X t1t1 t2t2 t7t7 t17t … … … 11… … … … … … 21… … … … … … … … … … … … t8t8 t9t9 t 18 t 19 … … … TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS 7 Morning rush hours Time Traffic Volume Afternoon rush hours
Arizona State University SVD (cont.) Singular vectors for correlation detection 9 ≈ × × … … … … … … … … … … … … … … … … … … … … … … … … … … … MR: Morning rush hours AR: Afternoon rush hours AR MR + AR P1P1 P2P2 strength of P 1 strength of P 2 TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS 7 Limitations: C1. High-order C2. Contextual Information C3. Temporal Smoothness P1P1 P2P2 U Σ Z X MR AR
Arizona State University Related Work Tensor decomposition – Tucker, CANDECOMP/PARAFAC, HOSVD, [Sun2006], [Xiong2010], … Low rank matrix factorization – SVD, PCA, [Mnih2007], [Ma2008], [Yao2014],… – Dynamic matrix factorization with temporal factor [Chua2013], [Sun2012], [Li2009], [Cai2015],… Time series mining – [Shieh2008], [Matsubara2014], … 10 Lack of comprehensiveness in tackling all the three challenges
Arizona State University Outline Motivation Facets: address all the three challenges Experiments Conclusion 11
Arizona State University C1. High-order: Tucker Decomposition M-order Time series tensors: Define 12 N1N1 N2N2 N3N3 L1L1 L2L2 L3L3 time series latent factor coefficient matrix Intuition: grouping effect on each mode #1
Arizona State University C2. Contextual Constraints Embedded contextual networks For each 13 ≈ × coefficient matrixcontextual latent factorcontextual network Intuition: well-connected more likely to share similar low-rank factors #2
Arizona State University Address C1 and C2 14 N1N1 N2N2 N3N3 L1L1 L2L2 L3L3
Arizona State University C3. Temporal Smoothness Define : multilinear to 15 … Intuition: successive observations are highly correlated. #3
Arizona State University Put It All Together - Facets 16 #3 #2 #1 high-order time series #1 contextual constraints #2 temporal smoothness #3
Arizona State University Proposed Optimization Algorithm Key idea: EM algorithm Infer latent factors and – Vectorization and matricization – Forward and backward algorithms 17 N1N1 N2N2 N3N3 N1N2N3N1N2N3 T vectorize
Arizona State University Proposed Optimization Algorithm (cont.) Update parameters – At each iteration, keep other U fixed, update U (m) and other parameters – Same for B (m) Properties – Converge to a local optimum – Time complexity: Linear in T 18 mode-m matricizing N1N1 N2N2 N3N3 mode-1 matricizing N1N1 N2N3N2N3
Arizona State University Special case: M=1 19 … … … … … … … … … … … … … … … … … … … … … ≈ TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS 7 × … … … … … … P1P1 P2P2 TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS 7 P1P1 P2P2 t7t7 t17t17 t8t8 t9t9 t 18 t 19 … … … Morning rush hours Time Traffic Volume Afternoon rush hours AR MR ≈ TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS 7 P1P1 P2P2 TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS TS 1TS 2TS 3TS 4TS 5TS 6TS 7 contextual network coevolving time series P1P1 P2P2 TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS × T X U Z S U V Y. Cai, H. Tong, W. Fan, and P. Ji. Fast mining of a network of coevolving time series. In SDM, Unable to deal with high-order (C1) time series
Arizona State University Outline Motivation Facets: address all the three challenges Experiments Conclusion 20
Arizona State University Experimental Evaluations Parameter Sensitivity – how robust is our Facets algorithm? Effectiveness – how accurate is our Facets algorithm in terms of imputation and prediction? Efficiency – how does our Facets algorithm scale w.r.t. T ? – what is the computational cost comparing to other methods? 21
Arizona State University Parameter Sensitivity (a) Impact of L(b) Impact of λ 22 L: dimensions of latent factors. – RMSE stabilizes after L reaches [15, 3]. λ: weight to control the contribution of contextual network
Arizona State University Effectiveness Results I Evaluation of missing value recovery. Lower is better 23 (a) SST (b) SPMD Ours
Arizona State University A Case Study – One Trip Instance 24 (a) 90% training data(b) 50% training data (c) 10% training data(d) 1% training data
Arizona State University Effectiveness Results II 25 (a) SST(b) SPMD Evaluation of prediction. Lower is better Ours
Arizona State University Scalability T: the length of time series 26
Arizona State University Efficiency 27 (a) Imputation on SST(b) Prediction on SPMD
Arizona State University Conclusion 28 Net-HiTs: a network of high-order time series Main contributions – Model formulation high-order time series + contextual constraints + temporal smoothness – Algorithm EM algorithm Linear scalability in the length of time series – Empirical evaluation outperform all the existing competitors
Arizona State University29 Thank you!Q & A