Facets: Fast Comprehensive Mining of Coevolving High-order Time Series Hanghang TongPing JiYongjie CaiWei FanQing He Joint Work by Presenter:Wei Fan.

Slides:



Advertisements
Similar presentations
Statistics for Improving the Efficiency of Public Administration Daniel Peña Universidad Carlos III Madrid, Spain NTTS 2009 Brussels.
Advertisements

Beyond Streams and Graphs: Dynamic Tensor Analysis
FUNNEL: Automatic Mining of Spatially Coevolving Epidemics Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Willem G. van Panhuis (University of.
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Robust Network Compressive Sensing Lili Qiu UT Austin NSF Workshop Nov. 12, 2014.
A Constraint Generation Approach to Learning Stable Linear Dynamical Systems Sajid M. Siddiqi Byron Boots Geoffrey J. Gordon Carnegie Mellon University.
Yuan Yao Joint work with Hanghang Tong, Feng Xu, and Jian Lu Predicting Long-Term Impact of CQA Posts: A Comprehensive Viewpoint 1 Aug 24-27, KDD 2014.
SCS CMU Joint Work by Hanghang Tong, Spiros Papadimitriou, Jimeng Sun, Philip S. Yu, Christos Faloutsos Speaker: Hanghang Tong Aug , 2008, Las Vegas.
© 2011 IBM Corporation IBM Research SIAM-DM 2011, Mesa AZ, USA, Non-Negative Residual Matrix Factorization w/ Application to Graph Anomaly Detection Hanghang.
Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.
Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.
A Constraint Generation Approach to Learning Stable Linear Dynamical Systems Sajid M. Siddiqi Byron Boots Geoffrey J. Gordon Carnegie Mellon University.
A Sparsification Approach for Temporal Graphical Model Decomposition Ning Ruan Kent State University Joint work with Ruoming Jin (KSU), Victor Lee (KSU)
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
Spatial and Temporal Databases Efficiently Time Series Matching by Wavelets (ICDE 98) Kin-pong Chan and Ada Wai-chee Fu.
Host Load Prediction in a Google Compute Cloud with a Bayesian Model Sheng Di 1, Derrick Kondo 1, Walfredo Cirne 2 1 INRIA 2 Google.
WEMAREC: Accurate and Scalable Recommendation through Weighted and Ensemble Matrix Approximation Chao Chen ⨳ , Dongsheng Li
Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.
Yan Yan, Mingkui Tan, Ivor W. Tsang, Yi Yang,
Analysis of Constrained Time-Series Similarity Measures
Spatio-Temporal Compressive Sensing Yin Zhang The University of Texas at Austin Joint work with Matthew Roughan.
Introduction to tensor, tensor factorization and its applications
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.
Energy-Aware Scheduling with Quality of Surveillance Guarantee in Wireless Sensor Networks Jaehoon Jeong, Sarah Sharafkandi and David H.C. Du Dept. of.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), Christos Faloutsos (CMU), Tomoharu.
Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.
Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.
Low-Rank Kernel Learning with Bregman Matrix Divergences Brian Kulis, Matyas A. Sustik and Inderjit S. Dhillon Journal of Machine Learning Research 10.
Mingyang Zhu, Huaijiang Sun, Zhigang Deng Quaternion Space Sparse Decomposition for Motion Compression and Retrieval SCA 2012.
Lei Li Computer Science Department Carnegie Mellon University Pre Proposal Time Series Learning completed work 11/27/2015.
1 Panther: Fast Top-K Similarity Search on Large Networks Jing Zhang 1, Jie Tang 1, Cong Ma 1, Hanghang Tong 2, Yu Jing 1, and Juanzi Li 1 1 Department.
Xutao Li1, Gao Cong1, Xiao-Li Li2
ICDCS 2014 Madrid, Spain 30 June-3 July 2014
Rank Minimization for Subspace Tracking from Incomplete Data
Streaming Pattern Discovery in Multiple Time-Series Jimeng Sun Spiros Papadimitrou Christos Faloutsos PARALLEL DATA LABORATORY Carnegie Mellon University.
U of Minnesota DIWANS'061 Energy-Aware Scheduling with Quality of Surveillance Guarantee in Wireless Sensor Networks Jaehoon Jeong, Sarah Sharafkandi and.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Ultra-high dimensional feature selection Yun Li
D YNA MM O : M INING AND S UMMARIZATION OF C OEVOLVING S EQUENCES WITH M ISSING V ALUES Lei Li joint work with Christos Faloutsos, James McCann, Nancy.
Hidden Markov Model Parameter Estimation BMI/CS 576 Colin Dewey Fall 2015.
Arizona State University1 Fast Mining of a Network of Coevolving Time Series Wei FanHanghang TongPing JiYongjie Cai.
PRESENT BY BING-HSIOU SUNG A Multilinear Singular Value Decomposition.
SCS CMU Speaker Hanghang Tong Colibri: Fast Mining of Large Static and Dynamic Graphs Speaking Skill Requirement.
Arizona State University Fast Eigen-Functions Tracking on Dynamic Graphs Chen Chen and Hanghang Tong - 1 -
ItemBased Collaborative Filtering Recommendation Algorithms 1.
Privacy Vulnerability of Published Anonymous Mobility Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak.
1 / 24 Distributed Methods for High-dimensional and Large-scale Tensor Factorization Kijung Shin (Seoul National University) and U Kang (KAIST)
Carnegie Mellon School of Computer Science Forecasting with Cyber-physical Interactions in Data Centers Lei Li PDL Seminar 9/28/2011.
SketchVisor: Robust Network Measurement for Software Packet Processing
Forecasting with Cyber-physical Interactions in Data Centers (part 3)
Zhu Han University of Houston Thanks for Dr. Hung Nguyen’s Slides
FUZZY NEURAL NETWORKS TECHNIQUES AND THEIR APPLICATIONS
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Supervised Time Series Pattern Discovery through Local Importance
INVERSE BUILDING MODELING
Predicting Long-Term Impact of CQA Posts: A Comprehensive Viewpoint
A Time Series Representation Framework Based on Learned Patterns
Pre Proposal Time Series Learning completed work
J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009
Outline Multilinear Analysis
Understanding and Exploiting Amazon EC2 Spot Instances
Latent Space Model for Road Networks to Predict Time-Varying Traffic
Probabilistic Models with Latent Variables
Jimeng Sun · Charalampos (Babis) E
Connecting Data with Domain Knowledge in Neural Networks -- Use Deep learning in Conventional problems Lizhong Zheng.
Asymmetric Transitivity Preserving Graph Embedding
A Dynamic System Analysis of Simultaneous Recurrent Neural Network
NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &
Presentation transcript:

Facets: Fast Comprehensive Mining of Coevolving High-order Time Series Hanghang TongPing JiYongjie CaiWei FanQing He Joint Work by Presenter:Wei Fan

Arizona State University Ubiquitous Coevolving * Time Series 2 a) Room condition monitoring in a smart building b) Intelligent transportation systems d) Climate Monitoring c) Stock Market *a.k.a. multivariate in statistics

Arizona State University Challenges  C1. High-order  C2. Contextual constraints  C3. Temporal smoothness 3

Arizona State University Challenges  C1. High-order Multiple sources, multiple types, e.g., – sensor, humidity, temperature, light, … – vehicle, trace location (x, y), speed, … – stock, max price, min price, volume,… – latitude, longitude, temperature, wind, … 4 sensor time humidity light temperature Voltage Latitude Longitude Metrics Latitude Longitude Metrics Latitude Longitude Metrics t=1 t=2 … t=T

Arizona State University Challenges  C2. Contextual constraints 5 (a) A Simplified Sensor Network (b) Measured Temperature Time Series. The time series are inter-connected with each other by its embedded network. sensor time humidity light temperature Voltage

Arizona State University Challenges  C2. Contextual constraints 6 The time series are inter-connected with each other by its embedded network. sensor time humidity light temperature Voltage (a) Network of Types.(b) Time Series of Room Conditions. Humidity Light Voltage Temperature time

Arizona State University Challenges  C3. Temporal smoothness 7 Correlated adjacent values Anomaly Intuition: || X t+1 - X t || is expected to be small

Arizona State University Singular Value Decomposition (SVD) 8 Coevolving time seriesMatrix representation X t1t1 t2t2 t7t7 t17t … … … 11… … … … … … 21… … … … … … … … … … … … t8t8 t9t9 t 18 t 19 … … … TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS 7 Morning rush hours Time Traffic Volume Afternoon rush hours

Arizona State University SVD (cont.)  Singular vectors for correlation detection 9 ≈ × × … … … … … … … … … … … … … … … … … … … … … … … … … … … MR: Morning rush hours AR: Afternoon rush hours AR MR + AR P1P1 P2P2 strength of P 1 strength of P 2 TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS 7 Limitations: C1. High-order C2. Contextual Information C3. Temporal Smoothness P1P1 P2P2 U Σ Z X MR AR

Arizona State University Related Work  Tensor decomposition – Tucker, CANDECOMP/PARAFAC, HOSVD, [Sun2006], [Xiong2010], …  Low rank matrix factorization – SVD, PCA, [Mnih2007], [Ma2008], [Yao2014],… – Dynamic matrix factorization with temporal factor [Chua2013], [Sun2012], [Li2009], [Cai2015],…  Time series mining – [Shieh2008], [Matsubara2014], … 10 Lack of comprehensiveness in tackling all the three challenges

Arizona State University Outline  Motivation  Facets: address all the three challenges  Experiments  Conclusion 11

Arizona State University C1. High-order: Tucker Decomposition  M-order Time series tensors:  Define 12 N1N1 N2N2 N3N3 L1L1 L2L2 L3L3 time series latent factor coefficient matrix Intuition: grouping effect on each mode #1

Arizona State University C2. Contextual Constraints  Embedded contextual networks  For each 13 ≈ × coefficient matrixcontextual latent factorcontextual network Intuition: well-connected  more likely to share similar low-rank factors #2

Arizona State University Address C1 and C2 14 N1N1 N2N2 N3N3 L1L1 L2L2 L3L3

Arizona State University C3. Temporal Smoothness  Define : multilinear to 15 … Intuition: successive observations are highly correlated. #3

Arizona State University Put It All Together - Facets 16 #3 #2 #1 high-order time series #1 contextual constraints #2 temporal smoothness #3

Arizona State University Proposed Optimization Algorithm  Key idea: EM algorithm  Infer latent factors and – Vectorization and matricization – Forward and backward algorithms 17 N1N1 N2N2 N3N3 N1N2N3N1N2N3 T vectorize

Arizona State University Proposed Optimization Algorithm (cont.)  Update parameters – At each iteration, keep other U fixed, update U (m) and other parameters – Same for B (m)  Properties – Converge to a local optimum – Time complexity: Linear in T 18 mode-m matricizing N1N1 N2N2 N3N3 mode-1 matricizing N1N1 N2N3N2N3

Arizona State University Special case: M=1 19 … … … … … … … … … … … … … … … … … … … … … ≈ TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS 7 × … … … … … … P1P1 P2P2 TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS 7 P1P1 P2P2 t7t7 t17t17 t8t8 t9t9 t 18 t 19 … … … Morning rush hours Time Traffic Volume Afternoon rush hours AR MR ≈ TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS 7 P1P1 P2P2 TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS TS 1TS 2TS 3TS 4TS 5TS 6TS 7 contextual network coevolving time series P1P1 P2P2 TS 1 TS 2 TS 3 TS 4 TS 5 TS 6 TS × T X U Z S U V Y. Cai, H. Tong, W. Fan, and P. Ji. Fast mining of a network of coevolving time series. In SDM, Unable to deal with high-order (C1) time series

Arizona State University Outline  Motivation  Facets: address all the three challenges  Experiments  Conclusion 20

Arizona State University Experimental Evaluations  Parameter Sensitivity – how robust is our Facets algorithm?  Effectiveness – how accurate is our Facets algorithm in terms of imputation and prediction?  Efficiency – how does our Facets algorithm scale w.r.t. T ? – what is the computational cost comparing to other methods? 21

Arizona State University Parameter Sensitivity (a) Impact of L(b) Impact of λ 22  L: dimensions of latent factors. – RMSE stabilizes after L reaches [15, 3].  λ: weight to control the contribution of contextual network

Arizona State University Effectiveness Results I Evaluation of missing value recovery. Lower is better 23 (a) SST (b) SPMD Ours

Arizona State University A Case Study – One Trip Instance 24 (a) 90% training data(b) 50% training data (c) 10% training data(d) 1% training data

Arizona State University Effectiveness Results II 25 (a) SST(b) SPMD Evaluation of prediction. Lower is better Ours

Arizona State University Scalability  T: the length of time series 26

Arizona State University Efficiency 27 (a) Imputation on SST(b) Prediction on SPMD

Arizona State University Conclusion 28  Net-HiTs: a network of high-order time series  Main contributions – Model formulation high-order time series + contextual constraints + temporal smoothness – Algorithm EM algorithm Linear scalability in the length of time series – Empirical evaluation outperform all the existing competitors

Arizona State University29 Thank you!Q & A