ST-MVL: Filling Missing Values in Geo-sensory Time Series Data

Slides:



Advertisements
Similar presentations
Spatial point patterns and Geostatistics an introduction
Advertisements

An Interactive-Voting Based Map Matching Algorithm
+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.
Presented by: Su Yingbin. Outline Introduction SocialSwam Design Notations Algorithms Evaluation Conclusion.
1 Alberta Agriculture and Food (AF) Surface Meteorological Stations and Data Quality Control Procedures.
Hongliang Li, Senior Member, IEEE, Linfeng Xu, Member, IEEE, and Guanghui Liu Face Hallucination via Similarity Constraints.
Learning Location Correlation From GPS Trajectories Yu Zheng Microsoft Research Asia March 16, 2010.
Detecting Nearly Duplicated Records in Location Datasets Microsoft Research Asia Search Technology Center Yu Zheng Xing Xie, Shuang Peng, James Fu.
Constructing Popular Routes from Uncertain Trajectories Authors of Paper: Ling-Yin Wei (National Chiao Tung University, Hsinchu) Yu Zheng (Microsoft Research.
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
Travel Time Estimation of a Path using Sparse Trajectories
Validation and Monitoring Measures of Accuracy Combining Forecasts Managing the Forecasting Process Monitoring & Control.
T-Drive : Driving Directions Based on Taxi Trajectories Microsoft Research Asia University of North Texas Jing Yuan, Yu Zheng, Chengyang Zhang, Xing Xie,
Yu Zheng, Lizhu Zhang, Xing Xie, Wei-Ying Ma Microsoft Research Asia
Computing Trust in Social Networks
Geo-statistical Analysis
Applications in GIS (Kriging Interpolation)
Title: Spatial Data Mining in Geo-Business. Overview  Twisting the Perspective of Map Surfaces — describes the character of spatial distributions through.
ANALYSIS OF ESTIMATED RAINFALL DATA USING SPATIAL INTERPOLATION. Preethi Raj GEOG 5650 (Environmental Applications of GIS)
Wancai Zhang, Hailong Sun, Xudong Liu, Xiaohui Guo.
Wang-Chien Lee i Pervasive Data Access ( i PDA) Group Pennsylvania State University Mining Social Network Big Data Intelligent.
What is cache memory?. Cache Cache is faster type of memory than is found in main memory. In other words, it takes less time to access something in cache.
GEOSTATISICAL ANALYSIS Course: Special Topics in Remote Sensing & GIS Mirza Muhammad Waqar Contact: EXT:2257.
Spatial Interpolation III
Christoph F. Eick University of Houston Organization 1. What are Ontologies? 2. What are they good for? 3. Ontologies and.
WSP: A Network Coordinate based Web Service Positioning Framework for Response Time Prediction Jieming Zhu, Yu Kang, Zibin Zheng and Michael R. Lyu The.
Analyzing wireless sensor network data under suppression and failure in transmission Alan E. Gelfand Institute of Statistics and Decision Sciences Duke.
Yu Zheng Microsoft Research, Beijing, China
Lecture 6: Point Interpolation
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Ran TAO Missing Spatial Data. Examples Places cannot be reached E.g. Mountainous area Sample points E.g. Air pollution Damage of data E.g.
Trajectory Data Mining
Forecasting Fine-Grained Air Quality Based on Big Data Date: 2015/10/15 Author: Yu Zheng, Xiuwen Yi, Ming Li1, Ruiyuan Li1, Zhangqing Shan, Eric Chang,
Trajectory Data Mining
Geo479/579: Geostatistics Ch12. Ordinary Kriging (2)
Graduate School of Information Science, Nara Institute of Science and Technology - Wed. 7 April 2004Profes 2004 Effort Estimation Based on Collaborative.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Adaptive Motion Compensation Without Blocking Artifacts
SCALECycle and Crowd Augmented Urban Sensing
Managing Massive Trajectories on the Cloud
Objective Analysis and Data Assimilation
Cohesive Subgraph Computation over Large Graphs
Using satellite data and data fusion techniques
T-Share: A Large-Scale Dynamic Taxi Ridesharing Service
Scalable Load-Distance Balancing
Urban Sensing Based on Human Mobility
DNN-Based Urban Flow Prediction
Saliency-guided Video Classification via Adaptively weighted learning
Urban Water Quality Prediction based on Multi-task Multi-view Learning
Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation Qi Xie1, Shenglin Zhao2, Zibin Zheng3, Jieming Zhu2 and Michael.
Quality-aware Aggregation & Predictive Analytics at the Edge
DASH Background Server provides multiple qualities of the same video
Mining Spatio-Temporal Reachable Regions over Massive Trajectory Data
Week 01 Comp 7780 – Class Overview.
University of Houston, USA
DISTRIBUTED CLUSTERING OF UBIQUITOUS DATA STREAMS
Chao Zhang1, Yu Zheng2, Xiuli Ma3, Jiawei Han1
Passenger Demand Prediction with Cellular Footprints
CSC 558 – Data Analytics II, Spring, 2018
Filtering and State Estimation: Basic Concepts
Priority geospatial datasets for the European Commission
Screening for Abnormal Values in AirBase Datasets
Housam Babiker, Randy Goebel and Irene Cheng
Creating Surfaces with 3D Analyst
Overview: Chapter 2 Localization and Tracking
9. Spatial Interpolation
Automated traffic congestion estimation via public video feeds
The WISE public website First protoype WISE viewer Stefan Jensen EEA project manager on WISE WISE end user workshop Brussels The access.
Presentation transcript:

ST-MVL: Filling Missing Values in Geo-sensory Time Series Data Xiuwen Yi, Yu Zheng, Junbo Zhang, Tianrui Li Microsoft Research Asia Southwest Jiaotong University

Filling Missing Values in Spatio-Temporal Data Data missing is a very common phenomenon in IOT data Lost data that is supposed to have Due to Communication or device errors   PM2.5 NO2 Humidity Wind Speed Missing rate 13.3% 16.0% 21.5% 30.3% Xiuwen Yi, Yu Zheng, et al. ST-MVL: Filling Missing Values in Geo-sensory Time Series Data. IJCAI 2016

Filling Missing Values in Spatio-Temporal Data Goal: Inferring the values of those missing entries using collective information: Data of a sensor and its neighborhoods A very fundamental problem Important for monitoring and further data analytics Xiuwen Yi, Yu Zheng, et al. ST-MVL: Filling Missing Values in Geo-sensory Time Series Data. IJCAI 2016

Filling Missing Values in Spatio-Temporal Data Difficulties Random missing and block missing Not handled by fixed learning models Readings changing over time and location non-linearly Not handled by simple interpolations

Fill Missing Values in Spatio-Temporal Datasets Achieve this goal from different perspectives Spatial and Temporal perspectives Spatial neighbors Temporally adjacent time intervals Global and temporal perspectives Local: Recent context Global: Long-term patterns Temporal Spatial X Xiuwen Yi, Yu Zheng, et al. ST-MVL: Filling Missing Values in Geo-sensory Time Series Data. IJCAI 2016

Global: long-term knowledge Spatial Inverse Distance Weighting (IDW) 𝑣 𝑔𝑠 = 𝑖=1 𝑚 𝑣 𝑖 ∗𝑑 𝑖 −𝛼 𝑖=1 𝑚 𝑑 𝑖 −𝛼 Beijing from May 2014 to May 2015 Temporal Simple Exponential Smoothing (SES) 𝑑 𝑖 −𝛼 assigns a bigger weight to closer sensors’ readings; a bigger 𝛼 denotes a faster decay of weight 𝛽∗(1−𝛽) 𝑡−1 gives a bigger weight to recent readings; a smaller 𝛽 denotes a slower decay of weight. t Time interval 𝑣 𝑔𝑡 = 𝛽𝑣 𝑗 +𝛽 1−𝛽 𝑣 𝑗−1 +…+ 𝛽 1−𝛽 𝑡 𝑗 −1 𝑣 1

Local: Recent Context Some situations break long-term patterns Spatial Temporal Window size Collaborative Filtering Sensors  users Time intervals  items

Fill Missing Values in Spatio-Temporal Datasets A multi-view-based method IDW: Inverse Distance Weighting SES: Simple Exponential Smoothing UCF: User-based Collaborative filtering ICF: Item-based Collaborative filtering 𝑣 𝑚𝑣𝑙 = 𝑤 1 ∗ 𝑣 𝑔𝑠 + 𝑤 2 ∗ 𝑣 𝑔𝑡 + 𝑤 3 ∗ 𝑣 𝑙𝑠 + 𝑤 4 ∗ 𝑣 𝑙𝑡 +𝑏 Spatial Temporal

Temporal Block Missing Experiments Baselines Method Spatial Temporal Spatial + Temporal Global IDW SES IDW+SES Lobal UCF ICF, ARMA CF, NMF, stKNN Global+Local Kriging SARIMA AKE, DESM, NMF-MVL   PM2.5 NO2 Humidity Wind Speed Block missing Spatial 2.2% 3.9% 9.8% 11.8% Temporal 3.5% 6.5% 9.6% 19.5% General missing 8.2% 6.8% 4.6% 4.0% Overall 13.3% 16.0% 21.5% 30.3% Comparison among different methods (based on PM2.5) Method General Missing Spatial Block Missing Temporal Block Missing Sudden Change Overall MAE MRE ARMA 22.61 0.331 29.26 0.369 \ 51.11 0.567 27.47 0.394 Kriging 15.53 0.221 15.62 0.222 42.32 0.407 16.59 0.234 SARIMA 14.69 0.220 23.92 0.319 31.20 0.561 52.80 0.586 18.76 0.278 stKNN 12.84 0.188 19.91 0.235 12.72 0.226 35.13 0.390 14.00 0.201 DESM 13.65 0.191 19.24 0.233 12.66 0.224 42.87 0.425 15.59 0.228 AKE 13.34 0.195 19.08 0.229 12.14 0.22 41.54 0.403 14.27 0.211 IDW+SES 11.64 0.171 18.25 0.215 11.95 0.213 34.33 0.381 12.70 0.183 CF 12.20 0.178 19.27 12.25 0.218 34.91 0.388 13.40 0.193 NMF 11.21 0.163 18.98 0.239 12.73 0.217 34.37 13.08 NMF-MVL 11.16 0.162 18.97 0.238 0.380 13.06 0.187 ST-MVL 10.81 0.158 17.85 11.71 0.208 33.15 0.368 12.12 0.174 Xiuwen Yi, Yu Zheng, et al. ST-MVL: Filling Missing Values in Geo-sensory Time Series Data. IJCAI 2016

Download Urban Air Apps Search for “Urban Computing” 搜索“城市计算” Thanks! Yu Zheng yuzheng@microsoft.com Download Urban Air Apps Homepage Zheng, Y., et al. Urban Computing: concepts, methodologies, and applications. ACM transactions on Intelligent Systems and Technology. 郑宇. 城市计算概述,武汉大学学报. 2015年1月,40卷第一期 Yu Zheng. Methodologies for Cross-Domain Data Fusion: An Overview. IEEE Transactions on Big Data, 1, 1, 2015.