Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases Jimeng Sun, Dimitris Papadias, Yufei Tao, Bin Liu.

Slides:



Advertisements
Similar presentations
Raghavendra Madala. Introduction Icicles Icicle Maintenance Icicle-Based Estimators Quality Guarantee Performance Evaluation Conclusion 2 ICICLES: Self-tuning.
Advertisements

1 Query Processing in Spatial Network Databases presented by Hao Hong Dimitris Papadias Jun Zhang Hong Kong University of Science and Technology Nikos.
Counting Distinct Objects over Sliding Windows Presented by: Muhammad Aamir Cheema Joint work with Wenjie Zhang, Ying Zhang and Xuemin Lin University of.
1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal
Fast Algorithms For Hierarchical Range Histogram Constructions
Indexing and Range Queries in Spatio-Temporal Databases
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
STHoles: A Multidimensional Workload-Aware Histogram Nicolas Bruno* Columbia University Luis Gravano* Columbia University Surajit Chaudhuri Microsoft Research.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Fast Incremental Maintenance of Approximate histograms : Phillip B. Gibbons (Intel Research Pittsburgh) Yossi Matias (Tel Aviv University) Viswanath Poosala.
Yoshiharu Ishikawa (Nagoya University) Yoji Machida (University of Tsukuba) Hiroyuki Kitagawa (University of Tsukuba) A Dynamic Mobility Histogram Construction.
Effectively Indexing Uncertain Moving Objects for Predictive Queries School of Computing National University of Singapore Department of Computer Science.
Indexing Network Voronoi Diagrams*
A Novel Scheme for Video Similarity Detection Chu-Hong Hoi, Steven March 5, 2003.
Spatial Indexing I Point Access Methods. PAMs Point Access Methods Multidimensional Hashing: Grid File Exponential growth of the directory Hierarchical.
On Reducing Communication Cost for Distributed Query Monitoring Systems. Fuyu Liu, Kien A. Hua, Fei Xie MDM 2008 Alex Papadimitriou.
Ph.D. DefenceUniversity of Alberta1 Approximation Algorithms for Frequency Related Query Processing on Streaming Data Presented by Fan Deng Supervisor:
Spatio-Temporal Databases
Computer Science Spatio-Temporal Aggregation Using Sketches Yufei Tao, George Kollios, Jeffrey Considine, Feifei Li, Dimitris Papadias Department of Computer.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
1 SINA: Scalable Incremental Processing of Continuous Queries in Spatio-temporal Databases Mohamed F. Mokbel, Xiaopeng Xiong, Walid G. Aref Presented by.
ICNP'061 Benefit-based Data Caching in Ad Hoc Networks Bin Tang, Himanshu Gupta and Samir Das Department of Computer Science Stony Brook University.
Spatial Indexing I Point Access Methods.
An Incremental Refining Spatial Join Algorithm for Estimating Query Results in GIS Wan D. Bae, Shayma Alkobaisi, Scott T. Leutenegger Department of Computer.
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
1 SINA: Scalable Incremental Processing of Continuous Queries in Spatio-temporal Databases Mohamed F. Mokbel, Xiaopeng Xiong, Walid G. Aref Presented by.
Indexing Spatio-Temporal Data Warehouses Dimitris Papadias, Yufei Tao, Panos Kalnis, Jun Zhang Department of Computer Science Hong Kong University of Science.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
One-Pass Wavelet Decompositions of Data Streams TKDE May 2002 Anna C. Gilbert,Yannis Kotidis, S. Muthukrishanan, Martin J. Strauss Presented by James Chan.
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
AAU A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing Presented by YuQing Zhang  Slobodan Rasetic Jorg Sander James Elding Mario A.
July, 2001 High-dimensional indexing techniques Kesheng John Wu Ekow Otoo Arie Shoshani.
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
Dept. of Electrical Engineering and Computer Science, Northwestern University Context-Aware Optimization of Continuous Query Maintenance for Trajectories.
HPDC 2014 Supporting Correlation Analysis on Scientific Datasets in Parallel and Distributed Settings Yu Su*, Gagan Agrawal*, Jonathan Woodring # Ayan.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Constructing Optimal Wavelet Synopses Dimitris Sacharidis Timos Sellis
ICPP 2012 Indexing and Parallel Query Processing Support for Visualizing Climate Datasets Yu Su*, Gagan Agrawal*, Jonathan Woodring † *The Ohio State University.
A Novel Approach for Approximate Aggregations Over Arrays SSDBM 2015 June 29 th, San Diego, California 1 Yi Wang, Yu Su, Gagan Agrawal The Ohio State University.
Clustering Moving Objects in Spatial Networks Jidong Chen, Caifeng Lai, Xiaofeng Meng, Renmin University of China Jianliang Xu, and Haibo Hu Hong Kong.
DIST: A Distributed Spatio-temporal Index Structure for Sensor Networks Anand Meka and Ambuj Singh UCSB, 2005.
Histograms for Selectivity Estimation
Reporter : Yu Shing Li 1.  Introduction  Querying and update in the cloud  Multi-dimensional index R-Tree and KD-tree Basic Structure Pruning Irrelevant.
Efficient Processing of Top-k Spatial Preference Queries
Monitoring k-NN Queries over Moving Objects Xiaohui Yu University of Toronto Joint work with Ken Pu and Nick Koudas.
Continual Neighborhood Tracking for Moving Objects Yoshiharu Ishikawa Hiroyuki Kitagawa Tooru Kawashima University of Tsukuba, Japan
February 4, Location Based M-Services Soon there will be more on-line personal mobile devices than on-line stationary PCs. Location based mobile-services.
Histograms for Selectivity Estimation, Part II Speaker: Ho Wai Shing Global Optimization of Histograms.
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,
Location-based Spatial Queries AGM SIGMOD 2003 Jun Zhang §, Manli Zhu §, Dimitris Papadias §, Yufei Tao †, Dik Lun Lee § Department of Computer Science.
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
Spatio-Temporal Databases. Term Project Groups of 2 students You can take a look on some project ideas from here:
Rethinking Choices for Multi-dimensional Point Indexing You Jung Kim and Jignesh M. Patel University of Michigan.
A Flexible Spatio-temporal indexing Scheme for Large Scale GPS Tracks Retrieval Yu Zheng, Longhao Wang, Xing Xie Microsoft Research.
Dense-Region Based Compact Data Cube
Spatio-Temporal Databases
Data-Streams and Histograms
Spatial Indexing I Point Access Methods.
ICICLES: Self-tuning Samples for Approximate Query Answering
Spatio-Temporal Databases
Continuous Density Queries for Moving Objects
Efficient Processing of Top-k Spatial Preference Queries
Presentation transcript:

Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases Jimeng Sun, Dimitris Papadias, Yufei Tao, Bin Liu

2 Motivation Spatio-temporal databases vs. Data streams The monitoring applications –Traffic supervision –Mobile users monitoring –Weather forecasting Example: –find the number of vehicles in the city center now The challenge is to provide fast query response in highly intensive environment

3 Problems and methods Problems: –How to efficiently store/summarize the spatio-temporal information? –How to approximately answer the query about the past, the present, and the future? Methods: –Adaptive multi-dimensional histogram (AMH) –Historical synopsis –Stochastic prediction method

4 Related work Histograms –Static multi-dimensional histograms Equi-depth, Mhist, Minskew, Genhist, SQ –Query-adaptive multi-dimensional histograms STGrid, STHoles, SASH Other approximation methods –DCT, Wavelet, Sketch Spatio-temporal databases –Historical retrieval –Future prediction

5 Outline Introduction Problem and proposed methods –Adaptive multi-dimensional histogram –Historical synopsis –Prediction model Experiment Conclusion

6 Query types Present Time (PT) Historical Time (HT) Future Time (FT) Queries time location currentpastfuture

7 System Overview PT HT FT Queries AMH Past Index Historical Synopsis Prediction Model Spatio-temporal updates

8 Histogram Partition the space into buckets Data within a bucket summarize by the mean The properties of a good histogram: –Uniformity within each bucket –Incremental updateable bad good

9 Adaptive Multi-dimensional Histogram (AMH) Regular cells Objective: minimize WVS=  (area i ∙var i ) (Minskew [Acharya, Poosala, Ramaswamy 99]) n1 n2 n3 n4 b1b2 b4b3 b5 n5 b6 BPT b1 b2 b3 b4 b6 b5 Buckets

10 Dynamic Maintenance of AMH Our scheme: record the information during the construction and modify the structure as needed. –1. information update Update the bucket count –2. bucket reorganization Merge: to claim buckets Split: to reduce WVS

11 Information update of AMH n1 n2 n3 n4 b1b2 b4b3 b5 n5 b6 BPT b1 b2 b3 b4 b6 b5 Buckets mapping b1 n2 n1

12 Bucket reorganization -Merge n1 n2 n3 b1b2 b5 BPT n1 n2 n3 n4 b1b2 b4b3 b5 n5 b6 BPT n1 n2 n3 n4 b1b2 b4b3 b5 n5 b6 n4 b* Merge b1 b2 b* b5 Buckets Bucket Info: 1. region [x-, x+][y-,y+] 2. frequency: count/area 3. 2 nd moment: (for variance calculation) Merge the subtree that leads to minimal WVS increase

13 Bucket reorganization -Split n1 n2 n3 b1b2 b5b* Split n1 n2 n3 b*1 b2 b5b* b*2 n4 b*3b*4 n5 Split the bucket that leads to maximal WVS decrease

14 Features of AMH Bucket information is updated as new data arrive Bucket extents continuously adapt the data distribution changes The maintenance does not affect the normal query processing –It is interruptible at any moment of time –It is performed at the CPU idle time

15 Outline Introduction Problem and proposed methods –Adaptive multi-dimensional histogram –Historical synopsis –Prediction model Experiment Conclusion

16 Historical Synopsis AMH maintains the current buckets. Past index stores the obsolete buckets. Past index: –Packed B-tree –3D R-tree

17 Prediction Model Prediction based on velocity doesn’t work! –It is not realistic to assume velocity remains constant between current time and query time –Velocity is highly dynamic We suggest to use only the past and present location information to do prediction.

18 Prediction Model (cont.) FT Prediction Model HT PT Historical Synopsis results Parse forecast the future using any time series prediction method: we use AR

19 Outline Introduction Related work Problem and proposed methods –Adaptive multi-dimensional histogram –Historical synopsis –Prediction model Experiment Conclusion

20 Experiment settings Datasets –2.5M updates for each dataset –spatial: 50K mobile objects from 2 spatial dataset –road: from a spatio-temporal generator (described in [Brinkhoff 2002] ) median finalinitial Road networkData distribution

21 Robustness with time spatial road Query: qlength = 6% of the data space; 25K queries uniformly distribute along space and time

22 Comparison with conventional histogram Minskew (a static spatial histogram) is rebuilt every 50k location updates tp is the proportion between the cost of AMH and that of Minskew The re-organization operations of AMH are uniformly distributed among the 50k location updates. spatial road minskew AMH minskew AMH

23 The effect of update intensity B-tree performs better at the high update rate. R-tree provides much faster query response. In general, when query/update ratio is large (>30%), R-tree performs better. spatial road 3D r-tree b-tree Query type

24 Conclusion We present a comprehensive approach for processing queries that refer to any time in history. The proposed architecture maintains – an incremental multi-dimensional histogram; –a past index structure for storing the outdated buckets. Future queries are answered by a stochastic method that uses the recent history to predict the future.

25 Q+A

26 Summary AMH Past Index Historical Synopsis Prediction Model 0. goal: min(WVS) 1. Info update 2. Reorganization happens when CPU is idle 1.Recent buckets in memory 2.Old buckets dump to the disk Old buckets Forecast based on the present and past.

27 Related work Static multi-dimensional histograms Query-adaptive multi-dimensional histograms Other multi-dimensional approximation methods Spatio-temporal prediction methods Spatio-temporal aggregation methods

28 Evaluation over different query types spatial road

29 Motivation (cont.) Spatio-temporal database (STDB) research: –historical retrieval –future prediction

30 Bucket reorganization -Split n1 n2 n3 b1b2 b5b* b1 b2 b* b5 Buckets Split b*1 b2 b* b5 Buckets n1 n2 n3 b*1 b2 b5b* b*2 n4 b*2 b*3b*4 n5 b*3 b*4