Query Processing of Massive Trajectory Data based on MapReduce Qiang Ma, Bin Yang (Fudan University) Weining Qian, Aoying Zhou (ECNU) Presented By: Xin.

Query Processing of Massive Trajectory Data based on MapReduce Qiang Ma, Bin Yang (Fudan University) Weining Qian, Aoying Zhou (ECNU) Presented By: Xin Cao (Aalborg University)

Outline Introduction Preliminary Trajectory Processing – Execution Overview – Storage – Indexing Methods – Query Processing Experimental Study Future Works

Introduction Location-based services are playing important roles. Large volumes of diverse formats of trajectory data have been accumulated. Traditional centralized technologies may not deal with the large amount of trajectories. Cloud computing, such as GFS and MapReduce, provides a promising paradigm to conquer the explosion of trajectory data.

Challenge Huge volume, updates frequently, rapidly increasing. Trajectory data is “continuous”, i.e. ordered sequentially. Highly skewed. MapReduce is good at offline data analysis, but not efficient for online query.

Our Contributions Extend the MapReduce framework to manage massive sequential data, such as trajectories of moving objects. Study what kind of query processing methods are appropriate for large clusters. Provide two scalable indexing methods to facilitate query processing efficiently.

Preliminary Data Model - line segments model – A polyline in three-dimensional space. Query Types – Spatio-temporal Range Query: – Q(E s, E t ) → {S k } – Trajectory-based Query: – Q(O, E t ) → {S k }

Trajectory Processing Execution Overviews

Storage Data are grouped with key and organized in data chunks in GFS-style storage. The whole data set is divided into several parts, and each part is called a partition and assigned to one data chunk to store. Each trajectory data is assigned to at least one partition according to spatio-temporal information

Storage A good spatio-temporal partitioning makes the size of data per chunk is fairly uniform. Static partitioning strategies are easy to control and suitable for distributed scheduling, but may lead to load imbalance. Dynamic strategies can resolve load imbalance, but re-split data can cause distantly migration of large volume of data in clusters. Appropriate strategies should be trained

PMI (Partition based Multilevel Index) Aim to speed up spatio-temporal range queries. Generate all candidate partitions by invoking space partition strategy. Store together as key/value. – Each data chunk only contains trajectory segments that belong to the same partition. Multilevel index for each node can be built local. (using traditional centralized methods)

OII (Object Inverted Index) Aim to speed up trajectory based queries. Collect each object's all historical trajectories. Store together as key/value. – – Access according to key(object identifier).

Data Insertion

Query Processing Trajectory based Queries – Given any object ID, the system can locate the object's trajectory according to OII. Range Queries

Experimental Study Settings – Hadoop version 0.19.0 – 8 PC nodes Ubuntu Linux version 8.04 Pentium IV 1.7GHz CPU 512M memory – Java SDK 1.42 – Experiment data: Network-based Generator

Experiments – Load Balance Standard Deviation of Partitioning Load Balance of PRADASE

Experiments – Data Importing and Index Creating Data Importing with PMI Data Importing with OII

Experiments – Query Processing Spatio-temporal Range Query Processing with PMITrajectory Base Query Processing with OII

Future Works More heuristic partitioning methods. Reducing data migration between nodes. Efficient real-time query processing on Cloud infrastructure.

Thanks!

Query Processing of Massive Trajectory Data based on MapReduce Qiang Ma, Bin Yang (Fudan University) Weining Qian, Aoying Zhou (ECNU) Presented By: Xin.

Similar presentations

Presentation on theme: "Query Processing of Massive Trajectory Data based on MapReduce Qiang Ma, Bin Yang (Fudan University) Weining Qian, Aoying Zhou (ECNU) Presented By: Xin."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Query Processing of Massive Trajectory Data based on MapReduce Qiang Ma, Bin Yang (Fudan University) Weining Qian, Aoying Zhou (ECNU) Presented By: Xin.

Similar presentations

Presentation on theme: "Query Processing of Massive Trajectory Data based on MapReduce Qiang Ma, Bin Yang (Fudan University) Weining Qian, Aoying Zhou (ECNU) Presented By: Xin."— Presentation transcript:

Similar presentations

About project

Feedback