Download presentation
1
SpatialHadoop:A MapReduce Framework
for Spatial Data 汇报人:赵郁亮 ICDE 2015
2
Executive Summary Propose a full-fledged MapReduce framework with native support for spatial data. Propose a new system architecture with fourlayers:language,operations,mapreduce and storage layers. SpatialHadoop achieve orders of magnitude better performance than hadoop for spatial data processing.
3
SpatialHadoop Architecture Experiments
Outline Introduction Related work SpatialHadoop Architecture Experiments
4
ESRI has released ‘GIS Tools on Hadoop’.
Introduction An explosion in the amounts of spatial data were produced by various devices such as smart phones,satellites,and medical devices. Hadoop was adopted as a solution for scalable processing of huge datasets in many applications,e.g.,machine learning ,graph processing and behavioral simulations. ESRI has released ‘GIS Tools on Hadoop’.
5
Introduction Parallel-Secondo MD-HBase Hadoop-GIS SpatialHadoop
6
Specific spatial operations R-tree construction Range query kNN query
Related work Specific spatial operations R-tree construction Range query kNN query All NN query Systems Hadoop-GIS MD-Hbase Parallel-Secondo
7
SpatialHadoop Architecture
8
Language Layer(Pigeon) Data types
SpatialHadoop Architecture Language Layer(Pigeon) Data types Spatial functions KNN query
9
Storage Layer(Indexing)
SpatialHadoop Architecture Storage Layer(Indexing) Existing techniques for spatial indexing in Hadoop 1) Build only 2)Custom on-the-fly indexing 3) Indexing in HDFS
10
Storage Layer(Indexing) Overview of Indexing in SpatialHadoop
SpatialHadoop Architecture Storage Layer(Indexing) Overview of Indexing in SpatialHadoop
11
Step1:Number of partitions. Step2:Partitions boundaries.
SpatialHadoop Architecture Index Building 1)Partitioning Step1:Number of partitions. Step2:Partitions boundaries. Step3:Physical partitioning 2)Local Indexing 3)Global Indexing
12
SpatialHadoop Architecture
Grid file
13
SpatialHadoop Architecture
R-tree
14
SpatialHadoop Architecture
R+-tree
15
SpatialHadoop Architecture
MapReduce Layer
16
SpatialHadoop Architecture
Operations Layer Range Query KNN
17
Step3:Duplicate avoidance
SpatialHadoop Architecture Operations Layer Spatial Join Step1:Global join Step2:Local join Step3:Duplicate avoidance
18
TIGER:spatial features in the US such as streets and rivers(60G).
Experiments DataSet TIGER:spatial features in the US such as streets and rivers(60G). OSM:OpenStreetMap(60G) NASA:120 Billion(4.6 TB) SYNTH:2 Billion(128 GB,uniform distribution) Experiment Environment Amazon EC2 cluster of up to 100 nodes Hadoop on java 1.6
19
Experiments Evaluation Range Query
20
Experiments Evaluation Range Query
21
Experiments Evaluation KNN
22
Experiments Evaluation Spatial Join
23
Experiments Evaluation Index Creation
24
Thanks !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.