Presentation is loading. Please wait.

Presentation is loading. Please wait.

Outline Summary an Future Work Introduction

Similar presentations


Presentation on theme: "Outline Summary an Future Work Introduction"— Presentation transcript:

1 Outline Summary an Future Work Introduction
Big Spatial data, GPU Computing and Distributed Platforms Spatial query processing on GPUs ISP System Architecture Implementations Experiments Setup Single-Node performance Scalability on Amazon EC2 Clusters Summary an Future Work

2 Taxi trip data in NYC Taxicabs 13,000 Medallion taxi cabs
License priced at > $1M Car services and taxi services are separate Taxi trip records ~170 million trips (300 million passengers) in 2009 1/5 of that of subway riders and 1/3 of that of bus riders in NYC 2 2

3 Taxi trip data in NYC Over all distributions of trip distance, time, speed and fare (2009)

4 Taxi trip data in NYC How to manage taxi trip data? How good are they?
Geographical Information System (GIS) Spatial Databases (SDB) Moving Object Databases (MOD) How good are they? Pretty good for small amount of data  But, rather poor for large-scale data 

5 Taxi trip data in NYC Can we do better? Example 1:
Loading 170 million taxi pickup locations into PostgreSQL UPDATE t SET PUGeo = ST_SetSRID(ST_Point("PULong","PuLat"),4326); 105.8 hours! Example 2: Finding the nearest tax blocks for 170 million taxi pickup locations using open source libspatiaindex+GDAL 30.5 hours! Intel Xeon 2.26 GHz processors with 48G memory I do not have time to wait... Can we do better?

6 Cloud computing+MapReduce+Hadoop
B C Thread Block CPU Host (CMP) Core Local Cache Shared Cache DRAM HDD SSD GPU SIMD PCI-E Ring Bus ... GDRAM MIC T0 T1 T2 T3 4-Threads In-Order 16 Intel Sandy Bridge CPU cores+ 128GB RAM + 8TB disk + GTX TITAN + Xeon Phi 3120A ~ $9,994

7 Attractive Features Extension is challenging!
ISP-GPU: Scaling out Geospatial Data Processing to GPU Clusters SQL Frontend: translate SQL queries into execution plans C/C++ backend with SSE4 support (for strings operations) Efficient implementations of hash-joins (partitioned and non-partitioned) LLVM-based JIT …. Attractive Features Extension is challenging!

8 7.1 billion transistors (551mm²) 2,688 processors
Feb. 2013 7.1 billion transistors (551mm²) 2,688 processors 4.5 TFLOPS SP and 1.3 TFLOPS DP Max bandwidth GB/s PCI-E peripheral device 250 W (17.98 GFLOPS/W -SP) Suggested retail price: $999 ASCI Red: 1997 First 1 Teraflops (sustained) system with Intel Pentium II Xeon processors (in 72 Cabinets) What can we do today using a device that is more powerful than ASCI Red 16 years ago?

9 Outline Summary an Future Work Introduction
Big Spatial data, GPU Computing and Distributed Platforms Spatial query processing on GPUs ISP System Architecture Implementations Experiments Setup Single-Node performance Scalability on Amazon EC2 Clusters Summary an Future Work

10 Spatial query processing on GPUs
Single-Level Grid-File based Spatial Filtering Vertices (polygon/polyline) Points Perfect coalesced memory accesses Utilizing GPU floating point computing power Nested-Loop based Refinement J. Zhang, S. You and L. Gruenwald, "Parallel Online Spatial and Temporal Aggregations on Multi-core CPUs and Many-Core GPUs," Information Systems, vol. 44, p. 134–154, 2014.

11 Spatial query processing on GPUs
38,794 census blocks (470,941 points) 735,488 tax blocks (4,698,986 points) 147,011 street segments P2N-D P2P-T P2P-D P2N-D P2P-T P2P-D - 15.2 h 30.5 h 10.9 s 11.2 s 33.1 s 4,900X 3,200X Algorithmic improvement: 3.7X Using main-memory data structures: 37.4X GPU Acceleration: 24.3X CPU time GPU Time Speedup

12 Outline Summary an Future Work Introduction
Big Spatial data, GPU Computing and Distributed Platforms Spatial query processing on GPUs ISP-MC+ and ISP-GPU System Architecture Implementations Experiments Setup Single-Node performance Scalability on Amazon EC2 Clusters Summary an Future Work

13 pip_join(…) nearest_join(…) create_rtree(…)
class SpatialJoinNode : public BlockingJoinNode { public: SpatialJoinNode(ObjectPool* pool, const TPlanNode& tnode, const DescriptorTbl& descs); virtual Status Prepare(RuntimeState* state); virtual Status GetNext(RuntimeState* state, RowBatch* row_batch, bool* eos); virtual void Close(RuntimeState* state); protected: virtual Status InitGetNext(TupleRow* first_left_row); virtual Status ConstructBuildSide(RuntimeState* state); private: boost::scoped_ptr<TPlanNode> thrift_plan_node_; RuntimeState* runtime_state_; } pip_join(…) nearest_join(…) create_rtree(…)

14 ISP-GPU: Scaling out Geospatial Data Processing to GPU Clusters

15 Outline Summary an Future Work Introduction
Big Spatial data, GPU Computing and Distributed Platforms Spatial query processing on GPUs ISP System Architecture Implementations Experiments Setup Single-Node performance Scalability on Amazon EC2 Clusters Summary an Future Work

16 Taxi trip data in NYC Taxicabs 13,000 Medallion taxi cabs
License priced at > $1M Car services and taxi services are separate Taxi trip records ~170 million trips (300 million passengers) in 2009 1/5 of that of subway riders and 1/3 of that of bus riders in NYC 16 16

17 Global Biodiversity Data at GBIF
SELECT aoi_id, sp_id, sum (ST_area (inter_geom)) FROM ( SELECT aoi_id, sp_id, ST_Intersection (sp_geom, qw_geom) AS inter_geom FROM SP_TB, QW_TB WHERE ST_Intersects (sp_geometry, qw_geom) ) GROUP BY aoi_id, sp_id HAVING sum(ST_area(inter_geom)) >T; 17 17

18 Single-node results: 16core CPU/128GB, GTX Titan
ISP-GPU: Scaling out Geospatial Data Processing to GPU Clusters Single-node results: 16core CPU/128GB, GTX Titan ISP-GPU ISP-MC+ GPU-Standalone MC-Standalone taxi-nycb (s) 96 130 50 89 GBIF-WWF(s) 1822 2816 1498 2664 Taxi-nycb: ~170 million points, ~40 thousand polygons (9 vertices/polygon) GBF-WWF: ~375 million points, ~15 thousand polygons (279 vertices/polygon) Cluster results: 2-10 nodes each with 8 vCPU cores/15GB, CUDA cores/4 GB (50 million species locations used due to memory constraint)

19 Outline Summary an Future Work Introduction
Big Spatial data, GPU Computing and Distributed Platforms Spatial query processing on GPUs ISP System Architecture Implementations Experiments Setup Single-Node performance Scalability on Amazon EC2 Clusters Summary an Future Work

20 Summary and Future Work
Designs and implementations of an in-memory spatial data management system on multi-core CPU and many-core GPU clusters by extending Cloudera Impala for distributed spatial join query processing Experiments on the initial implementations have revealed both advantages and disadvantages of extending a tightly-coupled big data system to support spatial data types and their operations. Alternative techniques are being developed to further improve efficiency, scalability, extensibility and portability.

21 SpatialSpark: Just Open-Sourced
Alternative Techniques SpatialSpark: Just Open-Sourced val sc = new SparkContext(conf) //reading left side data from HDFS and perform pre-processing val leftData = sc.textFile(leftFile, numPartitions).map(x => x.split(SEPARATOR)).zipWithIndex() val leftGeometryById = leftData.map(x => (x._2, Try(new WKTReader().read(x._1.apply(leftGeometryIndex))))) .filter(_._2.isSuccess).map(x => (x._1, x._2.get)) //similarly for right-side data…. //ready for spatial query (broadcast-based) val joinPredicate =SpatialOperator.Within // NearestD can be applied similarly var matchedPairs:RDD[(Long, Long)] = BroadcastSpatialJoin(sc, leftGeometryById, rightGeometryById, joinPredicate)

22 Alternative Techniques
Lightweight Distributed Execution Engine for Large-Scale Spatial Join Query Processing


Download ppt "Outline Summary an Future Work Introduction"

Similar presentations


Ads by Google