Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tree-Based Density Clustering using Graphics Processors

Similar presentations


Presentation on theme: "Tree-Based Density Clustering using Graphics Processors"— Presentation transcript:

1 Tree-Based Density Clustering using Graphics Processors
A First Marriage of MRNet and GPUs Evan Samanas and Ben Welton Roadmap or Through the roadmap. What our contribution was. Explain our contribution. Move tweet slide forward. Some more info to catch interest. Tweet slide after this. One person standing.

2 Tree-Based Density Clustering using Graphics Processors
The Tweet Stream v v ` ` Source: Twitter, Map: About.com Tree-Based Density Clustering using Graphics Processors

3 Tree-Based Overlay Networks (TBONs)
Scalable multicast Scalable gather Scalable data aggregation FE BE app CP Tree-Based Density Clustering using Graphics Processors

4 MRNet – Multicast / Reduction Network
General-purpose TBON API Network: user-defined topology Stream: logical data channel to a set of back-ends multicast, gather, and custom reduction Packet: collection of data Filter: stream data operator synchronization transformation Widely adopted by HPC tools CEPBA toolkit Cray ATP & CCDB Open|SpeedShop & CBTF STAT TAU FE F(x1,…,xn) CP CP CP CP CP CP Forgot to mention reduction operation with streams. Clean up boxes BE app BE app BE app BE app Tree-Based Density Clustering using Graphics Processors

5 Tree-Based Density Clustering using Graphics Processors
TBON Computation Data to process: e.g. 40 MB Total Time: ~60 sec Total Time: ~30 sec Ideal Characteristics: Filter output size constant or decreasing Computation rate similar across levels Adjustable for load balance FE ~10 sec Packet Size: ≤10 MB ~10 sec CP CP Packet Size: ≤10 MB BE app BE app BE app 4x BE app ~10 sec ~40 sec Simple animation for each bullet point? Should be constant/decreasing output size. Would hit scaling limits soon. Nuke original bullet. Keep sub-bullet. Mention that these are general features of MRNet (ideal computation). Replace symbol on <= with actual symbol. Data Size: 10MB per BE ~10 sec Tree-Based Density Clustering using Graphics Processors

6 Tree-Based Density Clustering using Graphics Processors
Why GPUs? Natural fit Increase compute power Trade computation for bandwidth Derived summaries Compute and send ∆ data Compression algorithms (e.g. LZO, zlib, etc.) FE F(x1,…,xn) CP CP CP CP CP CP Mention Jaguar at end? BE app BE app BE app BE app Tree-Based Density Clustering using Graphics Processors

7 Clustering Example (DBSCAN[1])
The two parameters that determine if a point is in a cluster is Epsilon (Eps), and MinPts For every discovered point, this same calculation is performed until the cluster is fully expanded Goal: Find regions that meet minimum density and spatial distance characteristics If the number of points in Eps is > MinPts, the point is a core point. Min Points Eps Min Pts: 3 Attribute to Hans on slide, try and fix the transition between 5 and 6. Don’t say that we had gpu’s and were looking for A problem. Change 1 on slide title. [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) Tree-Based Density Clustering using Graphics Processors

8 Tree-Based Density Clustering using Graphics Processors
Scaling DBSCAN PDBSCAN[2] Quality equivalent to single DBSCAN Linear speedup up to 8 nodes DBDC[3] Sacrifices quality ~30x speedup on 15 nodes CUDA-Dclust[4] Quality equivalent to DBSCAN ~15x faster on 1 node CUDA-Dclust looks like a third example of an algorithm. Change sub-bullet type or do something. Define quality maybe here? Mention speed up for number nodes for each of these. Say it’s a GPU running on a single node [2] X. Xu et. al., A fast Parallel Clustering Algorithm for Large Spatial Databases (1999) [3] E. Januzaj et. al., DBDC: Density Based Distributed Clustering (2004) [4] C. Bohm et al., Density-based clustering using graphics processors (2009) Tree-Based Density Clustering using Graphics Processors

9 Tree-Based Clustering: Mr. Scan
Algorithm Steps SpatialDecomp: FE) DBSCAN: CPU or BE) DrawBoundBox: CPU or GPU MergeCluster: CPU (x #levels) FE MergeCluster CP CP DBSCAN BE BE BE BE Talk about dbdc. 15sec’s to explain where it came from. Discuss number of levels/cpu/gpu/BE/etc on this slide. No parenthesis after function names. Wide dash is a bad idea. Tree-Based Density Clustering using Graphics Processors

10 Spatial Decomposition
Eps 1. Start with an input of Spatially Referenced points 2. Partition the region into equal sized density regions across one dimension 3. Add the shadow region area of one Epsilon to all density regions Partition #3 Partition #1 Partition #2 Split the area so that we show points split in area. Label Eps. Tree-Based Density Clustering using Graphics Processors

11 Tree-Based Density Clustering using Graphics Processors
DBSCAN - CPU Run on local slice for each BE R* tree Start at random point Cluster w/ respect to Eps, MinPts Complexity: O(n log n) A B R* - show something about it. Say it in a way that we extended dbscan via the following. Discuss the relationship of the levels of the tree. How does it work. C D E R* Tree Example Tree-Based Density Clustering using Graphics Processors

12 GPU DBSCAN Filter Candidate Clusters are potential clusters being explored concurrently in the GPU Number of cluster candidates is limited by GPU Characteristics State array stores the current state of points in the search space States Candidate #1 Coarse grain search space Chain Collisions causing cluster merges Candidate #2 . Canidate Clusters == Chains. What happens to chain number 1/2 do after states merge. What is a thread block maybe replace with limited by gpu charactoristics. Reference the cuda-dclust algorithm. More spacing between text. Consistentcy with justification. Candidate #N GPU DBSCAN operates similarly to the CPU version with two exceptions CUDA-DCLUST [09 – Böhm]  Tree-Based Density Clustering using Graphics Processors

13 DrawBoundBox – CPU | | GPU
Tree-Based Density Clustering using Graphics Processors

14 Tree-Based Density Clustering using Graphics Processors
MergeBoundBox - CPU Checks for merge if box within shadow At least one point MUST be in common Iterate through ALL points in right cluster Match! Tree-Based Density Clustering using Graphics Processors

15 Tree-Based Density Clustering using Graphics Processors
The Tweet Stream ` ` Source: Twitter, Map: About.com Tree-Based Density Clustering using Graphics Processors

16 Tree-Based Density Clustering using Graphics Processors
Evaluation Dataset: 1-3 “Tweet Days” Measuring: Time to completion Quality compared to single-threaded DBSCAN Algorithms: Single-Threaded DBSCAN DBDC MRNet w/DBSCAN filter MRNet w/DBSCAN GPU filter Not a results slide, this is an eval slide. Maybe use Evaluation. Tree-Based Density Clustering using Graphics Processors

17 Tree-Based Density Clustering using Graphics Processors
Results Tree-Based Density Clustering using Graphics Processors

18 Tree-Based Density Clustering using Graphics Processors
Results Tree-Based Density Clustering using Graphics Processors

19 Tree-Based Density Clustering using Graphics Processors
Results Tree-Based Density Clustering using Graphics Processors

20 Tree-Based Density Clustering using Graphics Processors
Quality Change range to 120%, 100k or a comma or something. Change colors. Change xxx.00%. Drop decimals Tree-Based Density Clustering using Graphics Processors

21 Tree-Based Density Clustering using Graphics Processors
Future Work Scaling Issues Spatial Decomposition Merging Algorithm Tree-Based Density Clustering using Graphics Processors

22 2D Spatial Decomposition
Eps 7pts 11pts 1D Spatial Decomposition has some severe limitations Splits can have wildly differing point counts Number of splits limited by Epsilon 2D Spatial Decomposition would allow for more Splits with more equal point counts Tree-Based Density Clustering using Graphics Processors

23 Tree-Based Density Clustering using Graphics Processors
Merging Algorithm Bounding Box No quality degradation Limits iteration, but still iterates Possible Alternative: Concave Hull DBSCAN on border points Avoids iteration Constant output size O(n2) on BEs Tree-Based Density Clustering using Graphics Processors

24 Tree-Based Density Clustering using Graphics Processors
Use Cases Twitter Data “Flu Tweets” Mood/Topic clustering Riot prediction Any Spatial data Currently limited to 2D Tree-Based Density Clustering using Graphics Processors

25 Tree-Based Density Clustering using Graphics Processors
Conclusion DBSCAN performance scales Quality able to be maintained Goal: Scale O(100,000 nodes) Tree-Based Density Clustering using Graphics Processors


Download ppt "Tree-Based Density Clustering using Graphics Processors"

Similar presentations


Ads by Google