Download presentation
Presentation is loading. Please wait.
Published byClara May Modified over 9 years ago
1
YOU LI SUPERVISOR: DR. CHU XIAOWEN CO-SUPERVISOR: PROF. LIU JIMING THURSDAY, MARCH 11, 2010 Speeding up k-Means by GPUs 1
2
Outline Introduction Efficiency of data mining -> GPGPU -> k-means on GPU; Related work Method Research Plan 2
3
Efficiency of Data mining Face the challenge of efficiency due to the increasing data Parallel data mining Fig.1 Fig.2 3
4
4 The efficiency of data mining
5
GPGPU A general-purpose and high performance parallel hardware; Supply another platform for parallelizing data mining algorithms. DRAM Cache ALU Control ALU DRAM CPU GPU Fig.3 5
6
k-means on GPU Programming on GPU CUDA: integrated CPU+GPU, C program k-Means Widely used in statistical data analysis, pattern recognition, etc.; Easy to implement on CPU, suitable to implement on GPU; 6
7
Outline Introduction Related work UV_k-Means, GPUMiner and HP_k-Means; Method Research Plan 7
8
Related work nkd MineBech on CPU HP k-Means UV k-Means GPU Miner 2 million 1002 19.361.45 2.84 61.39 4002 70.932.16 5.96 63.46 1008 39.812.48 6.07192.05 4008152.254.5316.32226.79 4 million 1002 38.742.88 5.64130.36 4002141.844.3811.94126.38 1008 79.604.9512.85383.41 4008304.469.0334.54474.83 Speed of k-Means on low dimension data, in second. NVIDIA GTX 280 GPU; Intel(R) Core(TM) i5 CPU; 8
9
Outline Introduction Related work Method and Results k-Means (three steps)-> step 1 -> step 2 -> step 3; Experiments; Research Plan 9
10
k-Means algorithm n data point; k centroid; Compute distanc (n i, k i ) find the closest centroid compute new centroid If centroid change? Yes No End Step 1 O(nkd) Step 2 O(nk) Step 3 O(nd) Memory Mechanism 10
11
Memory Mechanism of GPU Global Memory Large size Long latency Register Small size Short latency User cannot control Shared memory Medium size Short latency User control 11
12
k-Means on GPU Key idea Increase the number of computing operation for each global memory access; Adopts the method from matrix multiplication and reduction. Dimension is a key parameter For low dimension: use register; For high dimension: use shared memory; 12
13
k-Means on GPU For low dimension Read each data from global memory once 13
14
k-Means on GPU For high dimension Read each data from global memory once 14
15
Experiments The experiments were conducted on a PC with an NVIDIA GTX280 GPU and an Intel(R) Core(TM) i5 CPU. GTX 280 has 30 SIMD multi-processors, and each one contains eight processors and performs at 1.29 GHz. The memory of the GPU is 1GB with the peak bandwidth of 141.7 GB/sec. The CPU has four cores running at 2.67 GHz. The main memory is 8 GB with the peak bandwidth of 5.6 GB/sec. We use Visual Studio 2008 to write and compile all the source code. The version of CUDA is 2.3. We calculate the time of the application after the file I/O, in order to show the speedup effect more clearly. 15
16
Experiments On low dimension data Compare with HP, UV and GPUMiner, the data is generated randomly Four to ten times faster than HP 16
17
Experiments On high dimension data Compare with UV and GPUMiner, the data is from KDD 1999. Four to eight times faster than UV 17
18
Experiments Compare with CPU The results illustrate that our algorithm compares very favorably with other existing algorithms. Forty to two hundred times faster than CPU version 18
19
Outline Introduction Related work Method Research Plan 19
20
Research Plan Detail analysis about k-Means on GPU GFLOPS Deal with even larger data set Other data mining algorithms on GPU K-nn SDP (widely used in protein identification ) 20
21
Q & A Thanks very much 21
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.