Download presentation
Presentation is loading. Please wait.
1
Weekly Report Start learning GPU Ph.D. Student: Leo Lee date: Sep. 18, 2009
2
Outline References CUDA Work plan
3
Outline References CUDA Work plan
4
References
5
Frequent itemset mining on graphics Introduction –Two representative algorithms: Apriori and FP- growth; FP-growth were generally faster than Apriori; Apriori-borgelt was slightly faster when the support was high; –No prior work focuses on studying the GPU acceleration for FIM algorithms. –Challenge: the data structure is not aligned and access patterns are not regular (pointer-chasing).
6
Frequent itemset mining on graphics Background and related work-GPGPU –The parallel primitives [19] are a small set of common operations exploiting the architectural features of GPUs. We utilize map, reduce, and prefix sum primitives in our two FIM implementations. –Improvement - Memory optimizations: Local memory optimization for temporal locality Coalesced access optimization of device memory for spatial locality The built-in vector data type to reduce the number of memory access. –Difference we study the GPU acceleration of Apriori for FIM, which incurs much more complex control fows and memory accesses than performing database joins or maintaining quantiles from data streams.
8
Frequent itemset mining on graphics Implementation
9
Frequent itemset mining on graphics Implementation
10
Frequent itemset mining on graphics Implementation-Pure Bitmap Implementation
11
Frequent itemset mining on graphics Implementation-PBI Given m frequent (K ¡1)-itemsets, and n items. In order to check whether one (K ¡ 1)-itemset is frequent, we need to access (logm*(n/128)*16) bytes of data, where logm is the cost of performing a binary search, and (n/128)*16 is the size of a row (in bytes) in the bitmap of (K¡1)- itemsets. Typically, if m = 10000 and n = 10000, we need to access about 16 KB for checking only one (K ¡ 1)-subset. This problem in our pure bitmap- based solution triggers us to consider adopting another data structure in the Candidate Generation procedure in the presence of a large number of items.
12
Frequent itemset mining on graphics Implementation-Trie based Implemetation The candidate generation based on trie traversal is implemented on the CPU. This decision is based on the fact that, the trie is an irregular structure and difficult to share among SIMD threads. Thus, we store the trie representing itemsets in the CPU memory, and the bitmap representation of transactions in the GPU device memory.
13
Frequent itemset mining on graphics Implementation-TBI
14
Frequent itemset mining on graphics Experiments
15
Frequent itemset mining on graphics Experiments
16
Frequent itemset mining on graphics Results
17
Frequent itemset mining on graphics Results
18
Frequent itemset mining on graphics Results
19
Frequent itemset mining on graphics Results
20
Outline References CUDA Work plan
21
CUDA Review the code of K-means –CPU: 1101 S (10 S) –GPU: still need debug, no results right now
22
Outline References CUDA Work plan
23
Work Plan Summary this month Make plan for next month Try to implement a data mining algorithm Homework
24
References Key wordsGoogle scholarACM portal GPU decision tree2,230222 GPU k-means 388184 GPU SVM 416 27 GPU Apriori1,980 11 GPU Expectation Maximization 266 24 GPU PageRank4,260 5 GPU AdaBoost 113 9 GPU k-nn 314 20 GPU Naive Bayes 1042 (false positive) GPU CART1,0403 (false positive)
25
Thanks for your listening
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.