Copyright 2005, Data Mining Research Lab, The Ohio State University Cache-conscious Frequent Pattern Mining on a Modern Processor Amol Ghoting, Gregory.

Copyright 2005, Data Mining Research Lab, The Ohio State University Cache-conscious Frequent Pattern Mining on a Modern Processor Amol Ghoting, Gregory Buehrer, and Srinivasan Parthasarathy Data Mining Research Laboratory, CSE The Ohio State University Daehyun Kim, Anthony Nguyen, Yen-Kuang Chen, and Pradeep Dubey Intel Corporation

Copyright 2005, Data Mining Research Lab, The Ohio State University Motivation Data mining applications –Rapidly growing segment in commerce and science –Interactive  response time is important –Compute- and memory- intensive Modern architectures –Memory wall –Instruction level parallelism (ILP) FP-Growth Note: Experiment conducted on specialized hardware SATURATION 2.4x 1.6x

Copyright 2005, Data Mining Research Lab, The Ohio State University Contributions We characterize the performance and memory access behavior of three state-of-the-art frequent pattern mining algorithms We improve the performance of the three frequent pattern mining algorithms –Cache-conscious prefix tree Spatial locality + hardware pre-fetching Path tiling to improve temporal locality Co-scheduling to improve ILP on a simultaneous multi-threaded (SMT) processor

Copyright 2005, Data Mining Research Lab, The Ohio State University Finds groups of items that co-occur frequently in a transactional data set Example: Minimum support = 2 Frequent patterns: –Size 1: (milk), (bread), (eggs), (sugar) –Size 2: (milk & bread), (milk & eggs), (sugar & eggs) –Size 3: None Search space traversal: breadth-first or depth-first Frequent pattern mining (1) Customer 1:milk bread cereal Customer 2:milk bread eggs sugar Customer 3:milk bread butter Customer 4:eggs sugar

Copyright 2005, Data Mining Research Lab, The Ohio State University Frequent pattern mining (2) Algorithms under study –FP-Growth (based on the pattern-growth method) Winner of the 2003 Frequent Itemset Mining Implementations (FIMI) evaluation –Genmax (depth-first search space traversal) Maximal pattern miner –Apriori (breadth-first search space traversal) All algorithms use a prefix tree as a data set representation

Copyright 2005, Data Mining Research Lab, The Ohio State University Setup Intel Xeon processor –At 2Ghz with 4GB of main memory –4-way 8KB L1 data cache –8-way 512KB L2 cache on chip –8-way 2MB L3 cache Intel VTune Performance Analyzers to collect performance data Implementations –FIMI repository FP-Growth (Gosta Grahne and Jianfei Zhu) Genmax (Gosta Grahne and Jianfei Zhu) Apriori (Christian Borgelt) –Custom memory managers

Copyright 2005, Data Mining Research Lab, The Ohio State University Execution time breakdown FPGrowthGenmaxApriori Count-FPGrowth () – 61%Count-GM () – 91%Count-Apriori () – 70% Project-FPGrowth () – 31%Other – 9%Candidate-Gen () – 25% Other – 8%Other – 5% Support counting in a prefix tree Similar to Count-FPGrowth ()

Copyright 2005, Data Mining Research Lab, The Ohio State University Operation mix Count- FPGrowth () Count- GM () Count- Apriori () Integer ALU operations per instruction 0.650.640.34 Memory operations per instruction 0.720.690.66 Note: Each column need not sum up to 1

Copyright 2005, Data Mining Research Lab, The Ohio State University Memory access behavior 8%9% CPU utilization 0.040.03 L3 misses per instruction 27%40%39%L3 hit rate 49%42%43%L2 hit rate 86%87%89%L1 hit rate Count-Apriori()Count-GM()Count-FPGrowth()

Copyright 2005, Data Mining Research Lab, The Ohio State University FP-tree Minimum support = 3 a:1 p:1 c:1 f:1 m:1 r COUNT ITEM NODE POINTER CHILD POINTERS PARENT POINTER a:2 p:1 c:2 f:2 m:1 r b:1 m:1 a:4 p:2 c:3 f:3 m:2 c:1 b:1 p:1 f:1 b:1 r m:1 Index based on largest common prefix Typically results in a compressed data set representation Node pointers allow for fast searching of items

Copyright 2005, Data Mining Research Lab, The Ohio State University FP-Growth Basic step: –For each item in the FP-tree, build conditional FP-tree –Recursively mine the conditional FP-tree a:4 p:2 c:3 f:3 m:2 c:1 b:1 p:1 f:1 b:1 r m:1 r a:3 c:3 f:3 Conditional FP-tree for m: COUNT ITEM NODE POINTER CHILD POINTERS PARENT POINTER Mine conditional FP-tree We process items p, f, c, a, b similarly Dynamic data structure and only two node fields are used through the bottom up traversal Poor spatial locality Large data structure Poor temporal locality Pointer de-referencing Poor instruction level parallelism (ILP)

Copyright 2005, Data Mining Research Lab, The Ohio State University Cache-conscious prefix tree Improves cache line utilization –Smaller node size + DFS allocation Allows the use of hardware cache line pre-fetching for bottom-up traversals a p c f m c b p f b b r m ITEM o ooo oo oo oo oo Header lists (for Node pointers) o ooo oo oo oo oo Count lists (for Node counts) DFS allocation a b c f m p a b c f m p COUNT CHILD POINTERS NODE POINTER PARENT POINTER

Copyright 2005, Data Mining Research Lab, The Ohio State University Path tiling to improve temporal locality DFS order enables breaking the tree into tiles based on node addresses –These tiles can partially overlap Maximize tree reuse –All tree accesses are restructured to operate on a tile- by-tile basis r Tile 1 Tile 2Tile NTile N-1

Copyright 2005, Data Mining Research Lab, The Ohio State University Improving ILP using SMT Simultaneous multi-threading (SMT) –Intel hyper-threading (HT) maintains two hardware contexts on chip –Improves instruction level parallelism (ILP) When one thread waits, the other thread can use CPU resources Identifying independent threads is not good enough Unlikely to hide long cache miss latency Can lead to cache interference (conflicts) Solution: Restructure multi-threaded computation to reuse cache on a tile-by-tile basis r Tile 1 Tile 2Tile NTile N-1 Thread1Thread2 Same data but different computation

Copyright 2005, Data Mining Research Lab, The Ohio State University Speedup for FP-Growth (Real data set) 50000583506665075000 For FP-Growth, L1 hit rate improves from 89% to 94% L2 hit rate improves from 43% to 98% Speedup Genmax – up to 4.5x Apriori – up to 3.5x

Copyright 2005, Data Mining Research Lab, The Ohio State University Related work (1) Data mining algorithms –Characterizations Self organizing maps –Kim et al. [WWC99] C4.5 –Bradford and Fortes [WWC98] Sequence mining, graph mining, outlier detection, clustering, and decision tree induction –Ghoting et al. [DAMON05] –Memory placement techniques for association rule mining Considered the effects of memory pooling and coarse grained spatial locality on association rule mining algorithms in a serial and parallel setting –Parthasarathy et al. [SIGKDD98,KAIS01]

Copyright 2005, Data Mining Research Lab, The Ohio State University Related work (2) Data base algorithms –DBMS on modern hardware Ailamaki et al. [VLDB99,VLDB2001] –Cache sensitive search trees and B + -trees Rao and Ross [VLDB99,SIGMOD00] –Prefetching for B + -trees and Hash-Join Chen et al. [SIGMOD00,ICDE04]

Copyright 2005, Data Mining Research Lab, The Ohio State University Ongoing and future work Algorithm re-design for next generation architectures –e.g. graph mining on multi-core architectures Cache-conscious optimizations for other data mining and bioinformatics applications on modern architectures –e.g. classification algorithms, graph mining Out-of-core algorithm designs Microprocessor design targeted at data mining algorithms

Copyright 2005, Data Mining Research Lab, The Ohio State University Conclusions Characterized the performance of three popular frequent pattern mining algorithms Proposed a tile-able cache-conscious prefix tree –Improves spatial locality and allows for cache line pre- fetching –Path tiling improves temporal locality Proposed novel thread-based decomposition for improving ILP by utilizing SMT –Overall, up to 4.8-fold speedup Effective algorithm design in data mining needs to take into account modern architectural designs.

Copyright 2005, Data Mining Research Lab, The Ohio State University Thanks We would like to acknowledge the following grants –NSF: CAREER-IIS-0347662 –NSF: NGS-CNS-0406386 –NSF: RI-CNS-0403342 –DOE: ECPI-DE-FG02-04ER25611

Copyright 2005, Data Mining Research Lab, The Ohio State University Cache-conscious Frequent Pattern Mining on a Modern Processor Amol Ghoting, Gregory.

Similar presentations

Presentation on theme: "Copyright 2005, Data Mining Research Lab, The Ohio State University Cache-conscious Frequent Pattern Mining on a Modern Processor Amol Ghoting, Gregory."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Copyright 2005, Data Mining Research Lab, The Ohio State University Cache-conscious Frequent Pattern Mining on a Modern Processor Amol Ghoting, Gregory.

Similar presentations

Presentation on theme: "Copyright 2005, Data Mining Research Lab, The Ohio State University Cache-conscious Frequent Pattern Mining on a Modern Processor Amol Ghoting, Gregory."— Presentation transcript:

Similar presentations

About project

Feedback