Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sequential Pattern Mining Using A Bitmap Representation

Similar presentations


Presentation on theme: "Sequential Pattern Mining Using A Bitmap Representation"— Presentation transcript:

1 Sequential Pattern Mining Using A Bitmap Representation
Authors: Jay Ayres, Johannes Gehrke, Tomi Yiu and Jason Flannick Source: The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, 2002.

2 Outline Introduction SPAM (Sequential PAttern mining) algorithm
Lexicographic tree for sequences Depth first tree traversal Pruning S-step I-step Data representation - Bitmap

3

4 S= ({a}, {b, c}) is a sequence
The support of S is SupD(S) Frequent sequential pattern: SupD(S) >= Min Support SupD(S) = SupD ({a}, {b, c} ) = 2

5 SPAM (Sequential pattern mining)
S = ({a, b, c}, {a, b}) Sequence length: Length (S) = 5 Sequence size: Size (S) = 2 Sequence-extended sequence Itemset-extended sequence S’ = ({a, b, c}, {a, b}, {a}) S’ = ({a, b, c}, {a, b, d})

6 SPAM (Sequential pattern mining)
Max Size = 3 Items = {a, b} Level 1 Level 2 Level 3 Level 4 Level 5 Sequence-extended Item-extended Level 6

7 SPAM (Sequential pattern mining)
Max Size = 3 Items = {a, b} Level 1 Level 2 Level 3 Level 4 Level 5 Level 6

8 SPAM (Sequential pattern mining)
Pruning Items = {a, b, c, d}

9 Data Representation – BitMap
2K+1 < 3 < 2K+1

10 S-type S = {a} S’={a},{b} S’={a},{c}

11 I-type S = {a} S’={a, b} S’={a, c}

12 Expirations and results
D3 C2.5 T3 SPAM SPADE PrefixSpan

13 Small database Small database middle database middle database
SPADE SPAM PrefixSpan prefix middle database middle database

14 large database

15

16 Conclusions SPAM DFS traversal search S-type I-type
Efficient in large database but inefficient in small database Space-inefficient in comparison to SPADE


Download ppt "Sequential Pattern Mining Using A Bitmap Representation"

Similar presentations


Ads by Google