1 The Strategies for Mining Fault-Tolerant Patterns Jia-Ling Koh Department of Information and Computer Education National Taiwan Normal University.

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Salvatore Ruggieri SIGKDD2010 Frequent Regular Itemset Mining 2010/9/2 1.
Huffman Codes and Asssociation Rules (II) Prof. Sin-Min Lee Department of Computer Science.
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Frequent Closed Pattern Search By Row and Feature Enumeration
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
1 Department of Information & Computer Education, NTNU SmartMiner: A Depth First Algorithm Guided by Tail Information for Mining Maximal Frequent Itemsets.
Frequent Itemset Mining on Graphics Processors Wenbin Fang, Mian Lu, Xiangye Xiao, Bingsheng He 1, Qiong Luo Hong Kong Univ. of Sci.
Association rules The goal of mining association rules is to generate all possible rules that exceed some minimum user-specified support and confidence.
FP (FREQUENT PATTERN)-GROWTH ALGORITHM ERTAN LJAJIĆ, 3392/2013 Elektrotehnički fakultet Univerziteta u Beogradu.
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
732A02 Data Mining - Clustering and Association Analysis ………………… Jose M. Peña Association rules Apriori algorithm FP grow algorithm.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant.
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
Mining Time-Series Databases Mohamed G. Elfeky. Introduction A Time-Series Database is a database that contains data for each point in time. Examples:
Association Analysis: Basic Concepts and Algorithms.
Association Rule Mining. Generating assoc. rules from frequent itemsets  Assume that we have discovered the frequent itemsets and their support  How.
Data Mining Association Analysis: Basic Concepts and Algorithms
Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules S.D. Lee, D. W. Cheung, B. Kao Department of Computer Science.
Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and.
Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules S.D. Lee David W. Cheung Ben Kao The University of Hong Kong.
Mining Association Rules
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Ch5 Mining Frequent Patterns, Associations, and Correlations
Sequential PAttern Mining using A Bitmap Representation
Data Mining Frequent-Pattern Tree Approach Towards ARM Lecture
EFFICIENT ITEMSET EXTRACTION USING IMINE INDEX By By U.P.Pushpavalli U.P.Pushpavalli II Year ME(CSE) II Year ME(CSE)
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
Mining High Utility Itemset in Big Data
Mining Frequent Patterns without Candidate Generation : A Frequent-Pattern Tree Approach 指導教授:廖述賢博士 報 告 人:朱 佩 慧 班 級:管科所博一.
Mining Approximate Frequent Itemsets in the Presence of Noise By- J. Liu, S. Paulsen, X. Sun, W. Wang, A. Nobel and J. Prins Presentation by- Apurv Awasthi.
KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Outline Introduction – Frequent patterns and the Rare Item Problem – Multiple Minimum Support Framework – Issues with Multiple Minimum Support Framework.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining ARM: Improvements March 10, 2009 Slide.
1 Finding Periodic Partial Patterns in Time Series Database Huiping Cao Apr. 30, 2003.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Reducing Number of Candidates Apriori principle: – If an itemset is frequent, then all of its subsets must also be frequent Apriori principle holds due.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Association Rule Mining COMP Seminar BCB 713 Module Spring 2011.
Introduction to Machine Learning Lecture 13 Introduction to Association Rules Albert Orriols i Puig Artificial.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
1 Mining the Smallest Association Rule Set for Predictions Jiuyong Li, Hong Shen, and Rodney Topor Proceedings of the 2001 IEEE International Conference.
CS685: Special Topics in Data Mining The UNIVERSITY of KENTUCKY Frequent Itemset Mining II Tree-based Algorithm Max Itemsets Closed Itemsets.
Mining General Temporal Association Rules for Items with Different Exhibition Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international.
Reducing Number of Candidates
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining: Concepts and Techniques
Data Mining and Its Applications to Image Processing
Frequent Pattern Mining
Byung Joon Park, Sung Hee Kim
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
A Parameterised Algorithm for Mining Association Rules
Data Mining Association Analysis: Basic Concepts and Algorithms
Fractional Factorial Design
Frequent-Pattern Tree
Design matrix Run A B C D E
Association Analysis: Basic Concepts
Presentation transcript:

1 The Strategies for Mining Fault-Tolerant Patterns Jia-Ling Koh Department of Information and Computer Education National Taiwan Normal University

2 Two Topics Introduced in This Talk The strategies for mining fault-tolerant frequent itemsets (patterns)  from a transaction database The strategies for mining fault-tolerant repeating patterns  from a data sequence

3 An Efficient Approach for Mining Fault-Tolerant Frequent Patterns based on Bit Vector Representations Jia-Ling Koh and Pei-Wy Yo DASFAA 2005

4 Motivation Related works Problem Definition Appearing Bit Vectors VB_FT_Mine algorithm (Vector-Based Fault Tolerant frequent patterns Mining) Experiments Conclusion and future works

5 Min-sup=4 frequent pattern : E Min-sup=3 frequent patterns : B 、 D 、 E 、 F 、 G 、 BE 、 DE an expected minimum support few frequent patterns are discovered Low min-support no general information and representative frequent patterns is returned A B D E GT5 sample database C F GT4 B E F GT3 A C D ET2 B D E FT1 ItemsTID E E E E

6 contain 4 out of the 5 items {B, D, E, F, G} whether a transaction containing a pattern with fault-tolerance contain 4 out of 5 items a longer “approximate” pattern (BDEFG) with support count 4 TIDItems T1 B D E F T2A C D E T3 B E F G T4C F G T5 B D E G A B D E G sample database

7 FT-Apriori algorithm (ACM-SIGMOD,2001) Apriori approach Apply the “downward closure” property suffered from generating a large number of candidates repeatedly scanning database

8 When fault tolerance is set to be 1 A transaction FT-contains BDE : If a transaction contains any (|BDE|-1) items in BDE BD, BE, DE it FT-contains BDE

9  (fault tolerance) =1 Itemset P={B, D, E} FT-body 1 (P)={T1,T2,T3,T5} FT-sup 1 (P) = 4 item B Item-Sup (B)=3 item D Item-Sup (D)=3 item E Item-Sup (E)=4 TIDItems T1B D E F T2A C D E T3B E F G T4C F G T5A B D E G sample database B D E D E B E B D E B B B D D D E E E E δ

10 A fault-tolerant frequent pattern P (1) FT-sup δ (P)  min-sup FT (2)  p  P, Item-Sup(p)  min-sup item

11 δ=1 min-sup FT =4, min-supitem=3 Itemset P={B, D, E} FT-sup 1 (P) = 4 item B : Item-Sup (B)=3 item D : Item-Sup (D)=3 item E : Item-Sup (E)=4 BDE is a FT-frequent pattern TIDItems T1 B D E B D E F T2 D E A C D E T3 B E B E F G T4C F G T5 B D E A B D E G sample database

12 TIDItems T1B D E F T2A C D E T3B E F G T4C F G T5A B D E G sample database Appearing vector table Item Appearing vector ( Appear P ) A B C D E F G A A B B B

13 A Appearing vector : A the support count of an item count the number of bits with 1s

14 Appear A = I 5 = Vector(Appear A ) ․ I 5 = 2 Appearing vector

15 Appear A = Appear D = Appear AD = Appear A  Appear D =  = TIDItems T1B D E F T2 AD A C D E T3B E F G T4C F G T5 AD A B D E G sample database

16 Pattern P=AD  = 1 T1, T2 and T5 FT-contain AD FT-Appear AD (1) = TIDItems T1 D B D E F T2 AD A C D E T3B E F G T4C F G T5 AD A B D E G sample database

17 FT-Appear AD (1) = FT-sup 1 (AD) = ․ = 3 Item-Sup (A) = ․ = 2 Item-Sup (D) = ․ = 3 TIDItems T1 D B D E F T2 AD A C D E T3B E F G T4C F G T5 AD A B D E G sample database

18 Itemset AB FT-Appear AB (1) = Appear A  Appear B Itemset ABC FT-Appear ABC (1) = Appear AB  Appear BC  Appear AC FT-Appear ABC (2) = Appear A  Appear B  Appear C Perform C -1 OR operations

19 【 Theorem 】 Let P´ = P ∪ {x} If transaction T FT-contains P’ with fault-tolerance δ δ-1, or T FT-contains P with fault-tolerance δ-1, or δ T FT-contains P with fault-tolerance δ and contains P

20 P = ABD P´ = P ∪ {G} = ABDG T FT-contains P ´ with fault tolerance 2 Case 1 : T1 = BDEF FT-contains ABD with fault tolerance 1 Case 2 T3 = BEFGFT-contains ABD with fault tolerance 2 and contains G TIDItems T1B D E F T2A C D E T3B E F G T4C F G T5A B D E G sample database B D G

21 Ifδ = 0,FT-Appear P´ (δ) = Appear P´ If |P´|  δ, FT-Appear P´ ( δ )= I |DB| Otherwise, FT-Appear P´ ( δ ) = FT-Appear P (δ-1)  (FT-Appear P (δ)  Appear x )

22  = 1 Itemset A FT-Appear A (1) FT-Appear A (1) = I |DB| FT-Appear A (0) FT-Appear A (0) = Appear A Itemset AB FT-Appear A (0)FT-Appear A (1) FT-Appear AB (1) = FT-Appear A (0)  (FT-Appear A (1)  Appear B ) = Appear A  (I |DB|  Appear B ) = Appear A  Appear B FT-Appear AB (0) = Appear AB FT-Appear A (0) = FT-Appear A (0)  Appear B = Appear A  Appear B

23  =2 Itemsets AB FT-Appear AB (2) = I |DB| FT-Appear AB (1) = FT-Appear A (0)  (FT-Appear A (1)  Appear B ) FT-Appear AB (0) = Appear AB Itemset ABC FT-Appear ABC (2) = FT-AppearAB(1)  (FT-AppearAB (2)  Appear C ) FT-Appear AB (0) FT-Appear ABC (1) = FT-Appear AB (0)  (FT-AppearAB (1)  Appear C ) FT-Appear ABC (0) = Appear ABC FT-Appear AB (0) = FT-Appear AB (0)  Appear C

24 construct appearing vector table TIDItems T1B D E F T2A C D E T3B E F G T4C F G T5A B D E G sample database ItemAppear P A B C D E F G

25 Check the item supports  min-sup item min-sup item 3 min-sup FT 4 min-sup item = 3 min-sup FT = 4 δ=1 A : 2, B : 3, C : 2 D : 3, E : 4, F : 3 G : 3 Candidate items for constructing frequent FT-patterns B, D, E, F, G ItemAppear P A B C D E F G

26

27FT-SupportFT-sup 1 (BD)= Vector (FT-Appear BD ) ․ I 5 = ․ = 4 Item- Support Item-Sup (B) = Vector (FT-Appear BD (1)) ․ Vector (Appear B ) = ․ = 3 Item-Sup (D) = Vector (FT-Appear BD (1)) ․ Vector (Appear D ) = ․ = 3 FT- appearing vector FT-Appear BD (1) = FT-Appear B (0)  (FT-Appear B (1)  Appear D ) =  (  ) = FT-Appear BD (0) = FT-Appear B (0)  Appear D =  =

28

29FT-SupportFT-sup 1 (BDE) = Vector (FT-Appear BDE (1)) ․ I 5 = ․ = 4 Item- Support Item-Sup (B) = ․ = 3 Item-Sup (D) = ․ = 3 Item-Sup (E) = ․ = 4 FT- appearing vector FT-Appear BDE (1) = FT-Appear BD (0)  (FT-Appear BD (1)  Appear E ) =  (  ) = FT-Appear BDE (0) = FT-Appear BD (0)  Appear E =  =

30

31FT-SupportFT-sup 1 (BDEF) = Vector (FT-Appear BDEF (1)) ․ I 5 = ․ = 3 BDEF is not a FT frequent pattern FT-appearing vector FT-Appear BDEF (1) =  (  ) = FT-Appear BDEF (0) =  =

32

33FT-Support FT-sup 1 (BDEG) = Vector ( FT-Appear BDEG (1) )․ I 5 = 3 BDEG is not a FT frequent pattern FT-appearing vector FT-Appear BDEG (1) = FT-Appear BDEG (0) =

34

35 Visual C P4 2.4 GHz CPU 256MB main memory OS: Windows XP Professional Synthesis generator: IBM website

36 Experiment 1: min-sup item is changed T10I8D100kN450 (  =1)

37 Experiment 2: min-sup FT is changed T10I8D10kN1k (  =1 )

38 Experiment 3: fault tolerance  is changed T10I8D100kN450

39 Experiment 4: database size is changed T10I8N450 (  =1 )

40 Experiment 5: the number of various items in database is changed T10I8D100k (  =1 )

41 Conclusion VB-FT-Mine algorithm is proposed Construct FT-appearing vectors of candidates Compute FT-support and Item-support efficiently significant improvement on execution time than FT-Apriori algorithm Future work extend VB-FT-Mine algorithm for mining frequent patterns in data streams

42 An Efficient Approach for Mining Top-K Fault-Tolerant Repeating Patterns Jia-Ling Koh and Yu-Ting Kung DASFAA 2006

43 Outline Introduction Basic Terms Bit Sequence Representation Mining Top-K non-trivial FT-RPs with min_len Constraints Performance Study Conclusion & Feature Works

44 Introduction Repeating patterns  the sub-patterns appearing in a data sequence repeatedly  music feature extraction, user behavior monitoring In most studies, only exact matching was considered

45 Introduction (Cont.) For example: data sequence=ACDE……ACEDE…. using exact matching approache Allow insertion error the frequency of “ACDE” is 1 Find the implicit repeating pattern “ACDE”

46 Introduction (Cont.) Idea: 1.Discover fault-tolerant repeating patterns, FT- RPs in short, and 2.Avoid finding “duplicated” information & “short” patterns Mining “top-K non-trivial FT-RPs with length no less than min_len”

47 Outline Introduction Basic Terms Bit Sequence Representation Mining Top-K non-trivial FT-RPs with min_len Constraints Performance Study Conclusion & Feature Works

48 Data Sequence E = {A,B,…Z}  data items DSeq=D 1 D 2 …D n is a data sequence  where D i  E( i=1…n)  e.g. DSeq = ABCDABCACDEE |DSeq| = 12  the length of DSeq

49 Contain & Appear DSeq = ABCDABCDA P = CDA Contain (on position “3”) Appear (on position “3”) CDA 3 7 freq(P)? = 2

50 FT-contain: insertion error DSeq = ABCDABCA P = ABCA DSeq FT-contain P on position 1 with 1 insertion error ABC A ABCA DSeq FT-contain P on position 5 with 0 insertion error 15

51 FT-contain: deletion error DSeq = ABCBCA P = BCD BC DSeq FT-contain P on position 2 and 4 with 1 deletion error

52 IFT-contain & IFT-appear  insertion error: 0, 1, or 2 DSeq = ABCDABCA P = ABCA ABC A ABCA IFT-contain IFT-appear

53 DFT-contain & DFT-appear  deletion error: 0, 1, or 2 DSeq = ABCBCA P =BCD BC DFT-contain DFT-appear

54 Fault-Tolerant Frequency DSeq = ABCDABCAECDAA P = CA C C A C A A FT-freq DSeq (P) = 3

55 Fault-tolerant Repeating Patterns (FT-RPs) DSeq, P If FT-freq DSeq (P) ≧ min_freq P is a FT-RP

56 Outline Introduction Basic Terms Bit Sequence Representation Mining Top-K non-trivial FT-RPs with min_len Constraints Performance Study Conclusion & Feature Works

57 Appearing Bit Sequence DSeq = ABCDABCACDEEABCCDEAC Appear A A A A AA Initially freq(“A”) = 5

58 Bit Index Table DSeq = ABCDABCACDEEABCCDEAC Data Item Appearing Bit Sequence( Appear N ) A B C D E

59 Appearing Bit Sequence of longer patterns DSeq = Appear AB ? Appear A = ABCDBCCADEECAABCCDEA Appear B = l_shif(Appear B,1) =  Appear AB = freq(“AB”) = 3

60 Appearing Bit Sequence of longer patterns (Cont.) DSeq = Appear ABC ? ABCDBCCADEECAABCCDEA Appear AB = Appear C = l_shift(Appear C,2) = ︿ Appear ABC = freq(“ABC”) =3

61 Recursive Function-Appear P P=P 1 P 2 …P m-1 P m Appear P P’X

62 Fault-Tolerant Appearing Bit Sequences Represent the positions where the data sequence IFT/DFT-contains P under fault- tolerance Insertion Fault Tolerance Deletion Fault Tolerance

63 Appearing Bit Sequence of Insertion Fault Tolerance (E=0, 1, …, ) -The appearing bit sequence of P with E insertion errors

64 How to get ?? When |P| > 1 and E > 0  P = A B C, E = 2 1) A B x x C 2) A x B x C 3) A x x B C Shift 4 = |P|+ E -1 bit positions 0 insertion error in P’ 1 insertion error in P’ 2 insertion errors in P’  P’X

65 Recursive Function- P=P 1 P 2 …P m-1 P m, for E = 0 ~ P’=P 1 …P m-1, X=P m

66 Example V V

67 Fault-Tolerant Frequency FT-freq DSeq (P)  Get it by counting the number of bits with value “1” in  A pattern P can be evaluate whether P is FT-RP or not efficiently.

68 Appearing Bit Sequence of Deletion Fault Tolerance P=P 1 P 2 …P m -The appearing bit sequence of P with E deletion errors Y P’’ 0, 1, …,  D deletion errors in P’’ Shift 1 bit position

69 Recursive Function- P’’=P 2 …P m, for E = 0 ~ Q=P 2 …P m-1, X=P m

70 Example V V

71 Outline Introduction Basic Terms Bit Sequence Representation Mining Top-K non-trivial FT-RPs with min_len Constraints Performance Study Conclusion & Feature Works

72 TFTRP-Mine Algorithm DSeq = ABCDABCACDEEABCCDEAC , min_len = 2 and K = 3  min_freq is set to be Scan DSeq once to construct the bit index table.

73 TFTRP-Mine (Cont.) 2.Generate candidate patterns in Depth-First order root >= 3 A 5 < min_len = 2 B 3 = min_len = 2 C 3 >= 3 D 3 A 2 < 3 E 1 Minlen_Set A 1 >= min_freq = 3 AB(3) A 2 B 0 < 3 ABC (3) B 1 C 2 ABCD (3) A 2 EBCD Temporal Output Set Check non-trivial Empty ABCD (3) Check non-trivial D 2 E 2 C 5 D 2 E 1 AC (5) ACD (3) AC (5) ACD (3)

74 TFTRP-Mine (Cont.) root B 3 A B CDE ABCDE ABCDE Minlen_Set Temporal Output Set AB(3) ABC (3) ABCD (3) min_freq = 3, K = 3, min_len = 2 BC(3) BCD (3) AC (5) ACD (3) AC (5) ACD (3)

75 TFTRP-Mine (Cont.) Temporal Output Set ABCD (3) CAC (3) CDAC (3) CDA (4) CDEAC (3) CDE (4) CEA (3) AC (5) ACD (3) CD (5) Sort Temporal Output Set AC (5) CD (5) CDA (4) CDE (4) CAC (3) ACD (3) CEA (3) ABCD (3) CDEAC (3) Results: AC(5), CD(5), CDA (4), CDE (4) CDAC (3)

76 RE-TFTRP-Mine Algorithm min_len = 2, K = 3 min_freq = 3 Minlen_Set AC (5) CD (5) AB (3) BC (3) CA (3) CE (3) DA (3) EA (3) vvvvvvvv Temporal Top-K Set AC (5) CD (5) AB (3) min_freq = 3

77 RE-TFTRP-Mine (Cont.) Minlen_Set AC (5) CD (5) AB (3) BC (3) CA (3) CE (3) DA (3) EA (3) Temporal Top-K Set AC (5) CD (5) AB (3) v ACD (3) Check non-trivial v v CDA (4) CDE (4) Check non-trivial min_freq = 3 A AB C D E AB C D E C ABCDE ABCDE 41204

78 RE-TFTRP-Mine (Cont.) Minlen_Set AB (3) BC (3) CA (3) CE (3) DA (3) EA (3) ACD (3) CDA (4) CDE (4) Temporal Top-K Set AC (5) CD (5) AB (3) min_freq = 3 CDA 4 A B C D E v CDAC (3) Check non-trivial CDA (4) 4

79 Performance Study Implementation Environment  Borland C++ Builder 5.0  2.4 GHz Intel Pentium IV PC machine  512 MB  Microsoft XP Professional Data Sequence Generator ParametersMeaning LLength of the generated data sequence ENumber of various data items in the generated data sequence

80 Performance Study (Cont.) Experiment 1  Performance evaluation on efficiency Vary one of the five parameters data parameters: L, E, Runtime parameters:, min_len, K Experiment 2  Performance evaluation on effectiveness on music objects

81 Experiment 1 Changing the size of a data sequence L  min_len = 8, K = 5 and E = 5 Candidates Patterns Unit: Numbers Algorithm L TFTRP-MineRE-TFTRP-Mine ,2958, ,73024, ,07524, ,61036, ,77041,280

82 Experiment 1 (Cont.) Changing insertion fault tolerance  L2000.E5, min_len = 8, K = 5 Candidate Patterns Unit: Numbers TFTRP-MineRE-TFTRP-Mine 15,7605, ,73024, ,51536,600 43,434,80576,195 Algorithm

83 Experiment 1 (Cont.) Changing the setting of min_len  L2000.E5, and K = 5 Candidate patterns Unit :Numbers Algorithm min_len TFTRP RE- TFTRP 541,7303, ,73025, ,73026, ,73027, ,73029, ,73030, ,73038, ,73039, , ,730

84 Experiment 1 (Cont.) Changing the setting of K  K = max_K x 1%, max_K x 20%, …max_K x 100%  L2000.E5, and min_len = 8 Candidate Patterns Unit: Numbers Algorithm K/max_K TFTRP-Mine RE-TFTRP- Mine 1%41,73014,320 20%41,73024,300 40%41,73031,735 60%41,73037,325 80%41, %41,730

85 Experiment 2 Music Object Found FT-RPs under insertion fault tolerance 0 Found FT-RPs under insertion fault tolerance 1 Found FT-RPs under insertion fault tolerance 2 Motif in Music Object 1 (252 seconds)None 1. Ecgcegcdcgebbdgbbcaeaecaecba aegegeegecbbggdgacaffbfacfcgee ccgeb 2. None ecgcegcdcgebbdgbbcaeaecaecba aegegeegecbbggdgacaffbfacfcgee ccgebceaadfd 2 (270 seconds)None 1. dcbgddebdefgaabbadgcfbge dcdbaccaaccffaaccccddddcca afdd 2. None ddcbgddebdefgaabbadgcfbg edcdbaccaaccffaaccccddddc caaffddgend 3 (256 seconds)None 1. gggbbgfffdcbeeedccfeeeffgf 2. gggbbgfffdcbfeedccfeeeffgf cbcgggbbgfffdcbfeedccfeeef fgfccc 4 (288 seconds)None 1. deededacededaceded aagggagg 2. deecedacededaceded aagggagg 1. gedeededacededacededaagg gaggedgggcbbaaagabeeegaaa agedeededacedegedcaageedd eccgacaagegdegedcacdegedc dagagedddedcacccccbbaaaga beegaaaagedeededacededace dedaagggaggedgggc 2. None baaagabeeegaaaagedeededa cededacededaagggaggedggg cbbaaagabeeegaaaagedeede dacedegedcaageeddeccgaca agegdegedcacdegedcdagage dddedcacccccbbaaagabeeeg aaaagedeededacededaceded aagggaggedgggcbbaaagab 5 (291 seconds)None 1. aegcfaecaholdonfdbfebgfba holdonbgfeebfbgefdabfddbfd holdonbfcaccaecegcholdoncd fbdf 2. None ecgacaegcfaecaholdonfdbfe bgfbaholdonbgfeebfbgefdab fddbfdholdonbfccaccaecegc holdoncdfbdf min_freq = 3, K = 2 and min_len =8

86 Conclusion and Future Works Conclusion  fault-tolerant appearing bit sequences  TFTRP-Mine and RE-TFTRP-Mine algorithms For mining top-K non-trivial FT-RPs with length no less than min_len in data sequences efficiently Future works  partition the bit index table into several parts to perform parallel mining