Download presentation
Presentation is loading. Please wait.
Published byDerek Hood Modified over 9 years ago
1
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National Yunlin University of Science and Technology Efficient Mining of Intertransaction Association Rules IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING VOL. 15, NO. 1, JANUARY/FEBRUARY 2003
2
Intelligent Database Systems Lab Outline Motivation Objective Introduction PROBLEM DESCRIPTION PRINCIPLES OF INTERTRANSACTION ASSOCIATION MINNING METHODS THE FITI ALGORITHM PERFORMANCE STUDY CONCLUSION PERSONAL OPINION N.Y.U.S.T. I.M.
3
Intelligent Database Systems Lab Intratransaction v.s. Intertransaction Intratransaction same transaction same customer same day Intertransaction different transaction
4
Intelligent Database Systems Lab Motivation Most of the previous studies on mining association rules are on mining intratransaction associations. same transaction same customer same day
5
Intelligent Database Systems Lab Objective we break the barrier of transactions and extend the scope of mining association rules from traditional single-dimensional, intratransaction associations to multidimensional, intertransaction associations.
6
Intelligent Database Systems Lab Introduction Most of the pervious studies … There is an important form of association rules which are useful but could not discovered with existing association rule mining framework.
7
Intelligent Database Systems Lab Introduction Example : stock R1: When the prices of IBM and SUN go up, at 80 percent of probability the price of Microsoft goes up on the same day. R2: If the prices of IBM and SUN go up, Microsoft’s will most likely(80 percent of probability) go up the next day and then drop four days later.
8
Intelligent Database Systems Lab Introduction It is different from mining sequential patterns in transaction data. Contribution an association mining problem that is more general than what has been discussed in the literature. Developed an efficient algorithm for mining intertransaction association rule from large databases. (FITI)
9
Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 1. Let be a set of literals, called items. Let D be an attribute and Dom(D) be the domain of D. A transaction database is a database containing transactions in the form of (d, E), where Definition 2. A sliding window W in a transaction database T is a block of w continuous intervals along domain D, starting from interval such that T contains a transaction at interval. Each interval in W is called a subwindow of W denoted as W[j], where. We call j the subwindow number of within W.
10
Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 3. Let W be a sliding window with w intervals and u be the number of literals in ∑. We define a megatransaction M contained within W to be To distinguish the items in a megatransaction from the items in a traditional transaction, the items in a megatransaction are called extended-items.
11
Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 4. intratransaction itemset is a set of items An intertransaction itemset is a set of extended items such that Definition 5. An intertransaction association rule is an implication of the form, where
12
Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 6. Let S be the number of transactions in the transaction database. Let be the set of megatransactions that contain a set of extended-items and be the set of megatransactions that contain X. Then, the support and confidence of an intertransaction association rule are defined as
13
Intelligent Database Systems Lab PRINCIPLES OF INTERTRANSACTION ASSOCIATION MINNING METHODS Decomposed into two subproblems:
14
Intelligent Database Systems Lab THE FITI ALGORITHM FITI consists of the following three phasses.
15
Intelligent Database Systems Lab Phase 1 : Mining and Storing Frequent IntraTransaction Itemsets Frequent intratransaction itemsets are first mined using the Apriori algorithm Frequent-Itemsets Linked Table, FILT ItemSet Hash Table 1. Lookup Links 2. Generator and Extension Links 3. Subset Links 4. Descendant Links
16
Intelligent Database Systems Lab Phase 1 : Mining and Storing Frequent IntraTransaction Itemsets
17
Intelligent Database Systems Lab Phase 2 : Database Transformation FITI is to transform the database into a set of Encoded Frequent-Itemset Tables, FIT
18
Intelligent Database Systems Lab
19
Phase 3 : Mining frequent Intertrasaction Itemsets
20
Intelligent Database Systems Lab PERFORMANCE STUDY Generation of Synthetic Data Relative Performance Effect of Increasing Maxspan Effect of Increasing the Number of Transactions Effect of Increasing the Average Transaction Size Experiment on Real-Life Data
21
Intelligent Database Systems Lab Generation of Synthetic Data TABLE4 Parameter Settings
22
Intelligent Database Systems Lab Relative Performance Compare the performance of EH-Apriori and FITI vary the support level on data sets one and two maxspan equal to four and six
23
Intelligent Database Systems Lab Relative Performance
24
Intelligent Database Systems Lab Effect of Increasing Maxspan vary maxspan from two to eight fro both data sets
25
Intelligent Database Systems Lab Effect of Increasing the Number of Transactions Vary the number of transactions in dataset one from 10k to 1,000k increase linearly
26
Intelligent Database Systems Lab Effect of Increasing the Average Transaction Size Increase the average transaction size of data set one from five to 20
27
Intelligent Database Systems Lab Experiment on Real-Life Data
28
Intelligent Database Systems Lab CONCLUSION Intorduce a new problem of mining intertransaction association rules and provided an extensive study of it. EH-Apriori and FITI FITI proves to be much faster than EH-Apriori Performance is acceptable for real life application Useful in providing prediction capability along a single dimension.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.