Mining Progressive Confident Rules M. Zhang, W. Hsu and M.L. Lee Int'l Conf on Data Mining (ICDM),2006 IEEE Advisor : Jia-Ling Koh Speaker : Tsui-Feng.

Slides:



Advertisements
Similar presentations
Recap: Mining association rules from large datasets
Advertisements

Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Mining Multiple-level Association Rules in Large Databases
A Phrase Mining Framework for Recursive Construction of a Topical Hierarchy Date : 2014/04/15 Source : KDD’13 Authors : Chi Wang, Marina Danilevsky, Nihit.
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
Mining in Anticipation for Concept Change: Proactive-Reactive Prediction in Data Streams YING YANG, XINDONG WU, XINGQUAN ZHU Data Mining and Knowledge.
Sampling Large Databases for Association Rules ( Toivenon’s Approach, 1996) Farzaneh Mirzazadeh Fall 2007.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Learning Fuzzy Association Rules and Associative Classification Rules Jianchao Han Computer Science Department California State University Dominguez Hills.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Ex. 11 (pp.409) Given the lattice structure shown in Figure 6.33 and the transactions given in Table 6.24, label each node with the following letter(s):
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Data Mining Association Analysis: Basic Concepts and Algorithms
Temporal Pattern Matching of Moving Objects for Location-Based Service GDM Ronald Treur14 October 2003.
Association Analysis: Basic Concepts and Algorithms.
1 Mining Quantitative Association Rules in Large Relational Database Presented by Jin Jin April 1, 2004.
Data Mining Association Analysis: Basic Concepts and Algorithms
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
Association Rule Mining (Some material adapted from: Mining Sequential Patterns by Karuna Pande Joshi)‏
Fast Algorithms for Association Rule Mining
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant Slides from Ofer Pasternak.
Mining Association Rules
1 Synthesizing High-Frequency Rules from Different Data Sources Xindong Wu and Shichao Zhang IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
03/23/09 AI Week 15 Machine Learning: Data Mining : Association Rule Mining, Associative Classification, Applications Lee McCluskey, room 3/10
Presented by Tienwei Tsai July, 2005
1 Verifying and Mining Frequent Patterns from Large Windows ICDE2008 Barzan Mozafari, Hetal Thakkar, Carlo Zaniolo Date: 2008/9/25 Speaker: Li, HueiJyun.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Incident Threading for News Passages (CIKM 09) Speaker: Yi-lin,Hsu Advisor: Dr. Koh, Jia-ling. Date:2010/06/14.
MINING FREQUENT ITEMSETS IN A STREAM TOON CALDERS, NELE DEXTERS, BART GOETHALS ICDM2007 Date: 5 June 2008 Speaker: Li, Huei-Jyun Advisor: Dr. Koh, Jia-Ling.
Adding Semantics to Clustering Hua Li, Dou Shen, Benyu Zhang, Zheng Chen, Qiang Yang Microsoft Research Asia, Beijing, P.R.China Department of Computer.
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification - SVM CS 685: Special Topics in Data Mining Spring 2008 Jinze Liu.
Fast Algorithms for Mining Association Rules Rakesh Agrawal and Ramakrishnan Srikant VLDB '94 presented by kurt partridge cse 590db oct 4, 1999.
Towards Using Grid Services for Mining Fuzzy Association Rules Mihai Gabroveanu, Ion Iancu, Mirel Cosulschi, Nicolae Constantinescu Faculty of Mathematics.
Mining Quantitative Association Rules in Large Relational Tables ACM SIGMOD Conference 1996 Authors: R. Srikant, and R. Agrawal Presented by: Sasi Sekhar.
Association Rule Mining
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
1 AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Hong.
Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
1 Mining Sequential Patterns with Constraints in Large Database Jian Pei, Jiawei Han,Wei Wang Proc. of the 2002 IEEE International Conference on Data Mining.
SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
On Reducing Classifier Granularity in Mining Concept-Drifting Data Streams Peng Wang, H. Wang, X. Wu, W. Wang, and B. Shi Proc. of the Fifth IEEE International.
1 Mining the Smallest Association Rule Set for Predictions Jiuyong Li, Hong Shen, and Rodney Topor Proceedings of the 2001 IEEE International Conference.
Sorting by placement and Shift Sergi Elizalde Peter Winkler By 資工四 B 周于荃.
Searching for Pattern Rules Guichong Li and Howard J. Hamilton Int'l Conf on Data Mining (ICDM),2006 IEEE Advisor : Jia-Ling Koh Speaker : Tsui-Feng Yen.
A new Informative Generic Base of Association Rules Gh. Gasmi, S. Ben Yahia, E. Mephu Nguifo, and Y. Slimani PAKDD’05 Advisor : Jia-Ling Koh Speaker :
Mining General Temporal Association Rules for Items with Different Exhibition Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international.
Mining Sequential Patterns With Item Constraints
Data Mining Association Analysis: Basic Concepts and Algorithms
Association rule mining
Frequent Pattern Mining
Big Data Analytics: HW#2
Data Mining Association Rules: Advanced Concepts and Algorithms
Mining Association Rules from Stars
Amer Zaheer PC Mohammad Ali Jinnah University, Islamabad
Farzaneh Mirzazadeh Fall 2007
Discriminative Frequent Pattern Analysis for Effective Classification
Market Basket Analysis and Association Rules
DENSE ITEMSETS JOUNI K. SEPPANEN, HEIKKI MANNILA SIGKDD2004
Presentation transcript:

Mining Progressive Confident Rules M. Zhang, W. Hsu and M.L. Lee Int'l Conf on Data Mining (ICDM),2006 IEEE Advisor : Jia-Ling Koh Speaker : Tsui-Feng Yen

Outline Introduction Problem Definition Mining Concise set of PCR Algorithm Experiments Conclusion

Introduction - Real life objects can be described by their attribute values - If we denote the set of attribute values of an object as its state, then the state of an object changes as the attribute values change with time The states of an object at different time stamps form a state sequence The state sequence of an object also captures more information than the current state for classification

Introduction 3 people with anemia have CLL (IDs 1–3), while the other 3 with anemia do not (IDs 6–8), conf (anemia, CLL) = 0.5 conf (night sweat, CLL) = 0.5 we see that most CLL patients (IDs 1–3) have night sweat → fever → achroacytosis conf(night sweat → fever → achroacytosis,CLL) = 0.8

Introduction The novelties of this paper : -a new kind of pattern called progressive confident rules that describes the changing states of objects with increasing probabilities that lead to some end state. -general form of a progressive confident rule is X1 → X2 →... → Xn, Xi is a state By progressive confident, we mean the probability of an object achieving state C in the future increases as state Xi(1 ≤ i ≤ n) appears one after another

Introduction three additional criteria. : - there is a minimum support of the rule to ensure that the rule is general - the probability of the first state X 1 leading to the end state C should be no less than some threshold min_conf 1 -, the probability of the entire rule leading to C should be no less than a threshold min_conf 2

Problem Definition - I = {i 1, i 2,..., i m } be a set of literals called items. - A state or an itemset X ⊆ I is a set of items For example, if item a represents the symptom “sick”, item b represents the symptom “fever”, then state {a, b} means patient is sick and has fever. - A pattern P = X1 → X2 →... → Xn is an ordered set of states The length of P is defined as the number of states in P. We use |P| to represent the length of P. For example, if P = {a} →{b, c} → {d}, then |P| = 3.

Problem Definition An observation o is an itemset T with a time stamp (tid), i.e., o = (tid, T). If a state X ⊆ T, we say X is contained in the observation o. A sequence s is composed of a sequence id (sid) and an ordered list of observations, i.e. s = (sid, o1, o2,..., on.) Ex.

Problem Definition

A database D is composed of a number of sequences whose end states are known. We use D C to represent the set of sequences in D that end with state C, and D NC to represent the set of sequences in D that does not end with state C. Thus, D = D C ∪ D NC. The confidence of a pattern P resulting in class C is

Problem Definition

A base-pattern of a pattern Q is obtained by removing the last few states (itemsets) fromQ. For example, if Q = {a} → {b, c} →{d}, then {a} → {b, c} is Q’s base-pattern, and {a} → {b} is not.If P is a base-pattern of Q, we also say Q is an extender of P. Given a pattern Q, if conf(Q,C) = 1 and none of its base pattern P satisfies the condition conf(P, C) = 1, we say Q is a terminator pattern. Given a set of patterns V and a pattern P ∈ V, if P is not a base-pattern of any other pattern P in V, we say P is a maximal pattern in V.

Problem Definition Example 1. Suppose P = {a} → {b, c} and Q = {a} →{b, c} → {d} are two progressive confident rules. If conf(P,C) = 1, by definition of confidence, conf(Q,C) must also be 1, and the information captured in Q is redundant. In Example 1, all the extenders of a terminator pattern can be ignored without loss of information.

Problem Definition Example 2. Suppose P = {a} → {b, c} and Q = {a} →{b, c} → {d} are two progressive confident rules. If conf(P,C) <1, then all information in P is already contained in Q. if a pattern is neither a terminator pattern nor a maximal pattern (such as P in Example 2), then that pattern can be excluded from the concise set. Stragegy (1) delete all extenders of terminator patterns in R; (2) remove non-maximal patterns.

b

Mining concise set of PCR

Mining concise set of PCR If sup(P,C) or sup(X,C) is not large enough (compared to then conf(P → X,C) <conf(P,C)

Algorithm

Experiments

Experiments