Association Rules Olson Yanhong Li. Fuzzy Association Rules Association rules mining provides information to assess significant correlations in large.

Slides:



Advertisements
Similar presentations
Data Mining Techniques Association Rule
Advertisements

Association Analysis (Data Engineering). Type of attributes in assoc. analysis Association rule mining assumes the input data consists of binary attributes.
Random Forest Predrag Radenković 3237/10
Data Mining in Clinical Databases by using Association Rules Department of Computing Charles Lo.
Rule Generation from Decision Tree Decision tree classifiers are popular method of classification due to it is easy understanding However, decision tree.
Imbalanced data David Kauchak CS 451 – Fall 2013.
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
10 -1 Lecture 10 Association Rules Mining Topics –Basics –Mining Frequent Patterns –Mining Frequent Sequential Patterns –Applications.
Privacy Preserving Association Rule Mining in Vertically Partitioned Data Reporter : Ximeng Liu Supervisor: Rongxing Lu School of EEE, NTU
Rakesh Agrawal Ramakrishnan Srikant
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Learning Fuzzy Association Rules and Associative Classification Rules Jianchao Han Computer Science Department California State University Dominguez Hills.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Association Rules Mining Part III. Multiple-Level Association Rules Items often form hierarchy. Items at the lower level are expected to have lower support.
Data Mining Association Analysis: Basic Concepts and Algorithms
PSY 307 – Statistics for the Behavioral Sciences
Mutual Information Mathematical Biology Seminar
Optimatization of a New Score Function for the Detection of Remote Homologs Kann et al.
1 Mining Quantitative Association Rules in Large Relational Database Presented by Jin Jin April 1, 2004.
Data Mining Association Analysis: Basic Concepts and Algorithms
LinkSelector: A Web Mining Approach to Hyperlink Selection for Web Portals Xiao Fang University of Arizona 10/18/2002.
SAC’06 April 23-27, 2006, Dijon, France Towards Value Disclosure Analysis in Modeling General Databases Xintao Wu UNC Charlotte Songtao Guo UNC Charlotte.
Basic Data Mining Techniques Chapter Decision Trees.
Focused Reducts Janusz A. Starzyk and Dale Nelson.
Association Rules Hawaii International Conference on System Sciences (HICSS-40) January 2007 David L. Olson Yanhong Li.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier.
Fast Algorithms for Association Rule Mining
Research Project Mining Negative Rules in Large Databases using GRD.
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant Slides from Ofer Pasternak.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
For Better Accuracy Eick: Ensemble Learning
Chapter 13 – Association Rules
CS 349: Market Basket Data Mining All about beer and diapers.
Basic Data Mining Techniques
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Title: Spatial Data Mining in Geo-Business. Overview  Twisting the Perspective of Map Surfaces — describes the character of spatial distributions through.
Apriori algorithm Seminar of Popular Algorithms in Data Mining and Machine Learning, TKK Presentation Lauri Lahti.
An Excel-based Data Mining Tool Chapter The iData Analyzer.
1 Associative Classification of Imbalanced Datasets Sanjay Chawla School of IT University of Sydney.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
CIS 600: Master's Project Online Trading and Data Mining- Based Marketing of IT Books Supervisor : Dr. Haiping Xu Student : Tsung-Ta Tu Student ID :
Apriori Algorithms Feapres Project. Outline 1.Association Rules Overview 2.Apriori Overview – Apriori Advantage and Disadvantage 3.Apriori Algorithms.
Designing a Course Recommendation System on Web based on the Students’ course Selection Records Ko-Kang Chu, Maiga Chang and Yen- The Hsia (Dept. of Information.
Mining various kinds of Association Rules
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
Frequent Item Mining. What is data mining? =Pattern Mining? What patterns? Why are they useful?
1 FINDING FUZZY SETS FOR QUANTITATIVE ATTRIBUTES FOR MINING OF FUZZY ASSOCIATE RULES By H.N.A. Pham, T.W. Liao, and E. Triantaphyllou Department of Industrial.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
Mining Quantitative Association Rules in Large Relational Tables ACM SIGMOD Conference 1996 Authors: R. Srikant, and R. Agrawal Presented by: Sasi Sekhar.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
Associations and Frequent Item Analysis. 2 Outline  Transactions  Frequent itemsets  Subset Property  Association rules  Applications.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Bug Localization with Association Rule Mining Wujie Zheng
Network Community Behavior to Infer Human Activities.
CMU SCS : Multimedia Databases and Data Mining Lecture #30: Data Mining - assoc. rules C. Faloutsos.
Data Mining  Association Rule  Classification  Clustering.
Data Analytics CMIS Short Course part II Day 1 Part 1: Clustering Sam Buttrey December 2015.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
Chapter 14 – Association Rules and Collaborative Filtering © Galit Shmueli and Peter Bruce 2016 Data Mining for Business Analytics (3rd ed.) Shmueli, Bruce.
Chapter 13 – Association Rules DM for Business Intelligence.
Frequent Pattern Mining
Presentation transcript:

Association Rules Olson Yanhong Li

Fuzzy Association Rules Association rules mining provides information to assess significant correlations in large databases IF X THEN Y SUPPORT: degree to which relationship appears in data CONFIDENCE: probability that if X, then Y

Association Rule Algorithms APriori Agrawal et al., 1993; Agrawal & Srikant, 1994 –Find correlations among transactions, binary values Weighted association rules Cai et al., 1998; Lu et al Cardinal data Srikant & Agrawal, 1996 –Partitions attribute domain, combines adjacent partitions until binary

Fuzzy Association Rules Most based on APriori algorithm Treat all attributes as uniform Can increase number of rules by decreasing minimum support, decreasing minimum confidence –Generates many uninteresting rules –Software takes a lot longer

Gyenesei (2000) Studied weighted quantitative association rules in fuzzy domain –With & without normalization –NONNORMALIZED Used product operator to define combined weight and fuzzy value If weight small, support level small, tends to have data overflow –NORMALIZED Used geometric mean of item weights as combined weight Support then very small

Algorithm Get membership functions, minimum support, minimum confidence Assign weight to each fuzzy membership for each attribute (categorical) Calculate support for each fuzzy region If support > minimum, OK If confidence > minimum, OK If both OK, generate rules

Demo Model: Loan App CaseAgeIncomeRiskCreditResult Red Green Green Amber Green Green Green Green Green Red1

Fuzzified Age Figure 2: The membership functions of attibute Age Age Membership value YoungMiddleOld

Fuzzify Age CaseAgeYoungMiddleOld

Calculate Support for Each Pair of Fuzzy Categories Membership value –Identify weights for each attribute –Identify highest fuzzy membership category for each case Membership value = minimum weight associated with highest fuzzy membership category Support –Average membership value for all cases

Support If support for pair of categories is above minimum support, retain Identifies all pairs of fuzzy categories with sufficiently strong relationship

Pairs: minsup 0.25 R 11 R R 22 R R 11 R R 22 R R 11 R R 31 R R 11 R R 31 R R 11 R R 31 R R 22 R R 41 R R 22 R R 42 R

Confidence Identify direction For those training set cases involving the pair of attributes, what proportion came out as predicted?

Confidence Values: Pairs Minimum confidence 0.9 R 22 R R 41 R R 41 R R 31 R R 22 R R 51 R R 51 R R 41 R R 31 R R 51 R

Rules vs. Support

Rules vs. Confidence

Higher order combinations Try triplets –If ambitious, sets of 4, and beyond Problem: –Computational complexity explodes

Research The higher the minimum support, the fewer rules you get The higher the minimum confidence, the fewer rules you get Weights can yield more rules Greatest accuracy seemed to be at intermediate levels of support –Higher levels of confidence