Data Mining II: Association Rule mining & Classification

Slides:



Advertisements
Similar presentations
Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.
Advertisements

Institut für Scientific Computing - Universität WienP.Brezany 1 Datamining Methods Mining Association Rules and Sequential Patterns.
Data Mining Techniques Association Rule
Association Analysis (Data Engineering). Type of attributes in assoc. analysis Association rule mining assumes the input data consists of binary attributes.
Data Mining (Apriori Algorithm)DCS 802, Spring DCS 802 Data Mining Apriori Algorithm Spring of 2002 Prof. Sung-Hyuk Cha School of Computer Science.
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Mining Association Rules in Large Databases
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
6/25/2015 Acc 522 Fall 2001 (Jagdish S. Gangolly) 1 Data Mining I Jagdish Gangolly State University of New York at Albany.
Mining Association Rules
Classification.
Mining Association Rules
Mining Association Rules in Large Databases. What Is Association Rule Mining?  Association rule mining: Finding frequent patterns, associations, correlations,
Pattern Recognition Lecture 20: Data Mining 3 Dr. Richard Spillman Pacific Lutheran University.
CS 349: Market Basket Data Mining All about beer and diapers.
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Apriori algorithm Seminar of Popular Algorithms in Data Mining and Machine Learning, TKK Presentation Lauri Lahti.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
3.Mining Association Rules in Large Database 3.1 Market Basket Analysis:Example for Association Rule Mining 1.A typical example of association rule mining.
Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.
Information Systems Data Analysis – Association Mining Prof. Les Sztandera.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
1 What is Association Analysis: l Association analysis uses a set of transactions to discover rules that indicate the likely occurrence of an item based.
Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Dept. of Information Management, Tamkang University
Data Mining and Decision Support
Elsayed Hemayed Data Mining Course
Data Mining  Association Rule  Classification  Clustering.
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Association Rule Mining CS 685: Special Topics in Data Mining Jinze Liu.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Association Rule Mining COMP Seminar BCB 713 Module Spring 2011.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
Introduction to Data Mining Mining Association Rules Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
Mining Association Rules in Large Database This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed.
Data Mining Functionalities
UNIT-5 Mining Association Rules in Large Databases
DATA MINING © Prentice Hall.
A Research Oriented Study Report By :- Akash Saxena
Association rule mining
Association Rules Repoussis Panagiotis.
Chapter 6 Classification and Prediction
Mining Association Rules
Association Rules.
Waikato Environment for Knowledge Analysis
©Jiawei Han and Micheline Kamber
Classification and Prediction
Market Basket Analysis and Association Rules
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Sangeeta Devadiga CS 157B, Spring 2007
Mining Association Rules in Large Databases
Association Rule Mining
Transactional data Algorithm Applications
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Unit 3 MINING FREQUENT PATTERNS ASSOCIATION AND CORRELATIONS
Classification & Prediction
Classification and Prediction
Market Basket Analysis and Association Rules
©Jiawei Han and Micheline Kamber
A Data Mining Tutorial David Madigan
©Jiawei Han and Micheline Kamber
Association Analysis: Basic Concepts
Presentation transcript:

Data Mining II: Association Rule mining & Classification Jagdish Gangolly State University of New York at Albany Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Data Mining II Attribute-Oriented Induction Mining association rules Mining single-dimensional boolean association rules Classification Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Attribute-Oriented Induction I Steps: Original query (in DMQL) specify the database to be mined specify relevant attributes specify the relation to be mined specify the concept in the hierarchy Transformation of DMQL to relational query whose execution yields initial working relation. Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Attribute-Oriented Induction II Attribute removal/generalisation: removal rule: remove attribute if no generalisation operator on the attribute (large set of attribute values, but nogeneralisation operator) higher level concepts in the hierarchy expressed in terms of other attributes (address example) generalisation rule: if there are many attribute values and there are generalisation operators, use them attribute generalisation threshold control Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Basic Algorithm for Attribute-Oriented induction Input: Relational database, DMQL query, a list of attributes, a set of concept hierarchies, attribute generalisation thresholds Output: a Prime generalised relation Method: Collect task-relevant data into a working relation: get W Collect statistics on the working relation Derive the prime relation P. Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Mining association rules I Some examples: Market basket analysis: analysing customer buying habits Intrusion detection by analysing user habits Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Mining association rules II Basic concepts: Set of items I Task-relevant data D consisting of database transactions T  I An association rule is an implication of the form A  B where A  I, B  I, A  B =  support(A  B) = P(AB) confidence(A  B ) = P(B/A) Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Mining association rules II Classification of association rules: Based on types of values Boolean computer  financial-management-software Quantitative association rule age(X, “30..39”)  income(X, “42K..48K”)  buys(X, “financial-management-software”) Based on dimensions of data involved in the rule buys(X, “computer”)  buys(X, “financial-management-software”) Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Mining association rules III Based on levels of abstraction age(X, “30..39”)  buys(X, “laptop”) age(X, “30..39”)  buys(X, “computer”) Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Mining single-dimensional boolean association rules I Apriori algorithm for finding frequent itemsets Apriori property: (All nonempty subsets of a frequent itemset must also be frequent). If P(I) < min_sup, then for any item A, P(IA) < min_sup Steps: Join step: A set of candidate k-itemsets, denoted by Ck , generated by joining Lk-1 with itself. Prune step: Prune Ck Example 6-1 (p.232) Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Classification I Supervised learning Training data Test data Training data is analysed to derive classification rules; the test data are used to estimate the accuracy of classification rules Unsupervised learning or clustering Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018

Classification II Preliminary steps: Comparison/evaluation of methods: data cleaning (reduction of noise, missing values, etc.) relevance analysis (feature selection) data transformation (generalisation, normalisation) Comparison/evaluation of methods: Predictive accuracy speed Robustness Scalability Interpretability Acc 522 Fall 2001 Jagdish S. Gangolly 11/15/2018