ICS 421 Spring 2010 Data Mining 1 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/6/20101Lipyeow Lim.

Slides:



Advertisements
Similar presentations
Huffman Codes and Asssociation Rules (II) Prof. Sin-Min Lee Department of Computer Science.
Advertisements

Data Mining Techniques Association Rule
Association Analysis (Data Engineering). Type of attributes in assoc. analysis Association rule mining assumes the input data consists of binary attributes.
Data Mining (Apriori Algorithm)DCS 802, Spring DCS 802 Data Mining Apriori Algorithm Spring of 2002 Prof. Sung-Hyuk Cha School of Computer Science.
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
10 -1 Lecture 10 Association Rules Mining Topics –Basics –Mining Frequent Patterns –Mining Frequent Sequential Patterns –Applications.
Association rules The goal of mining association rules is to generate all possible rules that exceed some minimum user-specified support and confidence.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining 198:541. Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Definition Data mining is the exploration and analysis of large.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
1 Data Warehousing. 2 Data Warehouse A data warehouse is a huge database that stores historical data Example: Store information about all sales of products.
Data Mining. Jim Jim ’ s cows Which cows should I breed??
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rules Presented by: Anilkumar Panicker Presented by: Anilkumar Panicker.
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
Normalization and Data Mining R&G Chapter 19 Lecture 27 Science is the knowledge of consequences, and dependence of one fact upon another. Thomas Hobbes.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Association Rule Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Mining Association Rules
Data Mining. Jim Which cow should I buy?? Jim ’ s cows RatingAGE Milk Avg. (MA) Name Good56Mona Bad64Lisa Good38Mary Bad56Quirri Good62Paula Bad710Abdul.
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant Slides from Ofer Pasternak.
Pattern Recognition Lecture 20: Data Mining 3 Dr. Richard Spillman Pacific Lutheran University.
1 Data Mining, Database Tuning Tuesday, Feb. 27, 2007.
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Association Rules. 2 Customer buying habits by finding associations and correlations between the different items that customers place in their “shopping.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Data Mining By Fu-Chun (Tracy) Juang. What is Data Mining? ► The process of analyzing LARGE databases to find useful patterns. ► Attempts to discover.
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
3.Mining Association Rules in Large Database 3.1 Market Basket Analysis:Example for Association Rule Mining 1.A typical example of association rule mining.
Outline Knowledge discovery in databases. Data warehousing. Data mining. Different types of data mining. The Apriori algorithm for generating association.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Instructor : Prof. Marina Gavrilova. Goal Goal of this presentation is to discuss in detail how data mining methods are used in market analysis.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
Association Rule.. Association rule mining  It is an important data mining model studied extensively by the database and data mining community.  Assume.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
1 What is Association Analysis: l Association analysis uses a set of transactions to discover rules that indicate the likely occurrence of an item based.
Association Rule Mining
DATA MINING By Cecilia Parng CS 157B.
ASSOCIATION RULES (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Association Rules presented by Zbigniew W. Ras *,#) *) University of North Carolina – Charlotte #) ICS, Polish Academy of Sciences.
DATA MINING Using Association Rules by Andrew Williamson.
Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Definition Data mining is the exploration and analysis of large quantities of data.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Association Rules Carissa Wang February 23, 2010.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
Chapter 26: Data Mining Prepared by Assoc. Professor Bela Stantic.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
Introduction to Data Mining Mining Association Rules Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Data Mining Association Analysis: Basic Concepts and Algorithms
A Research Oriented Study Report By :- Akash Saxena
Data Mining.
Market Basket Analysis and Association Rules
Data Mining II: Association Rule mining & Classification
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Data Mining Association Rules: Advanced Concepts and Algorithms
Association Rule Mining
Market Basket Analysis and Association Rules
Presentation transcript:

ICS 421 Spring 2010 Data Mining 1 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/6/20101Lipyeow Lim -- University of Hawaii at Manoa

Definition Data mining is the exploration and analysis of large quantities of data in order to discover valid, novel, potentially useful, and ultimately understandable patterns in data. Example pattern (Census Bureau Data): If (relationship = husband), then (gender = male). 99.6% 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa2 The patterns hold in general. We did not know the pattern beforehand We can devise actions from the patterns We can interpret and comprehend the patterns.

Why Use Data Mining Today? Human analysis skills are inadequate: – Volume and dimensionality of the data – High data growth rate Availability of: – Data – Storage – Computational power – Off-the-shelf software – Expertise 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa3

Compelling Reason Competitive pressure! “The secret of success is to know something that nobody else knows.” Aristotle Onassis Competition on service, not only on price (Banks, phone companies, hotel chains, rental car companies) Personalization, CRM The real-time enterprise “Systemic listening” Security, homeland defense 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa4

The Knowledge Discovery Process 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa5 Original Data Target Data Preprocessed Data Patterns Data Integration and Selection Preprocessing Model Construction Interpretation Knowledge

Market Basket Analysis Consider shopping cart filled with several items Market basket analysis tries to answer the following questions: – Who makes purchases? – What do customers buy together? – In what order do customers purchase items? 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa6

Market Basket Analysis: Data Given: A database of customer transactions Each transaction is a set of items Example: Transaction with TID 111 contains items {Pen, Ink, Milk, Juice} 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa7 TIDCIDDateItemQty /1/99Pen /1/99Ink /1/99Milk /1/99Juice /3/99Pen /3/99Ink /3/99Milk /10/99Pen /10/99Milk /1/99Pen /1/99Ink /1/99Juice /1/99Water1

Market Basket Analysis: “Queries” Co-occurrences – 80% of all customers purchase items X, Y and Z together. Association rules – 60% of all customers who purchase X and Y also buy Z. Sequential patterns – 60% of customers who first buy X also purchase Y within three weeks. 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa8 “Itemset”

Frequent Itemsets An itemset (aka co-occurence) is a set of items The support of an itemset {A,B,...} is the fraction of transactions that contain {A,B,...} – {X,Y} has support s if P(XY) = s Frequent itemsets are itemsets whose support is higher than a user specified minimum support minsup. The a priori property: Every subset of a frequent itemset is also a frequent itemset. 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa9

Frequent Itemset Examples {Pen, Ink, Milk} – Support: 50% {Pen,Ink} – Support: 75% {Ink, Milk} – Support: 50% {Pen, Milk} – Support: 75% {Milk, Juice} – support: ? 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa10 TIDCIDDateItemQty /1/99Pen /1/99Ink /1/99Milk /1/99Juice /3/99Pen /3/99Ink /3/99Milk /10/99Pen /10/99Milk /1/99Pen /1/99Ink /1/99Juice /1/99Water1

Finding Frequent Itemsets Find all itemsets with support > 75% 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa11 TIDCIDDateItemQty /1/99Pen /1/99Ink /1/99Milk /1/99Juice /3/99Pen /3/99Ink /3/99Milk /10/99Pen /10/99Milk /1/99Pen /1/99Ink /1/99Juice /1/99Water1

A Priori Algorithm Foreach item – Check if it is a frequent itemset k= 1 Repeat – Foreach new frequent itemset I k with k items Generate all itemsets I k+1 with k+1 items, I k  I k+1 – Scan all transactions once and check if the generated (k+1)-itemsets are frequent – k=k+1 Until no new frequent itemsets are identified 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa12

Association Rules Rules of the form: LHS => RHS Example: {Pen} => {Ink} – “if pen is purchased in a transaction, it is likely that ink is also purchased in the same transaction” Confidence of a rule: – X  Y has confidence c if P(Y|X) = c Support of a rule: – X  Y has support s if P(XY) = s 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa13

Example {Pen} => {Milk} – Support: 75% – Confidence: 75% {Ink} => {Pen} – Support: 75% – Confidence: 100% {Milk}=>{Juice} – support: ? – Confidence: ? 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa14 TIDCIDDateItemQty /1/99Pen /1/99Ink /1/99Milk /1/99Juice /3/99Pen /3/99Ink /3/99Milk /10/99Pen /10/99Milk /1/99Pen /1/99Ink /1/99Juice /1/99Water1

Finding Association Rules Can you find all association rules with support >= 50%? 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa15 TIDCIDDateItemQty /1/99Pen /1/99Ink /1/99Milk /1/99Juice /3/99Pen /3/99Ink /3/99Milk /10/99Pen /10/99Milk /1/99Pen /1/99Ink /1/99Juice /1/99Water1

Association Rule Algorithm Goal: find association rule with given support minsup and given confidence minconf Step 1: Find frequent itemsets with support minsup Step 2: Foreach frequent itemset, – Foreach possible split into LHS=>RHS Compute the confidence as support(LHS,RHS)/support(LHS) and compare with minconf 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa16

Variations Association rules with isa hierarchies – Items in transactions can be grouped into subsumption hierarchies (like dimension hierarchies) – Items in itemsets can be any node in the hierarchy – Example: Support( {Ink,Juice} ) = 50% Support( {Ink,Beverage} ) = 75% Association rules on time slices – Eg. Find association rules on transactions occurring on the first of the month – Confidence and support within these “slices” will be different than over the entire data set. 4/6/2010Lipyeow Lim -- University of Hawaii at Manoa17