Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.

Slides:



Advertisements
Similar presentations
Association Rule and Sequential Pattern Mining for Episode Extraction Jonathan Yip.
Advertisements

Data Mining Techniques Association Rule
Rule Generation from Decision Tree Decision tree classifiers are popular method of classification due to it is easy understanding However, decision tree.
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
Chase Repp.  knowledge discovery  searching, analyzing, and sifting through large data sets to find new patterns, trends, and relationships contained.
Data Mining Glen Shih CS157B Section 1 Dr. Sin-Min Lee April 4, 2006.
1 of 25 1 of 45 Association Rule Mining CIT366: Data Mining & Data Warehousing Instructor: Bajuna Salehe The Institute of Finance Management: Computing.
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
Association Rules Presented by: Anilkumar Panicker Presented by: Anilkumar Panicker.
Data Mining By Archana Ketkar.
Data Mining Adrian Tuhtan CS157A Section1.
Research Project Mining Negative Rules in Large Databases using GRD.
Mining Sequences. Examples of Sequence Web sequence:  {Homepage} {Electronics} {Digital Cameras} {Canon Digital Camera} {Shopping Cart} {Order Confirmation}
Chapter Extension 12 Database Marketing.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Pattern Recognition Lecture 20: Data Mining 3 Dr. Richard Spillman Pacific Lutheran University.
CS 5941 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Apriori algorithm Seminar of Popular Algorithms in Data Mining and Machine Learning, TKK Presentation Lauri Lahti.
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
1 An Introduction to Data Mining Hosein Rostani Alireza Zohdi Report 1 for “advance data base” course Supervisor: Dr. Masoud Rahgozar December 2007.
Association Rules. 2 Customer buying habits by finding associations and correlations between the different items that customers place in their “shopping.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Course Title Database Technologies Instructor: Dr ALI DAUD Course Credits: 3 with Lab Total Hours: 45 approximately.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Data Mining By Fu-Chun (Tracy) Juang. What is Data Mining? ► The process of analyzing LARGE databases to find useful patterns. ► Attempts to discover.
DATA MINING Prof. Sin-Min Lee Surya Bhagvat CS 157B – Spring 2006.
Decision Trees and Association Rules Prof. Sin-Min Lee Department of Computer Science.
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Other Topics 2: Warehousing,
Computing & Information Sciences Kansas State University Friday. 30 Nov 2007CIS 560: Database System Concepts Lecture 39 of 42 Friday, 30 November 2007.
Data Mining: Association Rule By: Thanh Truong. Association Rules In Association Rules, we look at the associations between different items to draw conclusions.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Association Rule Mining March 5, 2009.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
Association Rule.. Association rule mining  It is an important data mining model studied extensively by the database and data mining community.  Assume.
Chapter 20 Data Analysis and Mining. 2 n Decision Support Systems  Obtain high-level information out of detailed information stored in (DB) transaction-processing.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Association Rule Mining
DATA MINING By Cecilia Parng CS 157B.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
Data Mining Brandon Leonardo CS157B (Spring 2006).
COMP53311 Knowledge Discovery in Databases Overview Prepared by Raymond Wong Presented by Raymond Wong
Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining.
Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
DATA MINING Using Association Rules by Andrew Williamson.
Elsayed Hemayed Data Mining Course
Academic Year 2014 Spring Academic Year 2014 Spring.
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Elective-I Examination Scheme- In semester Assessment: 30 End semester Assessment :70 Text Books: Data Mining Concepts and Techniques- Micheline Kamber.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Mining Association Rules in Large Database This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed.
By Arijit Chatterjee Dr
A Research Oriented Study Report By :- Akash Saxena
Association Rules.
Adrian Tuhtan CS157A Section1
Data Mining: Concepts and Techniques Course Outline
Market Basket Analysis and Association Rules
Data Analysis.
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Association Rule Mining
Frequent patterns and Association Rules
MIS2502: Data Analytics Association Rule Mining
Presentation transcript:

Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia

What is data mining?  The automated extraction of hidden predictive information from database  Allows users to analyze large databases to solve business decision problems.  An extension of statistics, with a few artificial intelligence and machine learning twists thrown in.  Attempts to discover rules and patterns from data.

Data Mining - On What Kind of Data  In principle, data mining should be applicable to any kind of information repositiory: ● relational databases ● data warehouses ● transactional and advanced databases ● flat files ● World Wide Web

Data Mining Functionalities-What kinds of Patterns Can be Mined?  Association Analysis  Classification and Prediction  Cluster Analysis  Evolution Analysis

Applications of data mining  Require some sort of Prediction: for example: when a person applies for a credit card, the credit-card company wants to predict if the person is a good credit risk.  Looks for Associations: for example: if a customer buys a book, an on-line bookstore may suggest other associated books.

Associations Rule Discovery  Task: Discovering association rules among items in a transaction database.  How are association rules mined from large database? 1. Find all frequent itemset: each of these itemsets will occur at least as frequent as pre- determined minimum support count. 2. Generate strong association rules from the frequent itemsets: these rules must satisfy minimum support and minimum confidence.

Association Rules (cont.)  Retail shops are often interested in associations between items that people buy. Someone who buys bread is quite likely also to buy milk. association rule: bread => milk A person who brought the book Database System Concepts is quite likely also to buy the book Operating System Concepts. association rule: DSC => OSC

Association Rules (cont.)  Two numbers:  Support: is a measure of what fraction of the population satisfies both the antecedent and the consequent of the true.  Confidence: is a measure of how often the consequent is true when the antecedent is true.

Association Rules (cont.)  Let I = {i 1, i 2, … i m } be a total set of items D is a set of transactions d is one transaction consists of a set of items d  I  Association rule: X  Y where X  I,Y  I and X  Y =  support = (#of transactions contain X  Y ) /D confidence = (#of transactions contain X  Y ) / #of transactions contain X

example  Example of transaction data: 1. CD player, music ’ s CD, music ’ s book 2. CD player, music ’ s CD 3. music ’ s CD, music ’ s book 4. CD player  I = {CD player, music ’ s CD, music ’ s book}  D = 4  #of transactions contain both CD player, music ’ s CD =2  #of transactions contain CD player =3  CD player  music ’ s CD (sup=2/4, conf =2/3 )

Association Rules (cont.)  Rule support and confidence reflect the usefulness and certainty of discovered rules.  A support of 50% for association rule means that 50% of all the transactions under analysis that CD’s player and music CD are purchased together.  A confidence of 67% means that 67% of the customers who purchased a CD’s player also bought music CD.

Strong Association Rule  User sets support and confidence thresholds.  Rules above support threshold have LARGE support.  Rules above confidence threshold have HIGH confidence.  Rules satisfying both are said to be STRONG.

References  Professor Lee ’ s lectures  Rui Zhao, SJSU  Jiawei Han, Micheline Kamber Data Mining Concepts and Techniques Morgan Kaufmann Publishers

Thank you !