Decision Trees and Association Rules Prof. Sin-Min Lee Department of Computer Science.

Slides:



Advertisements
Similar presentations
Data Mining using Decision Trees Professor J. F. Baldwin.
Advertisements

COMP3740 CR32: Knowledge Management and Adaptive Systems
Huffman Codes and Asssociation Rules (II) Prof. Sin-Min Lee Department of Computer Science.
Associative Classification (AC) Mining for A Personnel Scheduling Problem Fadi Thabtah.
Association Analysis (Data Engineering). Type of attributes in assoc. analysis Association rule mining assumes the input data consists of binary attributes.
Mining Frequent Patterns II: Mining Sequential & Navigational Patterns Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Classification Algorithms
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
Techniques for Dealing with Hard Problems Backtrack: –Systematically enumerates all potential solutions by continually trying to extend a partial solution.
Data Mining Tri Nguyen. Agenda Data Mining As Part of KDD Decision Tree Association Rules Clustering Amazon Data Mining Examples.
September 26, 2012Introduction to Artificial Intelligence Lecture 7: Search in State Spaces I 1 After our “Haskell in a Nutshell” excursion, let us move.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rules l Mining Association Rules between Sets of Items in Large Databases (R. Agrawal, T. Imielinski & A. Swami) l Fast Algorithms for.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
ICS 421 Spring 2010 Data Mining 1 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/6/20101Lipyeow Lim.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
November 10, 2009Introduction to Cognitive Science Lecture 17: Game-Playing Algorithms 1 Decision Trees Many classes of problems can be formalized as search.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
DATA MINING -ASSOCIATION RULES-
Fast Algorithms for Association Rule Mining
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Mining Association Rules
Backtracking.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Decision Trees.
Data Mining: A Closer Look
Chapter 5 Data mining : A Closer Look.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Basic Data Mining Techniques
Data Mining and Decision Trees Prof. Sin-Min Lee Department of Computer Science.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
Artificial Intelligence Lecture 9. Outline Search in State Space State Space Graphs Decision Trees Backtracking in Decision Trees.
Mining Optimal Decision Trees from Itemset Lattices Dr, Siegfried Nijssen Dr. Elisa Fromont KDD 2007.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
1 N -Queens via Relaxation Labeling Ilana Koreh ( ) Luba Rashkovsky ( )
Data Mining: Association Rule By: Thanh Truong. Association Rules In Association Rules, we look at the associations between different items to draw conclusions.
Data Mining Algorithms for Large-Scale Distributed Systems Presenter: Ran Wolff Joint work with Assaf Schuster 2003.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
CS 8751 ML & KDDSupport Vector Machines1 Mining Association Rules KDD from a DBMS point of view –The importance of efficiency Market basket analysis Association.
Outline Knowledge discovery in databases. Data warehousing. Data mining. Different types of data mining. The Apriori algorithm for generating association.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
Chapter 20 Data Analysis and Mining. 2 n Decision Support Systems  Obtain high-level information out of detailed information stored in (DB) transaction-processing.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Associations and Frequent Item Analysis. 2 Outline  Transactions  Frequent itemsets  Subset Property  Association rules  Applications.
Decision Trees and Association Rules Prof. Sin-Min Lee Department of Computer Science.
DATA MINING Using Association Rules by Andrew Williamson.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Search in State Spaces Problem solving as search Search consists of –state space –operators –start state –goal states A Search Tree is an efficient way.
N- Queens Solution with Genetic Algorithm By Mohammad A. Ismael.
February 11, 2016Introduction to Artificial Intelligence Lecture 6: Search in State Spaces II 1 State-Space Graphs There are various methods for searching.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
Indexing and Mining Free Trees Yun Chi, Yirong Yang, Richard R. Muntz Department of Computer Science University of California, Los Angeles, CA {
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
A Research Oriented Study Report By :- Akash Saxena
Ch9: Decision Trees 9.1 Introduction A decision tree:
Waikato Environment for Knowledge Analysis
Market Basket Analysis and Association Rules
Association Rule Mining
Haskell Tips You can turn any function that takes two inputs into an infix operator: mod 7 3 is the same as 7 `mod` 3 takeWhile returns all initial.
Implementation of Learning Systems
Presentation transcript:

Decision Trees and Association Rules Prof. Sin-Min Lee Department of Computer Science

Data Mining: A KDD Process –Data mining: the core of knowledge discovery process. Data Cleaning Data Integration Databases Data Warehouse Task-relevant Data Selection Data Mining Pattern Evaluation

Data Mining process model -DM

Search in State Spaces

Decision Trees A decision tree is a special case of a state-space graph. It is a rooted tree in which each internal node corresponds to a decision, with a subtree at these nodes for each possible outcome of the decision. Decision trees can be used to model problems in which a series of decisions leads to a solution. The possible solutions of the problem correspond to the paths from the root to the leaves of the decision tree.

Decision Trees Example: The n-queens problem How can we place n queens on an n  n chessboard so that no two queens can capture each other? A queen can move any number of squares horizontally, vertically, and diagonally. Here, the possible target squares of the queen Q are marked with an x.

Let us consider the 4-queens problem. Question: How many possible configurations of 4  4 chessboards containing 4 queens are there? Answer: There are 16!/(12!  4!) = (13  14  15  16)/(2  3  4) = 13  7  5  4 = 1820 possible configurations. Shall we simply try them out one by one until we encounter a solution? No, it is generally useful to think about a search problem more carefully and discover constraints on the problem’s solutions. Such constraints can dramatically reduce the size of the relevant state space.

Obviously, in any solution of the n-queens problem, there must be exactly one queen in each column of the board. Otherwise, the two queens in the same column could capture each other. Therefore, we can describe the solution of this problem as a sequence of n decisions: Decision 1: Place a queen in the first column. Decision 2: Place a queen in the second column.... Decision n: Place a queen in the n-th column.

Backtracking in Decision Trees Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q place 1 st queen place 2 nd queen place 3 rd queen place 4 th queen empty board

Neural Network Many inputs and a single output Trained on signal and background sample Well understood and mostly accepted in HEP Decision Tree Many inputs and a single output Trained on signal and background sample Used mostly in life sciences & business

Decision tree Basic Algorithm Initialize top node to all examples While impure leaves available –select next impure leave L –find splitting attribute A with maximal information gain –for each value of A add child to L

Decision tree Find good split Sufficient statistics to compute info gain: count matrix outlook humidity temperature windy gain: 0.25 bits gain: 0.16 bits gain: 0.03 bits gain: 0.14 bits

Decision trees Simple depth-first construction Needs entire data to fit in memory Unsuitable for large data sets Need to “scale up”

Decision Trees

Planning Tool

Decision Trees Enable a business to quantify decision making Useful when the outcomes are uncertain Places a numerical value on likely or potential outcomes Allows comparison of different possible decisions to be made

Decision Trees Limitations: –How accurate is the data used in the construction of the tree? –How reliable are the estimates of the probabilities? –Data may be historical – does this data relate to real time? –Necessity of factoring in the qualitative factors – human resources, motivation, reaction, relations with suppliers and other stakeholders

Process

Advantages

Disadvantages

Trained Decision Tree (Binned Likelihood Fit) (Limit)

Decision Trees from Data Base ExAttAttAttConcept NumSizeColourShapeSatisfied 1medbluebrickyes 2smallredwedgeno 3smallredsphereyes 4largeredwedgeno 5largegreenpillaryes 6largeredpillarno 7largegreensphereyes Choose target : Concept satisfied Use all attributes except Ex Num

Rules from Tree IF (SIZE = large AND ((SHAPE = wedge) OR (SHAPE = pillar AND COLOUR = red) ))) OR (SIZE = small AND SHAPE = wedge) THEN NO IF (SIZE = large AND ((SHAPE = pillar) AND COLOUR = green) OR SHAPE = sphere) ) OR (SIZE = small AND SHAPE = sphere) OR (SIZE = medium) THEN YES

Association Rule Used to find all rules in a basket data Basket data also called transaction data analyze how items purchased by customers in a shop are related discover all rules that have:- –support greater than minsup specified by user –confidence greater than minconf specified by user Example of transaction data:- – CD player, music’s CD, music’s book – CD player, music’s CD – music’s CD, music’s book – CD player

Association Rule Let I = {i 1, i 2, …i m } be a total set of items D a set of transactions d is one transaction consists of a set of items –d  I Association rule:- –X  Y where X  I,Y  I and X  Y =  –support = #of transactions contain X  Y D –confidence = #of transactions contain X  Y #of transactions contain X

Association Rule Example of transaction data:- – CD player, music’s CD, music’s book – CD player, music’s CD – music’s CD, music’s book – CD player I = {CD player, music’s CD, music’s book} D = 4 #of transactions contain both CD player, music’s CD =2 #of transactions contain CD player =3 CD player  music’s CD (sup=2/4, conf =2/3 );

Association Rule How are association rules mined from large databases ? Two-step process:- –find all frequent itemsets –generate strong association rules from frequent itemsets

Association Rules antecedent  consequent –if  then –beer  diaper (Walmart) –economy bad  higher unemployment –Higher unemployment  higher unemployment benefits cost Rules associated with population, support, confidence

Association Rules Population: instances such as grocery store purchases Support –% of population satisfying antecedent and consequent Confidence –% consequent true when antecedent true

2. Association rules Support Every association rule has a support and a confidence. “The support is the percentage of transactions that demonstrate the rule.” Example: Database with transactions ( customer_# : item_a1, item_a2, … ) 1: 1, 3, 5. 2: 1, 8, 14, 17, 12. 3: 4, 6, 8, 12, 9, : 2, 1, 8. support {8,12} = 2 (,or 50% ~ 2 of 4 customers) support {1, 5} = 1 (,or 25% ~ 1 of 4 customers ) support {1} = 3 (,or 75% ~ 3 of 4 customers)

2. Association rules Support An itemset is called frequent if its support is equal or greater than an agreed upon minimal value – the support threshold add to previous example: if threshold 50% then itemsets {8,12} and {1} called frequent

2. Association rules Confidence Every association rule has a support and a confidence. An association rule is of the form: X => Y X => Y: if someone buys X, he also buys Y The confidence is the conditional probability that, given X present in a transition, Y will also be present. Confidence measure, by definition: Confidence(X=>Y) equals support(X,Y) / support(X)

2. Association rules Confidence We should only consider rules derived from itemsets with high support, and that also have high confidence. “A rule with low confidence is not meaningful.” Rules don’t explain anything, they just point out hard facts in data volumes.

3. Example Example: Database with transactions ( customer_# : item_a1, item_a2, … ) 1: 3, 5, 8. 2: 2, 6, 8. 3: 1, 4, 7, 10. 4: 3, 8, 10. 5: 2, 5, 8. 6: 1, 5, 6. 7: 4, 5, 6, 8. 8: 2, 3, 4. 9: 1, 5, 7, 8. 10: 3, 8, 9, 10. Conf ( {5} => {8} ) ? supp({5}) = 5, supp({8}) = 7, supp({5,8}) = 4, then conf( {5} => {8} ) = 4/5 = 0.8 or 80%

3. Example Example: Database with transactions ( customer_# : item_a1, item_a2, … ) 1: 3, 5, 8. 2: 2, 6, 8. 3: 1, 4, 7, 10. 4: 3, 8, 10. 5: 2, 5, 8. 6: 1, 5, 6. 7: 4, 5, 6, 8. 8: 2, 3, 4. 9: 1, 5, 7, 8. 10: 3, 8, 9, 10. Conf ( {5} => {8} ) ? 80% Done. Conf ( {8} => {5} ) ? supp({5}) = 5, supp({8}) = 7, supp({5,8}) = 4, then conf( {8} => {5} ) = 4/7 = 0.57 or 57%

3. Example Example: Database with transactions ( customer_# : item_a1, item_a2, … ) Conf ( {5} => {8} ) ? 80% Done. Conf ( {8} => {5} ) ? 57% Done. Rule ( {5} => {8} ) more meaningful then Rule ( {8} => {5} )

3. Example Example: Database with transactions ( customer_# : item_a1, item_a2, … ) 1: 3, 5, 8. 2: 2, 6, 8. 3: 1, 4, 7, 10. 4: 3, 8, 10. 5: 2, 5, 8. 6: 1, 5, 6. 7: 4, 5, 6, 8. 8: 2, 3, 4. 9: 1, 5, 7, 8. 10: 3, 8, 9, 10. Conf ( {9} => {3} ) ? supp({9}) = 1, supp({3}) = 1, supp({3,9}) = 1, then conf( {9} => {3} ) = 1/1 = 1.0 or 100%. OK?

3. Example Example: Database with transactions ( customer_# : item_a1, item_a2, … ) Conf( {9} => {3} ) = 100%. Done. Notice: High Confidence, Low Support. -> Rule ( {9} => {3} ) not meaningful

Association Rules Population –MS, MSA, MSB, MA, MB, BA –M=Milk, S=Soda, A=Apple, B=beer Support (M  S)= 3/6 –(MS,MSA,MSB)/(MS,MSA,MSB,MA,MB, BA) Confidence (M  S) = 3/5 –(MS, MSA, MSB) / (MS,MSA,MSB,MA,MB)