Decision Trees Prof. Carolina Ruiz Dept. of Computer Science WPI.

Slides:



Advertisements
Similar presentations
DECISION TREES. Decision trees  One possible representation for hypotheses.
Advertisements

Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
IT 433 Data Warehousing and Data Mining
Hunt’s Algorithm CIT365: Data Mining & Data Warehousing Bajuna Salehe
Decision Tree Approach in Data Mining
Non-Metric Methods: Decision Trees Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Bab /44 Bab 4 Classification: Basic Concepts, Decision Trees & Model Evaluation Part 1 Classification With Decision tree.
Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.
1 Data Mining Classification Techniques: Decision Trees (BUSINESS INTELLIGENCE) Slides prepared by Elizabeth Anglo, DISCS ADMU.
Decision Tree.
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach,
Data Mining Classification This lecture node is modified based on Lecture Notes for Chapter 4/5 of Introduction to Data Mining by Tan, Steinbach, Kumar,
A Quick Overview By Munir Winkel. What do you know about: 1) decision trees 2) random forests? How could they be used?
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Lecture Notes for Chapter 4 Introduction to Data Mining
Classification: Decision Trees, and Naïve Bayes etc. March 17, 2010 Adapted from Chapters 4 and 5 of the book Introduction to Data Mining by Tan, Steinbach,
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
1 BUS 297D: Data Mining Professor David Mease Lecture 5 Agenda: 1) Go over midterm exam solutions 2) Assign HW #3 (Due Thurs 10/1) 3) Lecture over Chapter.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Lecture 5 (Classification with Decision Trees)
Example of a Decision Tree categorical continuous class Splitting Attributes Refund Yes No NO MarSt Single, Divorced Married TaxInc NO < 80K > 80K.
Classification.
Ordinal Decision Trees Qinghua Hu Harbin Institute of Technology
Decision Tree Models in Data Mining
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
Decision Trees.
1 Data Mining Lecture 3: Decision Trees. 2 Classification: Definition l Given a collection of records (training set ) –Each record contains a set of attributes,
Chapter 9 – Classification and Regression Trees
Chapter 4 Classification. 2 Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of.
Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.
Modul 6: Classification. 2 Classification: Definition  Given a collection of records (training set ) Each record contains a set of attributes, one of.
Decision Trees Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Feature Selection: Why?
Ch10 Machine Learning: Symbol-Based
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Comparing Univariate and Multivariate Decision Trees Olcay Taner Yıldız Ethem Alpaydın Department of Computer Engineering Bogazici University
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Chapter 20 Data Analysis and Mining. 2 n Decision Support Systems  Obtain high-level information out of detailed information stored in (DB) transaction-processing.
ID3 Algorithm Michael Crawford.
Decision Trees Example of a Decision Tree categorical continuous class Refund MarSt TaxInc YES NO YesNo Married Single, Divorced < 80K> 80K Splitting.
CIS671-Knowledge Discovery and Data Mining Vasileios Megalooikonomou Dept. of Computer and Information Sciences Temple University AI reminders (based on.
Lecture Notes for Chapter 4 Introduction to Data Mining
1 Illustration of the Classification Task: Learning Algorithm Model.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
CS4445/B12 Provided by: Kenneth J. Loomis. genrecritics-reviewsratingIMAXlikes comedythumbs-upRFALSEno comedythumbs-upRTRUEno comedyneutralRFALSEno actionthumbs-downPG-13TRUEno.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 4-Inducción de árboles de decisión (1/2) Eduardo Poggi.
Basic Data Mining Techniques Chapter 3-A. 3.1 Decision Trees.
MIS 451 Building Business Intelligence Systems Classification (1)
1 By: Ashmi Banerjee (125186) Suman Datta ( ) CSE- 3rd year.
BY International School of Engineering {We Are Applied Engineering} Disclaimer: Some of the Images and content have been taken from multiple online sources.
Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
DECISION TREE INDUCTION CLASSIFICATION AND PREDICTION What is classification? what is prediction? Issues for classification and prediction. What is decision.
DECISION TREES An internal node represents a test on an attribute.
Decision Trees CSC 600: Data Mining Class 8.
Decision Trees.
Trees, bagging, boosting, and stacking
Classification Decision Trees
© 2013 ExcelR Solutions. All Rights Reserved Data Mining - Supervised Decision Tree & Random Forest.
Data Mining Classification: Basic Concepts and Techniques
Classification and Prediction
Data Mining – Chapter 3 Classification
Classification with CART
Decision Tree  Decision tree is a popular classifier.
Decision Tree  Decision tree is a popular classifier.
Data Mining CSCI 307, Spring 2019 Lecture 15
… 1 2 n A B V W C X 1 2 … n A … V … W … C … A X feature 1 feature 2
Presentation transcript:

Decision Trees Prof. Carolina Ruiz Dept. of Computer Science WPI

Constructing a decision tree ? Which attribute to use as the root node? That is, which attribute to check first when making a prediction? Pick the attribute that brings us closer to a decision. That is, the attribute that splits the data more homogenously.

Which attribute splits the data more homogenously? [0,1,3] [2,1,2] [3,1,1] bad unknown good [3,3,2] [2,1,4] low high [3,2,6] [2,1,0] none adequate [0,0,4] [0,2,2] [5,1,0] >35 low moderate high Goal: Assign a unique number to each attribute that represents how well it “splits” the dataset according to the target attribute Target

For example … What function f to use? f([0,1,3],[2,1,2],[3,1,1]) = number Possible f functions: Gini Index measure of impurity Entropy from information theory Misclassification error metric used by OneR [0,1,3] [2,1,2] [3,1,1] bad unknown good

Using entropy as the f metric f([0,1,3],[2,1,2],[3,1,1]) = Entropy([0,1,3],[2,1,2],[3,1,1]) = (4/14)*Entropy([0,1,3]) + (5/14)*Entropy([2,1,2]) + (5/14)*Entropy([3,1,1]) = (4/14)*[-0 -1/4 log 2 (1/4) -3/4 log 2 (3/4) ] + (5/14)*[-2/5 log 2 (2/5)-1/5 log 2 (1/5) -2/5 log 2 (2/5) ] + (5/14)*[-3/5 log 2 (3/5)-1/5 log 2 (1/5) -1/5 log 2 (1/5) ] = [0,1,3] [2,1,2] [3,1,1] bad unknown good In general: Entropy([p,q,…,z]) = - (p/m)log 2 (p/m) – (q/m)log 2 (q/m) - … - (z/m)log 2 (z/m) where m = p+q+…+z

Which attribute splits the data more homogenously? [0,1,3] [2,1,2] [3,1,1] bad unknown good [3,3,2] [2,1,4] low high [3,2,6] [2,1,0] none adequate [0,0,4] [0,2,2] [5,1,0] >35 low moderate high Attribute with lowest entropy is chosen: income Target

Constructing a decision tree ? Which attribute to use as the root node? That is, which attribute to check first when making a prediction? Pick the attribute that brings us closer to a decision. That is, the attribute that splits the data more homogenously.

Constructing a decision tree income > 35 prediction: high ? ? ?

Splitting instances with income = [0,0,1], [0,1,1],[0,1,0] [0,1,0], [0,1,2] [0,2,2], [0,0,0] entropy: <- high <- moderate attribute with lowest entropy

Constructing a decision tree income > 35 prediction: high … … … Credit- history