SEEM4630 2015-2016 Tutorial 1 Classification: Decision tree Siyuan Zhang,

Slides:



Advertisements
Similar presentations
Data Mining Lecture 9.
Advertisements

Decision Trees Decision tree representation ID3 learning algorithm
Machine Learning III Decision Tree Induction
Decision Tree Approach in Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Decision Tree Algorithm (C4.5)
Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.
Decision Tree.
ICS320-Foundations of Adaptive and Learning Systems
Classification Techniques: Decision Tree Learning
Lecture Notes for Chapter 4 Introduction to Data Mining
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
1 Classification with Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Classification: Decision Trees
Induction of Decision Trees
1 Classification with Decision Trees I Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Lecture 5 (Classification with Decision Trees)
Decision Trees an Introduction.
SEG Tutorial 1 – Classification Decision tree, Naïve Bayes & k-NN CHANG Lijun.
Machine Learning Reading: Chapter Text Classification  Is text i a finance new article? PositiveNegative.
Classification.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Classification: Decision Trees 2 Outline  Top-Down Decision Tree Construction  Choosing the Splitting Attribute  Information Gain and Gain Ratio.
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
SEEM Tutorial 2 Classification: Decision tree, Naïve Bayes & k-NN
Data Mining: Classification
Machine Learning Chapter 3. Decision Tree Learning
1 ACM Student Chapter, Heritage Institute of Technology 3 rd February, 2012 SIGKDD Presentation by Satarupa Guha Sudipto Banerjee Ashish Baheti.
CS 484 – Artificial Intelligence1 Announcements List of 5 source for research paper Homework 5 due Tuesday, October 30 Book Review due Tuesday, October.
1 Data Mining Lecture 3: Decision Trees. 2 Classification: Definition l Given a collection of records (training set ) –Each record contains a set of attributes,
Classification I. 2 The Task Input: Collection of instances with a set of attributes x and a special nominal attribute Y called class attribute Output:
Classification. 2 Classification: Definition  Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes.
Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.
Machine Learning Lecture 10 Decision Tree Learning 1.
CpSc 810: Machine Learning Decision Tree Learning.
Decision Trees Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Feature Selection: Why?
Decision-Tree Induction & Decision-Rule Induction
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
Longin Jan Latecki Temple University
CS690L Data Mining: Classification
Classification and Prediction
Elsayed Hemayed Data Mining Course
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining By Tan, Steinbach,
Decision Trees.
Chapter 4: Algorithms CS 795. Inferring Rudimentary Rules 1R – Single rule – one level decision tree –Pick each attribute and form a single level tree.
Decision Trees by Muhammad Owais Zahid
Data Mining Chapter 4 Algorithms: The Basic Methods - Constructing decision trees Reporter: Yuen-Kuei Hsueh Date: 2008/7/24.
Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
DECISION TREES An internal node represents a test on an attribute.
Decision Trees an introduction.
Decision Tree Learning
Classification Algorithms
Teori Keputusan (Decision Theory)
Artificial Intelligence
Data Science Algorithms: The Basic Methods
Decision Tree Saed Sayad 9/21/2018.
Chapter 8 Tutorial.
Machine Learning: Lecture 3
Play Tennis ????? Day Outlook Temperature Humidity Wind PlayTennis
Dept. of Computer Science University of Liverpool
Junheng, Shengming, Yunsheng 10/19/2018
Classification.
Data Mining CSCI 307, Spring 2019 Lecture 15
Classification 1.
COP5577: Principles of Data Mining Fall 2008 Lecture 4 Dr
Presentation transcript:

SEEM Tutorial 1 Classification: Decision tree Siyuan Zhang,

Classification: Definition Given a collection of records (training set ), each record contains a set of attributes, one of the attributes is the class. Find a model for class attribute as a function of the values of other attributes. Decision tree Naïve bayes k-NN Goal: previously unseen records should be assigned a class as accurately as possible. 2

Decision Tree Goal Construct a tree so that instances belonging to different classes should be separated Basic algorithm (a greedy algorithm) Tree is constructed in a top-down recursive manner At start, all the training examples are at the root Test attributes are selected on the basis of a heuristics or statistical measure (e.g., information gain) Examples are partitioned recursively based on selected attributes 3

 Let p i be the probability that a tuple belongs to class C i, estimated by |C i,D |/|D|  Expected information (entropy) needed to classify a tuple in D:  Information needed (after using A to split D into v partitions) to classify D:  Information gained by branching on attribute A: Attribute Selection Measure 1: Information Gain 4

 Information gain measure is biased towards attributes with a large number of values  C4.5 (a successor of ID3) uses gain ratio to overcome the problem (normalization to information gain):  GainRatio(A) = Gain(A)/SplitInfo(A) Attribute Selection Measure 2: Gain Ratio 5

 If a data set D contains examples from n classes, gini index, gini(D) is defined as: where p j is the relative frequency of class j in D  If a data set D is split on A into two subsets D 1 and D 2, the gini index gini(D) is defined as  Reduction in Impurity:  If a data set D contains examples from n classes, gini index, gini(D) is defined as: where p j is the relative frequency of class j in D  If a data set D is split on A into two subsets D 1 and D 2, the gini index gini(D) is defined as  Reduction in Impurity: Attribute Selection Measure 3: Gini index 6

Example OutlookTemperatureHumidityWindPlay Tennis Sunny>25HighWeakNo Sunny>25HighStrongNo Overcast>25HighWeakYes Rain15-25HighWeakYes Rain<15NormalWeakYes Rain<15NormalStrongNo Overcast<15NormalStrongYes Sunny15-25HighWeakNo Sunny<15NormalWeakYes Rain15-25NormalWeakYes Sunny15-25NormalStrongYes Overcast15-25HighStrongYes Overcast>25NormalWeakYes Rain15-25HighStrongNo 7

Entropy of data S Split data by attribute Outlook Tree induction example 8 S[9+, 5-] Outlook Sunny [2+,3-] Overcast [4+,0-] Rain [3+,2-] Gain(Outlook) = 0.94 – 5/14[-2/5(log 2 (2/5))-3/5(log 2 (3/5))] – 4/14[-4/4(log 2 (4/4))-0/4(log 2 (0/4))] – 5/14[-3/5(log 2 (3/5))-2/5(log 2 (2/5))] = 0.94 – 0.69 = 0.25 Info(S) = -9/14(log 2 (9/14))-5/14(log 2 (5/14)) = 0.94

Tree induction example 9 S[9+, 5-] Temperature <15 [3+,1-] [4+,2-] >25 [2+,2-] Gain(Temperature) = 0.94 – 4/14[-3/4(log 2 (3/4))-1/4(log 2 (1/4))] – 6/14[-4/6(log 2 (4/6))-2/6(log 2 (2/6))] – 4/14[-2/4(log 2 (2/4))-2/4(log 2 (2/4))] = 0.94 – 0.91 = 0.03 Split data by attribute Temperature

Tree induction example 10 S[9+, 5-] Wind Weak [6+, 2-] Strong [3+, 3-] Gain(Humidity) = 0.94 – 7/14[-3/7(log 2 (3/7))-4/7(log 2 (4/7))] – 7/14[-6/7(log 2 (6/7))-1/7(log 2 (1/7))] = 0.94 – 0.79 = 0.15 Gain(Wind) = 0.94 – 8/14[-6/8(log 2 (6/8))-2/8(log 2 (2/8))] – 6/14[-3/6(log 2 (3/6))-3/6(log 2 (3/6))] = 0.94 – 0.89 = 0.05 Split data by attribute Humidity Split data by attribute Wind S[9+, 5-] Humidity High [3+,4-] Normal [6+, 1-]

11 Outlook Yes?? OvercastSunnyRain Gain(Outlook) = 0.25 Gain(Temperature)=0.0 3 Gain(Humidity) = 0.15 Gain(Wind) = 0.05 NoWeakHigh>25Sunny NoStrongHigh>25Sunny YesWeakHigh>25Overcast YesWeakHigh15-25Rain YesWeakNormal<15Rain NoStrongNormal<15Rain YesStrongNormal<15Overcast NoWeakHigh15-25Sunny YesWeakNormal<15Sunny YesWeakNormal15-25Rain YesStrongNormal15-25Sunny YesStrongHigh15-25Overcast YesWeakNormal>25Overcast NoStrongHigh15-25Rain Play Tennis WindHumidityTempera ture Outlook Tree induction example

Entropy of branch Sunny Split Sunny branch by attribute Temperature Split Sunny branch by attribute Humidity Split Sunny branch by attribute Wind 12 Sunny[2+, 3-] Wind Weak [1+, 2-] Strong [1+, 1-] Gain(Humidity) = 0.97 – 3/5[-0/3(log 2 (0/3))-3/3(log 2 (3/3))] – 2/5[-2/2(log 2 (2/2))-0/2(log 2 (0/2))] = 0.97 – 0 = 0.97 Gain(Wind) = 0.97 – 3/5[-1/3(log 2 (1/3))-2/3(log 2 (2/3))] – 2/5[-1/2(log 2 (1/2))-1/2(log 2 (1/2))] = 0.97 – 0.95= 0.02 Info(Sunny) = -2/5(log 2 (2/5))-3/5(log 2 (3/5)) = 0.97 Sunny[2+,3-] Temperature <15 [1+,0-] [1+,1-] >25 [0+,2-] Gain(Temperature) = 0.97 – 1/5[-1/1(log 2 (1/1))-0/1(log 2 (0/1))] – 2/5[-1/2(log 2 (1/2))-1/2(log 2 (1/2))] – 2/5[-0/2(log 2 (0/2))-2/2(log 2 (2/2))] = 0.97 – 0.4 = 0.57 Sunny[2+,3-] Humidity High [0+,3-] Normal [2+, 0-]

13 Outlook Yes Humidity ?? Yes No High SunnyRain Normal Overcast Tree induction example

Gain(Humidity) = 0.97 – 2/5[-1/2(log 2 (1/2))-1/2(log 2 (1/2))] – 3/5[-2/3(log 2 (2/3))-1/3(log 2 (1/3))] = 0.97 – 0.95 = 0.02 Gain(Wind) = 0.97 – 3/5[-3/3(log 2 (3/3))-0/3(log 2 (0/3))] – 2/5[-0/2(log 2 (0/2))-2/2(log 2 (2/2))] = 0.97 – 0 = 0.97 Entropy of branch Rain Split Rain branch by attribute Temperature Split Rain branch by attribute Humidity Split Rain branch by attribute Wind 14 Info(Rain) = -3/5(log 2 (3/5))-2/5(log 2 (2/5)) = 0.97 Rain[3+,2-] Temperature <15 [1+,1-] [2+,1-] >25 [0+,0-] Gain(Outlook) = 0.97 – 2/5[-1/2(log 2 (1/2))-1/2(log 2 (1/2))] – 3/5[-2/3(log 2 (2/3))-1/3(log 2 (1/3))] – 0/5[-0/0(log 2 (0/0))-0/0(log 2 (0/0))] = 0.97 – 0.95 = 0.02 Rain[3+,2-] Wind Weak [3+, 0-] Strong [0+, 2-] Rain[3+,2-] Humidity High [1+,1-] Normal [2+, 1-]

15 Outlook YesHumidity Wind Yes No NormalHigh No Yes StrongWeak OvercastSunnyRain Tree induction example

Thank you & Questions? 16