1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 3. Decision Tree Learning 3.1 Introduction –Method for approximation of discrete-valued target functions.

Slides:



Advertisements
Similar presentations
2. Concept Learning 2.1 Introduction
Advertisements

1 Machine Learning: Lecture 3 Decision Tree Learning (Based on Chapter 3 of Mitchell T.., Machine Learning, 1997)
Decision Trees Decision tree representation ID3 learning algorithm
ICS320-Foundations of Adaptive and Learning Systems
Classification Techniques: Decision Tree Learning
Decision Tree Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Decision Tree Learning
Part 7.3 Decision Trees Decision tree representation ID3 learning algorithm Entropy, information gain Overfitting.
CS 590M Fall 2001: Security Issues in Data Mining Lecture 4: ID3.
Decision Tree Learning Learning Decision Trees (Mitchell 1997, Russell & Norvig 2003) –Decision tree induction is a simple but powerful learning paradigm.
Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO E APPROSSIMAZIONE Prof. Giancarlo Mauri Lezione 3 - Learning Decision.
Decision Trees Decision tree representation Top Down Construction
1 Interacting with Data Materials from a Course in Princeton University -- Hu Yan.
1er. Escuela Red ProTIC - Tandil, de Abril, Bayesian Learning 5.1 Introduction –Bayesian learning algorithms calculate explicit probabilities.
Ch 3. Decision Tree Learning
Decision Tree Learning
For Monday No reading Homework: –Chapter 18, exercises 1 and 2.
Decision tree learning
By Wang Rui State Key Lab of CAD&CG
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Machine Learning Chapter 3. Decision Tree Learning
Mohammad Ali Keyvanrad
For Wednesday No new reading Homework: –Chapter 18, exercises 3, 4, 7.
For Monday Read chapter 18, sections 5-6 Homework: –Chapter 18, exercises 1-2.
Decision tree learning Maria Simi, 2010/2011 Inductive inference with decision trees  Decision Trees is one of the most widely used and practical methods.
For Friday No reading No homework. Program 4 Exam 2 A week from Friday Covers 10, 11, 13, 14, 18, Take home due at the exam.
Machine Learning Lecture 10 Decision Tree Learning 1.
CpSc 810: Machine Learning Decision Tree Learning.
Learning from Observations Chapter 18 Through
Decision Trees DefinitionDefinition MechanismMechanism Splitting FunctionSplitting Function Issues in Decision-Tree LearningIssues in Decision-Tree Learning.
Decision Tree Learning
Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006.
For Wednesday No reading Homework: –Chapter 18, exercise 6.
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
机器学习 陈昱 北京大学计算机科学技术研究所 信息安全工程研究中心. 课程基本信息  主讲教师:陈昱 Tel :  助教:程再兴, Tel :  课程网页:
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
1 Decision Tree Learning Original slides by Raymond J. Mooney University of Texas at Austin.
Decision Trees, Part 1 Reading: Textbook, Chapter 6.
Decision Tree Learning
Training Examples. Entropy and Information Gain Information answers questions The more clueless I am about the answer initially, the more information.
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
Seminar on Machine Learning Rada Mihalcea Decision Trees Very short intro to Weka January 27, 2003.
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 4-Inducción de árboles de decisión (1/2) Eduardo Poggi.
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
Decision Tree Learning DA514 - Lecture Slides 2 Modified and expanded from: E. Alpaydin-ML (chapter 9) T. Mitchell-ML.
Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Machine Learning Inductive Learning and Decision Trees
Università di Milano-Bicocca Laurea Magistrale in Informatica
CS 9633 Machine Learning Decision Tree Learning
Decision Tree Learning
Decision trees (concept learnig)
Machine Learning Lecture 2: Decision Tree Learning.
Decision trees (concept learnig)
Decision Tree Learning
Decision Tree Saed Sayad 9/21/2018.
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning: Lecture 3
Decision Trees Decision tree representation ID3 learning algorithm
Machine Learning Chapter 3. Decision Tree Learning
Decision Trees.
Decision Trees Decision tree representation ID3 learning algorithm
Decision Trees Berlin Chen
INTRODUCTION TO Machine Learning
Presentation transcript:

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.1 Introduction –Method for approximation of discrete-valued target functions (classification) –One of the most widely used method for inductive inference –Capable of learning disjunctive hypothesis (searches a completely expressive hypothesis space) –Can be represented as if-then rules –Inductive bias: preference for small trees

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Method for approximation of discrete-valued target functions (classification) One of the most widely used method for inductive inference Capable of learning disjunctive hypothesis (searches a completely expressive hypothesis space) Can be represented as if-then rules Inductive bias: preference for small trees

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.2 Decision Tree Representation –Each node tests some attribute of the instance –Decision trees represent a disjunction of conjunctions of constraints on the attributes Example: (Outlook=Sunny  Humidity=Normal)  (Outlook = Overcast)  (Outlook=Rain  Wind=Weak)

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Example: PlayTennis

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Decision Tree for PlayTennis

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.3 Appropiate Problems for DTL –Instances are represented by attribute-value pairs –The target function has discrete output values –Disjunctive descriptions may be required –The training data may contain errors –The training data may contain missing attributes values

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.4 The Basic DTL Algorithm –Top-down, greedy search through the space of possible decision trees (ID3 and C4.5) –Root: best attribute for classification Which attribute is the best classifier?  answer based on information gain

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Entropy Entropy(S)  - p + log 2 p + - p - log 2 p - p + (-) = proportion of positive (negative) examples –Entropy specifies the minimum number of bits of information needed to encode the classification of an arbitrary member of S –In general: Entropy(S) = -  i=1, c p i log 2 p i

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Information Gain –Measures the expected reduction in entropy given the value of some attribute A Gain(S,A)  Entropy(S) -  v  Values(A) |S v |Entropy(S)/|S| Values(A): Set of all possible values for attribute A S v : Subset of S for which attribute A has value v

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Example

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Selecting the Next Attribute

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning PlayTennis Problem –Gain(S,Outlook) =0.246 –Gain(S,Humidity) =0.151 –Gain(S,Wind) =0.048 –Gain(S,Temperature) =0.029  Outlook is the attribute of the root node

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.5 Hypothesis Space Search in Decision Tree Learning –ID3’s hypothesis space for all decision trees is a complete space of finite discrete-valued functions –ID3 maintains only a single current hypothesis as it searches through the space of trees –ID3 in its pure form performs no backtracking in its search –ID3 uses all training examples at each step in the search (statistically based decisions)

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Hypothesis Space Search

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.6 Inductive Bias in DTL Approximate Inductive bias of ID3: Shorter trees are preferred over larger trees. Trees that place high information gain attributes close to the root are preferred. –ID3 searches incompletely a complete hypothesis space (preference bias) –Candidate-Elimination searches completely an incomplete hypothesis space (language bias)

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Why Prefer Short Hypotheses? Occam’s Razor: “Prefer the simplest hypothesis that fits the data” “Entities must not be multiplied beyond necessary” William de Ockham, siglo 14

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.7 Issues in Decision Tree Learning Avoiding Overfitting the Data –stop growing the tree earlier –post-prune the tree How? –Use a separate set of examples –Use statistical tests –Minimize a measure of complexity of training examples plus decision tree

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Reduced-Error Pruning –Nodes are pruned iteratively, always choosing the node whose removal most increases the decision tree accuracy over the validation set Rule Pos-Pruning Example: IF (Outlook=Sunny)  (Humidity=High) THEN PlayTennis = No

1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning Advanced Material –Incorporating continuous-valued attributes –Alternative Measures for Selecting Attributes –Handling Missing Attribute Values –Handling Attributes with Different Costs