Machine Learning Group University College Dublin Decision Trees What is a Decision Tree? How to build a good one…

Slides:



Advertisements
Similar presentations
Learning from Observations Chapter 18 Section 1 – 3.
Advertisements

Decision Tree Learning - ID3
Decision Trees Decision tree representation ID3 learning algorithm
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.1 Introduction –Method for approximation of discrete-valued target functions.
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Decision Tree Approach in Data Mining
ICS320-Foundations of Adaptive and Learning Systems
Part 7.3 Decision Trees Decision tree representation ID3 learning algorithm Entropy, information gain Overfitting.
Induction of Decision Trees
Decision Trees Decision tree representation Top Down Construction
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
LEARNING DECISION TREES
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
Learning decision trees
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
Ch 3. Decision Tree Learning
ICS 273A Intro Machine Learning
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
Decision Tree Learning
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning CPS4801. Research Day Keynote Speaker o Tuesday 9:30-11:00 STEM Lecture Hall (2 nd floor) o Meet-and-Greet 11:30 STEM 512 Faculty Presentation.
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Inductive learning Simplest form: learn a function from examples
Mohammad Ali Keyvanrad
Decision Trees & the Iterative Dichotomiser 3 (ID3) Algorithm David Ramos CS 157B, Section 1 May 4, 2006.
For Friday No reading No homework. Program 4 Exam 2 A week from Friday Covers 10, 11, 13, 14, 18, Take home due at the exam.
Machine Learning Lecture 10 Decision Tree Learning 1.
CpSc 810: Machine Learning Decision Tree Learning.
Learning from observations
Learning from Observations Chapter 18 Through
CHAPTER 18 SECTION 1 – 3 Learning from Observations.
Learning from Observations Chapter 18 Section 1 – 3, 5-8 (presentation TBC)
Learning from Observations Chapter 18 Section 1 – 3.
For Wednesday No reading Homework: –Chapter 18, exercise 6.
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
Decision Trees. What is a decision tree? Input = assignment of values for given attributes –Discrete (often Boolean) or continuous Output = predicated.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
1 Decision Tree Learning Original slides by Raymond J. Mooney University of Texas at Austin.
Decision Tree Learning
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,
Oliver Schulte Machine Learning 726 Decision Tree Classifiers.
Chapter 18 Section 1 – 3 Learning from Observations.
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
Friday’s Deliverable As a GROUP, you need to bring 2N+1 copies of your “initial submission” –This paper should be a complete version of your paper – something.
Learning From Observations Inductive Learning Decision Trees Ensembles.
1 By: Ashmi Banerjee (125186) Suman Datta ( ) CSE- 3rd year.
CSE343/543 Machine Learning: Lecture 4.  Chapter 3: Decision Trees  Weekly assignment:  There are lot of applications and systems using machine learning.
Decision Tree Learning DA514 - Lecture Slides 2 Modified and expanded from: E. Alpaydin-ML (chapter 9) T. Mitchell-ML.
Decision Tree Learning CMPT 463. Reminders Homework 7 is due on Tuesday, May 10 Projects are due on Tuesday, May 10 o Moodle submission: readme.doc and.
Learning from Observations
Learning from Observations
Machine Learning Inductive Learning and Decision Trees
Università di Milano-Bicocca Laurea Magistrale in Informatica
Introduce to machine learning
Presented By S.Yamuna AP/CSE
Decision Tree Saed Sayad 9/21/2018.
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning: Lecture 3
Decision Trees Decision tree representation ID3 learning algorithm
Machine Learning Chapter 3. Decision Tree Learning
Learning from Observations
Decision Trees Decision tree representation ID3 learning algorithm
Learning from Observations
Machine Learning: Decision Tree Learning
Presentation transcript:

Machine Learning Group University College Dublin Decision Trees What is a Decision Tree? How to build a good one…

D Trees 2 Classifying Apples & Pears

D Trees 3 Width Height >55<55 <59>59 Apple Pear A Decision Tree

D Trees 4 A Decision Tree Width Height >55<55 <59>59 Apple Pear Height/Width <1.2>1.2 Apple Pear

D Trees 5 Decision Trees Each internal node tests an attribute Each branch corresponds to an attribute value Each leaf node assigns a classification Cannot readily represent , , XOR (A  B) , (C   D  E) M of N

D Trees 6 When to consider D-Trees Instances described by attribute-value pairs Target function is discrete valued Disjunctive hypothesis may be required Possibly noisy training data Classification can be done using a few features Examples Equipment or medical diagnosis Credit risk analysis

D Trees 7 D-Tree Example AlternativeWhether there is a suitable alternative restaurant nearby. BarIs there a comfortable bar area? Fri/SatTrue on Friday or Saturday nights. HungaryHow hungry is the subject? PatronsHow many people in the restaurant? PricePrice range. RainingIs it raining outside? Reservation Does the subject have a reservation? TypeType of Restaurant. Stay?Stay or Go

D Trees 8 D-Tree Example

D Trees 9 D-Tree Example Very good D-Tree Classifies all examples correctly Very few nodes Objective in building a decision tree is to choose attributes so as to minimise the depth of the tree

D Trees 10 Top-down induction of D-Trees 1. A  the “best” decision attribute for next node 2.Assign A as decision attribute for node 3. For each value of A create new descendant of node 4. Sort training examples to leaf nodes 5. If training examples perfectly classified, Then Stop, Else repeat recursively over leaf nodes Which attribute is best?

D Trees 11 Good and Bad Attributes A perfect attribute divides examples into categories of one type. (e.g. Patrons) A poor attribute produces categories of mixed type. (e.g. Type) How can we measure this?

D Trees 12 Entropy S is a sample of training examples p is the proportion of positive examples in S q is the proportion of positive examples in S Entropy measures the impurity of S Entropy(S) = -plog 2 (p) -qlog 2 (q)

D Trees 13 Entropy Entropy(S) = expected number of bits needed to encode class (p or q) of randomly drawn members of S (under optimal shortest length code) Why? Information theory: optimal length code assigns - log 2 (q) bits to messages having probability p. So, expected number of bits to encode messages in ratio p:q of random members of S. -p(log 2 (p)) -q(log 2 (q)) i.e. Entropy(S) = -plog 2 (p) -qlog 2 (q)

D Trees 14 Information Gain Gain(S,A) = expected reduction in entropy due to sorting on A

D Trees 15 D-Tree Example

D Trees 16 Minimal D-Tree

D Trees 17 Summary ML avoids some KE effort Recursive algorithm for bulding D-Trees Using informatio gain (Entropy) to select discriminating attribute Example Important People  Claude Shannon  William of Ockham