LEARNING DECISION TREES Yılmaz KILIÇASLAN. Definition - I Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm.

Slides:



Advertisements
Similar presentations
Learning from Observations
Advertisements

Learning from Observations Chapter 18 Section 1 – 3.
ICS 178 Intro Machine Learning
Artificial Intelligence
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Classification Techniques: Decision Tree Learning
CS 4700: Foundations of Artificial Intelligence
Learning Department of Computer Science & Engineering Indian Institute of Technology Kharagpur.
Learning from Observations Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 18 Spring 2004.
Decision making in episodic environments
Learning from Observations Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 18 Fall 2005.
Cooperating Intelligent Systems
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
18 LEARNING FROM OBSERVATIONS
Learning From Observations
Learning from Observations Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 18 Fall 2004.
Induction of Decision Trees
Learning from Observations Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 18.
ICS 273A Intro Machine Learning
CSCE 580 Artificial Intelligence Ch.18: Learning from Observations
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
LEARNING DECISION TREES
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
Let remember from the previous lesson what is Knowledge representation
Learning decision trees
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
ICS 273A Intro Machine Learning
Learning: Introduction and Overview
CS 4700: Foundations of Artificial Intelligence
Induction of Decision Trees (IDT) CSE 335/435 Resources: – –
Machine learning Image source:
Machine learning Image source:
Machine Learning CPS4801. Research Day Keynote Speaker o Tuesday 9:30-11:00 STEM Lecture Hall (2 nd floor) o Meet-and-Greet 11:30 STEM 512 Faculty Presentation.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Inductive learning Simplest form: learn a function from examples
Decision Trees & the Iterative Dichotomiser 3 (ID3) Algorithm David Ramos CS 157B, Section 1 May 4, 2006.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
Learning from observations
Learning from Observations Chapter 18 Through
CHAPTER 18 SECTION 1 – 3 Learning from Observations.
Lecture 7 : Intro to Machine Learning Rachel Greenstadt & Mike Brennan November 10, 2008.
Learning from Observations Chapter 18 Section 1 – 3, 5-8 (presentation TBC)
Learning from Observations Chapter 18 Section 1 – 3.
Classification and Prediction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot Readings: Chapter 6 – Han and Kamber.
L6. Learning Systems in Java. Necessity of Learning No Prior Knowledge about all of the situations. Being able to adapt to changes in the environment.
Decision Trees. What is a decision tree? Input = assignment of values for given attributes –Discrete (often Boolean) or continuous Output = predicated.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
Machine learning Image source:
CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,
Chapter 18 Section 1 – 3 Learning from Observations.
Decision Tree. Classification Databases are rich with hidden information that can be used for making intelligent decisions. Classification is a form of.
Learning From Observations Inductive Learning Decision Trees Ensembles.
Anifuddin Azis LEARNING. Why is learning important? So far we have assumed we know how the world works Rules of queens puzzle Rules of chess Knowledge.
Decision Tree Learning CMPT 463. Reminders Homework 7 is due on Tuesday, May 10 Projects are due on Tuesday, May 10 o Moodle submission: readme.doc and.
Learning from Observations
Learning from Observations
Learning from Observations
Machine learning Image source:
Introduce to machine learning
Presented By S.Yamuna AP/CSE
Decision making in episodic environments
Learning from Observations
Learning from Observations
Decision trees One possible representation for hypotheses
Machine Learning: Decision Tree Learning
Decision Trees - Intermediate
Presentation transcript:

LEARNING DECISION TREES Yılmaz KILIÇASLAN

Definition - I Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm. A decision tree takes as input an object or situation described by a set of attributes and returns a "decision” – the predicted output value for the input. The input attributes and the output value can be discrete or continuous.

Definition - II Learning a discrete-valued function is called classification learning. Learning a continuous function is called regression. We will concentrate on Boolean classification, wherein each example is classified as true (positive) or false (negative).

Decision Trees as Performance Elements A decision tree reaches its decision by performing a sequence of tests. Each internal node in the tree corresponds to a test of the value of one of the properties, and the branches from the node are labeled with the possible values of the test. Each leaf node in the tree specifies the value to be returned if that leaf is reached.

Example Aim: To decide whether to wait for a table at a restaurant. Attributes: 1. Alternate: whether there is a suitable alternative restaurant nearby. 2. Bar: whether the restaurant has a comfortable bar area to wait in. 3. Fri/Sat: true on Fridays and Saturdays. 4. Hungry: whether we are hungry. 5. Patrons: how many people are in the restaurant (values are None, Some, and Full). 6. Price: the restaurant's price range ($, $$, $$$). 7. Raining: whether it is raining outside. 8. Reservation: whether we made a reservation. 9. Type: the kind of restaurant (French, Italian, Thai, or burger). 10. WaitEstimate: the wait estimated by the host (0-10 minutes, , 30-60, >60).

A Decision Tree to Decide Whether to Wait for a Table Figure 1.

Expressiveness of Decision Trees - I Logically speaking, any particular decision tree hypotlhesis for the Will Wait goal predicate can be seen as an assertion of the form  s WillWait(s)  ( P 1 ( s ) V P 2 ( s ) V... V P n ( s ) ), where each condition P i (s) is a conjunction of tests corresponding to a path from the root of the tree to a leaf with a positive outcome. Although this looks like a first-order sentence, it is, in a sense, propositional.

Expressiveness of Decision Trees - II Decision trees are fully expressive within the class of propositional languages; that is, any Boolean function can be written as a decision tree. This can be done trivially by having each row in the truth table for the function correspond to a path in the tree. This would yield an exponentially large decision tree representation because the truth table has exponentially many rows. Decision trees can represent many (but not all) functions with much smaller trees.

Inducing decision trees from examples – I An example for a Boolean decision tree consists of a vector of input attributes, X, and a single Boolean output value y. The complete set of examples is called the training set. In order to find a decision tree that agrees with the training set, we could simply construct a decision tree that has one path to a leaf for each example, where the path tests each attribute in turn and follows the value for the example and the leaf has the classification of the example. When given the same example again, the decision tree will come up with the right classification. Unfortunately, it will not have much to say about any other cases!

Inducing decision trees from examples – II Figure 2.

Decision-Tree Leaning Algorithm - I The basic idea behind the Decision-Tree Learning Algorithm is to test the most important attribute first. By "most important," we mean the one that makes the most difference to the classification of an example. That way, we hope to get to the correct classification with a small number of tests, meaning that all paths in the tree will be short and the tree as a whole will be small.

Decision-Tree Leaning Algorithm - II (a) Splitting on Type brings us no nearer to distinguishing between positive and negative examples. Splitting the examples by testing on attributes (i) : Figure 3.

Decision-Tree Leaning Algorithm - III Splitting the examples by testing on attributes (ii) : Patrons? Splitting on Patrons does a good job of separating positive and negative examples. After splitting on Patrons, Hungry is a fairly good second test. Figure 3.

Decision-Tree Leaning Algorithm - IV In general, after the first attribute test splits up the examples, each outcome is a new decision tree learning problem in itself, with fewer examples and one fewer attribute. There are four cases to consider for these recursive problems: 1. If there are some positive and some negative examples, then choose the best attribute to split them. Figure 3(b) shows Hungry being used to split the remaining examples. 2. If all the remaining examples are positive (or all. negative), then we are done: we can answer Yes or No. Figure 3(b) shows examples of this in the None and Some cases.

Decision-Tree Leaning Algorithm - V 3. If there are no examples left, it means that no such example has been observed, and we return a default value calculated from the majority classification at the node's parent. 4. If there are no attributes left, but both positive and negative examples, we have a problem. It means that these examples have exactly the same description, but different classifications. This happens when some of the data are incorrect; we say there is noise in the data. It also happens either when the attributes do not give enough information to describe the situation fully, or when the domain is truly nondeterministic. One simple way out of the problem is to use a majority vote.

Decision-Tree Leaning Algorithm - VI Figure 4.

A Decision Tree Induced From Examples Figure 5. The decision tree induced from the 12-example.

References Russell, S. and P. Norvig (2003). Artificial Intelligence: A Modern Approach. Prentice Hall.