Download presentation
Presentation is loading. Please wait.
Published byCathleen Martin Modified over 9 years ago
1
Decision Trees & the Iterative Dichotomiser 3 (ID3) Algorithm David Ramos CS 157B, Section 1 May 4, 2006
2
Review of Basics ● What exactly is a Decision Tree? A tree where each branching node represents a choice between two or more alternatives, with every branching node being part of a path to a leaf node (bottom of the tree). The leaf node represents a decision, derived from the tree for the given input. ● How can Decision Trees be used to classify instances of data? Instead of representing decisions, leaf nodes represent a particular classification of a data instance, based on the given set of attributes (and their discrete values) that define the instance of data, which is kind of like a relational tuple for illustration purposes.
3
Review of Basics (cont’d) ● It is important that data instances have boolean or discrete data values for their attributes to help with the basic understanding of ID3, although there are extensions of ID3 that deal with continuous data. ●Because Decision Trees can classify data instances into different types, they can be “interpreted” as a “good generalization of unobserved instances” of data, appealing to people because “it makes the classification process self-evident” [1]. They represent knowledge about these data instances.
4
How does ID3 relate to Decision Trees, then? ● ID3, or Iterative Dichotomiser 3 Algorithm, is a Decision Tree learning algorithm. The name is correct in that it creates Decision Trees for “dichotomizing” data instances, or classifying them discretely through branching nodes until a classification “bucket” is reached (leaf node). ● By using ID3 and other machine-learning algorithms from Artificial Intelligence, expert systems can engage in tasks usually done by human experts, such as doctors diagnosing diseases by examining various symptoms (the attributes) of patients (the data instances) in a complex Decision Tree. ● Of course, accurate Decision Trees are fundamental to Data Mining and Databases.
5
ID3 relates to Decision Trees (cont’d) ● Decision tree learning is a method for approximating discrete-valued target functions, in which the learned function is represented by a decision tree. Decision tree learning is one of the most widely used and practical methods for inductive inference. [2] ●The input data of ID3 is known as sets of “training” or “learning” data instances, which will be used by the algorithm to generate the Decision Tree. The machine is “learning” from this set of preliminary data. For future exams, remember that ID3 was developed by Ross Quinlan in 1983.
6
Description of ID3 ● The ID3 algorithm generates a Decision Tree by using a greedy search through the inputted sets of data instances so as to determine nodes and the attributes they use for branching. Also, the emerging tree is traversed in a top-down (root to leaf) approach through each of the nodes within the tree. This occurs RECURSIVELY, reminding you of those “pointless” tree traversals strategies in CS 146 that you hated doing. ● The traversal attempts to determine if the decision “attribute” on which the branching will be based on for any particular emerging node is the most ideal branching attribute (by using the inputted sets of data). One particular metric that can be used to determine the if a branching attribute is adequate is that of INFORMATION GAIN, or ENTROPY (abstractly inversely proportional to INFORMATION GAIN).
7
Why do we use Entropy as THE metric ? ● ID3 uses Entropy to determine if, based on the inputted set of data, the selected branching attribute for a particular node of the emerging tree is adequate. Specifically, the attribute that results in the most reduction of Entropy related to the learning sets is the best. ● GOAL: Find a way to optimally classify a learning set, such that the resulting Decision Tree is not too deep and the number of branches (internal nodes) of the Tree are minimized. ● SOLUTION: The more Entropy in a system, or measure of impurity in a collection of data sets, the more branches and depth the tree will have. FIND entropy reducing attributes in the learning sets and use them for branching!
8
How is Entropy Related to Information Gain, which is the other metric for determining if ID3 is choosing appropriate branching attributes from the training sets? ● Information Gain = measuring the expected reduction in Entropy…The higher the Information Gain, the more expected reduction in Entropy. ●It turns out Entropy, or measure of non-homogeneousness within a set of learning sets can be calculated in a straight forward manner. For a more detailed presentation of the definition of Entropy and its calculation, see Prof. Lee’s Lecture Notes.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.