Decision Trees & the Iterative Dichotomiser 3 (ID3) Algorithm David Ramos CS 157B, Section 1 May 4, 2006.

Slides:



Advertisements
Similar presentations
1 Machine Learning: Lecture 3 Decision Tree Learning (Based on Chapter 3 of Mitchell T.., Machine Learning, 1997)
Advertisements

CHAPTER 9: Decision Trees
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.1 Introduction –Method for approximation of discrete-valued target functions.
Classification Algorithms
Paper By - Manish Mehta, Rakesh Agarwal and Jorma Rissanen
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Data Mining Techniques: Classification. Classification What is Classification? –Classifying tuples in a database –In training set E each tuple consists.
IT 433 Data Warehousing and Data Mining
Hunt’s Algorithm CIT365: Data Mining & Data Warehousing Bajuna Salehe
Decision Tree Approach in Data Mining
Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.
CS 484 – Artificial Intelligence1 Announcements Choose Research Topic by today Project 1 is due Thursday, October 11 Midterm is Thursday, October 18 Book.
Deriving rules from data Decision Trees a.j.m.m (ton) weijters.
Decision Tree Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
Induction and Decision Trees. Artificial Intelligence The design and development of computer systems that exhibit intelligent behavior. What is intelligence?
Lecture Notes for Chapter 4 Introduction to Data Mining
Decision Tree Learning
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Machine Learning Group University College Dublin Decision Trees What is a Decision Tree? How to build a good one…
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
Decision Tree Algorithm
Induction of Decision Trees
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Rule induction: Ross Quinlan's ID3 algorithm Fredda Weinberg CIS 718X Fall 2005 Professor Kopec Assignment #3.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
Data Mining and Decision Tree CS157B Spring 2006 Masumi Shimoda.
Decision Tree Learning
Data Mining: Classification
Issues with Data Mining
Mohammad Ali Keyvanrad
For Friday No reading No homework. Program 4 Exam 2 A week from Friday Covers 10, 11, 13, 14, 18, Take home due at the exam.
1 CO Games Development 2 Week 19 Probability Trees + Decision Trees (Learning Trees) Gareth Bellaby.
LEARNING DECISION TREES Yılmaz KILIÇASLAN. Definition - I Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.7: Instance-Based Learning Rodney Nielsen.
CS690L Data Mining: Classification
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
ID3 Algorithm Michael Crawford.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Decision Mining in Prom A. Rozinat and W.M.P. van der Aalst Joosung, Ko.
Decision Trees, Part 1 Reading: Textbook, Chapter 6.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
Decision Tree Learning
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,
Big Data Analysis and Mining Qinpei Zhao 赵钦佩 2015 Fall Decision Tree.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
Iterative Dichotomiser 3 By Christopher Archibald.
1 By: Ashmi Banerjee (125186) Suman Datta ( ) CSE- 3rd year.
CSE343/543 Machine Learning: Lecture 4.  Chapter 3: Decision Trees  Weekly assignment:  There are lot of applications and systems using machine learning.
Decision Tree Learning DA514 - Lecture Slides 2 Modified and expanded from: E. Alpaydin-ML (chapter 9) T. Mitchell-ML.
CSE573 Autumn /09/98 Machine Learning Administrative –Last topic: Decision Tree Learning Reading: 5.1, 5.4 Last time –finished NLP sample system’s.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Decision Trees Jennifer Tsay CS 157B February 4, 2010.
Decision Trees.
Data Mining Lecture 11.
Machine Learning: Lecture 3
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Decision Trees Jeff Storey.
Presentation transcript:

Decision Trees & the Iterative Dichotomiser 3 (ID3) Algorithm David Ramos CS 157B, Section 1 May 4, 2006

Review of Basics ● What exactly is a Decision Tree? A tree where each branching node represents a choice between two or more alternatives, with every branching node being part of a path to a leaf node (bottom of the tree). The leaf node represents a decision, derived from the tree for the given input. ● How can Decision Trees be used to classify instances of data? Instead of representing decisions, leaf nodes represent a particular classification of a data instance, based on the given set of attributes (and their discrete values) that define the instance of data, which is kind of like a relational tuple for illustration purposes.

Review of Basics (cont’d) ● It is important that data instances have boolean or discrete data values for their attributes to help with the basic understanding of ID3, although there are extensions of ID3 that deal with continuous data. ●Because Decision Trees can classify data instances into different types, they can be “interpreted” as a “good generalization of unobserved instances” of data, appealing to people because “it makes the classification process self-evident” [1]. They represent knowledge about these data instances.

How does ID3 relate to Decision Trees, then? ● ID3, or Iterative Dichotomiser 3 Algorithm, is a Decision Tree learning algorithm. The name is correct in that it creates Decision Trees for “dichotomizing” data instances, or classifying them discretely through branching nodes until a classification “bucket” is reached (leaf node). ● By using ID3 and other machine-learning algorithms from Artificial Intelligence, expert systems can engage in tasks usually done by human experts, such as doctors diagnosing diseases by examining various symptoms (the attributes) of patients (the data instances) in a complex Decision Tree. ● Of course, accurate Decision Trees are fundamental to Data Mining and Databases.

ID3 relates to Decision Trees (cont’d) ● Decision tree learning is a method for approximating discrete-valued target functions, in which the learned function is represented by a decision tree. Decision tree learning is one of the most widely used and practical methods for inductive inference. [2] ●The input data of ID3 is known as sets of “training” or “learning” data instances, which will be used by the algorithm to generate the Decision Tree. The machine is “learning” from this set of preliminary data. For future exams, remember that ID3 was developed by Ross Quinlan in 1983.

Description of ID3 ● The ID3 algorithm generates a Decision Tree by using a greedy search through the inputted sets of data instances so as to determine nodes and the attributes they use for branching. Also, the emerging tree is traversed in a top-down (root to leaf) approach through each of the nodes within the tree. This occurs RECURSIVELY, reminding you of those “pointless” tree traversals strategies in CS 146 that you hated doing. ● The traversal attempts to determine if the decision “attribute” on which the branching will be based on for any particular emerging node is the most ideal branching attribute (by using the inputted sets of data). One particular metric that can be used to determine the if a branching attribute is adequate is that of INFORMATION GAIN, or ENTROPY (abstractly inversely proportional to INFORMATION GAIN).

Why do we use Entropy as THE metric ? ● ID3 uses Entropy to determine if, based on the inputted set of data, the selected branching attribute for a particular node of the emerging tree is adequate. Specifically, the attribute that results in the most reduction of Entropy related to the learning sets is the best. ● GOAL: Find a way to optimally classify a learning set, such that the resulting Decision Tree is not too deep and the number of branches (internal nodes) of the Tree are minimized. ● SOLUTION: The more Entropy in a system, or measure of impurity in a collection of data sets, the more branches and depth the tree will have. FIND entropy reducing attributes in the learning sets and use them for branching!

How is Entropy Related to Information Gain, which is the other metric for determining if ID3 is choosing appropriate branching attributes from the training sets? ● Information Gain = measuring the expected reduction in Entropy…The higher the Information Gain, the more expected reduction in Entropy. ●It turns out Entropy, or measure of non-homogeneousness within a set of learning sets can be calculated in a straight forward manner. For a more detailed presentation of the definition of Entropy and its calculation, see Prof. Lee’s Lecture Notes.