Artificial Intelligence 7. Decision trees

Slides:



Advertisements
Similar presentations
1 Machine Learning: Lecture 3 Decision Tree Learning (Based on Chapter 3 of Mitchell T.., Machine Learning, 1997)
Advertisements

Decision Tree Learning
Decision Tree Learning - ID3
Decision Trees Decision tree representation ID3 learning algorithm
Machine Learning III Decision Tree Induction
1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.1 Introduction –Method for approximation of discrete-valued target functions.
Decision Tree Algorithm (C4.5)
ICS320-Foundations of Adaptive and Learning Systems
Classification Techniques: Decision Tree Learning
Decision Tree Example MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way.
Decision Tree Rong Jin. Determine Milage Per Gallon.
Machine Learning II Decision Tree Induction CSE 473.
Decision Trees. DEFINE: Set X of Instances (of n-tuples x = ) –E.g., days decribed by attributes (or features): Sky, Temp, Humidity, Wind, Water, Forecast.
Decision Tree Learning
Part 7.3 Decision Trees Decision tree representation ID3 learning algorithm Entropy, information gain Overfitting.
CS 590M Fall 2001: Security Issues in Data Mining Lecture 4: ID3.
Comparing between machine learning methods for a remote monitoring system. Ronit Zrahia Final Project Tel-Aviv University.
Decision Tree Learning Learning Decision Trees (Mitchell 1997, Russell & Norvig 2003) –Decision tree induction is a simple but powerful learning paradigm.
Induction of Decision Trees
Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO E APPROSSIMAZIONE Prof. Giancarlo Mauri Lezione 3 - Learning Decision.
Decision Trees Decision tree representation Top Down Construction
1 Interacting with Data Materials from a Course in Princeton University -- Hu Yan.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Ch 3. Decision Tree Learning
Machine Learning Reading: Chapter Text Classification  Is text i a finance new article? PositiveNegative.
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
ID3 and Decision tree by Tuan Nguyen May 2008.
NAÏVE BAYES CLASSIFIER 1 ACM Student Chapter, Heritage Institute of Technology 10 th February, 2012 SIGKDD Presentation by Anirban Ghose Parami Roy Sourav.
National Centre for Agricultural Economics and Policy Research (NCAP), New Delhi Rajni Jain
Machine Learning Chapter 3. Decision Tree Learning
CS 484 – Artificial Intelligence1 Announcements List of 5 source for research paper Homework 5 due Tuesday, October 30 Book Review due Tuesday, October.
Machine Learning CS 165B Spring 2012
Machine Learning Decision Tree.
Mohammad Ali Keyvanrad
Classification with Decision Trees and Rules Evgueni Smirnov.
Machine Learning Lecture 10 Decision Tree Learning 1.
CpSc 810: Machine Learning Decision Tree Learning.
Decision-Tree Induction & Decision-Rule Induction
Decision Tree Learning
Data Mining-Knowledge Presentation—ID3 algorithm Prof. Sin-Min Lee Department of Computer Science.
Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006.
Artificial Intelligence 8. Supervised and unsupervised learning Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.
Decision Trees, Part 1 Reading: Textbook, Chapter 6.
Decision Tree Learning
Training Examples. Entropy and Information Gain Information answers questions The more clueless I am about the answer initially, the more information.
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
Seminar on Machine Learning Rada Mihalcea Decision Trees Very short intro to Weka January 27, 2003.
Decision Trees Reading: Textbook, “Learning From Examples”, Section 3.
Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
Friday’s Deliverable As a GROUP, you need to bring 2N+1 copies of your “initial submission” –This paper should be a complete version of your paper – something.
1 By: Ashmi Banerjee (125186) Suman Datta ( ) CSE- 3rd year.
Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Decision Tree Learning CMPT 463. Reminders Homework 7 is due on Tuesday, May 10 Projects are due on Tuesday, May 10 o Moodle submission: readme.doc and.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Decision Tree Learning
Machine Learning Inductive Learning and Decision Trees
CS 9633 Machine Learning Decision Tree Learning
Decision Tree Learning
Decision trees (concept learnig)
Machine Learning Lecture 2: Decision Tree Learning.
Classification Algorithms
Decision Tree Learning
Decision Tree Saed Sayad 9/21/2018.
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning: Lecture 3
Decision Trees Decision tree representation ID3 learning algorithm
Machine Learning Chapter 3. Decision Tree Learning
Decision Trees Decision tree representation ID3 learning algorithm
Artificial Intelligence 9. Perceptron
Presentation transcript:

Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka

Outline What is a decision tree? How to build a decision tree Entropy Information Gain Overfitting Generalization performance Pruning Lecture slides http://www.jaist.ac.jp/~tsuruoka/lectures/

Decision trees Chapter 3 of Mitchell, T., Machine Learning (1997) Disjunction of conjunctions Successfully applied to a broad range of tasks Diagnosing medical cases Assessing credit risk of loan applications Nice characteristics Understandable to human Robust to noise

A decision tree Concept: PlayTennis Outlook Humidity Wind Sunny Overcast Rain Humidity Wind Yes High Normal Strong Weak No Yes No Yes

Classification by a decision tree Instance <Outlook = Sunny, Temperature = Hot, Humidity = High, Wind = Strong> Outlook Sunny Overcast Rain Humidity Wind Yes High Normal Strong Weak No Yes No Yes

Disjunction of conjunctions (Outlook = Sunny ^ Humidity = Normal) v (Outlook = Overcast) v (Outlook = Rain ^ Wind = Weak) Outlook Sunny Overcast Rain Humidity Wind Yes High Normal Strong Weak No Yes No Yes

Problems suited to decision trees Instanced are represented by attribute-value pairs The target function has discrete target values Disjunctive descriptions may be required The training data may contain errors The training data may contain missing attribute values

Training data Day Outlook Temperature Humidity Wind PlayTennis D1 Sunny Hot High Weak No D2 Strong D3 Overcast Yes D4 Rain Mild D5 Cool Normal D6 D7 D8 D9 D10 D11 D12 D13 D14

Which attribute should be tested at each node? We want to build a small decision tree Information gain How well a given attribute separates the training examples according to their target classification Reduction in entropy Entropy (im)purity of an arbitrary collection of examples

Entropy If there are only two classes In general,

Information Gain The expected reduction in entropy achieved by splitting the training examples

Example

Coumpiting Information Gain Humidity Wind High Normal Weak Strong

Which attribute is the best classifier? Information gain

Splitting training data with Outlook {D1,D2,…,D14} [9+,5-] Outlook Sunny Overcast Rain {D1,D2,D8,D9,D11} [2+,3-] {D3,D7,D12,D13} [4+,0-] {D4,D5,D6,D10,D14} [3+,2-] Yes ? ?

Overfitting Growing each branch of the tree deeply enough to perfectly classify the training examples is not a good strategy. The resulting tree may overfit the training data Overfitting The tree can explain the training data very well but performs poorly on new data

Alleviating the overfitting problem Several approaches Stop growing the tree earlier Post-prune the tree How can we evaluate the classification performance of the tree for new data? The available data are separated into two sets of examples: a training set and a validation (development) set

Validation (development) set Use a portion of the original training data to estimate the generalization performance. Original training set Training set Validation set Test set Test set