COMP61011 : Machine Learning Decision Trees

Slides:

Advertisements

Similar presentations

Decision Trees Decision tree representation ID3 learning algorithm

Advertisements

Machine Learning III Decision Tree Induction

Classification Algorithms

Decision Tree Algorithm (C4.5)

ICS320-Foundations of Adaptive and Learning Systems

Classification Techniques: Decision Tree Learning

Decision Tree Rong Jin. Determine Milage Per Gallon.

ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.

Decision Tree Learning Learning Decision Trees (Mitchell 1997, Russell & Norvig 2003) –Decision tree induction is a simple but powerful learning paradigm.

Induction of Decision Trees

Machine Learning Reading: Chapter Text Classification  Is text i a finance new article? PositiveNegative.

Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.

ID3 Algorithm Allan Neymark CS157B – Spring 2007.

Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.

Short Introduction to Machine Learning Instructor: Rada Mihalcea.

Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.

Machine Learning Lecture 10 Decision Tree Learning 1.

Machine Learning Queens College Lecture 2: Decision Trees.

Machine Learning II Decision Tree Induction CSE 573.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Sections 4.1 Inferring Rudimentary Rules Rodney Nielsen.

Decision Tree Learning

A C B. Will I play tennis today? Features – Outlook: {Sun, Overcast, Rain} – Temperature:{Hot, Mild, Cool} – Humidity:{High, Normal, Low} – Wind:{Strong,

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Decision/Classification Trees Readings: Murphy ; Hastie 9.2.

COM24111: Machine Learning Decision Trees Gavin Brown

Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.

Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

CSE573 Autumn /09/98 Machine Learning Administrative –Last topic: Decision Tree Learning Reading: 5.1, 5.4 Last time –finished NLP sample system’s.

CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.

Decision Tree Learning

Bayesian Learning Reading: Tom Mitchell, “Generative and discriminative classifiers: Naive Bayes and logistic regression”, Sections 1-2. (Linked from.

Machine Learning Inductive Learning and Decision Trees

DECISION TREES An internal node represents a test on an attribute.

Decision Trees an introduction.

CS 9633 Machine Learning Decision Tree Learning

Decision Tree Learning

Decision trees (concept learnig)

Machine Learning Lecture 2: Decision Tree Learning.

Naïve Bayes Classifier

Classification Algorithms

Decision Tree Learning

Teori Keputusan (Decision Theory)

Prepared by: Mahmoud Rafeek Al-Farra

COMP61011 : Machine Learning Probabilistic Models + Bayes’ Theorem

Data Science Algorithms: The Basic Methods

Decision Trees: Another Example

Artificial Intelligence

Mining Time-Changing Data Streams

COMP61011 Foundations of Machine Learning

Data Science Algorithms: The Basic Methods

Oliver Schulte Machine Learning 726

Decision Tree Saed Sayad 9/21/2018.

Machine Learning Techniques for Data Mining

Roberto Battiti, Mauro Brunato

COMP61011 Foundations of Machine Learning

Privacy Preserving Data Mining

Machine Learning: Lecture 3

Naïve Bayes Classifier

Decision Trees Decision tree representation ID3 learning algorithm

Generative Models and Naïve Bayes

Lecture 05: Decision Trees

Play Tennis ????? Day Outlook Temperature Humidity Wind PlayTennis

Decision Trees.

Statistical Learning Dong Liu Dept. EEIS, USTC.

Decision Trees Decision tree representation ID3 learning algorithm

Artificial Intelligence 9. Perceptron

Generative Models and Naïve Bayes

A task of induction to find patterns

Data Mining CSCI 307, Spring 2019 Lecture 15

A task of induction to find patterns

Presentation transcript:

COMP61011 : Machine Learning Decision Trees

Recap: Decision Stumps 1 10 20 30 40 50 60 Q. Where is a good threshold?

From Decision Stumps, to Decision Trees New type of non-linear model Copes naturally with continuous and categorical data Fast to both train and test (highly parallelizable) Generates a set of interpretable rules

Recap: Decision Stumps 1 10 20 30 40 50 60 10.5 14.1 17.0 21.0 23.2 27.1 30.1 42.0 47.0 57.3 59.9 yes no The stump “splits” the dataset. Here we have 4 classification errors.

A modified stump 1 10 20 30 40 50 60 no yes 57.3 59.9 10.5 14.1 17.0 21.0 23.2 27.1 30.1 42.0 47.0 Here we have 3 classification errors.

Recursion… Just another dataset! Build a stump! 10.5 14.1 17.0 21.0 23.2 27.1 30.1 42.0 47.0 57.3 59.9 yes no Just another dataset! Build a stump! no yes no yes

Decision Trees = nested rules 10 20 30 40 50 60 yes no if x>25 then if x>50 then y=0 else y=1; endif else if x>16 then y=0 else y=1; endif endif

Here’s one I made earlier…..

‘pop’ is the number of training points that arrived at that node. ‘err’ is the fraction of those examples incorrectly classified.

Increasing the maximum depth (10) Decreasing the minimum number of examples required to make a split (5)

Overfitting…. testing error tree depth starting to overfit optimal depth about here 1 2 3 4 5 6 7 8 9 starting to overfit

Talk to your neighbours. A little break. Talk to your neighbours.

We’ve been assuming continuous variables! 10 5 0 5 10 10 20 30 40 50 60

The Tennis Problem

Outlook Humidity HIGH RAIN SUNNY OVERCAST NO Wind YES WEAK STRONG NORMAL

The Tennis Problem Note: 9 examples say “YES”, while 5 say “NO”.

Partitioning the data…

Thinking in Probabilities…

The “Information” in a feature More uncertainty = less information H(X) = 1

The “Information” in a feature Less uncertainty = more information H(X) = 0.72193

Entropy

Calculating Entropy

Information Gain, also known as “Mutual Information”

Why don’t we just measure the number of errors?! Errors = 0.25 … for both! Mutual information I(X1,Y) = 0.1887 I(X2;Y) = 0.3113

maximum information gain

Outlook Temp Humidity HIGH RAIN SUNNY OVERCAST MILD Wind YES WEAK STRONG NO NORMAL HOT COOL

Decision Trees, done. Now, grab some lunch. Then, get to the labs. Start to think about your projects…

Reminder : this is the halfway point! Project suggestions in handbook. Feedback opportunity next week. Final project deadline… 4pm Friday of week six (that’s the end of reading week – submit to SSO)

Team Projects deep study of any topic in ML that you want! 6 page report (maximum), written as a research paper. Minimum font size 11, margins 2cm. NO EXCEPTIONS. Does not have to be “academic”. You could build something – but you must thoroughly analyze and evaluate it according to the same principles. 50 marks, based on your 6 page report. Grades divided equally unless a case is made. Graded according to: Technical understanding/achievements Experimental Methodologies Depth of critical analysis Writing Style and exposition