COM24111: Machine Learning Decision Trees Gavin Brown www.cs.man.ac.uk/~gbrown.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

Based on slides by Pierre Dönnes and Ron Meir Modified by Longin Jan Latecki, Temple University Ch. 5: Support Vector Machines Stephen Marsland, Machine.
Data Mining Lecture 9.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Decision Tree Learning - ID3
Decision Trees Decision tree representation ID3 learning algorithm
Ignas Budvytis*, Tae-Kyun Kim*, Roberto Cipolla * - indicates equal contribution Making a Shallow Network Deep: Growing a Tree from Decision Regions of.
also known as the “Perceptron”
Classification Algorithms
Decision Tree Algorithm (C4.5)
ICS320-Foundations of Adaptive and Learning Systems
Classification Techniques: Decision Tree Learning
Decision Trees. DEFINE: Set X of Instances (of n-tuples x = ) –E.g., days decribed by attributes (or features): Sky, Temp, Humidity, Wind, Water, Forecast.
Induction of Decision Trees
Classification Continued
Naïve Bayesian Classifiers Before getting to Naïve Bayesian Classifiers let’s first go over some basic probability theory p(C k |A) is known as a conditional.
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
Rugby players, Ballet dancers, and the Nearest Neighbour Classifier COMP24111 lecture 2.
Decision Tree Learning
Classifiers, Part 3 Week 1, Video 5 Classification  There is something you want to predict (“the label”)  The thing you want to predict is categorical.
ID3 Algorithm Allan Neymark CS157B – Spring 2007.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.
Machine Learning Lecture 10 Decision Tree Learning 1.
Machine Learning Queens College Lecture 2: Decision Trees.
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Data Mining – Algorithms: Decision Trees - ID3 Chapter 4, Section 4.3.
CS690L Data Mining: Classification
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Sections 4.1 Inferring Rudimentary Rules Rodney Nielsen.
Decision Tree Learning
COMP24111: Machine Learning Ensemble Models Gavin Brown
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Decision/Classification Trees Readings: Murphy ; Hastie 9.2.
Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.
Decision Trees MSE 2400 EaLiCaRA Dr. Tom Way.
Presenter: Jae Sung Park
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Decision Tree Learning DA514 - Lecture Slides 2 Modified and expanded from: E. Alpaydin-ML (chapter 9) T. Mitchell-ML.
Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Machine Learning Inductive Learning and Decision Trees
Decision Trees an introduction.
Decision trees (concept learnig)
Machine Learning Lecture 2: Decision Tree Learning.
Decision trees (concept learnig)
Naïve Bayes Classifier
Classification Algorithms
Prepared by: Mahmoud Rafeek Al-Farra
COMP61011 : Machine Learning Probabilistic Models + Bayes’ Theorem
Decision Trees: Another Example
Artificial Intelligence
Data Science Algorithms: The Basic Methods
COMP61011 : Machine Learning Ensemble Models
Decision Tree Saed Sayad 9/21/2018.
Decision Trees Greg Grudic
ID3 Algorithm.
Machine Learning: Lecture 3
Naïve Bayes Classifier
Decision Trees Decision tree representation ID3 learning algorithm
Lecture 05: Decision Trees
Play Tennis ????? Day Outlook Temperature Humidity Wind PlayTennis
Decision Trees.
COMP61011 : Machine Learning Decision Trees
Ensemble learning.
Decision Trees Decision tree representation ID3 learning algorithm
Artificial Intelligence 9. Perceptron
A task of induction to find patterns
Data Mining CSCI 307, Spring 2019 Lecture 15
A task of induction to find patterns
Presentation transcript:

COM24111: Machine Learning Decision Trees Gavin Brown

Recap: threshold classifiers height weight

Q. Where is a good threshold? Also known as “decision stump”

From Decision Stumps, to Decision Trees -New type of non-linear model -Copes naturally with continuous and categorical data -Fast to both train and test (highly parallelizable) -Generates a set of interpretable rules

Recap: Decision Stumps The stump “splits” the dataset. Here we have 4 classification errors yes no 1010 predict 0 predict 1

A modified stump yes no 1010 Here we have 3 classification errors.

Recursion… yes no Just another dataset! Build a stump! noyes noyes

Decision Trees = nested rules yes no yes noyes if x>25 then if x>50 then y=0 else y=1; endif else if x>16 then y=0 else y=1; endif endif

Trees build “orthogonal” decision boundaries. Boundary is piecewise, and at 90 degrees to feature axes.

The most important concept in Machine Learning

Looks good so far… The most important concept in Machine Learning

Looks good so far… Oh no! Mistakes! What happened? The most important concept in Machine Learning

Looks good so far… Oh no! Mistakes! What happened? We didn’t have all the data. We can never assume that we do. This is called “OVER-FITTING” to the small dataset. The most important concept in Machine Learning

Number of possible paths down tree tells you the number of rules. More rules = more complicated. Could have N rules, where N is the num of examples in training data. … the more rules… the more chance of OVERFITTING. yes no yes noyes

tree depth / length of rules testing error optimal depth about here starting to overfit Overfitting….

Take a short break. Talk to your neighbours. Make sure you understand the recursive algorithm.

if x>25 then if x>50 then y=0 else y=1; endif else if x>16 then y=0 else y=1; endif endif Decision trees can be seen as nested rules. Nested rules are FAST, and highly parallelizable. x,y,z-coordinates per joint, ~60 total x,y,z-velocities per joint, ~60 total joint angles (~35 total) joint angular velocities (~35 total)

Real-time human pose recognition in parts from single depth images Computer Vision and Pattern Recognition 2011 Shotton et al, Microsoft Research Trees are the basis of Kinect controller Features are simple image properties. Test phase: 200 frames per sec on GPU Train phase more complex but still parallel

We’ve been assuming continuous variables!

The Tennis Problem

Outlook Humidity HIGH RAIN SUNNY OVERCAST NO Wind YES WEAK STRONG NO NORMAL YES

The Tennis Problem Note: 9 examples say “YES”, while 5 say “NO”.

Partitioning the data…

Thinking in Probabilities…

The “Information” in a feature H(X) = 1 More uncertainty = less information

The “Information” in a feature H(X) = Less uncertainty = more information

Entropy

Calculating Entropy

Information Gain, also known as “Mutual Information”

maximum information gain gain

Outlook Temp Humidity HIGH RAINSUNNY OVERCAST Temp Humidity MILD Wind HIGH YES WEAK STRONG NO NORMAL YES WEAK STRONG NO HOT NO COOL Wind YES WEAK STRONG NO Humidity MILD HOT Wind HIGH STRONG YES COOL YES MILD HIGH NOYES NORMAL NO YES COOL NO WEAK YES NORMAL

Decision trees Trees learnt by recursive algorithm, ID3 (but there are many variants of this) Trees draw a particular type of decision boundary (look it up in these slides!) Highly efficient, highly parallelizable, lots of possible “splitting” criteria. Powerful methodology, used in lots of industry applications.