DECISION TREES David Kauchak CS 451 – Fall 2013. Admin Assignment 1 available and is due on Friday (printed out at the beginning of class) Door code for.

Slides:



Advertisements
Similar presentations
Knowledge Acquisition and Modelling
Advertisements

Data Mining Lecture 9.
Decision Trees Decision tree representation ID3 learning algorithm
Imbalanced data David Kauchak CS 451 – Fall 2013.
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Decision Tree Approach in Data Mining
Non-Metric Methods: Decision Trees Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.
Classification Techniques: Decision Tree Learning
PROBABILISTIC MODELS David Kauchak CS451 – Fall 2013.
PROBABILISTIC MODELS David Kauchak CS451 – Fall 2013.
Decision Trees Jeff Storey. Overview What is a Decision Tree Sample Decision Trees How to Construct a Decision Tree Problems with Decision Trees Decision.
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
Assuming normally distributed data! Naïve Bayes Classifier.
Ensemble Learning: An Introduction
Induction of Decision Trees
Decision Trees an Introduction.
Classification.
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
Ensemble Learning (2), Tree and Forest
EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()
INTRODUCTION TO MACHINE LEARNING David Kauchak CS 451 – Fall 2013.
FEATURES David Kauchak CS 451 – Fall Admin Assignment 2  This class will make you a better programmer!  How did it go?  How much time did you.
Data Mining: Classification
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
MULTICLASS CONTINUED AND RANKING David Kauchak CS 451 – Fall 2013.
Bayesian Networks. Male brain wiring Female brain wiring.
Machine Learning Decision Tree.
GEOMETRIC VIEW OF DATA David Kauchak CS 451 – Fall 2013.
ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.
ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 – Fall 2014.
Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.
MACHINE LEARNING BASICS David Kauchak CS159 Fall 2014.
Statistical Classification Methods
Machine Learning Queens College Lecture 2: Decision Trees.
LOGISTIC REGRESSION David Kauchak CS451 – Fall 2013.
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Classification and Prediction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot Readings: Chapter 6 – Han and Kamber.
CS690L Data Mining: Classification
PRIORS David Kauchak CS159 Fall Admin Assignment 7 due Friday at 5pm Project proposals due Thursday at 11:59pm Grading update.
Today’s Topics Learning Decision Trees (Chapter 18) –We’ll use d-trees to introduce/motivate many general issues in ML (eg, overfitting reduction) “Forests”
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
MULTICLASS David Kauchak CS 451 – Fall Admin Assignment 4 Assignment 2 back soon If you need assignment feedback… CS Lunch tomorrow (Thursday):
Exercises Decision Trees In decision tree learning, the information gain criterion helps us select the best attribute to split the data at every node.
Big Data Analysis and Mining Qinpei Zhao 赵钦佩 2015 Fall Decision Tree.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
COM24111: Machine Learning Decision Trees Gavin Brown
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
Decision Trees MSE 2400 EaLiCaRA Dr. Tom Way.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
CSE343/543 Machine Learning: Lecture 4.  Chapter 3: Decision Trees  Weekly assignment:  There are lot of applications and systems using machine learning.
Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
By N.Gopinath AP/CSE.  A decision tree is a flowchart-like tree structure, where each internal node (nonleaf node) denotes a test on an attribute, each.
Decision Trees: Another Example
DECISION TREES An internal node represents a test on an attribute.
Decision Trees an introduction.
Probability David Kauchak CS158 – Fall 2013.
Decision Trees.
Artificial Intelligence

Chapter 6 Classification and Prediction
Feature PRE-PROCESSING
Data Science Algorithms: The Basic Methods
Machine Learning basics
Decision Trees Jeff Storey.
Evaluation David Kauchak CS 158 – Fall 2019.
Presentation transcript:

DECISION TREES David Kauchak CS 451 – Fall 2013

Admin Assignment 1 available and is due on Friday (printed out at the beginning of class) Door code for MBH632 Keep up with the reading Videos

Quick refresher from last time…

Representing examples examples What is an example? How is it represented?

Features examples f 1, f 2, f 3, …, f n features f 1, f 2, f 3, …, f n How our algorithms actually “view” the data Features are the questions we can ask about the examples

Features examples red, round, leaf, 3oz, … features How our algorithms actually “view” the data Features are the questions we can ask about the examples green, round, no leaf, 4oz, … yellow, curved, no leaf, 4oz, … green, curved, no leaf, 5oz, …

Classification revisited red, round, leaf, 3oz, … green, round, no leaf, 4oz, … yellow, curved, no leaf, 4oz, … green, curved, no leaf, 5oz, … label apple banana examples model/ classifier learn During learning/training/induction, learn a model of what distinguishes apples and bananas based on the features

Classification revisited red, round, no leaf, 4oz, … model/ classifier The model can then classify a new example based on the features predict Apple or banana?

Classification revisited red, round, no leaf, 4oz, … model/ classifier The model can then classify a new example based on the features predict Apple Why?

Classification revisited red, round, leaf, 3oz, … green, round, no leaf, 4oz, … yellow, curved, no leaf, 4oz, … green, curved, no leaf, 5oz, … label apple banana examples Training data red, round, no leaf, 4oz, … ? Test set

Classification revisited red, round, leaf, 3oz, … green, round, no leaf, 4oz, … yellow, curved, no leaf, 4oz, … green, curved, no leaf, 5oz, … label apple banana examples Training data red, round, no leaf, 4oz, … ? Learning is about generalizing from the training data Test set What does this assume about the training and test set?

Past predicts future Training dataTest set

Past predicts future Training dataTest set Not always the case, but we’ll often assume it is!

Past predicts future Training dataTest set Not always the case, but we’ll often assume it is!

More technically… We are going to use the probabilistic model of learning There is some probability distribution over example/label pairs called the data generating distribution Both the training data and the test set are generated based on this distribution What is a probability distribution?

Probability distribution Describes how likely (i.e. probable) certain events are

Probability distribution Training data round apples curved bananas apples with leaves … High probabilityLow probability curved apples red bananas yellow apples …

data generating distribution Training dataTest set data generating distribution

Training dataTest set data generating distribution

Training dataTest set data generating distribution

A sample data set FeaturesLabel HourWeatherAccidentStallCommute 8 AMSunnyNo Long 8 AMCloudyNoYesLong 10 AMSunnyNo Short 9 AMRainyYesNoLong 9 AMSunnyYes Long 10 AMSunnyNo Short 10 AMCloudyNo Short 9 AMSunnyYesNoLong 10 AMCloudyYes Long 10 AMRainyNo Short 8 AMCloudyYesNoLong 9 AMRainyNo Short 8 AM, Rainy, Yes, No? 10 AM, Rainy, No, No? Can you describe a “model” that could be used to make decisions in general?

Short Decision trees Leave At Stall?Accident? 10 AM 9 AM 8 AM Long ShortLong NoYes NoYes Tree with internal nodes labeled by features Branches are labeled by tests on that feature Leaves labeled with classes

Short Decision trees Leave At Stall?Accident? 10 AM 9 AM 8 AM Long ShortLong NoYes NoYes Tree with internal nodes labeled by features Branches are labeled by tests on that feature Leaves labeled with classes Leave = 8 AM Weather = Rainy Accident = Yes Stall = No

Short Decision trees Leave At Stall?Accident? 10 AM 9 AM 8 AM Long ShortLong NoYes NoYes Tree with internal nodes labeled by features Branches are labeled by tests on that feature Leaves labeled with classes Leave = 8 AM Weather = Rainy Accident = Yes Stall = No

Short Decision trees Leave At Stall?Accident? 10 AM 9 AM 8 AM Long ShortLong NoYes NoYes Tree with internal nodes labeled by features Branches are labeled by tests on that feature Leaves labeled with classes Leave = 10 AM Weather = Rainy Accident = No Stall = No

Short Decision trees Leave At Stall?Accident? 10 AM 9 AM 8 AM Long ShortLong NoYes NoYes Tree with internal nodes labeled by features Branches are labeled by tests on that feature Leaves labeled with classes Leave = 10 AM Weather = Rainy Accident = No Stall = No

To ride or not to ride, that is the question… Build a decision tree TerrainUnicycle- type WeatherGo-For-Ride? TrailNormalRainyNO RoadNormalSunnyYES TrailMountainSunnyYES RoadMountainRainyYES TrailNormalSnowyNO RoadNormalRainyYES RoadMountainSnowyYES TrailNormalSunnyNO RoadNormalSnowyNO TrailMountainSnowyYES

Recursive approach Base case: If all data belong to the same class, create a leaf node with that label Otherwise: - calculate the “score” for each feature if we used it to split the data - pick the feature with the highest score, partition the data based on that data value and call recursively

Snowy Partitioning the data TerrainUnicycle- type WeatherGo-For- Ride? TrailNormalRainyNO RoadNormalSunnyYES TrailMountainSunnyYES RoadMountainRainyYES TrailNormalSnowyNO RoadNormalRainyYES RoadMountainSnowyYES TrailNormalSunnyNO RoadNormalSnowyNO TrailMountainSnowyYES Terrain Road Trail YES: 4 NO: 1 YES: 2 NO: 3 Unicycle Mountain Normal YES: 4 NO: 0 YES: 2 NO: 4 Weather Rainy Sunny YES: 2 NO: 1 YES: 2 NO: 1 YES: 2 NO: 2

Partitioning the data Terrain Road Trail YES: 4 NO: 1 YES: 2 NO: 3 Unicycle Mountain Normal YES: 4 NO: 0 YES: 2 NO: 4 Snowy Weather Rainy Sunny YES: 2 NO: 1 YES: 2 NO: 1 YES: 2 NO: 2 calculate the “score” for each feature if we used it to split the data What score should we use? If we just stopped here, which tree would be best? How could we make these into decision trees?

Decision trees Terrain Road Trail YES: 4 NO: 1 YES: 2 NO: 3 Unicycle Mountain Normal YES: 4 NO: 0 YES: 2 NO: 4 Snowy Weather Rainy Sunny YES: 2 NO: 1 YES: 2 NO: 1 YES: 2 NO: 2 How could we make these into decision trees?

Decision trees Terrain Road Trail YES: 4 NO: 1 YES: 2 NO: 3 Unicycle Mountain Normal YES: 4 NO: 0 YES: 2 NO: 4 Snowy Weather Rainy Sunny YES: 2 NO: 1 YES: 2 NO: 1 YES: 2 NO: 2

Decision trees Terrain Road Trail YES: 4 NO: 1 YES: 2 NO: 3 Unicycle Mountain Normal YES: 4 NO: 0 YES: 2 NO: 4 Snowy Weather Rainy Sunny YES: 2 NO: 1 YES: 2 NO: 1 YES: 2 NO: 2 Training error: the average error over the training set For classification, the most common “error” is the number of mistakes Training error for each of these?

Decision trees Terrain Road Trail YES: 4 NO: 1 YES: 2 NO: 3 Unicycle Mountain Normal YES: 4 NO: 0 YES: 2 NO: 4 Snowy Weather Rainy Sunny YES: 2 NO: 1 YES: 2 NO: 1 YES: 2 NO: 2 Training error: the average error over the training set 3/102/104/10

Recurse Unicycle Mountain Normal YES: 4 NO: 0 YES: 2 NO: 4 TerrainUnicycle- type WeatherGo-For- Ride? TrailNormalRainyNO RoadNormalSunnyYES TrailNormalSnowyNO RoadNormalRainyYES TrailNormalSunnyNO RoadNormalSnowyNO TerrainUnicycle- type WeatherGo-For- Ride? TrailMountainSunnyYES RoadMountainRainyYES RoadMountainSnowyYES TrailMountainSnowyYES

Recurse Unicycle Mountain Normal YES: 4 NO: 0 YES: 2 NO: 4 TerrainUnicycle- type WeatherGo-For- Ride? TrailNormalRainyNO RoadNormalSunnyYES TrailNormalSnowyNO RoadNormalRainyYES TrailNormalSunnyNO RoadNormalSnowyNO Snowy Terrain Road Trail YES: 2 NO: 1 YES: 0 NO: 3 Weather Rainy Sunny YES: 1 NO: 1 YES: 1 NO: 1 YES: 0 NO: 2

Recurse Unicycle Mountain Normal YES: 4 NO: 0 YES: 2 NO: 4 TerrainUnicycle- type WeatherGo-For- Ride? TrailNormalRainyNO RoadNormalSunnyYES TrailNormalSnowyNO RoadNormalRainyYES TrailNormalSunnyNO RoadNormalSnowyNO Snowy Terrain Road Trail YES: 2 NO: 1 YES: 0 NO: 3 Weather Rainy Sunny YES: 1 NO: 1 YES: 1 NO: 1 YES: 0 NO: 2 1/6 2/6

Recurse Unicycle Mountain Normal YES: 4 NO: 0 TerrainUnicycle- type WeatherGo-For- Ride? RoadNormalSunnyYES RoadNormalRainyYES RoadNormalSnowyNO Terrain Road Trail YES: 2 NO: 1 YES: 0 NO: 3

Recurse Unicycle Mountain Normal YES: 4 NO: 0 Terrain Road Trail YES: 0 NO: 3 Snowy Weather Rainy Sunny YES: 1 NO: 1 YES: 1 NO: 0 YES: 0 NO: 1