Presentation on Decision trees Presented to: Sir Marooof Pasha.

Slides:



Advertisements
Similar presentations
Data Mining Lecture 9.
Advertisements

Decision Trees Decision tree representation ID3 learning algorithm
Paper By - Manish Mehta, Rakesh Agarwal and Jorma Rissanen
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
IT 433 Data Warehousing and Data Mining
Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5
1 Data Mining Classification Techniques: Decision Trees (BUSINESS INTELLIGENCE) Slides prepared by Elizabeth Anglo, DISCS ADMU.
Deriving rules from data Decision Trees a.j.m.m (ton) weijters.
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Machine Learning Group University College Dublin Decision Trees What is a Decision Tree? How to build a good one…
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
Induction of Decision Trees
Three kinds of learning
Decision Tree Classifier
Classification.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
Ordinal Decision Trees Qinghua Hu Harbin Institute of Technology
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
Theses slides are based on the slides by
Chapter 7 Decision Tree.
ID3 Algorithm Allan Neymark CS157B – Spring 2007.
Business Intelligence Technologies – Data Mining
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Machine Learning Chapter 3. Decision Tree Learning
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
Mohammad Ali Keyvanrad
Chapter 9 – Classification and Regression Trees
Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.
Machine Learning Queens College Lecture 2: Decision Trees.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.
Business Intelligence and Decision Modeling Week 9 Customer Profiling Decision Trees (Part 2) CHAID CRT.
For Wednesday No reading Homework: –Chapter 18, exercise 6.
D ECISION T REES Jianping Fan Dept of Computer Science UNC-Charlotte.
Data Mining – Algorithms: Decision Trees - ID3 Chapter 4, Section 4.3.
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
ID3 Algorithm Michael Crawford.
Chapter 6 Classification and Prediction Dr. Bernard Chen Ph.D. University of Central Arkansas.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Decision Trees, Part 1 Reading: Textbook, Chapter 6.
Machine Learning Decision Trees. E. Keogh, UC Riverside Decision Tree Classifier Ross Quinlan Antenna Length Abdomen Length.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
Decision Tree Learning
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
ID3 Algorithm Amrit Gurung. Classification Library System Organise according to special characteristics Faster retrieval New items sorted easily Related.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Classification and Regression Trees
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
Iterative Dichotomiser 3 By Christopher Archibald.
Basic Data Mining Techniques Chapter 3-A. 3.1 Decision Trees.
1 Machine Learning Decision Trees Some of these slides are courtesy of R.Mooney, UT Austin and E. Keogh, UC Riverside.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
CSE343/543 Machine Learning: Lecture 4.  Chapter 3: Decision Trees  Weekly assignment:  There are lot of applications and systems using machine learning.
Decision Tree Learning DA514 - Lecture Slides 2 Modified and expanded from: E. Alpaydin-ML (chapter 9) T. Mitchell-ML.
Data Mining Decision Tree Induction
Decision Trees.
Artificial Intelligence
Chapter 6 Classification and Prediction
Data Science Algorithms: The Basic Methods
Classification and Prediction
ID3 Algorithm.
Induction of Decision Trees and ID3
Presentation transcript:

Presentation on Decision trees Presented to: Sir Marooof Pasha

Group members Kiran Shakoor Kiran Shakoor Nazish Yaqoob Nazish Yaqoob Razeena Ameen Razeena Ameen Ayesha Yaseen Ayesha Yaseen Nudrat Rehman Nudrat Rehman 07-47

Decision Trees Rules for classifying data using attributes. Rules for classifying data using attributes. The tree consists of decision nodes and leaf nodes. The tree consists of decision nodes and leaf nodes. A decision node has two or more branches, each representing values for the attribute tested. A decision node has two or more branches, each representing values for the attribute tested. A leaf node attribute which does not require additional classification testing. A leaf node attribute which does not require additional classification testing.

The Process of Constructing a Decision Tree Select an attribute to place at the root of the decision tree and make one branch for every possible value. Select an attribute to place at the root of the decision tree and make one branch for every possible value. Repeat the process recursively for each branch. Repeat the process recursively for each branch.

Decision Trees Example Minivan Age Car Type YES NO YES <30>=30 Sports, Truck 03060Age YES NO Minivan Sports, Truck

ID code OutlookTemperatureHumidityWindyPlay abcdefghijklmnSunnySunnyOvercastRainyRainyRainyOvercastSunnySunnyRainySunnyOvercastOvercastRainyHotHotHotMildCoolCoolCoolMildCoolMildMildMildHotMildHighHighHighHighNormalNormalNormalHighNormalNormalNormalHighNormalHighFalseTrueFalseFalseFalseTrueTrueFalseFalseFalseTrueTrueFalseTrueNoNoYesYesYesNoYesNoYesYesYesYesYesNo

Decision tree Example

Advantages of decision trees Simple to understand and interpret Simple to understand and interpret Able to handle both numerical and categorical data Able to handle both numerical and categorical data Possible to validate a model using statistical tests Possible to validate a model using statistical tests

Nudrat Rehman Nudrat Rehman 07-47

What is ID3? A mathematical algorithm for building the decision tree. A mathematical algorithm for building the decision tree. Invented by J. Ross Quinlan in Invented by J. Ross Quinlan in Uses Information Theory invented by Shannon in Uses Information Theory invented by Shannon in Builds the tree from the top down, with no backtracking. Builds the tree from the top down, with no backtracking. Information Gain is used to select the most useful attribute for classification. Information Gain is used to select the most useful attribute for classification.

Entropy A formula to calculate the homogeneity of a sample. A formula to calculate the homogeneity of a sample. A completely homogeneous sample has entropy of 0. A completely homogeneous sample has entropy of 0. An equally divided sample has entropy of 1. An equally divided sample has entropy of 1. Entropy(s) = - p+log2 (p+) -p-log2 (p-) for a sample of negative and positive elements. Entropy(s) = - p+log2 (p+) -p-log2 (p-) for a sample of negative and positive elements.

The formula for entropy:

Entropy Example Entropy's) = - (9/14) Log2 (9/14) - (5/14) Log2 (5/14) = 0.940

Information Gain (IG) The information gain is based on the decrease in entropy after a dataset is split on an attribute. The information gain is based on the decrease in entropy after a dataset is split on an attribute. Which attribute creates the most homogeneous branches? Which attribute creates the most homogeneous branches? First the entropy of the total dataset is calculated. First the entropy of the total dataset is calculated. The dataset is then split on the different attributes. The dataset is then split on the different attributes. The entropy for each branch is calculated. Then it is added proportionally, to get total entropy for the split. The entropy for each branch is calculated. Then it is added proportionally, to get total entropy for the split. The resulting entropy is subtracted from the entropy before the split. The resulting entropy is subtracted from the entropy before the split. The result is the Information Gain, or decrease in entropy. The result is the Information Gain, or decrease in entropy. The attribute that yields the largest IG is chosen for the decision node. The attribute that yields the largest IG is chosen for the decision node.

Information Gain (cont’d) A branch set with entropy of 0 is a leaf node. A branch set with entropy of 0 is a leaf node. Otherwise, the branch needs further splitting to classify its dataset. Otherwise, the branch needs further splitting to classify its dataset. The ID3 algorithm is run recursively on the non-leaf branches, until all data is classified. The ID3 algorithm is run recursively on the non-leaf branches, until all data is classified.

Nazish yaqoob 07-11

Example: The Simpsons Example: The Simpsons

Person Hair Length WeightAgeClass Homer Homer0”25036M Marge10”15034F Bart2”9010M Lisa6”788F Maggie4”201F Abe1”17070M Selma8”16041F Otto10”18038M Krusty6”20045M Comic8”29038?

Hair Length <= 5? yes no Entropy(4F,5M) = -(4/9)log 2 (4/9) - (5/9)log 2 (5/9) = Entropy(1F,3M) = -(1/4)log 2 (1/4) - (3/4)log 2 (3/4) = Entropy(3F,2M) = -(3/5)log 2 (3/5) - (2/5)log 2 (2/5) = Gain(Hair Length <= 5) = – (4/9 * /9 * ) = Let us try splitting on Hair length

Weight <= 160? yes no Entropy(4F,5M) = -(4/9)log 2 (4/9) - (5/9)log 2 (5/9) = Entropy(4F,1M) = -(4/5)log 2 (4/5) - (1/5)log 2 (1/5) = Entropy(0F,4M) = -(0/4)log 2 (0/4) - (4/4)log 2 (4/4) = 0 Gain(Weight <= 160) = – (5/9 * /9 * 0 ) = Let us try splitting on Weight

age <= 40? yes no Entropy(4F,5M) = -(4/9)log 2 (4/9) - (5/9)log 2 (5/9) = Entropy(3F,3M) = -(3/6)log 2 (3/6) - (3/6)log 2 (3/6) = 1 Entropy(1F,2M) = -(1/3)log 2 (1/3) - (2/3)log 2 (2/3) = Gain(Age <= 40) = – (6/9 * 1 + 3/9 * ) = Let us try splitting on Age

Weight <= 160? yes no Hair Length <= 2? yes no Of the 3 features we had, Weight was best. But while people who weigh over 160 are perfectly classified (as males), the under 160 people are not perfectly classified… So we simply recurse! This time we find that we can split on Hair length, and we are done!

Weight <= 160? yesno Hair Length <= 2? yes no We need don’t need to keep the data around, just the test conditions. Male Female How would these people be classified?

It is trivial to convert Decision Trees to rules… Weight <= 160? yesno Hair Length <= 2? yes no Male Female Rules to Classify Males/Females If Weight greater than 160, classify as Male Elseif Hair Length less than or equal to 2, classify as Male Else classify as Female Rules to Classify Males/Females If Weight greater than 160, classify as Male Elseif Hair Length less than or equal to 2, classify as Male Else classify as Female