Decision trees MARIO REGIN.

Slides:



Advertisements
Similar presentations
Conceptual Clustering
Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.
Random Forest Predrag Radenković 3237/10
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Elementary Statistics MOREHEAD STATE UNIVERSITY
A Quick Overview By Munir Winkel. What do you know about: 1) decision trees 2) random forests? How could they be used?
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Decision Tree under MapReduce Week 14 Part II. Decision Tree.
Mining for High Complexity Regions Using Entropy and Box Counting Dimension Quad-Trees Rosanne Vetro, Wei Ding, Dan A. Simovici Computer Science Department.
© The McGraw-Hill Companies, Inc., by Marc M. Triola & Mario F. Triola SLIDES PREPARED BY LLOYD R. JAISINGH MOREHEAD STATE UNIVERSITY MOREHEAD.
Data Mining Techniques Outline
Decision Tree Algorithm
Basic Data Mining Techniques
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Ordinal Decision Trees Qinghua Hu Harbin Institute of Technology
Multiple Regression – Basic Relationships
Chapter 5 Data mining : A Closer Look.
Decision Tree Models in Data Mining
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Decision Trees.
Introduction to Statistics What is Statistics? : Statistics is the sciences of conducting studies to collect, organize, summarize, analyze, and draw conclusions.
Chapter 9 – Classification and Regression Trees
Business Intelligence and Decision Modeling Week 9 Customer Profiling Decision Trees (Part 2) CHAID CRT.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
APPLICATION OF DATAMINING TOOL FOR CLASSIFICATION OF ORGANIZATIONAL CHANGE EXPECTATION Şule ÖZMEN Serra YURTKORU Beril SİPAHİ.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.
Descriptive and Inferential Statistics Descriptive Statistics – consists of the collection, organization, and overall summery of the data presented. Inferential.
Statistics & Evidence-Based Practice
Elementary Statistics
Part Four ANALYSIS AND PRESENTATION OF DATA
Chapter 14 Introduction to Multiple Regression
SNS COLLEGE OF TECHNOLOGY
Chapter(1) The Nature of Probability and Statistics
DECISION TREES An internal node represents a test on an attribute.
Introduction to Machine Learning and Tree Based Methods
Lecture Slides Elementary Statistics Twelfth Edition
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Intro to Machine Learning
Artificial Intelligence
Decision Trees (suggested time: 30 min)
Predict whom survived the Titanic Disaster
Ch9: Decision Trees 9.1 Introduction A decision tree:
Multiple Regression Analysis and Model Building
Elementary Statistics MOREHEAD STATE UNIVERSITY
Issues in Decision-Tree Learning Avoiding overfitting through pruning
Introduction to Data Mining, 2nd Edition by
Classification and Prediction
(classification & regression trees)
Learning with Identification Trees
Tabulations and Statistics
Data Mining for Business Analytics
Introduction to Statistics
Data Mining – Chapter 3 Classification
Decision Trees.
The Nature of Probability and Statistics
Multiple Decision Trees ISQS7342
Decision Trees By Cole Daily CSCI 446.
Elementary Statistics MOREHEAD STATE UNIVERSITY
MIS2502: Data Analytics Clustering and Segmentation
Predicting Students’ Course Success Using Machine Learning Approach
MIS2502: Data Analytics Clustering and Segmentation
Statistical Learning Dong Liu Dept. EEIS, USTC.
Data Pre-processing Lecture Notes for Chapter 2
Diagnostics and Remedial Measures
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
STT : Intro. to Statistical Learning
Presentation transcript:

Decision trees MARIO REGIN

What is a decision tree? General purpose prediction and classification mechanism Emerged at the same time as the nascent fields of artificial intelligence and statistical computation

... ... Nomenclature Root Node Branch Level Input Level Nodes   NODE ID:      1                1:   38.2%                0:   61.8%        Count:   1309 Level Female Male Gender   NODE ID:      23                1:   72.7%                0:   27.3%        Count:   466   NODE ID:      24                1:   19.1%                0:   80.9%        Count:   843 Input Level ... ... Nodes

Main Characteristic Recursive subsetting of a target field of data according to the values of associated input fields or predictors to create partitions, and associated descendent data subsets, that contain progressively similar intra-node target values and progressively dissimilar inter-node values at any given level of the tree. In other words, decision trees reduce the disorder of the data as we go from higher to lower levels of the tree.

Creating a Decision Tree 1.- Create the root node 2.- Search through the data set to discover the best partitioning input for the current node. 3.- Use the best input to partition the current node to form branches. 4.- Repeat steps 2 and 3 until one of more possible stop conditions are met.

Best Input The selection of the ‘best’ input field is an open subject of active research. Decision trees allow for a variety of computational approaches to input selection. Some possible Approaches to select the best input: High performance predictive model approach. Relationship analysis approach.

High Performance Predictive Model Approach The input that produces the most separation in the variability among the descendent nodes. Emphasis on the selection of high quality partitions that can collectively produce the best overall model.

Relationship Analysis Approach Guide the branching in order to analyst, discover, support or confirm the conditional relations that are assumed to exist among the various inputs and the component nodes that they produce. Emphasis on analyzing the interaction in the formation of the tree

Stopping Rule Generally, stopping rules consist of thresholds on diminishing returns (in terms of test statistics) or in a diminishing supply of training cases (minimum acceptable number of observations in a node).

Tree Construction

? Three Construction ? P N H Y N Y A NODE ID: 1 1: 37.5% 0: 62.5% Cast Shadow Likes Garlic Complexion Vampire ? Yes Pale No Healthy Average   NODE ID:      1                1:   37.5%                0:   62.5%        Count:      8 ? .5 Cast Shadow Likes Garlic Complexion .6 .68 ? P N H Y N Y A N: 2 Y: 2 N: 3 Y: 0 N: 0 Y: 1 N: 3 Y: 0 N: 2 Y: 3 N: 2 Y: 0 N: 1 Y: 2 N: 2 Y: 1

Three Construction NODE ID: 1 1: 37.5% 0: 62.5% Count: 8 Gender ? N Y Cast Shadow Likes Garlic Complexion Vampire ? Yes Pale No Healthy Average   NODE ID:      1                1:   37.5%                0:   62.5%        Count:      8 Gender ? N Y NODE ID:      2              1:   50.0%              0:   50.0%      Count:      4 NODE ID:      3              1:   00.0%              0:  100%      Count:      3 NODE ID:      4              1:   100%              0:   00.0%      Count:      1

Three Construction P N H Y A NODE ID: 1 1: 37.5% 0: 62.5% Count: 8 Cast Shadow Likes Garlic Complexion Vampire ? Yes Pale No Healthy Average   NODE ID:      1                1:   37.5%                0:   62.5%        Count:      8 Gender ? N Y NODE ID:      2              1:   50.0%              0:   50.0%      Count:      4 NODE ID:      3              1:   00.0%              0:  100%      Count:      3 NODE ID:      4              1:   100%              0:   00.0%      Count:      1 Likes Garlic Complexion .5 P N H Y A N: 2 Y: 0 N: 0 Y: 2 N: 1 Y: 0 N: 0 Y: 1 N: 1 Y: 1

Three Construction NODE ID: 1 1: 37.5% 0: 62.5% Count: 8 Gender ? N Y                1:   37.5%                0:   62.5%        Count:      8 Gender ? N Y NODE ID:      2              1:   50.0%              0:   50.0%      Count:      4 NODE ID:      3              1:   00.0%              0:  100%      Count:      3 NODE ID:      4              1:   100%              0:   00.0%      Count:      1 Garlic Y N NODE ID:      5              1:   00.0%              0:   100%      Count:      2 NODE ID:      6              1:   100%              0:   00.0%      Count:      2

Evaluation

Survival in Titanic Sinking   NODE ID:      1                1:   38.2%                0:   61.8%        Count:   1309 Male with missing Age Female Male Gender   NODE ID:      23                1:   72.7%                0:   27.3%        Count:   466   NODE ID:      24                1:   19.1%                0:   80.9%        Count:   843 Age Age <30.75 or missing ≥30.75 ≥9.5 or missing <9.5   NODE ID:      30                1:   67.8%                0:   32.2%        Count:   314   NODE ID:      31                1:   82.9%                0:   17.1%        Count:   152   NODE ID:      33                1:   58.1%                0:   41.9%        Count:   43   NODE ID:      34                1:   17.0%                0:   83.0%        Count:   800

Unequal partition in age Relationship   NODE ID:      1                1:   38.2%                0:   61.8%        Count:   1309 Female Male Women First Gender   NODE ID:      23                1:   72.7%                0:   27.3%        Count:   466   NODE ID:      24                1:   19.1%                0:   80.9%        Count:   843 Children First Age Age <30.75 or missing ≥30.75 ≥9.5 or missing <9.5   NODE ID:      30                1:   67.8%                0:   32.2%        Count:   314   NODE ID:      31                1:   82.9%                0:   17.1%        Count:   152   NODE ID:      33                1:   58.1%                0:   41.9%        Count:   43   NODE ID:      34                1:   17.0%                0:   83.0%        Count:   800 Unequal partition in age

Characteristics of Decision Trees Successive partitioning results in the presentation of a tree-like visual display with a top node and descendent branches. Branch partitions may be two-way or multiway branches. Partitioning fields may be nominal, ordinal, or interval measurement levels. The final result can be a class or a number

Characteristics of Decision Trees Missing values.- Can be grouped with other values or have their own partition. Symmetry.- Descendent nodes can be balanced and symmetrical, employing a matching set of predictors with each level of the subtree. Asymmetry.- Descendent nodes can be unbalanced in that subnode partitions could be based on the most powerful predictor for each node .

Advantages of Decision trees The created decision trees can detect and visually present contextual effects. There are easy to understand. The resulting model is a white box. Flexibility Cut-points for the same input can be different in each node. Missing values are allowed. Numerical and nominal data can be used as input. Output can be nominal or numerical.

Disadvantages of Decision Trees Deep trees tend to over-fit Poor generalization

Multi Trees constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.

Creation Method The idea is to create multiple uncorrelated trees Select a random subset of the training set Create a tree Ti with these random subset  When creating the partitions of this tree use only a random subset of the inputs to search for the best input.

Evaluation If the output is a class (Classification) Evaluate the sample in all the multiple trees Each tree votes for one class The selected class is the most voted class If the output is a number (Regression) Final result is the average of each result

Extra. Computation of Entropy