Decision Tree Concept of Decision Tree

Slides:



Advertisements
Similar presentations
Data Mining Lecture 9.
Advertisements

Rule Generation from Decision Tree Decision tree classifiers are popular method of classification due to it is easy understanding However, decision tree.
Demo: Classification Programs C4.5 CBA Minqing Hu CS594 Fall 2003 UIC.
Decision tree software C4.5
Decision Tree Approach in Data Mining
Classification Techniques: Decision Tree Learning
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
Classification and Prediction
Classification II.
Classification.
Chapter 6 Decision Trees
Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst.
Chapter 7 Decision Tree.
Rapid Miner Session CIS 600 Analytical Data Mining,EECS, SU Three steps for use  Assign the dataset file first  Select functionality  Execute.
ID3 Algorithm Allan Neymark CS157B – Spring 2007.
Building And Interpreting Decision Trees in Enterprise Miner.
Basic Data Mining Technique
Data Mining – Algorithms: Decision Trees - ID3 Chapter 4, Section 4.3.
CS690L Data Mining: Classification
Copyright © 2010 SAS Institute Inc. All rights reserved. Decision Trees Using SAS Sylvain Tremblay SAS Canada – Education SAS Halifax Regional User Group.
1 Appendix D: Application of Genetic Algorithm in Classification Duong Tuan Anh 5/2014.
Decision Tree (Rule Induction)
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
CSE/CIS 787 Analytical Data Mining, Dept. of EECS, SU Three steps for use  Assign the dataset file first  Assign the analysis type you want.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.
Decision Tree. Classification Databases are rich with hidden information that can be used for making intelligent decisions. Classification is a form of.
Rapid Miner Session CIS 787 Data Mining,EECS, SU Three steps for use  Assign the dataset file first  Assign the analysis type you want  Execute.
Lecture 10 (big data) Knowledge Induction using association rule and decision tree (Understanding customer behavior Using data mining skills)
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.
Data Mining: Confluence of Multiple Disciplines Data Mining Database Systems Statistics Other Disciplines Algorithm Machine Learning Visualization.
3. 의사결정나무 Decision Tree (Rule Induction)
Chapter 6 Decision Tree.
Machine Learning Inductive Learning and Decision Trees
DECISION TREES An internal node represents a test on an attribute.
Decision Trees an introduction.
Chapter 18 From Data to Knowledge
Classification Algorithms
Teori Keputusan (Decision Theory)
Prepared by: Mahmoud Rafeek Al-Farra
Decision Trees: Another Example
Artificial Intelligence
COMP1942 Classification: More Concept Prepared by Raymond Wong
ID3 Vlad Dumitriu.
Data Science Algorithms: The Basic Methods
Decision Tree Saed Sayad 9/21/2018.
ECON734: Spatial Econometrics – Lab 2
Three steps for use Sample Datasets Assign the dataset file first
Classification and Prediction
Advanced Artificial Intelligence
ID3 Algorithm.
MIS2502: Data Analytics Classification using Decision Trees
Machine Learning Chapter 3. Decision Tree Learning
Decision Trees.
ECON734: Spatial Econometrics – Lab 2
(Understanding customer behavior Using data mining skills)
Machine Learning Chapter 3. Decision Tree Learning
CSCI N317 Computation for Scientific Applications Unit Weka
How decision tree is derived from a data set
©Jiawei Han and Micheline Kamber
Decision Tree (Rule Induction)
Data Mining CSCI 307, Spring 2019 Lecture 6
Data Mining CSCI 307, Spring 2019 Lecture 9
Presentation transcript:

Decision Tree Concept of Decision Tree Tree-like graph for classification purpose Through recursive partitioning it consists of root node, internal nodes, link, leaf

An Example of ‘Play Golf’ or ‘Not” Input variables - Outlook: rain. overcast,sunny - Temperature: number - Humidity: number - Windy: true, false Decision - play golf - do not play golf

Decision Tree from the Data

1st round: Group data roughly

The final grouping of data with rules

Training Dataset This follows an example from Quinlan’s ID3

Output: A Decision Tree for Credit Approval age? <=30 overcast 30..40 >40 student? yes credit rating? no yes excellent fair no yes yes no

Extracting Classification Rules from Trees Represent the knowledge in the form of IF-THEN rules One rule is created for each path from the root to a leaf Each attribute-value pair along a path forms a conjunction The leaf node holds the class prediction Rules are easier for humans to understand Example IF age = “<=30” AND student = “no” THEN buys_computer = “no” IF age = “<=30” AND student = “yes” THEN buys_computer = “yes” IF age = “31…40” THEN buys_computer = “yes” IF age = “>40” AND credit_rating = “excellent” THEN buys_computer = “yes” IF age = “>40” AND credit_rating = “fair” THEN buys_computer = “no”

An Example of ‘Car Buyers’ no Job M/F Area Age Y/N 1 NJ M N 35 2 F 51 3 OW 31 Y 4 EM 38 5 S 33 6 54 7 49 8 32 9 10 11 12 50 13 36 14 * (a,b,c) means a: total # of records, b: ‘N’ counts, c: ‘Y’ counts

Lab on Decision Tree(1) SPSS Clementine, SAS Enterprise Miner See5/C5.0Download See5/C5.0 2.02 Evaluation from http://www.rulequest.com

Lab on Decision Tree(2) From below initial screen, choose File – Locate Data

Lab on Decision Tree(3) Select housing.data from Samples folder and click open.

Lab on Decision Tree(3(4) This data set is on deciding house price in Boston area. It has 350 cases and 13 variables.

Lab on Decision Tree (5) Input variables crime rate proportion large lots: residential space proportion industrial: ratio of commercial area CHAS: dummy variable nitric oxides ppm: polution rate in ppm av rooms per dwelling: # of room for dwelling proportion pre-1940 distance to employment centers: distance to the center of city accessibility to radial highways: accessibility to high way property tax rate per $10\,000 pupil-teacher ratio: teachers’ rate B: racial statistics percentage low income earners: ratio of low income people Decision variable Top 20%, Bottom 80%

Lab on Decision Tree(6) For the analysis, click Construct Classifier or click Construct Classifier from File menu

Lab on Decision Tree(7) Click on Global pruning to (V ). Then, click OK

Lab on Decision Tree(8) Decision Tree Evaluation with Training data Evaluation with Test data

Lab on Decision Tree(9) Understanding picture We can see that (av rooms per dwelling) is the most important variable in deciding house price.

Lab on Decision Tree(11) 의사결정나무 그림으로는 규칙을 알아보기 어렵다. To view the rules, close current screen and click Construct Classifier again or click Construct Classifier from File menu.

Lab on Decision Tree(12) Choose/click Rulesets. Then click OK.

Lab on Decision Tree(13)