Decision Tree Lab. Load in iris data: Display iris data as a sanity.

Slides:

Advertisements

Similar presentations

Writing functions in R Some handy advice for creating your own functions.

Advertisements

1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 7a, March 10, 2015 Labs: more data, models, prediction, deciding with trees.

Selection Trees. What are selection trees? Complete binary tree Each node represents a “match” Winner Trees Loser Trees.

Regression Tree Learning Gabor Melli July 18 th, 2013.

Decision Trees in R Arko Barman With additions and modifications by Ch. Eick COSC 4335 Data Mining.

Decision Tree Dr. Jieh-Shan George YEH

Neural Nets How the brain achieves intelligence 10^11 1mhz cpu’s.

Supervised Learning & Classification, part I Reading: W&F ch 1.1, 1.2, , 3.2, 3.3, 4.3, 6.1*

Logit Lab material borrowed from tutorial by William B

Arko Barman Slightly edited by Ch. Eick COSC 6335 Data Mining

Properties of Poisson The mean and variance are both equal to. The sum of independent Poisson variables is a further Poisson variable with mean equal to.

Rapid Miner Session CIS 600 Analytical Data Mining,EECS, SU Three steps for use  Assign the dataset file first  Select functionality  Execute.

Tree-Based Methods (V&R 9.1) Demeke Kasaw, Andreas Nguyen, Mariana Alvaro STAT 6601 Project.

A quick introduction to R prog. 淡江統計陳景祥 (Steve Chen)

R-Graphics Day 2 Stephen Opiyo. Basic Graphs One of the main reasons data analysts turn to R is for its strong graphic capabilities. R generates publication-ready.

Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 5: Classification Trees: An Alternative to Logistic.

Univariate Graphs III Review Create histogram from Commands Window. Multipanel histogram. Quantile Plots Quantile-Normal Plots Quantile-Quantile Plots.

Figure 1.1 Rules for the contact lens data.. Figure 1.2 Decision tree for the contact lens data.

1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 4b, February 20, 2015 Lab: regression, kNN and K- means results, interpreting and evaluating models.

Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.

1 Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston.

1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 8b, March 21, 2014 Using the models, prediction, deciding.

Lecture 5 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.

The Three Analytics Techniques. Decision Trees – Determining Probability.

1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 11a, April 7, 2014 Support Vector Machines, Decision Trees, Cross- validation.

ITSC/University of Alabama in Huntsville ADaM version 4.0 (Eagle) Tutorial Information Technology and Systems Center University of Alabama in Huntsville.

CSE/CIS 787 Analytical Data Mining, Dept. of EECS, SU Three steps for use  Assign the dataset file first  Assign the analysis type you want.

Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.

Machine Learning (ML) with Weka Weka can classify data or approximate functions: choice of many algorithms.

Regression through the origin

Biostatistics Case Studies 2008 Peter D. Christenson Biostatistician Session 6: Classification Trees.

Neural networks – Hands on

Introductory Data Analysis F73DA2. Contact Times (Spring Term 2008) Monday 4: : Lecture in LT3 Tuesday 2: : Lecture in LT3 Wednesday

1 Statistics & R, TiP, 2011/12 Multivariate Methods  Multivariate data  Data display  Principal component analysis Unsupervised learning technique 

1 Peter Fox Data Analytics – ITWS-4600/ITWS-6600 Week 3b, February 12, 2016 Lab exercises /assignment 2.

Diagonal is sum of variances In general, these will be larger when “within” class variance is larger (a bad thing) Sw(iris[,1:4],iris[,5]) Sepal.Length.

Review > head(tripData) > table(speciesData$SpeciesCode) > grep("a", c("aa","ab","bb")) > c(2,3,8) %in% c(1,2,3,5,7,9) > bocTrip

Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.

By Subhasis Dasgupta Asst Professor Praxis Business School, Kolkata Classification Modeling Decision Tree (Part 2)

Решение задач Data Mining. R и Hadoop. Классификация Decision Tree  Исходные данные >names(iris) [1] "Sepal.Length" "Sepal.Width" "Petal.Length“ [4]

Common Linear & Classification for Machine Learning using Microsoft R

Building 1 million predictions per second using SQL-R

Using the models, prediction, deciding

PCA/LDA Lab CSCE 587 Spring 2017.

Data Analytics – ITWS-4600/ITWS-6600

Clustering CSC 600: Data Mining Class 21.

Evaluating-Ayasdi’s-Topological-Data-Analysis-For-Big-Data_HKim2015

Chapter 18 From Data to Knowledge

Group 1 Lab 2 exercises /assignment 2

CS 235 Decision Tree Classification

Discriminant Analysis

Peter Fox and Greg Hughes Data Analytics – ITWS-4600/ITWS-6600

Data Analytics – ITWS-4600/ITWS-6600/MATP-4450

Weka Free and Open Source ML Suite Ian Witten & Eibe Frank

DataMining, Morgan Kaufmann, p Mining Lab. 김완섭 2004년 10월 27일

Classification and clustering - interpreting and exploring data

PCA/LDA Lab CSCE 587 Fall 2018.

Assignment 2 (in lab) Peter Fox and Greg Hughes

R & Trees There are two tree libraries: tree: original

Lab weighted kNN, decision trees, random forest (“cross-validation” built in – more labs on it later in the course) Peter Fox and Greg Hughes Data Analytics.

Mathematica: Hubble.

Data Analytics – ITWS-4600/ITWS-6600/MATP-4450

ITWS-4600/ITWS-6600/MATP-4450/CSCI-4960

Association Rules Lab.

Arko Barman COSC 6335 Data Mining

DATA VISUALISATION (QUANTITATIVE).

Data Mining CSCI 307, Spring 2019 Lecture 6

Data Mining CSCI 307, Spring 2019 Lecture 9

R for Statistics and Graphics

Presentation transcript:

Decision Tree Lab

Load in iris data: Display iris data as a sanity check: iris Load package rpart. Install if necessary

We will use fit() to build tree First: understand arguments to fit() – fit(formula, data=, method, control=) – formula: outcome ~ predictor1 + predictor2+… – data: specifies the dataframe – method: “class” for classification tree – control: optional parameters for controlling tree growth

In the case of the iris dataset – formula: Species ~ Petal.Length + Petal.Width + Sepal.Length + Sepal.Width – data = iris – method=“class”

In the case of the iris dataset – control=rpart.control(minsplit=2, cp=0.001) i.e. at least 2 observation in a node must improve overall fit by a factor of (cost complexity)

Altogether: fit = rpart(Species ~ Petal.Length + Petal.Width + Sepal.Length + Sepal.Width, method = "class", data=iris, control =rpart.control(minsplit=2, cp=0.001)) Examine decision tree: print(fit)

Plot decision tree: plot(fit, uniform=TRUE, main="Classification Tree for Iris Dataset") Label the tree: text(fit, use.n=TRUE, all=TRUE, cex=.7)