CART: Classification and Regression Trees Chris Franck LISA Short Course March 26, 2013.

Slides:



Advertisements
Similar presentations
Ensemble Learning – Bagging, Boosting, and Stacking, and other topics
Advertisements

Associate Collaborator for LISA Department of Statistics, VT
Chapter 7 Classification and Regression Trees
Factorial Experiments: -Blocking,
DECISION TREES. Decision trees  One possible representation for hypotheses.
Random Forest Predrag Radenković 3237/10
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
A Quick Overview By Munir Winkel. What do you know about: 1) decision trees 2) random forests? How could they be used?
Model Assessment and Selection
Model Assessment, Selection and Averaging
Chapter 7 – Classification and Regression Trees
Model assessment and cross-validation - overview
Chapter 7 – Classification and Regression Trees
LISA Short Course Series Multivariate Analysis in R Liang (Sally) Shan March 3, 2015 LISA: Multivariate Analysis in RMar. 3, 2015.
Decision Tree Rong Jin. Determine Milage Per Gallon.
Sparse vs. Ensemble Approaches to Supervised Learning
About ISoft … What is Decision Tree? Alice Process … Conclusions Outline.
Additive Models and Trees
Data Mining.
Three kinds of learning
Comp 540 Chapter 9: Additive Models, Trees, and Related Methods
Classification and Prediction: Regression Analysis
Ensemble Learning (2), Tree and Forest
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
LISA Short Course Series R Statistical Analysis Ning Wang Summer 2013 LISA: R Statistical AnalysisSummer 2013.
Classifiers, Part 3 Week 1, Video 5 Classification  There is something you want to predict (“the label”)  The thing you want to predict is categorical.
Machine Learning CS 165B Spring 2012
Lecture Notes 4 Pruning Zhangxi Lin ISQS
Chapter 10 Boosting May 6, Outline Adaboost Ensemble point-view of Boosting Boosting Trees Supervised Learning Methods.
Zhangxi Lin ISQS Texas Tech University Note: Most slides are from Decision Tree Modeling by SAS Lecture Notes 6 Ensembles of Trees.
Shuyu Chu Department of Statistics February 17, 2014 Lisa Short Course Series R Statistical Analysis Laboratory for Interdisciplinary Statistical Analysis.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 5: Classification Trees: An Alternative to Logistic.
Using mixed effects models to quantify dependency among repeated measures Dr. Christopher Franck LISA Short Course August 5, 2015.
Chapter 9 – Classification and Regression Trees
NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015
 Enhanced Graphics Using GGPLOT2 in R Paul Sabin March 23rd & 24th, 2015.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Hong Tran, April 21, 2015.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
LISA Short Course Series Multivariate Clustering Analysis in R Yuhyun Song Nov 03, 2015 LISA: Multivariate Clustering Analysis in RNov 3, 2015.
Konstantina Christakopoulou Liang Zeng Group G21
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Finding τ → μ−μ−μ+ Decays at LHCb with Data Mining Algorithms
Validation methods.
Classification and Regression Trees
CS 548 Spring 2016 Model and Regression Trees Showcase by Yanran Ma, Thanaporn Patikorn, Boya Zhou Showcasing work by Gabriele Fanelli, Juergen Gall, and.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
1 C.A.L. Bailer-Jones. Machine Learning. Model selection and combination Machine learning, pattern recognition and statistical data modelling Lecture 10.
Bootstrap and Model Validation
JMP Discovery Summit 2016 Janet Alvarado
Bagging and Random Forests
Introduction to Machine Learning and Tree Based Methods
Lecture 15. Decision Trees¶
Eco 6380 Predictive Analytics For Economists Spring 2016
Lecture 17. Boosting¶ CS 109A/AC 209A/STAT 121A Data Science: Harvard University Fall 2016 Instructors: P. Protopapas, K. Rader, W. Pan.
Introduction to Data Mining, 2nd Edition by
ECE 471/571 – Lecture 12 Decision Tree.
Introduction to Data Mining, 2nd Edition by
(classification & regression trees)
Experiments in Machine Learning
Decision Trees By Cole Daily CSCI 446.
Classification with CART
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
STT : Intro. to Statistical Learning
Presentation transcript:

CART: Classification and Regression Trees Chris Franck LISA Short Course March 26, 2013

Outline Overview of LISA Overview of CART Classification tree description – Examples – iris and skull data. Regression tree description – Examples – simulated and car data Going further – Mention cross validation, pruning, cost-complexity

In addition to CART, these statistical and practical principals will be discussed R programming. Importance of exploratory data analysis. Use trees to predict outcomes for newly collected data. Graphical Comparison with regression. Performance assessment on simulated data. Importance of model validation (brief).

Laboratory for Interdisciplinary Statistical Analysis Collaboration: Visit our website to request personalized statistical advice and assistance with: Experimental Design Data Analysis Interpreting Results Grant Proposals Software (R, SAS, JMP, SPSS...) LISA statistical collaborators aim to explain concepts in ways useful for your research. Great advice right now: Meet with LISA before collecting your data. All services are FREE for VT researchers. We assist with research—not class projects or homework. LISA helps VT researchers benefit from the use of Statistics LISA also offers: Educational Short Courses: Designed to help graduate students apply statistics in their research Walk-In Consulting: M-F 1-3 PM GLC Video Conference Room for questions requiring <30 mins Also 3-5 PM Port (Library/Torg Bridge) and 9-11 AM ICTAS Café X 4

Tree-based methods

The above idea is simple, although some of the language surrounding CART can sound technical.

How can CART divide the data space?

Example 1 iris data In Rstudio, type ‘?iris’ (no quotes) to open the help file on the iris data. Preceding built-in data objects or functions with ‘?’ in R opens the help file. Install the ‘tree’ package – Tools -> Install Packages… -> type ‘tree’ -> click install Open ‘CART course code’

Iris data review

Tree splits are chosen to minimize Deviance at each step

Another example – Tibetan skulls Description from Hand et. al. (1996).

Skull data review We grew another classification tree Predicted an outcome based on new data Looked at deviance calculation

Under the hood CART uses a greedy algorithm. At each step the chosen split is the one which. maximizes classification/ minimizes error. Similar to forward variable selection in regression. The splitting continues until nodes become “too small” or deviance explained by a new split is small relative to starting deviance (see ‘?tree.control’ for more details)

Final two examples: Regression trees Similar to classification trees but for continuous outcomes. Simulated example – when we know the correct answer, does the method work? Motor trend car data

Advantages of CART Can be used to characterize outcomes as a function of many predictors Simple, yet powerful. Tree can be visualized easily in high dimension. Classification is highly similar to regression in CART. (more similar than orginary least squares versus logistic regression in my opinion).

Caveats of CART Trees tend to overfit data – We saw low classification error rates and good deviance performance for the data used to construct the tree. – Would the trees we built necessarily predict new irises, skulls, or cars as well? Small changes to input data could result in major changes to tree structure (homework).

Cross validation is typically used to assess overfitting K-fold cross validation: A technique which assesses the predictive value of a model (tree in this case) for new data. – Split the data into k (say 10) parts. – Withhold one part (validation set), grow the tree using other 9 parts (training set). – Assess predictive accuracy on the validation part using the tree. – Repeat, holding all 10 parts out in turn.

How big should the tree be? Too big will overfit data Too small might miss important structures Generally, cost-complexity pruning can be used. – Make a big tree (say until no node has more than 5 observations) – Consider all subtrees which can be achieved by pruning the big tree. Choose tree which satisfies a cost-complexity criterion (see ?prune.tree in R, references for more detail).

Random Forests – another tree-based technique Basic idea is to sample data with replacement (i.e. bootstrap sample). 1/3 of each sample is left out. 2/3 of data used to build a tree, then performance of tree determined based on hold- out data Grow a large number of trees, each of which “votes” for a certain classification. See Forests/cc_home.htm

References The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Hastie T, Tibshirani R, Friedman J. Second Edition Morant GM. A First Study of the Tibetan Skull Biometrika, Vol. 14, No. 3/4 (Mar., 1923), pp Discussion of “deviance” is-deviance-specifically-in-cart-rpart is-deviance-specifically-in-cart-rpart Hand DJ, Daly F, McConway K, Lunn D, Ostrowski E. A Handbook of Small Data Sets, Chapman and Hall 1996.