Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.

Slides:



Advertisements
Similar presentations
Plans to improve estimators to better utilize panel data John Coulston Southern Research Station Forest Inventory and Analysis.
Advertisements

Random Forest Predrag Radenković 3237/10
CHAPTER 9: Decision Trees
Molecular Biomedical Informatics 分子生醫資訊實驗室 Machine Learning and Bioinformatics 機器學習與生物資訊學 Machine Learning & Bioinformatics 1.
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Classification Techniques: Decision Tree Learning
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Longin Jan Latecki Temple University
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
Sparse vs. Ensemble Approaches to Supervised Learning
Ensemble Learning: An Introduction
Lecture 5 (Classification with Decision Trees)
Prediction Methods Mark J. van der Laan Division of Biostatistics U.C. Berkeley
Data mining and statistical learning - lecture 13 Separating hyperplane.
ICS 273A Intro Machine Learning
Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?
Machine Learning: Ensemble Methods
Ensemble Learning (2), Tree and Forest
1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.
Ensembles of Classifiers Evgueni Smirnov
Machine Learning CS 165B Spring 2012
Issues with Data Mining
DATA MINING : CLASSIFICATION. Classification : Definition  Classification is a supervised learning.  Uses training sets which has correct answers (class.
Chapter 10 Boosting May 6, Outline Adaboost Ensemble point-view of Boosting Boosting Trees Supervised Learning Methods.
ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.
Chapter 9 – Classification and Regression Trees
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Data Mining - Volinsky Columbia University 1 Topic 10 - Ensemble Methods.
Scaling up Decision Trees. Decision tree learning.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
Konstantina Christakopoulou Liang Zeng Group G21
Lecture Notes for Chapter 4 Introduction to Data Mining
Data Analytics CMIS Short Course part II Day 1 Part 3: Ensembles Sam Buttrey December 2015.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Finding τ → μ−μ−μ+ Decays at LHCb with Data Mining Algorithms
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
Regression Tree Ensembles Sergey Bakin. Problem Formulation §Training data set of N data points (x i,y i ), 1,…,N. §x are predictor variables (P-dimensional.
Classification and Regression Trees
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.
Ensemble Learning, Boosting, and Bagging: Scaling up Decision Trees (with thanks to William Cohen of CMU, Michael Malohlava of 0xdata, and Manish Amde.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
1 Machine Learning: Ensemble Methods. 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training data or different.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Decision tree and random forest
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
JMP Discovery Summit 2016 Janet Alvarado
Data Mining Practical Machine Learning Tools and Techniques
Bagging and Random Forests
Zaman Faisal Kyushu Institute of Technology Fukuoka, JAPAN
Chapter 13 – Ensembles and Uplift
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Introduction to Data Mining, 2nd Edition by
ECE 471/571 – Lecture 12 Decision Tree.
Introduction to Data Mining, 2nd Edition by
Introduction to Data Mining, 2nd Edition
Multiple Decision Trees ISQS7342
CSCI N317 Computation for Scientific Applications Unit Weka
Ensemble learning.
Ensemble learning Reminder - Bagging of Trees Random Forest
Classification with CART
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Random Forests Ujjwol Subedi

Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features at each node. ◦ Trees have a uniform distribution. ◦ It can be generated efficiently and the combination of large sets of random trees generally leads to accurate models.

Decision trees Decision trees are predictive models that use a set of binary rules to calculate a target value. Two types of decision trees. ◦ Classification  Classification trees are used to create categorical data sets. ◦ Regression  are used to create continuous data sets.

Here is the simple example of decision trees

Definition Random forests  first developed by Leo Breiman.  It is group of un-pruned classification or regression trees made from random selections of samples of the training data.  Random forests are way of averaging multiple deep decision trees, trained on different parts of the same training set, with goal of overcoming over-fitting problem of individual decision trees.  In other words, random forests are an ensemble learning method for classification and regression that operate by constructing a lot of decision trees at training time and outputting the class that is the mode of the classes output by individual trees.

Random forests does not over fit. You can run as many trees as you want. It is fast. Running on a data set with 50,000 cases and 100 variables, it produced 100 trees in 11 minutes on a 800Mhz machine. For large data sets the major memory requirement is the storage of the data itself, and three integer arrays with the same dimensions as the data. If proximities are calculated, storage requirements grow as the number of cases times the number of trees.

How random Forest works? Each tree is grown as follows: 1.Random Record Selection: Each tree is trained on roughly 2/3 rd of the total training data. Cases are drawn at random with replacement from the original data, this sample will be the training set for growing the tree. 2. Random variable selection: Some predictor variables, say m, are selected at random out of all the predictor variables and the best split on these m is used to split the node. 3.For each tree, using leftover data, calculate the misclassification rate – out of bag (OOB) error rate and aggregate error from all the trees to determine overall the OOB error rate for the classification.

4. Each tree gives a classification and we say that the tree “votes” for that class. The forest chooses the classification having the most votes. For example: If 500 trees are grown and 400 of them predict that a particular pixel is forest and 100 predict it is a grass. Then the predicted output for that pixel will be forest.

Algorithm Let the number of training cases be N and number of variables in the classifier be M. Number m of the input variables be used to determine the decision at a node of the tree; m << M. Choose the training set for this tree by choosing n times with replacement from all N available training cases. Use the rest cases to estimate the error to estimate the error of the tree by predicting their classes. For each node of the tree, randomly choose m variables on which to base the decision at the node. Calculate the best split based on these variables in the training set. Each tree is fully grown and not pruned.

Pros and Cons Then advantages of random forests : ◦ It is one of the most accurate learning algorithms available. For many data sets, it produces a highly accurate classifier. ◦ It runs efficiently on large data sets. ◦ It can handle thousands of input variables without variable deletion. ◦ It gives estimate of what variables are important in the classification. ◦ It has an effective method for estimating missing data and maintains when large proportion of the data are missing. ◦ It computes proximities between pairs of cases that can be used in clustering, locating outliers.

Pros and Cons contd…. Disadvantages are: ◦ Random forests have been observed to over fit for some datasets with noisy classification/regression tasks. ◦ For data including categorical variables with different number of levels, random forests are biased in favor of those attributes with more labels.

Parameters When running random forests there are number of parameters that need to specified. The most common parameters are: ◦ Input training data including predictor variables. ◦ The number of trees that should be built. ◦ The number of predictor variables to be used to create the binary rule for each split. ◦ parameters to calculate information related to error and variable significance.

Terminologies related to random forest algorithm Bagging ( Bootstrap Aggregating) ◦ Generates m new training data set and each new training data set picks a sample of observations with replacement. Then m models are fitted using the above m bootstrap samples and combined by averaging the output (for regression) or voting(for classification). The training algorithm for random forests applies the general technique of bootstrap aggregating, or bagging, to tree learners. Given a training set X = x 1,..., x n with responses Y = y 1,..., y n, bagging repeatedly (B times) selects a random sample with replacement of the training set and fits trees to these samples:  For b = 1,..., B:Sample, with replacement, n training examples from X, Y; call these X b, Y b. Train a decision or regression tree f b on X b, Y b.  After training, predictions for unseen samples x' can be made by averaging the predictions from all the individual regression trees on x': or by taking the majority vote in the case of decision trees.

Terminologies contd.. Out-of-Bag error rate ◦ As the forest is built on training data, each tree is tested on the 1/3 rd of the samples not used in building that tree. This is the out-of-bag error estimate- an internal error estimate of a random forest Bootstrap sample ◦ It is a random with replacement sampling method. Proximities ◦ These are one of the most useful tools in random forests. The proximities originally formed a NxN matrix. After a tree is grown, put all of the data, both training and oob, down the tree. If cases k and n are in the same terminal node increase their proximity by one. At the end, normalize the proximities by dividing by the number of trees.

Missing Values.. Missing Data Imputation Fast way: replace missing values for a given variable using the median of the non-missing values (or the most frequent, if categorical) Better way (using proximities): 1. Start with the fast way. 2. Get proximities. 3. Replace missing values in case i by a weighted average of non- missing values, with weights proportional to the proximity between case i and the cases with the non-missing values. Repeat steps 2 and 3 a few times (5 or 6).

Variables importance RF computes two measures of variable importance, one based on a rough-and- ready measure (Gini for classification) and the other based on permutations.

Example In this tree, it advise us based on weather conditions, whether to play ball

Example contd… The random forest takes this notion to the next level by combining with notion of an ensemble.

Results and Discussions Here classification results are compared between the results of J48 and the Random forest.

Results and discussion contd.. Table shows the Precision, Recall and the F-measure for the random forest and J48 for the 20 data sets.

References andomForests/cc_home.htm andomForests/cc_home.htm ndom-forests-ensembles-and- performance-metrics/ ndom-forests-ensembles-and- performance-metrics/ Random Forests for land cover classification by Pall Oskar GislasonPall Oskar Gislason