Biophysical Gradient Modeling. Management Needs Decision Support Tools – Baseline Information Vegetation characteristics Forest stand structure Fuel loads.

Slides:

Advertisements

Similar presentations

Chapter 7 Classification and Regression Trees

Advertisements

Random Forest Predrag Radenković 3237/10

CHAPTER 9: Decision Trees

Hunt’s Algorithm CIT365: Data Mining & Data Warehousing Bajuna Salehe

Decision Tree Approach in Data Mining

Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5

Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.

1 Data Mining Classification Techniques: Decision Trees (BUSINESS INTELLIGENCE) Slides prepared by Elizabeth Anglo, DISCS ADMU.

VEGETATION MAPPING FOR LANDFIRE National Implementation.

Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”

A Quick Overview By Munir Winkel. What do you know about: 1) decision trees 2) random forests? How could they be used?

Chapter 7 – Classification and Regression Trees

Chapter 7 – Classification and Regression Trees

SLIQ: A Fast Scalable Classifier for Data Mining Manish Mehta, Rakesh Agrawal, Jorma Rissanen Presentation by: Vladan Radosavljevic.

Lecture Notes for Chapter 4 Introduction to Data Mining

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Decision Tree Rong Jin. Determine Milage Per Gallon.

Decision Tree Algorithm

Spatial and Temporal Data Mining V. Megalooikonomou Introduction to Decision Trees ( based on notes by Jiawei Han and Micheline Kamber and on notes by.

Lecture 5 (Classification with Decision Trees)

Three kinds of learning

ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.

(C) 2001 SNU CSE Biointelligence Lab Incremental Classification Using Tree- Based Sampling for Large Data H. Yoon, K. Alsabti, and S. Ranka Instance Selection.

R OBERTO B ATTITI, M AURO B RUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Feb 2014.

Ensemble Learning (2), Tree and Forest

Classification Part 4: Tree-Based Methods

Data Mining Techniques

Next Generation Techniques: Trees, Network and Rules

DATA MINING : CLASSIFICATION. Classification : Definition  Classification is a supervised learning.  Uses training sets which has correct answers (class.

1 Statistics 202: Statistical Aspects of Data Mining Professor David Mease Tuesday, Thursday 9:00-10:15 AM Terman 156 Lecture 11 = Finish ch. 4 and start.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

1 Data Mining Lecture 3: Decision Trees. 2 Classification: Definition l Given a collection of records (training set ) –Each record contains a set of attributes,

Chapter 9 – Classification and Regression Trees

Classification and Regression Trees (CART). Variety of approaches used CART developed by Breiman Friedman Olsen and Stone: “Classification and Regression.

Computational Intelligence: Methods and Applications Lecture 19 Pruning of decision trees Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.

LOGO Ensemble Learning Lecturer: Dr. Bo Yuan

Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.

Business Intelligence and Decision Modeling Week 9 Customer Profiling Decision Trees (Part 2) CHAID CRT.

Today Ensemble Methods. Recap of the course. Classifier Fusion

MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.

1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.

Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.

DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:

Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.

Lecture Notes for Chapter 4 Introduction to Data Mining

Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

Finding τ → μ−μ−μ+ Decays at LHCb with Data Mining Algorithms

Decision Tree Algorithms Rule Based Suitable for automatic generation.

ECE 471/571 – Lecture 20 Decision Tree 11/19/15. 2 Nominal Data Descriptions that are discrete and without any natural notion of similarity or even ordering.

GLC 2000 Workshop March 2003 Land cover map of southern hemisphere Africa using SPOT-4 VEGETATION data Ana Cabral 1, Maria J.P. de Vasconcelos 1,2,

Classification and Regression Trees

Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.

DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.

Decision Tree Learning DA514 - Lecture Slides 2 Modified and expanded from: E. Alpaydin-ML (chapter 9) T. Mitchell-ML.

DECISION TREE INDUCTION CLASSIFICATION AND PREDICTION What is classification? what is prediction? Issues for classification and prediction. What is decision.

Introduction to Machine Learning

DECISION TREES An internal node represents a test on an attribute.

Introduction to Machine Learning and Tree Based Methods

Lecture 17. Boosting¶ CS 109A/AC 209A/STAT 121A Data Science: Harvard University Fall 2016 Instructors: P. Protopapas, K. Rader, W. Pan.

Table 1. Advantages and Disadvantages of Traditional DM/ML Methods

Ch9: Decision Trees 9.1 Introduction A decision tree:

ECE 471/571 – Lecture 12 Decision Tree.

Lecture 05: Decision Trees

Decision Trees By Cole Daily CSCI 446.

Statistical Learning Dong Liu Dept. EEIS, USTC.

Classification with CART

Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017

Presentation transcript:

Biophysical Gradient Modeling

Management Needs Decision Support Tools – Baseline Information Vegetation characteristics Forest stand structure Fuel loads – Predictive Mapping Vegetation maps Fuels maps

What are the different vegetation types in the Sky Island systems of Chihuahuan Desert Borderlands? Are the local- and landscape-scale abundance and distribution patterns of vegetation related to variation in the biophysical environment and the spectral characteristics of the vegetation? Can those species-environment relationships be used in a predictive manner to map vegetation across the landscape? Research Questions

Species-Environment Relationships in SW North America Niering and Lowe (1984)

Sky Island Forests Sierra Madre Oriental and Occidental Post-Pleistocene refugia High vascular plant diversity and endemism

Integrated approach Merge extensive field sampling with image classification of vegetation/fuel characteristics and biophysical gradient modeling. Davis Mountain Alterna Lampropellis alterna

Vegetation Sampling 600 Permanent plots –Systematic sampling grid –Captured topographic variability –Circular, fixed-area plots Tree attributes –Species ID, DBH, height, live crown height, spatial location

Topographic Data } Digital Elevation Model

Analysis Species data for each plot (Basal Area/ Density) Cluster Analysis Species IV = Sum Rel BA + Rel Dens Vegetation Types CART Species-EnvironmentRelationships Topographic Data For Each Plot ENVI Decision Tree Vegetation and Fuels Maps

9 Dominant Forest Types Pinyon Pine Forest Oak-Pinyon-Juniper Forest Alligator Juniper Forest Gray Oak Forest Emory Oak Forest Cypress-Fir Forest Ponderosa-SW White Pine Forest Gallery Forest Graves Oak Forest Dry sites High solar radiation Upper topographic positions Mesic sites Low solar radiation Valley bottoms Tolerant Species Good Competitors Elevation high low

CART Basics: How do you parse these data into homogeneous groups?

Classification Given a collection of records Each record contains a set of attributes, one of the attributes is the class. Find a model for class attribute as a function of the values of other attributes.

Development of CART Leo Breiman- discovered tree-based methods of Classification that later became machine learning. Also know as data mining. Wrote CART: Classification and Regression Trees with Jerome Friedman and Richard Olshen in and also Random Forests….

Classification and Regression Trees A supervised learning algorithm that recursively partitions heterogeneous data into successive homogeneous subsets using binary splits Non-parametric and non-linear Can handle numerical or categorical Easy interpretability of results Output can be directly fed into ENVI Decision Tree to classify your image

Steps for Producing a CART Model 1.Determine the vegetation/fuel types using field generated data or prior knowledge of the site. 2.Extract spectral and landform metric data from imagery and DEMs 3.Inspect the training data and check for an extremely unbalanced dataset. 4.Grow the CART model to its full size and prune it using the 1- SE rule. 5.Use 10-fold cross-validation and bootstrapping to validate the model accuracy using misclassification % and the Kappa statistic. 6.Code the maps using ENVI decision tree and visually asses the “look” of the map. 7.Validate the maps in the field to produce misclassification % and the Kappa statistic.

Impurity of a Node Need a measure of impurity of a node to help decide on how to split a node, or which node to split The measure should be at a maximum when a node is equally divided amongst all classes The impurity should be zero if the node is all one class

Measures of Impurity Misclassification Rate Gini Index In practice the first is not used for the following reasons: Situations can occur where no split improves the misclassification rate The misclassification rate can be equal when one option is clearly better for the next step

Visual Example

Selection of Splits We select the split that most decreases the Gini Index. This is done over all possible places for a split and all possible variables to split. We keep splitting until the terminal nodes have very few cases or are all pure – this is an unsatisfactory answer to when to stop growing the tree, but it was realized that the best approach is to grow a larger tree than required and then to prune it!

Pruning the Tree I The best method of arriving at a suitable size for the tree is to grow an overly complex one then to prune it back. The pruning is based on the misclassification rate. However the error rate will always drop (or at least not increase) with every split. This does not mean however that the error rate on Test data will improve.

Pruning the Tree II The solution to this problem is cross- validation. One version of the method carries out a 10 fold cross validation where the data is divided into 10 subsets of equal size (at random) and then the tree is grown leaving out one of the subsets and the performance assessed on the subset left out from growing the tree. This is done for each of the 10 sets. The average performance is then assessed.

Advantages and Disadvantages Advantages – Handles data with any structure – Robust to outliers – Machine learning-little input from analyst – Final results can be summarized in logical if- then conditions Disadvantages – Knowing when to stop splitting – Does not use combinations of variables – Computations are complex in determining best split conditions

…back to Mapping fuels and Vegetation in the Chihuahuan Desert Borderlands

Misclassification = 29.1% Kappa = 0.57

Once map is generated… perform field validation