Download presentation
Presentation is loading. Please wait.
1
CHAPTER 29 Classification and Regression Trees Dean L. Urban From: McCune, B. & J. B. Grace. 2002. Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon http://www.pcord.comhttp://www.pcord.com Tables, Figures, and Equations
2
Table 29.1. A matrix matching statistical techniques to various applications that require group classification or discrimination. Applications are discussed in the Introduction, coded here as groups defined on species composition (SPP) or environmental variables (ENV). Techniques are discriminant analysis (DA), group- contrast Mantel test (GC-Mantel), multivariate analysis of variance (MANOVA), nonparametric MANOVA (NPMANOVA), multi-response permutation procedures (MRPP), classification and regression trees (CART), generalized linear models (GLM), and generalized additive models (GAM). ApplicationAppropriate Techniques Exploratory data analysis: 1a. Do SPP groups differ?CART, DA, GC-Mantel, MANOVA, NPMANOVA, MRPP 1b. On which ENV variable(s)?CART, DA, partial GC-Mantel 2a. Do ENV groups differ?ISA, CART, GC-Mantel, MRPP 2b. On which SPP?ISA, CART, partial GC-Mantel 3a. Do habitats differ?DA, CART, MANOVA, NPMANOVA, MRPP, logistic regression, GLM, GAM, etc. 3b. On which variable(s)?CART, DA, partial GC-Mantel, logistic regression, etc. Predict group membership: 1c. on SPPISA (with some modification) 2c. on ENVCART, DA, (multinomial) logistic regression 3c. habitat variablesCART, DA, logistic regression
4
Table 29.3. Indicator Species Analysis for the seven forest types identified via hierarchical clustering. Indicator values (IV) are percentage of perfect fidelity. Indicator values were tested for statistical significance based on 1000 permutations (**, p < 0.001; *, p < 0.005). Sequence = order of groups in data, Identifier = group identifier, Avg =Average IV, Max = Maximum IV, MaxGrp = Group with highest IV.
5
Figure 29.1. Upper: Classification tree for 7 forest types on 15 environmental variables (function rpart, complexity parameter (cp) = 0.000001, minsplit = 10, split = information).
6
Figure 29.1. (Lower): Pruned classification tree, simplified by stopping the tree at the number of nodes corresponding to the point where the pruning curve crosses the minimum (1 S.E.) line (Fig. 29.2).
7
Table 29.4. Misclassification table for the 7 forest types, based on a pruned CART model with 11 nodes (Fig. 29.3). Rows are actual forest types, columns are predicted forest types. Row totals are indexed as number correct/number misclassified. Total misclassification rate based on jack-knifing is 39/98 (39.8%).
8
Figure 29.2. Cost-complexity pruning curve for the classification tree in Figure 29.1. Error bars are estimated from 10 cross-validation subsets of the samples. The horizontal line is one standard error above the minimum error rate. “Inf” = infinite. Relative error is calculated by cross-validation.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.