Presentation is loading. Please wait.

Presentation is loading. Please wait.

Imputating snag data to forest inventory for wildlife habitat modeling Kevin Ceder College of Forest Resources University of Washington GMUG – 11 February.

Similar presentations


Presentation on theme: "Imputating snag data to forest inventory for wildlife habitat modeling Kevin Ceder College of Forest Resources University of Washington GMUG – 11 February."— Presentation transcript:

1 Imputating snag data to forest inventory for wildlife habitat modeling Kevin Ceder College of Forest Resources University of Washington GMUG – 11 February 2008

2 Why impute snag data? Snags are an important habitat element and needed for habitat assessments. These data are often not collected in forest inventory The Large-Landscape Wildlife Assessment models will need these data

3 Why use Nearest-Neighbor? Non-parametric requiring no assumptions of underlying functional form Retains the variance/covariance structure of the input data in the output data

4 The Questions 1)Can snag data be imputed using kNN techniques with stand and site data? 2)How well do the results fit observed data? 3)Which distance measure performs best? 4)What is the effect of increasing neighborhood size? 5)How do the results compare with random sampling?

5 The Process The database –FIA integrated database version 2.1 –Data for private forests in western Washington (1510 plots) –Both tree and snag data collected between 1989 - 1991 –Representative of the forest targeted for the LLWA project

6 The Process The tool - –The yaImpute package for kNN imputation Raw, Euclidean, Mahalanobis, MSN, MSN2, ICA, and randomForest distance measures k = 1, 2, 3, 4, 5, 10 For k>1 imputed data are distance weighted means of neighbors –9999 permutations of the data for comparisons with random sampling k = 1, 2, 3, 4, 5, 10 For k>1 imputed data are distance weighted means of neighbors using Euclidean distance

7 The Statistics Goodness of fitComparison with random

8 The Input Data – Tree and site data (xData) N = 1510MinMaxMean Trees per Acre (TOT_TPA)6.72920.5475.8 Basal Area per Acre (TOT_BA, sqft/ac)0.0397.8119.8 Quadratic Mean Diameter (QMD, in)0.128.57.6 Mean Height (MEAN_HT, ft)1.0147.943.7 Stand Age (AGE, yr)521537 Site Index (SITE_INDEX_FIA, feet @ 50 yr)44180112 Slope (SLOPE, %)09924 Aspect (ASPECT_DEG, deg)0130155 Elevation (ELEV_FT, ft)34724869

9 The Input Data – Snag data (yData) N = 1510MinMaxMean Snags per Acre (SNAG_TPA_TOTAL)0.096.84.8 Basal Area (SNAG_BA, sqft/ac)0.09.70.5 Quadratic Mean Diameter (SNAG_QMD, in)0.010.52.7 Mean Height (SNAG_ MEAN_HT, ft)0.0161.014.9 695 of 1510 plots did not have snags present

10 Results 1)Can snag data be imputed using kNN techniques with stand and site data? Yes!

11 Results 1)How well do the results fit observed data? RMSDSPABAQMDMean Ht Min7.00.62.518.9 Max11.21.03.728.4 Mean9.00.83.023.4

12 Results 1)How well do the results fit observed data? BIASSPABAQMDMean Ht Min-1.7-0.2-0.6-4.0 Max-0.10.00.10.3 Mean-0.50.0-0.1-0.8

13 Results 1)How well do the results fit observed data? MADSPABAQMDMean Ht Min3.5.31.811.3 Max6.20.62.718.1 Mean5.10.52.315.1

14

15 Results 1)How well do the results fit observed data?  Marginally… High RMSD and MAD relative to mean snag measures in the data Observed vs imputed plots show poor patterning

16 Results 2)Which distance measure performs best? 3)What is the effect of increasing neighborhood size?

17

18

19

20

21 Results 2)Which distance measure performs best? All are generally similar randomForest imputations provide lower RMSD and MAD but under-predict more than others 3)What is the effect of increasing neighborhood size? Increasing k reduces RMSD and MAD Little effect on bias Slightly decreased range in imputed values with k = 10

22 Results 4)How do the results compare with random sampling? RMSD k1234510 SNAG_TPA_TOTAL NN11.249.969.158.708.628.35 p0.0001 SNAG_BA_TOTAL NN0.950.840.770.73 0.71 p0.0001 SNAG_QMD NN3.553.102.932.852.792.69 p0.0001 SNAG_MEAN_HT NN27.8924.3822.7822.0221.6920.87 p0.0001

23 Results 4)How do the results compare with random sampling? Bias k1234510 SNAG_TPA_TOTAL NN-0.16-0.18-0.29-0.19-0.24-0.33 p0.0001 SNAG_BA_TOTAL NN-0.02-0.03 -0.02-0.03-0.04 p0.0001 SNAG_QMD NN-0.010.00 -0.03 p0.00010.07270.02860.03480.00540.0001 SNAG_MEAN_HT NN-0.46-0.37-0.56-0.37-0.44-0.52 p0.0001

24 Results 4)How do the results compare with random sampling? MAD k1234510 SNAG_TPA_TOTAL NN6.035.595.275.165.134.97 p0.0001 SNAG_BA_TOTAL NN0.530.490.460.450.440.43 p0.0001 SNAG_QMD NN2.512.382.302.262.23 p0.0001 SNAG_MEAN_HT NN17.4615.8614.9614.6714.5114.13 p0.0001

25 Results 4)How do the results compare with random sampling? p-values of 0.001 suggest that there is some underlying very weak relationship between snags and overstory Imputation is better than just randomly assigning snags to stands

26 Why didn’t it work better? Very weak correlations between overstory and snags –Snags are from prior stand Many of the snags in the FIA database have advanced decay classes Often snags are larger than QMD –Management history Snags were removed at harvest Thinning captures mortality

27 Future Direction Assessing the effects of imputed data on habitat model outputs –If there are big differences then what?


Download ppt "Imputating snag data to forest inventory for wildlife habitat modeling Kevin Ceder College of Forest Resources University of Washington GMUG – 11 February."

Similar presentations


Ads by Google