Multi-scale Analysis: Options for Modeling Presence/Absence of Bird Species Kathryn M. Georgitis 1, Alix I. Gitelman 1, and Nick Danz 2 1 Statistics Department, Oregon State University 2 Natural Resources Research Institute University of Minnesota-Duluth
The research described in this presentation has been funded by the U.S. Environmental Protection Agency through the STAR Cooperative Agreement CR Program on Designs and Models for Aquatic Resource Surveys at Oregon State University. It has not been subjected to the Agency's review and therefore does not necessarily reflect the views of the Agency, and no official endorsement should be inferred R
Talk Overview Ecological Question of Interest Western Great Lakes Breeding Bird Study Interesting Features of our Example Options for Modeling Species Presence/Absence (1) Separate Models for Each Spatial Extent (2) One Model for all Spatial Extents (3) Model using Functionals of Explanatory Variables (4) Graphical Model
Ecological Question of Interest How does the relationship between landscape characteristics and presence of a bird species change with scale? What scale is the most useful in terms of understanding bird presence/absence?
Concentric Circle Sampling Design 1000m 500m 100 m
Western Great Lakes Breeding Bird Study Response Variable: – Presence/Absence of Pine Warbler Explanatory Variables: –% land cover within 4 different spatial extents –Ten land cover types
Interesting Features of the Data Correlation between Explanatory Variables Spatial Extent pine and oak-pine/ spruce-fir lowland non-forest/ n. hardwoods n. hardwoods / aspen-birch 100m (0.08) (0.08) (0.08) 500m 0.03 (0.08) (0.08) (0.08) 1000m 0.11 (0.08) (0.08) (0.08) 5000m 0.21 (0.08) (0.06) (0.06)
Correlation Between Pine and Oak-Pine Measured at Different Scales Spatial Extent100m500m1000m5000m 100m10.81 (0.05) 0.70 (0.06) 0.45 (0.07) 500m10.95 (0.03) 0.70 (0.06) 1000m10.79 (0.05)
Relationship between Land Cover Variables and Spatial Extent
Options for Modeling Presence/Absence of Pine Warbler (1) Separate Models for Each Spatial Extent (2) One Model for all Spatial Extents (3) Model using Functionals of Explanatory Variables (4) Bayesian Network (Graphical) Model
Option 1: Separate Models Approach (100m) M 1 : log( (500m) M 5 : log( (1000m) M 10 : log( (5000m) M 50 : log( where Y denotes n-length vector of binary response with Pr(Y i =1) = i, denotes matrix of explanatory variables at the 100m scale
Option 1: Separate Models Approach
Disadvantages: –does not account for possible relationships between spatial extents –multi-collinearity of explanatory variable –2 10 possible models for each spatial extent
Options for Modeling Presence/Absence of Pine Warbler (1) Separate Models for Each Spatial Extent (2) One Model for all Spatial Extents (3) Model using Functionals of Explanatory Variables (4) Bayesian Network (Graphical) Model
Option 2: One Model for all Spatial Extents M all : log ( (1- ) -1 ) = all all where Y denotes n-length vector of binary response with Pr(Y i =1) = i, all = [
Option 2: One Model for all Spatial Extents
Advantages: –allows for interactions between scales Disadvantages: –serious multi-collinearity problems –2 30 possible models Option 2: One Model for all Spatial Extents
Options for Modeling Presence/Absence of Pine Warbler (1) Separate Models for Each Spatial Extent (2) One Model for all Spatial Extents (3) Model using Functionals of Explanatory Variables (4) Bayesian Network (Graphical) Model
Option 3: Model using Functionals of Explanatory Variables Difference Model M diff : log ( (1- ) -1 ) = diff diff where diff = (element-wise) Proportional Model M prop : log ( (1- ) -1 ) = prop prop where prop = (element-wise)
Option 3: Model using Functionals of Explanatory Variables
Advantages: –incorporates two spatial extents Disadvantages: –biologically meaningful? –multi-collinearity –model selection
Options for Modeling Presence/Absence of Pine Warbler (1) Separate Models for Each Spatial Extent (2) One Model for all Spatial Extents (3) Model using Functionals of Explanatory Variables (4) Bayesian Network (Graphical) Model
Option 4: Graphical Model - think of explanatory variables and response holistically (i.e., as a single multivariate observation) Logistic Regression Model X 1 Y X 2 X 3 X 4 X 1 Y X 2 X 3 X 4 Bayesian Network (Graphical) Model
Option 4: Graphical Model For comparison with M ALL, we use the same “explanatory” variables aspen-birch 100m pine & oak-pine 100m spruce-fir 1000m Pine Warble r spruce-fir 100m n. hardwoods 100m
Option 4: Graphical Model spruce-fir 100m pine & oak-pine 100m spruce-fir 1000m Pine Warbler aspen-birch 100m N. hardwoods 100m Diagram of M ALL spruce-fir 100m pine & oak-pine 100m spruce-fir 1000m Pine Warbler aspen-birch 100m N. hardwoods 100m Diagram of Bayesian M ALL log ( (1- ) -1 ) = all ; fixed ~ Multinomial(P,100) log(spruce-fir 1000 ) ~ N log ( (1- ) -1 ) = + log(spruce-fir 1000 ) Where = variables in M ALL
Option 4: Graphical Model Comparison of M ALL and Bayesian M ALL
Option 4: Graphical Model spruce-fir 100m pine & oak-pine 100m spruce-fir 1000m Pine Warbler aspen-birch 100m N. hardwoods 100m spruce-fir 100m pine & oak-pine 100m spruce-fir 1000m Pine Warbler aspen-birch 100m N. hardwoods 100m Where Z= variables in M ALL ~ Multinomial(P,100) log(spruce-fir 1000 )~ N log ( (1- ) -1 ) = + log(spruce-fir 1000 ) i ~ Multinomial(P i,100) P i =(P i,1, P i,2, P i,3, P i,4, P i,5 ) log(P i,1 /(1- P i,1 ))= log(spruce-fir 1000 ) log(spruce-fir 1000 )~ N log( (1- ) -1 ) = + pine & oak-pine 100 Bayesian M ALL Bayesian Network Model
Option 4: Graphical Model Comparison of two Bayesian Network Models
Option 4: Graphical Model Advantages: –considers ecological system holistically –can eliminate multi-collinearity –biologically meaningful Disadvantages: –model selection –implementation issues
Acknowledgements Don Stevens, OSU Jerry Niemi, N.R.R.I Univ. of Minn., Duluth JoAnn Hanowski, N.R.R.I Univ. of Minn., Duluth