1 Accounting for Spatial Dependence in Bayesian Belief Networks Alix I Gitelman Statistics Department Oregon State University August 2003 JSM, San Francisco
2 The work reported here was developed under the STAR Research Assistance Agreement CR awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of the presenter and the STARMAP, the Program she represents. EPA does not endorse any products or commercial services mentioned in this presentation. Project Funding This research is funded by U.S.EPA – Science To Achieve Results (STAR) Program Cooperative Agreement #CR
3 Specific Points Bayesian Belief Networks Modeling Ecological Systems Macro-invertebrate Example Adding Spatial Correlation Structure Results
4 Bayesian Belief Networks Graphical models (Lauritzen 1982; Pearl 1985, 1988, 2000). –Joint probability distributions –Nodes are random variables –Edges are “influences”
5 An Example
6 Example (cont’d) To estimate this model we assume that n samples,, are independent. But ecological data are often collected through space and time.
7 Understanding Mechanisms of Ecosystem Health Mid-Altantic Integrated Assessment (MAIA) Program ( ). Program to provide information on conditions of surface water resources in the Mid-Atlantic region. Focus on the condition of macro- invertebrates (BUGIBI).
8 One Piece of the Puzzle
9 The BUGIBI Data The MAIA data were collected (relatively) close together in space. Some species of macro-invertebrates can travel distances in the 10’s of kilometers. How can we account for spatial proximity?
10 Options for Dealing with Spatial Correlation Include location in the model Allow additional nodes based on location (i.e., spatial auto-correlation) Account for spatial dependence in the residuals (and only in the “response”) Some combination of these
11 Some Notation channel sediment (poor, medium, good) acid deposit (low, moderate, high) BUG index of biotic integrity
12 Model Specification continued…
13 Model Specification if these sites are within 30km
14 Prior Specification Regression coefficients are given diffuse Normal priors
15 Results There are 206 sites. The largest neighborhood set has 5 sites in it. Roughly 2% of the pairwise distances are less than 30km.
16 Results
17 Final Words Important additional information can be obtained by incorporating the spatial correlation component. This approach can be extended to other nodes of the BBN using a different spatial dependence structure, and/or a different distance metric for each node.
18 Acknowledgements Tom Deitterich Steve Jensen Scott Urquart