Download presentation
Presentation is loading. Please wait.
1
www.csiro.au Predicting Water Quality Impaired Stream Segments using Landscape-scale Data and a Regional Geostatistical Model Erin E. Peterson Postdoctoral Research Fellow CSIRO Mathematical and Information Sciences Division June 5, 2006
2
The work reported here was developed under STAR Research Assistance Agreement CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. EPA does not endorse any products or commercial services mentioned in this presentation. Space-Time Aquatic Resources Modeling and Analysis Program This research is funded by U.S.EPA凡Science To Achieve Results (STAR) Program Cooperative Agreement # CR-829095 This research is funded by U.S.EPAScience To Achieve Results (STAR) Program Cooperative Agreement # CR-829095
3
Collaborators Dr. David M. Theobald Natural Resource Ecology Lab Department of Recreation & Tourism Colorado State University, USA Dr. N. Scott Urquhart Department of Statistics Colorado State University, USA Dr. Jay M. Ver Hoef National Marine Mammal Laboratory, Seattle, USA Andrew A. Merton Department of Statistics Colorado State University, USA
4
Overview Introduction ~ Background ~ Patterns of spatial autocorrelation in stream water chemistry ~ Visualizing model predictions ~ Current and future research in SEQ
5
Water Quality Monitoring Goals Create a regional water quality assessment Identify water quality impaired stream segments Purpose Demonstrate a geostatistical methodology based on Coarse-scale GIS data Field surveys Predict water quality characteristics about stream segments throughout a region Purpose of Our Research
6
How are geostatistical model different from traditional statistical models? Traditional statistical models (non-spatial) Residual error (ε) is assumed to be uncorrelated ε = unexplained variability in the data Geostatistical models Residual errors are correlated through space Spatial patterns in residual error resulting from unidentified process(es) Model spatial structure in the residual error Explain additional variability in the data Generate predictions at unobserved sites
7
Geostatistical Modelling Fit an autocovariance function to data Describes relationship between observations based on separation distance Separation Distance Semivariance Sill Nugget Range 1000 0 0 10 3 Autocovariance Parameters 1)Nugget: variation between sites as separation distance approaches zero 2)Sill: delineated where semivariance asymptotes 3)Range: distance within which spatial autocorrelation occurs
8
Distance Measures and Spatial Relationships Straight Line Distance (SLD) As the crow flies A B C
9
Symmetric Hydrologic Distance (SHD) As the fish swims A B C Distance Measures and Spatial Relationships
10
Weighted asymmetric hydrologic distance (WAHD) As the water flows Incorporate flow direction & flow volume A B C Distance Measures and Spatial Relationships Ver Hoef, J.M., Peterson, E.E., and Theobald, D.M. (2006) Spatial Statistical Models that Use Flow and Stream Distance, Environmental and Ecological Statistics, to appear.
11
A B C Challenge: Spatial autocovariance models developed for SLD may not be valid for hydrologic distances –Covariance matrix is not positive definite Distance Measures and Spatial Relationships
12
Asymmetric Autocovariance Models for Stream Networks Flow Ver Hoef, J.M., Peterson, E.E., and Theobald, D.M., Spatial Statistical Models that Use Flow and Stream Distance, Environmental and Ecological Statistics. In Press. Weighted asymmetric hydrologic distance (WAHD) Developed by Jay Ver Hoef, National Marine Mammal Laboratory, Seattle, WA, USA Moving average models Incorporate flow volume, flow direction, and use hydrologic distance Positive definite covariance matrices
13
Evaluate 8 chemical response variables 1.pH measured in the lab (PHLAB) 2.Conductivity (COND) measured in the lab μmho/cm 3.Dissolved oxygen (DO) mg/l 4.Dissolved organic carbon (DOC) mg/l 5.Nitrate-nitrogen (NO3) mg/l 6.Sulfate (SO4) mg/l 7.Acid neutralizing capacity (ANC) μeq/l 8.Temperature (TEMP) °C Determine which distance measure is most appropriate SLD, SHD, WAHD? More than one? Find the range of spatial autocorrelation Objectives
14
Maryland Biological Stream Survey (MBSS) Data Maryland Department of Natural Resources Maryland, USA 1995, 1996, 1997 Stratified probability-based random survey design 1 st, 2 nd, and 3 rd order non-tidal streams 955 sites 881 sites after pre-processing 17 interbasins
15
Maryland, USA Baltimore Annapolis Washington D.C. Chesapeake Bay Study Area
16
Spatial Distribution of MBSS Data N
17
Create data for geostatistical modelling 1.Calculate watershed covariates for each stream segment 2.Calculate separation distances between sites SLD, SHD, Asymmetric hydrologic distance (AHD) 3.Calculate the spatial weights for the WAHD 4.Convert GIS data to a format compatible with statistics software FLoWS website: http://www.nrel.colostate.edu/projects/starmap 1 2 3 1 2 3 SLD 12 3 SHD AHD Functional Linkage of Watersheds and Streams (FLoWS)
18
Spatial Weights for WAHD Proportional influence (PI): influence of each neighboring survey site on a downstream survey site Weighted by catchment area: Surrogate for flow volume 1.Calculate the PI of each upstream segment on segment directly downstream 2.Calculate the PI of one survey site on another site Flow-connected sites Multiply the segment PIs BA C Watershed Segment B Watershed Segment A Segment PI of A Watershed Area A Watershed Area A+B =
19
A B C D E F G H survey sites stream segment Spatial Weights for WAHD Proportional influence (PI): influence of each neighboring survey site on a downstream survey site Weighted by catchment area: Surrogate for flow volume 1.Calculate the PI of each upstream segment on segment directly downstream 2.Calculate the PI of one survey site on another site Flow-connected sites Multiply the segment PIs
20
A B C D E F G H Site PI = B * D * F * G Spatial Weights for WAHD Proportional influence (PI): influence of each neighboring survey site on a downstream survey site Weighted by catchment area: Surrogate for flow volume 1.Calculate the PI of each upstream segment on segment directly downstream 2.Calculate the PI of one survey site on another site Flow-connected sites Multiply the segment PIs
21
Data for Geostatistical Modelling Distance matrices SLD, SHD, AHD Spatial weights matrix Contains flow dependent weights for WAHD Watershed covariates Lumped watershed covariates Mean elevation, % Urban Observations MBSS survey sites
22
Validation Set Unique for each chemical response variable Initial Covariate Selection 5 covariates Model Development Restricted model space to all possible linear models 4 model sets Geostatistical Modeling Methods
23
Geostatistical model parameter estimation Maximize the profile log-likelihood function Geostatistical Modelling Methods Log-likelihood function of the parameters ( ) given the observed data Z is: Maximizing the log-likelihood with respect to B and sigma2 yields: and Both maximum likelihood estimators can be written as functions of alone Derive the profile log-likelihood function by substituting the MLEs ( ) back into the log-likelihood function
24
Correlation matrix for SLD and SHD models Fit exponential autocorrelation function where C 1 is the correlation based on the distance between two sites, h, given the autocorrelation parameter estimates: nugget ( ), sill ( ), and range ( ). Geostatistical Modeling Methods Correlation matrix for WAHD model Fit exponential autocorrelation function (C 1 ) Hadamard (element-wise) product of C 1 & square root of spatial weights matrix forced into symmetry ( )
25
Geostatistical Modeling Methods Model selection between model types 100 Predictions: Universal kriging algorithm Mean square prediction error (MSPE) Cannot use AICC to compare models based on different distance measures Model comparison r 2 for observed vs. predicted values Model selection within model set GLM: Akaike Information Corrected Criterion (AICC) Geostatistical models: Spatial AICC (Hoeting et al., in press) where n is the number of observations, p-1 is the number of covariates, and k is the number of autocorrelation parameters. http://www.stat.colostate.edu/~jah/papers/spavarsel.pdf
26
Results Summary statistics for distance measures Spatial neighborhood differs Affects number of neighboring sites Affects median, mean, and maximum separation distance * Asymmetric hydrologic distance is not weighted here Summary statistics for distance measures in kilometers using DO (n=826).
27
Results SLD SHD WAHD 180.79301.76 Range of spatial autocorrelation differs Shortest for SLD TEMP = shortest range values DO = largest range values Mean Range Values SLD = 28.2 km SHD = 88.03 km WAHD = 57.8 km
28
MSPE GLM SLD SHD WAHD Distance Measures GLM always has less predictive ability More than one distance measure usually performed well –SLD, SHD, WAHD: PHLAB & DOC –SLD and SHD : ANC, DO, NO3 –WAHD & SHD: COND, TEMP SLD distance: SO4 Results
29
GLM SLD SHD WAHD r2r2 Strong: ANC, COND, DOC, NO3, PHLAB Weak: DO, TEMP, SO4 r2r2 Predictive ability of models
30
Discussion Site’s relative influence on other sites Dictates form and size of spatial neighborhood Important because… Impacts accuracy of the geostatistical model predictions Distance measure influences how spatial relationships are represented in a stream network SHDWAHD SLD
31
SHD Geostatistical models describe more variability than GLM Patterns of spatial autocorrelation found at relatively coarse scale > 1 distance measure performed well SLD never substantially inferior Do not represent movement through network Different range of spatial autocorrelation? Larger SHD and WAHD range values Separation distance larger when restricted to network SLD, SHD, and WAHD represent spatial autocorrelation in continuous coarse-scale variables Discussion
32
Probability-based random survey design (-) affected WAHD Maximize spatial independence of sites Does not represent spatial relationships in networks Validation sites randomly selected Frequency Number of Neighboring Sites 244 sites did not have neighbors Sample Size = 881 Number of sites with ≤1 neighbor: 393 Mean number of neighbors per site: 2.81 Discussion
33
4500 0 Difference (O – E) Number of Neighboring Sites 0 1234567 8 91011121314171516 WAHD GLM Not when neighbors had: Similar watershed conditions Significantly different chemical response values WAHD models explained more variability as neighboring sites increased Discussion
34
GLM predictions improved as number of neighbors increased Clusters of sites in space have similar watershed conditions –Statistical regression pulled towards the cluster GLM contained hidden spatial information –Explained additional variability in data with > neighbors 4500 0 Number of Neighboring Sites 0 1234567 8 91011121314171516 WAHD GLM Difference (O – E)
35
Predictive Ability of Geostatistical Models r2r2 PH Coarse Fine Scale of unknown influential processes ANC NO3 COND DOC SO4 DO 0 0.5 1.0 TEMP
36
Conclusions 1)Spatial autocorrelation exists in stream chemistry data at a relatively coarse scale 2)Geostatistical models improve the accuracy of water chemistry predictions 3)Patterns of spatial autocorrelation differ between chemical response variables Ecological processes acting at different spatial scales affect conditions at the survey site 4)SLD is the most suitable distance measure in Maryland for these chemical response variables at this time Unsuitable survey designs SHD: GIS processing time is prohibitive
37
Conclusions 5)Results are scale specific Spatial patterns change with survey scale Other patterns may emerge at shorter separation distances 6) Further research is needed at finer scales Watershed or small stream network
38
Demonstrate how a geostatistical methodology can be used to compliment regional water quality monitoring efforts 1)Predict regional water quality conditions 2)Identify the spatial location of potentially impaired stream segments Visualization of Model Predictions
39
MBSS 1996 DOC
40
Spatial Patterns in Model Fit Squared Prediction Error (SPE)
41
Generate Model Predictions Prediction sites Study area –1 st, 2 nd, and 3 rd order non-tidal streams –3083 segments = 5973 stream km ID downstream node of each segment –Create prediction site More than one site at each confluence Generate predictions and prediction variances SLD Mariah model Universal kriging algorithm Assigned predictions and prediction variances back to stream segments in GIS
42
DOC Predictions (mg/l)
43
Weak Model Fit
44
Strong Model Fit
45
Water Quality Attainment by Stream Kilometres Threshold values for DOC Set by Maryland Department of Natural Resources High DOC values may indicate biological or ecological stress
46
Different ways to capture spatial information 1) Geostatistical models Attempt to explain spatial relationship between response variables May represent another ecological process that is affecting them 2) Spatial location of covariates Does the spatial location of landuse within the watershed affect the response? Does the spatial configuration of landuse affect the response? 3) Stream network configuration and connectivity How does the configuration of the network affect the response? Are stream segments within one network really connected? Current and Future Research in SEQ
47
mean constant here but might incorporate other covariates weight function for relative stream orders or watershed areas independent Gaussian process kernel function: Governs spatial dependence | u-s | = river distance d Covariance Matched Constrained Kriging (CMCK) Geostatistical Models Cressie, N., Frey, J., Harch, B., and Smith, M.: 2006, ‘Spatial Prediction on a River Network’, Journal of Agricultural, Biological, and Environmental Statistics, to appear.
48
Covariance Matched Constrained Kriging (CMCK) Combination of distance measures A B C Cressie, N., Frey, J., Harch, B., and Smith, M.: 2006, ‘Spatial Prediction on a River Network’, Journal of Agricultural, Biological, and Environmental Statistics, to appear. Geostatistical Models
49
Fish Invertebrates Develop geostatistical models Individual indices and multivariate indicators Physical/Chemical Nutrients Ecosystem Processes Determine which distance measure(s) to use One distance measure: SLD, SHD, WAHD More than one distance measure: CMCK (covariance matched constrained kriging) Based on statistical evidence, ecological expertise, and survey design Make model predictions Geostatistical Models and the EHMP
50
Spatial Location of Watershed Attributes Lumped non-spatial watershed attributes
51
Buffer streams using straight- line distance Straight-line distance from stream outlet Overland hydrologic distance + instream distance to stream outlet Overland hydrologic distance to stream Spatial Location of Watershed Attributes
52
How large or small are patches of landuse? How complex is the shape? Is landuse clumped or dissected? Is landuse adjacent to stream? Spatial Configuration of Watershed Attributes
53
Network Configuration
54
Network Connectivity = Survey site
55
Barrier Represent connectivity on a regional scale = Survey site Network Connectivity
56
Define individual networks Network Connectivity
57
Measure network size and complexity Network Configuration and Connectivity
58
www.csiro.au Questions? Comments? Erin E. Peterson Phone: +61 7 3214 2914 Email: Erin.Peterson@csiro.au
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.