WFM 6202: Remote Sensing and GIS in Water Management [Part-B: Geographic Information System (GIS)] Lecture-6: Geo-statistical Analysis Akm Saiful Islam Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET) January, 2009
Geo-statistical Analyst of ArcGIS This training will be on: Histogram Normal QQ plot Trend Analysis Creating a prediction map using the geo-statistical wizard Semivariogram / covariance modeling Searching neighbor Creating a prediction standard error map Display Formats Input Data Groundwater well data of Dinajpur district of Bangladesh
Steps in Geo-statistical Analyst Representation of the Data Explore the Data Fit a model (create surface) Perform Diagnostics
Representation of Data Representing the data is a vital first step in assessing the validity of the data and identifying external factors that may ultimately play a role in the distribution of data.
Explore the Data Distribution of the data, looking for data trends, looking for global and local outliers, examining spatial autocorrelation, understanding the co-variation among multiple data sets.
Explore data Histogram Q-Q plot Trend Analysis Semivariogram Voronoi map Cross covariance
Histogram Show frequency distribution as a bar graph that displays how often observed values fall within certain intervals or classes
Normal distribution Skewness is zero for normal distribution Normally distributed Positively skewed
Q-Q Plot Normal QQ Plot is created by plotting data values with the value of a standard normal where their cumulative distributions are equal
Trend Analysis The Trend Analysis tool provides a three-dimensional perspective of the data. The locations of sample points are plotted on the x,y plane. Above each sample point, the value is given by the height of a stick in the z dimension. The unique feature of the Trend Analysis tool is that the values are then projected onto the x,z plane and the y,z plane as scatter plots. This can be thought of as sideways views through the three-dimensional data. Polynomials are then fit through the scatter plots on the projected planes.
Voronoi map Voronoi maps are constructed from a series of polygons formed around the location of a sample point. Voronoi polygons are created so that every location within a polygon is closer to the sample point in that polygon than any other sample point. After the polygons are created, neighbors of a sample point are defined as any other sample point whose polygon shares a border with the chosen sample point. For example, in the following figure, the bright green sample point is enclosed by a polygon, given as red. Every location within the red polygon is closer to the bright green sample point than any other sample point (given as small dark blue dots). The blue polygons all share a border with the red polygon, so the sample points within the blue polygons are neighbors of the bright green sample point.
Cross variance The Crosscovariance cloud shows the empirical crosscovariance for all pairs of locations between two datasets and plots them as a function of the distance between the two locations.
Fit a Model A wide variety of interpolation methods available to create surface. Two main groups of interpolation techniques: 1. Deterministic 2. Geo-statistical
Interpolation techniques 1. Deterministic: is used for creating surfaces from measures points based either on extent of similarity (Inverse Distance Weighted (IDW) or the degree of smoothing (radial basis functions and polynomials) 2. Geo-statistical: is based on statistics and is used for more advanced prediction of surface modeling that also includes errors or uncertainty of prediction.
Deterministic Methods Four types: Inverse Distance Weighted (IDW) Global Polynomial Local Polynomial Radial Basis Functions Can classified into two groups: Global uses entire data set Global polynomial Local calculates prediction from measured point with specified neighbors: IDW, local polynomials, radial basis functions
Inverse Distance Weighted (IDW) A window of circular shape with the radius of dmax is drawn at a point to be interpolated, so as to involve six to eight surrounding observed points.
Global polynomial interpolation Global Polynomial interpolation fits a smooth surface that is defined by a mathematical function (a polynomial) to the input sample points. The Global Polynomial surface changes gradually and captures coarse-scale pattern in the data. Conceptually, Global Polynomial interpolation is like taking a piece of paper and fitting it between the raised points (raised to the height of value). This is demonstrated in the diagram below for a set of sample points of elevation taken on a gently sloping hill (the piece of paper is magenta).
Local Polynomial interpolation While Global Polynomial interpolation fits a polynomial to the entire surface, Local Polynomial interpolation fits many polynomials, each within specified overlapping neighborhoods. The search neighborhood can be defined using the search neighborhood dialog
Radial Basis Functions (RBF) RBF methods are a series of exact interpolation techniques; that is, the surface must go through each measured sample value. There are five different basis functions: thin-plate spline, spline with tension, completely regularized spline, multi-quadric function, and inverse multi-quadric function. RBF methods are a form of artificial neural networks.
Geo-statistical Methods Kriging and Co-kriging Algorithm Ordinary -A variety of kriging which assumes that local means are not necessarily closely related to the population mean, and which therefore uses only the samples in the local neighbourhood for the estimate. Ordinary kriging is the most comrnonly used method for environmental situations. Simple - A variety of kriging which assumes that local means are relatively constant and equal to the population mean, which is well-known. The population mean is used as a factor in each local estimate, along with the samples in the local neighborhood. This is not usually the most appropriate method for environmental situations. Universal - Indicator Probability Disjunctive Output Surfaces Prediction and prediction standard error Quantile Probability and standard errors of indicators
Kriging Kriging is a geostatistical method for spatial interpolation. It can assess the quality of prediction with estimated prediction errors. It uses statistical models that allow a variety of map outputs including predictions, prediction standard errors, probability, etc.
Interpolation using Kriging Kriging weights
Semivariogram The semivariogram functions quantifies the assumption that things nearby tend to be more similar than things that are farther apart. Semivariogram measures the strength of statistical correlation as a function of distance. Semivariance: Y(h) = ½ [(Z(xi) - Z(xj)]2 Covarience = Sill – Y(h)
Types of semivariogram models Geostatistical Analyst provides the following functions to choose from to model the empirical semivariogram: Circular Spherical Tetraspherical Pentaspherical Exponential Gaussian Rational Quadratic Hole Effect K-Bessel J-Bessel Stable
Semi-variogram Models
Trend An example of a global trend can be seen in the effects of the prevailing winds on a smoke stack at a factory (below). In the image, the higher concentrations of pollution are depicted in the warm colors (reds and yellows) and the lower concentrations in the cool colors (greens and blues). Notice that the values of the pollutant change more slowly in the east–west direction than in the north–south direction. This is because east–west is aligned with the wind while north–south is perpendicular to the wind.
Detrending tool
Anisotropy Anisotropy is a characteristic of a random process that shows higher autocorrelation in one direction than another. The following image shows conceptually how the process might look. Once again, the higher concentrations of pollution are depicted in the warm colors (reds and yellows) and the lower concentrations in the cool colors (greens and blues). The random process shows undulations that are shorter in one direction than another.
Accounting for Anisotrophy
Searching Neighbor The points highlighted in the data view give an indicator of the weights (absolute value in percent) associated with each point located in the moving window. The weights are used to estimate the value at the unknown location which is at the center of the cross hair.
Data transformation
Declustering method There are two ways to decluster your data: by the cell method and by Voronoi polygons. Samples should be taken so they are representative of the entire surface. However, many times the samples are taken where the concentration is most severe, thus skewing the view of the surface. Declustering accounts for skewed representation of the samples by weighting them appropriately so that a more accurate surface can be created.
Bi-variate normal distribution
Output Surface
Cross Validation Cross-validation uses all of the data to estimate the trend and autocorrelation models. It removes each data location, one at a time, and predicts the associated data value.
Various Surface produced using ordinary kriging
Model comparison Comparison helps you determine how good the model that created a geostatistical layer is relative to another model.
Display Format Filled contour Contours Grids Combination of contours Filled contour and hill shade Hill shade
Exercise on Geo-statistical Analyst Data from 21 Groundwater observation Wells as shape file “gwowell_bwdb.shp” Weekly data from December to May for 1994 to 2003 Upazilla shape file “upazila.shp” Tasks: Represent data Explore data Fit Model Diagnostic output Create output maps