EXPLORING SPATIAL CORRELATION IN RIVERS by Joshua French.

Slides:



Advertisements
Similar presentations
Spatial point patterns and Geostatistics an introduction
Advertisements

Spatial point patterns and Geostatistics an introduction
Assumptions underlying regression analysis
EXPLORING SPATIAL CORRELATION IN RIVERS by Joshua French.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and Linear Regression.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
STAT 497 APPLIED TIME SERIES ANALYSIS
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Chapter 12 Simple Regression
Deterministic Solutions Geostatistical Solutions
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Spatial Interpolation
Geostatistical structural analysis of TransCom data for development of time-dependent inversion Erwan Gloaguen, Emanuel Gloor, Jorge Sarmiento and TransCom.
Applied Geostatistics
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Lecture 24 Multiple Regression (Sections )
Deterministic Solutions Geostatistical Solutions
Economics 20 - Prof. Anderson
Lecture 17 Interaction Plots Simple Linear Regression (Chapter ) Homework 4 due Friday. JMP instructions for question are actually for.
Copyright (c)Bani K. Mallick1 STAT 651 Lecture #21.
Applications in GIS (Kriging Interpolation)
Correlation and Regression Analysis
Method of Soil Analysis 1. 5 Geostatistics Introduction 1. 5
Relationships Among Variables
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Regression and Correlation Methods Judy Zhong Ph.D.
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
Introduction to Linear Regression and Correlation Analysis
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Relationship of two variables
Correlation Scatter Plots Correlation Coefficients Significance Test.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
CORRELATION & REGRESSION
Correlation.
Spatial Interpolation of monthly precipitation by Kriging method
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Chapter 10 Correlation and Regression
Explorations in Geostatistical Simulation Deven Barnett Spring 2010.
Geographic Information Science
Correlation & Regression
Geo479/579: Geostatistics Ch16. Modeling the Sample Variogram.
Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.
Spatial Statistics in Ecology: Continuous Data Lecture Three.
The Semivariogram in Remote Sensing: An Introduction P. J. Curran, Remote Sensing of Environment 24: (1988). Presented by Dahl Winters Geog 577,
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
Week 21 Stochastic Process - Introduction Stochastic processes are processes that proceed randomly in time. Rather than consider fixed random variables.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Semivariogram Analysis and Estimation Tanya, Nick Caroline.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Vamsi Sundus Shawnalee. “Data collected under different conditions (i.e. treatments)  whether the conditions are different from each other and […] how.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
Geo479/579: Geostatistics Ch7. Spatial Continuity.
Linear Correlation (12.5) In the regression analysis that we have considered so far, we assume that x is a controlled independent variable and Y is an.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Geostatistics GLY 560: GIS for Earth Scientists. 2/22/2016UB Geology GLY560: GIS Introduction Premise: One cannot obtain error-free estimates of unknowns.
Principal Component Analysis
The simple linear regression model and parameter estimation
Regression and Correlation
Chapter 4 Basic Estimation Techniques
Statistical Data Analysis - Lecture /04/03
Ch9 Random Function Models (II)
SIMPLE LINEAR REGRESSION MODEL
Presentation transcript:

EXPLORING SPATIAL CORRELATION IN RIVERS by Joshua French

Introduction A city is required to extends its sewage pipelines farther in its bay to meet EPA requirements. How far should the pipelines be extended? The city doesn’t want to spend any more money than it needs to extend the pipelines. It needs to find a way to make predictions for the waste levels at different sites in the bay.

Usually we might try to interpolate the data using a linear model. Usually we assume observations are independent. For spatial data however, we intuitively know that response values for points close together should be more similar than points separated by a great distance. We can use the correlation between sampling sites to make better predictions with our model.

The Road Ahead -Methods -Introduction to the Variogram -Exploratory Analysis -Sample Variogram -Modeling the Variogram -Analysis -3 types of results -Conclusions -Future Work

Introduction to the Variogram Spatial data is often viewed as a stochastic process. For each point x, a specific property Z(x) is viewed as a random variable with mean µ, variance σ 2, higher-order moments, and a cumulative distribution function.

Each individual Z(x i ) is assumed to have its own distribution, and the set {Z(x 1 ),Z(x 2 ),…} is a stochastic process. The data values in a given data set are simply a realization of the stochastic process. For a spatial process, second-order stationarity is often assumed.

Second-order stationarity implies that the mean is the same everywhere: i.e. E[Z(x j )]=µ for all points x j. It also implies that Cov(Z(x j ),Z(x k )) becomes a function of the distance x j to x k.

Thus, Cov(Z(x j ),Z(x k )) = Cov(Z(x),Z(x+h)) = Cov(h) where h measures the distance between two points.

Looking at the variance of differences Var[Z(x)-Z(x+h)] =E[ (Z(x)-Z(x+h)) 2 ] = 2 γ(h) Assuming second-order stationarity, γ(h)=Cov(0)-Cov(h). γ(h) is known as the semi-variogram. The plot of γ(h) on h is known as the variogram.

Things to know about variograms: 1.γ(h)= γ(-h). Because it is an even function, usually only positive lag distances are shown. 2.Nugget effect - by definition, γ(0)= 0. In practice however, sample variograms often have a positive value at lag 0. This is called the “nugget effect”.

3.Tend to increase monotonically 4.Sill – the maximum variance of the variogram 5.Range – the lag distance at which the sill is reached. Observations are not correlated past this distance. The following figure shows these features

Variogram Example

Exploratory Analysis The data studied is the longitudinal profile of the Ohio River. Instead of worrying about the river network with streams, tributaries, and other factors, we simply look at the Ohio River as a one- dimensional object.

The Ohio River

Longitudinal Profile of the Ohio River Sampling Sites

Before we model variograms, we should explore the data. We need to make sure that the data analyzed satisfies second-order stationarity If there is an obvious trend in the data, we should remove it and analyze the residuals. If the variance increases or decreases with lag distance, then we should transform the variable to correct this.

It is fairly easy to check for stationarity of this data set using a scatter plot.

If the data contains outliers, we should do analysis both with and without outliers present. If G 1 >1, then we should transform the data to approximate normality if possible.

3.3 The Sample Variogram One of the previous definitions of semivariance is: The logical estimator is: where N(h) is the number of pairs of observations associated with that lag.

Sample Variogram Example

Modeling the Variogram Our goal is to estimate the true variogram of the data. There were four variogram models used to model the sample variogram: the spherical, Gaussian, exponential, and Matern models.

Variogram Models

Analysis The data analyzed is a set of particle size and biological variables for the Ohio River. The data was collected by “The Ohio River Valley Sanitation Commission. This is better known as ORSANCO. There were between 190 and 235 unique sampling sites, depending on the variable.

ORSANCO data collection

The results of the analysis fell into three main groups: -Able to fit the sample variogram well -Not able to fit the sample variogram well -Analysis not reasonable

Good Results: Number of Individuals at a site After correcting for skewness by doing a log transformation, there are a number of outliers. We analyze the data both with and without the outliers.

log(Num Individuals) Sample Variogram with outliers

log(Num Individuals) Sample Variogram without outliers

We were not able to model the sample variogram perfectly, but we were able to detect some amount of spatial correlation in the data, especially when the outliers were removed. We are able to obtain reasonable estimates of the nugget, sill, and variance.

Poor Results: Percent Sand After doing exploratory spatial analysis and removing a trend, we fit the sample variogram of the percent sand residuals.

Sample Variogram of percent sand residuals

The sample variogram does not really increase monotonically with distance. Our variogram models cannot fit this very well. Though we can obtain estimates of the nugget, sill, and range, the estimates cannot be trusted.

No results: Percent Hardpan This variable was so badly skewed that analysis was not reasonable. The skewness coefficient is This is extremely high.

QQplot of Percent Hardpan

Scatter plot of Percent Hardpan

The data is nearly all zeros! There is also an erroneous data value. A percentage cannot be greater than 100%. Data analysis does not seem reasonable. Our data does not meet the conditions necessary to use the spatial methods discussed.

Conclusions Able to fit sample variogram reasonably well – percent gravel, number of individuals, number of species Not able to fit sample variogram well – percent sand, percent detritivore, percent simple lithophilic individuals, percent invertivore No results – remaining variables

Summary of Results

Future Work Things to consider in future analysis: - The water flows in only one-direction. A point downstream cannot affect a point upstream - Natural features such as tributaries may impact spatial correlation - Manmade features such as dams may impact spatial correlation

Concluding Thought Before you criticize someone, you should walk a mile in their shoes. That way, when you criticize them, you’re a mile away and you have their shoes. - Jack Handey