Rodolphe Devillers (Almost) everything you always wanted to know (or maybe not…) about Geographically Weighted Regressions JCU Stats Group, March 2012.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Analysis of variance and statistical inference.
Chapter 12 Simple Linear Regression
Forecasting Using the Simple Linear Regression Model and Correlation
Hypothesis Testing Steps in Hypothesis Testing:
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Spatial Autocorrelation Basics NR 245 Austin Troy University of Vermont.
Local Measures of Spatial Autocorrelation
Dealing with Spatial Autocorrelation
GIS and Spatial Statistics: Methods and Applications in Public Health
Correlation and Autocorrelation
PSY 307 – Statistics for the Behavioral Sciences
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
SA basics Lack of independence for nearby obs
Topic 3: Regression.
Multiple Regression and Correlation Analysis
Why Geography is important.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Simple Linear Regression Analysis
IS415 Geospatial Analytics for Business Intelligence
Lecture 5 Correlation and Regression
Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 13: Inference in Regression
STA291 Statistical Methods Lecture 27. Inference for Regression.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Pure Serial Correlation
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Mixed Effects Models Rebecca Atkins and Rachel Smith March 30, 2015.
ECON 338/ENVR 305 CLICKER QUESTIONS Statistics – Question Set #8 (from Chapter 10)
PCB 3043L - General Ecology Data Analysis.
Lecture 6 Your data and models are never perfect… Making choices in research design and analysis that you can defend.
Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
7-1 MGMG 522 : Session #7 Serial Correlation (Ch. 9)
© 2000 Prentice-Hall, Inc. Chap Chapter 10 Multiple Regression Models Business Statistics A First Course (2nd Edition)
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Inference about the slope parameter and correlation
Chapter 13 Simple Linear Regression
Spatial statistics: Spatial Autocorrelation
Modify—use bio. IB book  IB Biology Topic 1: Statistical Analysis
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
Chapter 5 Part B: Spatial Autocorrelation and regression modelling.
PCB 3043L - General Ecology Data Analysis.
Statistics in MSmcDESPOT
Multiple Regression Analysis and Model Building
Fundamentals of regression analysis
Pure Serial Correlation
Correlation and Simple Linear Regression
Spatial Autocorrelation
Correlation and Simple Linear Regression
Spatial Data Analysis: Intro to Spatial Statistical Concepts
Spatial Data Analysis: Intro to Spatial Statistical Concepts
Simple Linear Regression and Correlation
Topic 8 Correlation and Regression Analysis
SPATIAL ANALYSIS IN MACROECOLOGY
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Rodolphe Devillers (Almost) everything you always wanted to know (or maybe not…) about Geographically Weighted Regressions JCU Stats Group, March 2012

Outline Background Spatial autocorrelation Spatial non-stationarity Geographically Weighted Regressions (GWR)

Outline Background Spatial autocorrelation Spatial non-stationarity Geographically Weighted Regressions (GWR)

Background

Decrease in cod populations 1984

1985 Decrease in cod populations

1986 Decrease in cod populations

1987 Decrease in cod populations

1988 Decrease in cod populations

1989 Decrease in cod populations

1990 Decrease in cod populations

1991 Decrease in cod populations

1992 Decrease in cod populations

1993 Decrease in cod populations

1994 Decrease in cod populations

Scientific surveys Fisheries observers 4 species > records GeoCod Project (2006-…) Biological Data Goal: Get a better understanding of the spatial and temporal dynamics of some fish/shellfish species in the NW Atlantic region, and their relationship with the physical environmental Environmental Data Temperature Salinity Remote Sensing > 300 GB

Fisheries data Collection Environmental data Other data(Bathy, etc.) IntegrationAnalysis Normalized database Visualization 1234 GeoCod project

Context A number of statistical methods can be used Testing spatial statistics SpeciesEnvironnement ?

Outline Background Spatial autocorrelation Spatial non-stationarity Geographically Weighted Regressions (GWR)

Spatial autocorrelation “ …the property of random variables taking values, at pairs of locations a certain distance apart, that are more similar (positive autocorrelation) or less similar (negative autocorrelation) than expected for randomly associated pairs of observations. ” (Legendre, 1993)

Spatial autocorrelation - Basics Positive (Neighbours more similar) Neutral (Random) Negative (Neighbours less similar)

Spatial autocorrelation – is it common? Elevation Air/water temperature Air humidity Disease distribution Species abundance Housing value Etc.

Spatial autocorrelation – why bother? Spatial autocorrelation in the data leads to spatial autocorrelation in the residuals

Spatial autocorrelation – why bother? Most statistics are based on the assumption that the values of observations in each sample are independent of one another Consequence: it will violate the assumption about the independence of residuals and call into question the validity of hypothesis testing Main effect: Standard errors are underestimated, t-scores are overestimated (= increases the chance of a Type I error = Incorrect rejection of a Null Hypothesis) Sometime inverts the slope of relationships.

Spatial autocorrelation – how to measure it? Measures of spatial autocorrelation: Moran’s I Geary’s C Others (e.g. Getis’ G)

Spatial autocorrelation – How can I deal with it? Many ways to handle this: Subsampling, adjusting type I error, adjusting the effective sample size, etc. (Dale and Fortin (2002) Ecoscience 9(2)) Autocovariate regressions, spatial eigenvector mapping (SEVM), generalised least squares (GLS), conditional autoregressive models (CAR), simultaneous autoregressive models (SAR), generalised linear mixed models (GLMM), generalised estimation equations (GEE), etc. (More details: Dormann et al. (2007) Ecography 30) If spatial autocorrelation is not stationary: GWR

Outline Background Spatial autocorrelation Spatial non-stationarity Geographically Weighted Regressions (GWR)

Stationarity Classical regression models are valid under the assumptions that phenomena are stationary temporally and spatially (=statistical parameters such as the mean, the variance or the spatial autocorrelation do not vary depending on the geographic position) E.g. Coral bleaching = 0.55 Temperature Nutrients + … - … Studies (in various fields, including terrestrial ecology) have shown that they are rarely stationary

Global vs Local Statistics Simpson Paradox

Local spatial statistics Local Indicators of Spatial Association (LISA) Local Moran’s I (used to detect clustering) Getis-Ord Gi* (hotspot analysis) Look at GeoDa (free software from Luc Anselin Local regressions: GWR

Outline Background Spatial autocorrelation Spatial non-stationarity Geographically Weighted Regressions (GWR)

Brunsdon, Fortheringham and Charlton GWR

Increasingly used in various fields (mostly since 2006, and even more since integrated into ArcGIS) Sally: yes, it is also available in R… (spgwr)

Criticized by some authors (e.g. Wheeler 2005, Cho et al. 2009) when using collinear data, potentially leading to: Occasional inflation of the variance Rare inversion of the sign of the regression GWR

Windle, M., Rose, G., Devillers, R. and Fortin, M.-J. Exploring spatial non-stationarity of fisheries survey data using geographically weighted regression (GWR): an example from the Northwest Atlantic. ICES Journal of Marine Science, 67:

GWR Geographically Weighted Regression (GRW ) (μ,ν): geographic coordinates of the samples Multiple regression model (global) y: dependent variable, x 1 to x p : independent variables, β 0: origin, β 1 to β p : coefficients, ε: error.

Cod presence/absence (threshold at 5 kg) for the Fall 2001 Method Government fisheries scientific survey data (Fisheries and Oceans Canada)

Method – Data interpolation

Method

Combining data in a single point data file Exporting data points in a file (.dbf) Temperature Cod Crab Shrimp Year 2001 Method

GWR software (version 3.0) 200km used for tests About 25 minutes per file of 5500 points

Fixed Variable

Results Test of spatial stationarity of independent variables used in the regression Spatial stationarity Spatial non- stationarity

Results spatial stationarity Windle et al. (accepted) - MEPS Stationarity of bottom temperature used to model shrimp biomass

Results Comparison of regression models

Results Test of the spatial auto-correlation of the residuals

Results

K-means clustering of the t values of the GWR coefficients Positive relationship between crab and shrimp, weak relationship with the coast Negative relationship with crab and distance, positive with shrimp Stronger negative relationship with crab

Results GAM systematically has lower AIC values, suggesting a non-linear relationship between cod and the variables used in the analysis Strong Weak AIC: Akaike Information Criterion

Results Min and max GWR coefficients (R 2 ) Model power decreases with years

GWR coefficients– Capelan

GWR coefficients – Catch per Unit Effort

Conclusions The spatial structure of data matters Ecology (and mostly marine ecology) is still in the process of adopting such methods GWR is an interesting method but can be hard to interpret and should be used together with other methods

Questions? Technical questions beyond my knowledge: Matt Windle Technical questions beyond Matt’s knowledge: (allow for several months for an answer)