Stata in Space: An example for the econometric analysis of spatially explicit raster data --- Daniel Müller --- Institute of Agricultural Economics and.

Slides:



Advertisements
Similar presentations
Introduction to GRCP Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.
Advertisements

WFM 6202: Remote Sensing and GIS in Water Management © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 6202: Remote Sensing and GIS in Water Management Akm.
Introduction to Applied Spatial Econometrics Attila Varga DIMETIC Pécs, July 3, 2009.
Zakaria A. Khamis GE 2110 GEOGRAPHICAL STATISTICS GE 2110.
Raster Based GIS Analysis
University of Wisconsin-Milwaukee Geographic Information Science Geography 625 Intermediate Geographic Information Science Instructor: Changshan Wu Department.
School of Environmental Sciences University of East Anglia
GIS: The Grand Unifying Technology. Introduction to GIS  What is GIS?  Why GIS?  Contributing Disciplines  Applications of GIS  GIS functions  Information.
GIS and Spatial Statistics: Methods and Applications in Public Health
Border around project area Everything else is hardly noticeable… but it’s there Big circles… and semi- transparent Color distinction is clear.
Correlation and Autocorrelation
Geographic Information Systems
Syr Johnathan Duncan. GIS What is GIS? Geography is information about the earth's surface and the objects found on it, as well as a framework for organizing.
GIS Geographic Information System
GIS DATA STRUCTURES There are two fundamental approaches to the representation of the spatial component of geographic information: Vector Model Raster.
Introduction to Geographic Information Systems (GIS) September 5, 2006 SGO1910 & SGO4030 Fall 2006 Karen O’Brien Harriet Holters Hus, Room 215
Introduction to Mapping Sciences: Lecture #5 (Form and Structure) Form and Structure Describing primary and secondary spatial elements Explanation of spatial.
Why Geography is important.
Statistics and Data for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 27, 2008.
Dr. David Liu Objectives  Understand what a GIS is  Understand how a GIS functions  Spatial data representation  GIS application.
Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University.
Raster and Vector 2 Major GIS Data Models. Raster and Vector 2 Major GIS Data Models.
Rebecca Boger Earth and Environmental Sciences Brooklyn College.
Prepared by Abzamiyeva Laura Candidate of the department of KKGU named after Al-Farabi Kizilorda, Kazakstan 2012.
Spatial data Visualization spatial data Ruslan Bobov
Habitat Analysis in ArcGIS Use of Spatial Analysis to characterize used resources Thomas Bonnot
The Nature of Geographic Data Based in part on Longley et al. Ch. 3 and Ch. 4 up to 4.4 (Ch. 4 up to 4.6 to be covered in Lab 8) Library Reserve #VR 100.
Spatial Data Model: Basic Data Types 2 basic spatial data models exist vector: based on geometry of points lines Polygons raster: based on geometry of.
Area Objects and Spatial Autocorrelation Chapter 7 Geographic Information Analysis O’Sullivan and Unwin.
Title: Spatial Data Mining in Geo-Business. Overview  Twisting the Perspective of Map Surfaces — describes the character of spatial distributions through.
Grid-based Analysis in GIS
Point to Ponder “I think there is a world market for maybe five computers.” »Thomas Watson, chairman of IBM, 1943.
BY:- RAVI MALKAT HARSH JAIN JATIN ARORA CIVIL -2 ND YEAR.
Introduction to Geographic Information Systems (GIS) Lesson 1.
Basic Geographic Concepts GEOG 370 Instructor: Christine Erlien.
GIScience 2000 Raster Data Pixels as Modifiable Areal Units E. Lynn Usery U.S. Geological Survey University of Georgia.
How do we represent the world in a GIS database?
Raster Data Model.
Spatial Data Analysis Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What is spatial data and their special.
Raster Concepts.
Introduction to Cartographic Modeling
Spatial Analysis & Vulnerability Studies START 2004 Advanced Institute IIASA, Laxenburg, Austria Colin Polsky May 12, 2004 Graduate School of Geography.
Applications of Spatial Statistics in Ecology Introduction.
ISPRS Congress 2000 Multidimensional Representation of Geographic Features E. Lynn Usery Research Geographer U.S. Geological Survey.
Raster data models Rasters can be different types of tesselations SquaresTrianglesHexagons Regular tesselations.
GIS Data Structures How do we represent the world in a GIS database?
NR 143 Study Overview: part 1 By Austin Troy University of Vermont Using GIS-- Introduction to GIS.
© Phil Hurvitz, Introduction to Geographic Information Systems and their Potential Uses as Management Tools in Commercial Shellfish Farming Introduction.
The Nature of Geographic Data Based in part on Longley et al. Chapters 3 and 4.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
L15 – Spatial Interpolation – Part 1 Chapter 12. INTERPOLATION Procedure to predict values of attributes at unsampled points Why? Can’t measure all locations:
Integrating Geographic Information Systems (GIS) into your Curriculum Teaching American History Meg Merrick & Heather Kaplinger Year 2 GIS Inservices.
Geotechnology Geotechnology – one of three “mega-technologies” for the 21 st Century Global Positioning System (Location and navigation) Remote Sensing.
Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc
U.S. Department of the Interior U.S. Geological Survey Automatic Generation of Parameter Inputs and Visualization of Model Outputs for AGNPS using GIS.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Zakaria A. Khamis GE 2110 GEOGRAPHICAL STATISTICS GE 2110.
What is GIS? “A powerful set of tools for collecting, storing, retrieving, transforming and displaying spatial data”
Spatial Data Models Geography is concerned with many aspects of our environment. From a GIS perspective, we can identify two aspects which are of particular.
The Nature of Geographic Data Longley et al. Chapters 3 and 4.
Using GIS to Manage Agricultural Fields and Soil Erosion Nick Stadnyk - Applied Data Consultants, Inc.
INTERPOLATION Procedure to predict values of attributes at unsampled points within the region sampled Why?Examples: -Can not measure all locations: - temperature.
Lesson 3 GIS Fundamentals MEASURE Evaluation PHFI Training of Trainers May 2011.
Raster Analysis Ming-Chun Lee.
Spatial Analysis: Raster
Tabulations and Statistics
Special Topics in Geo-Business Data Analysis
Spatial interpolation
Why are Spatial Data Special?
Spatial Analysis: Raster
Presentation transcript:

Stata in Space: An example for the econometric analysis of spatially explicit raster data --- Daniel Müller --- Institute of Agricultural Economics and Social Sciences Humboldt University Berlin Berlin -- August, 12th, 2003

Outline Introduction Spatial data analysis Data preparation The empirical example Econometric estimation Export of results and geovisualization

Introduction Socioeconomic data usually exist for (discrete) social entities, rarely explicitly linked to location (georeferenced) ‘Natural’ data: often continuous (rainfall, slope, elevation) and georeferenced Integration of both data sources can provide additional insights Allows to understand spatial patterns & processes Knowing the where can help us infer the why necessary?

Spatial data analysis Spatial analysis is the analysis of data linked to location (spatial data) Why analysis of spatial data ? Variables of interest vary in space Location matters! Spatial analysis can provide important insights: geographical targeting of investments diffusion of technologies causes and consequences of land-use change

What’s special about spatial data ? Spatial data analysis What’s special about spatial data ? => Location matters !!! => Tobler’s 1st law of geography (1979): “Everything is related to everything else, but near things are more related than distant things.” => Spatial effects: - spatial autocorrelation - spatial heterogeneity 1. Location matters: absolute and relative; physical measurement often, economic data seldom an explicit spatial sample 2. Dependence = spatial structure

Peculiarities in space: Spatial effects Spatial data analysis Peculiarities in space: Spatial effects 1. Spatial autocorrelation Coincidence of value similarity with locational similarity Second dimension adds mathematical complexity (multiple directions) 2. Spatial heterogeneity Each location is unique Units of observations not homogeneous across space Structural instability over space, e.g. heteroskedasticity non-constant error variances due to, e.g. unequal population densities or varying technological development

Peculiarities in space: spatial effects [2] Spatial data analysis Peculiarities in space: spatial effects [2] Spatial effects due to: interactions among neighboring agents data from different sources different sample designs varying aggregation rules “Spatial relationships among observations can result in unreliable estimates and misguided statistical inference of the parameters.” (Anselin 1988). => Corrections necessary non-constant error variances

Geographic Information Systems (GIS): Spatial data analysis Geographic Information Systems (GIS): Compile, store, manipulate, analyse, visualize spatial data Consist of hardware, software, data and procedures Data models: vector & raster

Spatial data analysis Raster data model: Arrangement of regularly shaped, contiguous cells Continuous data layers; fit together edge-to-edge Typically consist of square cells Each cell represents a location in a raster GIS Cells are arranged in layers Values of a cell indicate characteristics of that location Data is composed of many layers covering the same geographical area

Raster data model --- file structure: Spatial data analysis Raster data model --- file structure: Header: Contains spatial information! 1 2 3 4 5 6

Raster data model --- land use map: Spatial data analysis Raster data model --- land use map:

From data layers to resulting map Spatial data analysis From data layers to resulting map data layers overlays analyses output

Importing grids into Stata Data preparation Importing grids into Stata ras2dta , files(filelist) [ idcell(varname) nodata(#) dropmiss xcoord(#) ycoord(#) genxcoor(varname) genycoor(varname) header(filename) saving(filelist) replace clear ] infile-s grids (filelist) into Stata: -generate-s IDcode for each cell (=observation) reads the information from the header (if present) “ sets missing values to a specified number “ -drop-s unnecessary empty cells “ -generate-s X and Y coordinates “ -save-s the header information in a file

Integration of data layers Data preparation Integration of data layers Import of raster grids (-ras2dta-) Combination of raster layers in Stata (-joinby-, -merge-) based on spatial identifier (ID-code of cells) Socioeconomic (survey, census) data can be joined to grids based on, e.g., administrative boundaries

Corrections of spatial effects Data preparation Corrections of spatial effects Spatial lag variables with index values for latitude (Y) and longitude (X) Spatially lagged variables Regular sampling from a grid => 1. can be done with -ras2dta- => 2. we ignore here => 3. is easy in Stata, e.g. with : -spatsam- non-constant error variances

Data preparation spatsam , gap(#) xcoord(varname) ycoord(varname) [ saving(filename) norestore noseed replace ] Basically that‘s: keep if (xcoord / gap) == int (xcoord / gap) & (ycoord / gap) == int (ycoord / gap) Therefore, only every #-th observation in X and Y direction is kept in the sample. non-constant error variances

Land use change in Vietnam The empirical example Land use change in Vietnam Land use as an inherently spatial process Returns to land use are (spatially) affected by: market accessibility (von Thünen) land rent (Ricardo) Possible factors to consider: soil quality, topography, climate, market locations, population density, technology Limited dependent variable problem (-mlogit-)

The empirical example Data Satellite image interpretation: - land cover => land use (change) GIS, maps, point measurements: - geophysical indicators => topography, soil, climate Socioeconomic & policy variables: - village survey, secondary statistics => technology, population, education, market access Data integration based on spatial identifier and (approximated) village areas

Econometric estimation Observations: 964,000 pixels (50 x 50 m) Spatial sample: every 5. cell in X & Y direction Estimation: 35,000 observations => Dependent: five land cover classes (1, 2, .., 5) => Independent: a) geophysical b) socioeconomic c) policy d) spatial effects non-constant error variances

Econometric estimation 1. Estimation of the influence of hypothesized determinants on land use. 2. What is the probability that a certain pixel falls into one of the five land-use categories? => -mlogit- (reduced form, clustered for villages) => -mlogtest, iia-, -fitstat- (Long & Freese) Then we take the highest predicted probability as predicted land use. observations within villages likely not independent -> SE underestimated -> robust SE clustered for villages

Outputting results from Stata Export of results Outputting results from Stata dta2ras [varlist], xcoord(#) ycoord(#) cellsize(#) [ header(filename) idcell(varname) nodata(#) xllcorner(#) yllcorner(#) saving(varlist) replace ] writes header in front of file with the information from xcoord(#) ycoord(#) cellsize(#) or header(); (optionally) nodata(#) xllcorner(#) yllcorner(#) then the results can be mapped in the GIS

Geovisualization of results Prediction map

Geovisualization of results Maximum predicted probabilities

Thank you. Questions, comments and critique welcome Thank you ! Questions, comments and critique welcome ! ____________________________ © Daniel Müller (danielix@gmx.net) Institute of Agricultural Economics and Social Sciences ---- Humboldt University Berlin ---- Stata ados available for download at: http://amor.cms.hu-berlin.de/~muelleda