Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interfacing R with GIS Xiaogang (Marshall) Ma School of Science Rensselaer Polytechnic Institute Tuesday, Apr 02, 2013 GIS in the Sciences ERTH 4750 (38031)

Similar presentations


Presentation on theme: "Interfacing R with GIS Xiaogang (Marshall) Ma School of Science Rensselaer Polytechnic Institute Tuesday, Apr 02, 2013 GIS in the Sciences ERTH 4750 (38031)"— Presentation transcript:

1 Interfacing R with GIS Xiaogang (Marshall) Ma School of Science Rensselaer Polytechnic Institute Tuesday, Apr 02, 2013 GIS in the Sciences ERTH 4750 (38031)

2 MapInfo R Mashup Week 1 (Jan. 22/25): Introduction to GIS Week 2 (Jan. 29/Feb. 1): Geographic information and spatial data types Week 3 (Feb. 5/8): Spatial referencing Week 4 (Feb 12/15): Geostatistical computing Week 5 (Feb. 19/22): Exploring and visualizing spatial data Week 6 (Feb. 26/Mar. 1): Modeling spatial structure from point samples Week 7 (Mar. 5/8): Spatial prediction from point samples (Part 1) Week 8 (Mar. 12/15: no classes - spring break) Week 9 (Mar. 19/22): Spatial prediction from point samples (Part 2) Week 10 (Mar. 26/29): Assessing the quality of spatial predictions Week 11 (Apr. 2/5): Interfacing R spatial statistics with GIS Week 12 (Apr. 9/12: no class - Grand Marshal week) Week 13 (Apr. 16/19): Efficient and effective result presentation with GIS Week 14 (Apr. 23/26): Tuesday: Guest Lecture Week 15 (Apr. 30): Tuesday: Short final project presentations A review of our course outline 2

3 Acknowledgements This lecture is partly based on: –Bivand, R.S., Pebesma, E.J., Gómez-Rubio, V., 2008. Applied Spatial Data Analysis with R. Springer, New York, NY, 374pp. –Rossiter, D.G., 2012. R and GIS. Exercise in distance course Applied Geostatistics. ITC, University of Twente 3

4 Contents 1.GIS and R: Overview 2.Classes for spatial data in R 3.Handling coordinate reference systems in R 4.Spatial data import and export in R 5.Examples 4

5 1 GIS and R: Overview GIS: a computerized system that facilitates the phases of data entry, data analysis and data presentation especially in cases when we are dealing with georeferenced data. –Geographic phenomena: continuous field, discrete field, collection of objects –Computer representations: raster and vector –Spatial autocorrelation and topology –Spatial reference: definitions, physical/geometric constructs and tools required to describe the geometry and motion of objects near and on the Earth’s surface –Datum (vertical and horizontal): a surface that represents the shape of the earth –Map projection: map the curved surface of the Earth (as modeled by an ellipsoid) onto a flat plane 5

6 So, data analysis is an essential part of a GIS Statistics is a primary way of data analysis –Descriptive and Inferential statistics –Geostatistics: statistics on a population with known location R: an open-source environment for statistical computing and visualization –Exploring and visualizing spatial data: spatial structure, regional trends, local spatial dependence and anisotropy –Modeling spatial structure from point samples –Spatial prediction from point samples –Assessing the quality of spatial predictions 6

7 GIS and R Users approaching R with experience of GIS will want to see ‘layers’, ‘vectors’, ‘rasters’, and ‘georeferences’, etc. –They desire that R accommodates familiar concepts and well- known data models and formats For statistician users of R, ‘everything’ is a data.frame, a rectangular table with rows of observations on columns of variables –They want their geo-data analysis result can be visualized and presented in main-stream GIS software programs, and thus extend the usage of the result. 7

8 In R, classes have grown that look like GIS data models to GIS people, and look and behave like data frames from the point of view of applied statisticians. –We have used the R packages sp, gstat and lattice quite often in the past few weeks –sp: provides classes for importing, manipulating and exporting spatial data in R, and for methods including print/show, plot, subset, names, dim, summary, and a number of methods specific to spatial data handling –gstat: creates gstat objects that hold all the information necessary for univariate or multivariate geostatistical prediction (simple, ordinary or universal (co)kriging), or its conditional or unconditional Gaussian or indicator simulation equivalents –lattice: a high-level data visualization system with an emphasis on multivariate data 8

9 A simplified view 9 Data import Data analysis Data export GIS (MapInfo, ArcGIS, Google Earth, etc.) R GIS (MapInfo, ArcGIS, Google Earth, etc.)

10 The real world is much more complicated –Combining spatial data from different sources often means that much more insight is needed into the data formats and data models involved Methods are needed for handling and combining multisource datasets –Transformation of coordinate reference systems –Data import and export –Mash-up for themed studies 10

11 2 Classes for spatial data in R Data frames are containers for data used everywhere in S –view data as a rectangle of rows of observations on columns of variables New-style (S4) classes were introduced in the S language at release 4 –The central advantage of new-style classes is that they have formal definitions that specify the name and type of the components, called slots, that they contain. 11

12 For spatial objects the foundation class is the Spatial class, with just two slots. –The first is a bounding box, a matrix of numerical coordinates with column names c(‘min’, ‘max’), and at least two rows, with the first row eastings (x-axis) and the second northings (y-axis). –The second is a CRS class object defining the coordinate reference system, and may be set to ‘missing’, represented by NA in R, by CRS(as.character(NA)), its default value. Operations on Spatial* objects should update or copy these values to the new Spatial* objects being created. 12

13 > m <- matrix(c(0, 0, 1, 1), ncol = 2, dimnames = list(NULL, + c("min", "max"))) > crs <- CRS(projargs = as.character(NA)) > S <- Spatial(bbox = m, proj4string = crs) 13 Build a simple Spatial object from a bounding box matrix, and a missing coordinate reference system

14 SpatialPoints The SpatialPoints class is the first subclass of Spatial in S > getClass("SpatialPoints") Slots: Name:coordsbboxproj4string Class:matrixmatrixCRS Extends: "Spatial" Known Subclasses: Class "SpatialPointsDataFrame", directly Class "SpatialPixels", directly Class "SpatialGrid", by class "SpatialPixels", distance 2 Class "SpatialPixelsDataFrame", by class "SpatialPixels", distance 2 Class "SpatialGridDataFrame", by class "SpatialGrid", distance 3 14

15 SpatialLines SpatialLines object contains the bounding box and projection information for the list of Lines objects stored in its lines slot. > getClass("SpatialLines") Slots: Name: linesbboxproj4string Class:listmatrixCRS Extends: "Spatial" Known Subclasses: "SpatialLinesDataFrame" 15

16 SpatialPolygons A SpatialPolygons object is a set of Polygons objects with the additional slots of a Spatial object to contain the bounding box and projection information of the set as a whole > getClass("SpatialPolygons") Slots: Name: polygonsplotOrderbboxproj4string Class:listintegermatrixCRS Extends: "Spatial" Known Subclasses: "SpatialPolygonsDataFrame" 16

17 3 Handling coordinate reference systems in R Datum: a surface that represents the shape of the earth Projection: render the surface of an ellipsoid as a plane Project and datum together are referred to as a coordinate reference systems (CRS) –Most countries have multiple CRS, often for very good reasons. This section can be covered in the general topic of ‘Spatial data import and export in R’ (next section), but is worthy to be discussed separately 17

18 The EPSG List The European Petroleum Survey Group (EPSG; now Oil & Gas Producers (OGP) Surveying & Positioning Committee) began collecting a geodetic parameter data set starting in 1986, based on earlier work in member companies. The EPSG list is under continuous development and/or update. A copy of the list is provided in the rgdal package, because it permits the conversion of a large number of CRS into the PROJ.4 style description. –Datum transformation is based on transformation to the World Geodetic System of 1984 (WGS84), or inverse transformation from it to an alternative specified datum. 18

19 The PROJ.4 CRS Specification PROJ.4 is an open-source of library of projection functions; it is not part of R or the sp package, and may be used stand-alone or linked in another program. The PROJ.4 library uses a ‘tag=value’ representation of coordinate reference systems, with the tag and value pairs enclosed in a single character string –The only values used autonomously in CRS class objects are whether the string is a character NA (missing) value for an unknown CRS, and whether it contains the string longlat, in which case the CRS contains geographical coordinates 19

20 > data(meuse) > coordinates(meuse) = ~x + y > str(meuse) We can browse the CRS information either by read the @proj4string slot in the result of str(meuse) or by using the proj4string function > proj4string(meuse) For the meuse data the result is NA 20 Browse CRS information of a spatial object

21 > ?meuse The metadata of meuse dataset shows that the coordinates are in the RDH (Rijksdriehook = Dutch triangulation) CRS We use the make_EPSG utility function of the rgdal package to load the denitions into a list, and then search with grep for the string Amersfoort, which is the name of the origin of the Rijksdriehook (Dutch triangulation) coordinate system > EPSG <- make_EPSG() > (EPSG[grep("Amersfoort", fixed = T, EPSG$note), ]) > rm(EPSG) There are two systems ‘RD old’ and ‘RD new’. meuse data uses ‘RD new’, which is corresponding to reference 28992 in the EPSG database > proj4string(meuse) <- CRS("+init=epsg:28992") We can now browse the WRS information of meuse > proj4string(meuse) 21 Specify a CRS with the EPSG database (1)

22 Unfortunately, the EPSG is not (as of June 2012) up-to-date; in particular it is missing a key issue: the offset of the center of the Earth in the RD system to that in the “WGS84” CRS The Netherlands Geodetic Commission [5] has published the following adjustment +towgs84=565.237,50.0087,465.658,-0.406857,0.350733,- 1.87035,4.0812 Add the WGS84 displacement to the CRS definition for the Meuse dataset > proj4string(meuse) <- CRS(paste(proj4string(meuse), + "+towgs84=565.237,50.0087,465.658,- 0.406857,0.350733,-1.87035,4.0812")) We can now browse the updated WRS information of meuse > proj4string(meuse) 22 Specify a CRS with the EPSG database (2)

23 Result of proj4string(meuse) [1] " +init=epsg:28992 +proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs +towgs84=565.237,50.0087,465.658,-0.406857,0.350733,-1.87035,4.0812" Now browse the records of the coordinates > head(coordinates(meuse)) Result x y [1,] 181072 333611 [2,] 181025 333558 [3,] 181165 333537 [4,] 181298 333484 [5,] 181307 333330 [6,] 181390 333260 23

24 Transformation of CRS The rgdal package provides methods to convert between projections and datums; this dual process we can call “transformation” between CRS –The work is done by the spTransform method, taking any Spatial* object, and returning an object with coordinates transformed to the target CRS. 24

25 Transform the Meuse dataset from the Rijksdriehook CRS to geographic coordinates on the WGS84 ellipsoid > meuse.wgs84 <- spTransform(meuse, CRS("+proj=longlat +datum=WGS84")) We can now browse the updated WRS information of meuse > proj4string(meuse.wgs84) Result [1] " +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0" Now browse the records of the coordinates > head(coordinates(meuse)) Result x y [1,] 5.758560 50.99153 [2,] 5.757887 50.99105 [3,] 5.759880 50.99086 [4,] 5.761770 50.99037 [5,] 5.761887 50.98899 [6,] 5.763064 50.98836 25 An example of CRS transformation

26 4 Spatial data import and export in R Transformation between sp objects in R and formats of external GIS programs –A number of open source projects made the transformation possible, such as: –Geospatial Data Abstraction Library ( http://www.gdal.org )http://www.gdal.org –PROJ.4 Cartographic Projections Library ( http://proj.maptools.org )http://proj.maptools.org 26

27 Vector File Formats Spatial vector data are points, lines, polygons, and fit the equivalent sp classes. –GIS vector data can be either topological or simple. Legacy GIS were topological, desktop GIS were simple –The sp vector classes are simple, meaning that for each polygon all coordinates are stored without checking that boundaries have corresponding points 27

28 Using OGR Drivers in rgdal Using the OGR vector functions of the Geospatial Data Abstraction Library, interfaced in rgdal, lets us read spatial vector data for which drivers are available. –A driver is a software component plugged-in on demand – here the OGR library tries to read the data using all the formats that it knows, using the appropriate driver if available. OGR also supports the handling of coordinate reference systems directly, so that if the imported data have a specification, it will be read. 28

29 The availability of OGR drivers differs from platform to platform, and can be listed using the ogrDrivers function. The function also lists whether the driver supports the creation of output files. The readOGR function: reads an OGR data source and layer into a suitable Spatial vector object – It takes at least two arguments – they are the data source name (dsn) and the layer (layer), and may take different forms for different drivers. The writeOGR function: allowing data to be written out using supported drivers –Like readOGR, it uses drivers to handle different data formats 29

30 Other vector import/export functions (only with shapefiles) –The shapefiles package is written without external libraries, using file connections. It can be very useful when a shapefile is malformed, because it gives access to the raw numbers. –The maptools package contains a local copy of the library used in OGR for reading shapefiles (the DBF reader is in the foreign package), and provides a low-level import read.shape function, a helper function getinfo.shape to identify whether the shapefile contains points, lines, or polygons. 30

31 Raster File Formats There are now a number of R packages that support image import and export –Such as the rimage and biOps packages and the EBImage package in the Bioconductor project –The requirements for spatial raster data handling include respecting the coordinate reference system of the image, so that specific solutions are needed Using arguments to readGDAL, subregions or bands may be selected, and the data may be decimated, which helps handle large rasters. The writeGDAL function can be used directly for drivers that support file creation. 31

32 Other raster import/export functions –There is a simple readAsciiGrid function in maptools that reads ESRI™ Arc ASCII grids into SpatialGridDataFrame objects; it does not handle CRS and has a single band. The companion writeAsciiGrid is for writing Arc ASCII grids. –It is also possible to use connections to read and write arbitrary binary files, provided that the content is not compressed. 32

33 5 Examples 1.Read and write MapInfo data 2.Export meuse data to Google Earth 33

34 With this example we only show the data import and export First, load the libraries needed > require(sp) > require(gstat) > require(rgdal) > require(maptools) > require(lattice) Set the working directory, where the MapInfo datasets ("CITY_125.DAT“, "CITY_125.ID“, "CITY_125.IND“, "CITY_125.MAP“, and "CITY_125.TAB") located > setwd("E:/RnGIS") 34 Read and write MapInfo data

35 Show the metadata of the layer CITY_125, including CRS information > ogrInfo("CITY_125.TAB", layer="CITY_125") Result Source: "CITY_125.TAB", layer: "CITY_125" Driver: MapInfo File number of rows 125 Feature type: wkbPoint with 2 dimensions Extent: (-157.8583 18.40056) - (-66.10611 61.21806) CRS: +proj=longlat +ellps=GRS80 +towgs84=0,0,0,-0,-0,-0,0 +no_defs Number of fields: 4 name type length typeName 1 City 4 50 String 2 State 4 2 String 3 Tot_hu 0 0 Integer 4 Tot_Pop 0 0 Integer 35

36 Read this source and layer into a suitable Spatial vector object > CITY125 <- readOGR("CITY_125.TAB", layer="CITY_125") Returned notice OGR data source with driver: MapInfo File Source: "CITY_125.TAB", layer: "CITY_125" with 125 features and 4 fields Feature type: wkbPoint with 2 dimensions 36

37 Show the class of CITY125 > class(CITY125) Result [1] "SpatialPointsDataFrame" attr(,"package") [1] "sp" Show the CRS information of CITY125 > proj4string(CITY125) Result "+proj=longlat +ellps=GRS80 +towgs84=0,0,0,-0,-0,-0,0 +no_defs“ 37

38 Plot CITY125 in a diagram > plot(CITY125, pch=1, col="red", border="grey60", + axes=TRUE, asp=1, + main="Big cities in USA") Result 38

39 Add name of cities in the diagram > text(coordinates(CITY125), + labels=as.character(CITY125$City), + font=2, cex=0.25, pos=4) Result 39

40 Export the spatial object to a MapInfo file. dsn="." means exporting to the working folder > writeOGR(CITY125, dsn=".", + layer="CITY125RwriteToMapInfo", + driver="MapInfo File", overwrite_layer=TRUE) Result: in the working folder there will be a list of generaged MapInfo files 40

41 Here we show a simple example of exporting meuse data Zn values as a KML file. In coming lab there will be a few enriched examples with exporting result of kriging predication into Google Earth Load the meuse data > data(meuse) > coordinates(meuse) = ~ x + y Specify the CRS infromation > EPSG <- make_EPSG() > (EPSG[grep("Amersfoort", fixed=T, EPSG$note), ]) > proj4string(meuse) <- CRS("+init=epsg:28992") > proj4string(meuse) <- CRS(paste(proj4string(meuse), + "+towgs84=565.237,50.0087,465.658, + -0.406857,0.350733,-1.87035,4.0812")) 41 Export meuse data to Google Earth

42 Transform the Meuse dataset from the Rijksdriehook CRS to geographic coordinates on the WGS84 ellipsoid > meuse.wgs84 <- spTransform(meuse, + CRS("+proj=longlat +datum=WGS84")) Export the point coverage of Meuse Zn values as a KML file. > writeOGR(meuse.wgs84["zinc"], "meuseZnPoints.kml", + "zinc", driver = "KML", overwrite_layer = TRUE) In the working folder there will be a KML file generated, which can be opened in Google Earth 42

43 43 We can also export the result of predication to Google Earth (details will be in the coming Friday lab)

44 Reading assignments for this week –Handout: Chapter 4 Spatial Data Import and Export. In: Bivand, R.S., Pebesma, E.J., Gómez-Rubio, V., 2008. Applied Spatial Data Analysis with R. Springer, New York, NY, 374pp. –Chapter 9 Selecting and Querying Data. In: MapInfo Professional 11.0 User Guide 44

45 Next classes Friday class: –Interfacing R and GIS – practices Next week –no class – Grand Marshal week Term assignment 45


Download ppt "Interfacing R with GIS Xiaogang (Marshall) Ma School of Science Rensselaer Polytechnic Institute Tuesday, Apr 02, 2013 GIS in the Sciences ERTH 4750 (38031)"

Similar presentations


Ads by Google