Data Analysis using Python-I Baburao Kamble and Ayse Kilic University of Nebraska-Lincoln University of Nebraska-Lincoln GIS in Water Resources Lecture, 2014
Objectives To be able to understand and write Python scripts 9/17/2018 Objectives To be able to understand and write Python scripts Numerical, Text (Weather) &Geospatial (Environmental) data manipulation
Agenda Part I: Basic Python Programming 9/17/2018 Agenda Part I: Basic Python Programming Python Syntax, Strings, Array, Conditional and Control flow, Functions File Input/Output and Text Data Processing Part II: Geospatial Data Analysis Installation of GDAL bindings on windows operating system Reading Raster (Landsat and DEM) data Using GDAL function from command line (interpolation, translation) Raster Processing. NDVI calculation DN2Rad2Ref
Why use GDAL/NumPy instead of canned GIS software? Not advisable if what you want to do is easily handled within ArcGIS/Imagine/etc. – there is a lot of programming overhead Well suited for process model applications where the logic at a cell based is too complex Example: Grid algebra : grid1 + grid2 (probably use GIS) Finding NN in multidimensional space (maybe use GDAL/Numpy) Also useful if your spatial data is NOT standard GIS formats (JPEG, PNG, etc.)
Geoprocessing with GDAL and Numpy in Python GDAL - Geospatial Data Abstraction Library Numpy - the N-dimensional array package for scientific computing with Python. Both of them are open source software Read raster dataset using GDAL Do some calculation using Numpy Output to geospatial dataset using GDAL
Python Libraries for Geospatial Development There are two popular Python libraries for raster and vector data: GDAL (Geospatial Data Abstraction Library) Is for raster-based geospatial data; aailable for download at: http://gdal.org Windows user use Frank Warmerdam’s “FWTools” open source GIS binaries for Windows (32 bit); available at: http://fwtools.maptools.org # includes gdal and ogr OGR Is for vector-based geospatial data and is available for download at: http://gdal.org/ogr NOTE: These are now merged together and are downloaded together under the common name GDAL
GDAL (Geospatial Data Abstraction Library) Presents an “abstract data model” for processing spatial data Can be used directly from C/C++ and can be “wrapped” for use with Python, Perl, VB, C#, R, Java … Early developers have chosen Python as their scripting language and documentation is relatively good for this.
GDAL Geospatial Data Abstraction Library Raster data access UGIC 2009 GDAL Geospatial Data Abstraction Library Raster data access Supports about 100 different formats ArcInfo grids, ArcSDE raster, Imagine, Idrisi, ENVI, GRASS, GeoTIFF HDF4, HDF5 USGS DOQ, USGS DEM ECW, MrSID TIFF, JPEG, JPEG2000, PNG, GIF, BMP
NumPy (Numerical Python) An array/matrix package for Python Well suited for image processing – i.e. one function can operate on the entire array Slicing by dimensions and applying functions to these slices is concise and straightforward Nearly 400 methods defined for use with NumPy arrays (e.g. type conversions, mathematical, logical, etc.)
GDAL and NumPy Since GDAL 1.3(?), GDAL has implemented NG (New Generation) Python bindings which includes NumPy Process: Get raster band(s) Convert the raster band(s) to a NumPy array using ReadAsArray() Open GDALDataset Write out GDALDataset Process the raster band(s) using NumPy functionality Convert the NumPy array to GDAL raster bands using WriteAsArray()
GDAL GDAL installation 9/17/2018 GDAL GDAL installation Make sure NumPy is installed (http://sourceforge.net/projects/numpy/files/) Download gdalwin32exe160.zip from http://download.osgeo.org/gdal/win32/1.6/ Unzip to C:\gdalwin32-1.6
Test Your GDAL in Python 9/17/2018 Test Your GDAL in Python
GDAL Supports more than 100 raster formats 9/17/2018 GDAL Supports more than 100 raster formats http://www.gdal.org/formats_list.html Finding available formats gdalinfo --formats
GDAL Data Model GDAL’s data model includes: Dataset, which hold the raster data in a collection of raster bands Raster Band: represents a band, channel, or layer within an image (e.g., RGB image has red, green, and blue components of the image) Raster size: specifies the width of the image in pixels and overall height of the image in lines Georeferencing transform converts from x,y coordinates into georeferenced coordinates (on the surface of Earth) Affine transformation mathematical formula allowing operations such as X offset, Y offset, X scale, Y scale, horizontal shear, vertical shear Coordinate system includes the projection and datum Metadata provide additional information about the dataset
GDAL example Python code Use GDAL to calculate the average of the height values contained in a DEM: from osgeo import gdal, gdalconst dataset = gdal.Open(”DEM.dat”) band = dataset.GetRAsterBand(1) fmt= “<“ + (“h” * band.XSize) totHeight = 0 for y in range (band.Ysize): scanline = band.ReadRaster (0, y, band.Xsize, 1, band.Xsize, 1, band.DataType) values = struct.unpack(fmt, scanline) for value in values: totHeight = totHeight + value average = totHeight / (band.xSize * band.YSize) print “Average height = “, average
OGR Datasource: represents the file (e.g., a country), has many: Layers: sets of related data, e.g., terrain, contour lines, roads layers Spatial reference: specifies datum and projection Attributes: additional metadata about the feature Geometry: specifies the shape and location of the feature. Are recursive, i.e., can have sub-geometries
OGR example Python code import osgeo.ogr from osgeo import ogr shapefile = ogr.Open(“TM_WORLD_BORDERS-0.3.shp”) layer = shapefile.GetLayer (0) for i in range (layer.GetFeatureCount()): feature = layer.GetFeature(i) name = feature.GetField(“NAME”) geometry = feature.GetGeometryRef() print i, name, geometry.GetGeometryName()
Sample 1 Read two tif files (red band and nir band) Calculate Output NDVI in same projection and georeference as the input file. Numpy example
Raster File input/output 9/17/2018 Raster File input/output read_raster.py read_Write_raster_in_Block.py NDVI.py DN2rad2ref.py
GDAL Command Line Utilities 20202020 GDAL Command Line Utilities gdalinfo - report information about a file. gdal_translate - Copy a raster file, with control of output format. gdaladdo - Add overviews to a file…pyramids gdalwarp - Warp an image into a new coordinate system. gdal_contour - Contours from DEM. gdaldem - Tools to analyze and visualize DEMs. rgb2pct.py - Convert a 24bit RGB image to 8bit paletted. pct2rgb.py - Convert an 8bit paletted image to 24bit RGB. dal_merge.py - Build a quick mosaic from a set of images. gdal_rasterize - Rasterize vectors into raster file. gdaltransform - Transform coordinates. nearblack - Convert nearly black/white borders to exact value. gdal_grid - Create raster from the scattered data. gdal_polygonize.py - Generate polygons from raster. gdal_sieve.py - Raster Sieve filter. gdal_fillnodata.py - Interpolate in nodata regions. gdal-config - Get options required to build software using GDAL.