Introduction to GIS and Data Francisco Olivera, Ph.D., P.E. Department of Civil Engineering Texas A&M University
GIS: Geographic Information Systems Geographic Information Systems: Database management systems in which the databases include geographic information. A key characteristic of GIS is the explicit linkage between geographic features represented on a map with attribute data that describe the geometric feature. Overview
Early GIS The term GIS was first used by Roger Tomlinson in the 1960s during his work with the Canada Land Inventory. A GIS was developed to analyze the data collected and to support the development of land management plans for rural areas. Work accomplished at the Harvard Laboratory for Computer Graphics and Spatial Analysis in the 1970s and early 1980s had a major influence on the development of GIS. In 1969, the Environmental System Research Institute was founded by Jack Dangermond, a Harvard Lab graduate.
ESRI Software History Toolbox GIS provides a command line interface, while desktop GIS provides a point-and-click graphical user interface (GUI). ArcInfo up to 7.x was a toolbox GIS used for spatial data development and analysis. ArcView 1.x was a desktop GIS used for displaying and printing data only. ArcView 2.x and 3.x, on the contrary, had some limited data development, analysis and programming capabilities (compared to ArcInfo) without giving up its desktop character.
ESRI Software History The ESRI software ArcInfo 8.x and ArcView 8.x are desktop GIS with strong data development, analysis and display capabilities. Both ArcInfo 8.x and ArcView 8.x consist of three components: ArcMap, ArcCatalog and ArcTools, each of which performs specific functions. The differences between ArcInfo 8.x and ArcView 8.x have to do with the number of commands available, but the interfaces are identical. ArcInfo 8.x includes ArcInfo Workstation which is identical to the toolbox GIS available in previous versions of ArcInfo.
Programming Languages ArcInfo up to version 7.x and the current ArcInfo Workstation use Arc Macro Language (AML) as its programming language. ArcView 3.x uses Avenue, and object-oriented programming language developed specifically for ArcView. ArcInfo and ArcView 8.x use Visual Basic for Applications (VBA), a standard programming language in the Windows environment.
Transition The transition from ArcInfo 7.x and ArcView 3.x to ArcInfo 8.x and ArcView 8.x is slower than observed for other software packages. Lack of backward compatibility keeps users from running Avenue applications with ArcInfo 8.x and ArcView 8.x. Lack of GIS applications in VBA for ArcInfo 8.x and ArcView 8.x also keeps users from switching to the new software.
Introduction to ArcGIS ArcGIS is a software program, used to create, display and analyze geospatial data. Developed by Environmental Systems Research Institute (ESRI) of Redlands, CaliforniaEnvironmental Systems Research Institute
Variants of ArcGIS ArcGIS comes in three different versions based on the capabilities provided by the software: ArcView, ArcEditor and ArcInfo. ArcView provides data visualization, query, analysis and integration capabilities along with the ability to create and edit simple geographic features. ArcEditor includes all the functionalities of ArcView and extends these to a multi-user environment. ArcInfo includes all the functionalities of ArcEditor and adds advanced geoprocessing capabilities.
Components of ArcGIS ArcCatalog is used for browsing for maps and spatial data, managing spatial data, and viewing and creating metadata. ArcMap is used for visualizing spatial data, performing spatial analysis and creating maps to show the results.
Definitions Digital Spatial Data: Synthesis – in electronic format – of geographic (map) and tabular (table) information. Data models: Formats in which geographic data is stored and managed.
Data Models Vector Data Models (Features) Points Lines Polygons Raster Data Models (Surfaces) TIN Data Models (Surfaces)
Features Geographic objects that have different shapes are represented as features
Features Points are a pair of x,y coordinates One-to-one relation between features in the map and records in the table.
Features Lines are sets of coordinates that define a shape One-to-one relation between features in the map and records in the table.
Features Polygons are sets of coordinates defining boundaries that enclose areas. One-to-one relation between features in the map and records in the table.
Data Structures of Features A line is an open sequence of points in which the first and last points are called nodes, and the remaining intermediate points are called vertices. Nodes Vertices
Vector Data Implementations ArcGIS uses three different implementations of the vector data: Coverages Shapefiles Feature classes in geodatabases These three different types of storage have to do with the type of data structure chosen to store the data. Coverages and shapefiles are file-based models, whereas geodatabase models are database management system (DBMS) feature models.
Data Structures of Features Simple lines Complex lines
Simple polygons Complex polygons Data Structures of Features
Space-filling polygons Not space-filling polygons
Data Structures of Surfaces Grid datasets: Cellular-based data structure composed of square cells of equal size arranged in rows and columns. Grid definition requires: (1) the coordinates of the upper-left corner, (2) the cell size, (3) the number of rows, (4) the number of columns, and (5) the value at each cell. Cells that do not store any value are called NODATA cells. Number of columns Number of rows Cell size (x, y)
Surfaces Grid datasets
TIN Datasets Surfaces
Triangular Irregular Network (TIN) Datasets: Dataset constructed by connecting points -- for which the TIN parameter is known – forming triangles. Triangle sides are constructed by connecting adjacent points so that the minimum angle of each triangle is maximized. Triangle sides cannot cross breaklines. The TIN format is efficient to store data because the resolution adjusts to the parameter spatial variability. Data Structures of Surfaces
Triangular Irregular Network (TIN) Datasets Data Structures of Surfaces
Image Datasets Surfaces
Image datasets: ARC Digitized Raster Graphics (ADRG) Windows bitmap images (BMP) [.bmp] Multiband (BSQ, BIL and BIP) and single band images [.bsq,.bil and.bip] ERDAS [.lan and.gis] ESRI Grid datasets IMAGINE [.img] IMPELL Bitmaps [.rlc] Image catalogs JPEG [.jpg] MrSID [.sid] National Image Transfer Format (NITF) Sun rasterfiles [.rs,.ras and.sun] Tag Image File Format (TIFF) [.tiff,.tif and.tff] TIFF/LZW Data Structures of Surfaces
Storing Datasets Features: Coverages are stored partially in their own folder and partially in the common INFO folder. Shapefiles are stored in at least three files (with extensions.shp,.shx and.dbf) and up to seven files (with extensions.sbx,.sbn,.ain and.aih). Feature Classes are stored inside geodatabases which are single files with extension.mdb.
Storing datasets SURFACES Grid and TIN datasets are stored partially in their own folder and partially in the common INFO folder. Image datasets are stored in different ways depending on the image format. Structure of a folder containing different types digital spatial data. Coverage Grid TIN Image.tif Shapefile.shp Shapefile.shx Shapefile.dbf Info
Managing Datasets Renaming Always use ArcGIS utilities to rename coverages, shapefiles, feature classes, grids and TINs because some information is internally stored with the dataset name. Images can be renamed using the operating system utilities. Copying and Moving Always use ArcGIS utilities to copy and move coverages, grids and TINs to make sure the information stored in the INFO folder is included. Shapefiles, geodatabases and images can be moved or copied using the operating system utilities, making sure all the files are included. ArcGIS utilities should be used to copy and move feature classes or feature datasets from geodatabases.
Sharing Datasets Interchange files Coverages, grids and TINs are shared as interchange files. An interchange file is a single file – with extension E00 – that includes all information stored in the dataset folder and its share of information contained in the INFO folder. If a limit is set on the size of the interchange file, then several smaller files (i.e., E00, E01, E02, …) are generated rather than one single file. This option was common when storage media had limited capacity. An interchange file is obtained by exporting a coverage, grid or TIN. In turn, a coverage, grid or TIN is obtained by importing an interchange file.