Lecture 10 Creating and Maintaining Geographic Databases Longley et al., Ch. 10, through section 10.4
Outline Definitions Characteristics of DBMS Types of databases Relational model SQL Queries as a gateway to spatial analysis
cell towers +/- 500 m Google db of tower locations Graphic courtesy of Wired, Feb Wi-Fi +/- 30 m Skyhook servers and db iPhone GPS +/- 10 m iPhone uses reference network
Definitions Database – an integrated set of data (attributes) on a particular subject Geographic (=geospatial) database - database containing geographic data of a particular subject for a particular area Database Management System (DBMS) – software to create, maintain and access databases
A GIS can answer the question: What is where? WHAT: Characteristics of features (= attributes). WHERE: In geographic space.
A GIS links attribute and spatial data Attribute Data Flat File or DBMS Relationships Topology Table Map Data Point File Line File Area File Topology Type
Flat File or DBMS RecordValue Attribute RecordValue RecordValue
Ancient DBMS From Clarke, Getting Started with GIS
Types of DBMS Models Hierarchical Network Relational - RDBMS Object-oriented - OODBMS Object-relational - ORDBMS
Historically, databases were structured hierarchically in flat files...
Relational Databases rule now
Characteristics of DBMS (1) Support for multiple data types e.g MS Access: Text, Memo, Number, Date/Time, Currency, AutoNumber, Yes/No, OLE Object, Hyperlink, Lookup Wizard Load data from files, databases and other applications Index for rapid retrieval
Characteristics of DBMS (2) Query language – e.g., SQL Security – controlled access to data Multi-level groups Controlled update using a transaction manager Backup and recovery
Characteristics of DBMS (3) Applications Forms builder Reportwriter Internet Application Server CASE tools Programmable API
Geographic Information System Database Management System Data loading Editing Visualization Mapping Analysis Storage Indexing Security Query Data System Task Role of DBMS
Relational DBMS (1) Data stored as tuples (tup-el), conceptualized as tables Table – data about a class of objects Two-dimensional list (array) Rows = objects Columns = object states (properties, attributes)
Table Row = object Vector feature Column = attribute
Relational DBMS (2) Most popular type of DBMS Over 95% of data in DBMS is in RDBMS Commercial systems IBM DB2 Informix Microsoft Access Microsoft SQL Server Oracle Sybase
Relational Join Fundamental query operation Occurs because Data created/maintained by different users, but integration needed for queries Table joins use common keys (column values) Table (attribute) join concept has been extended to geographic case
Relational Databases
SQL Structured (Standard) Query Language – (pronounced SEQUEL) Developed by IBM in 1970s Now de facto and de jure standard for accessing relational databases Three types of usage Stand alone queries High level programming Embedded in other applications
Types of SQL Statements Data Definition Language (DDL) Create, alter and delete data CREATE TABLE, CREATE INDEX Data Manipulation Language (DML) Retrieve and manipulate data SELECT, UPDATE, DELETE, INSERT Data Control Languages (DCL) Control security of data GRANT, CREATE USER, DROP USER
Spatial Search: Gateway to Spatial Analysis Overlay is a spatial retrieval operation that is equivalent to an attribute join. Buffering is a spatial retrieval around points, lines, or areas based on distance.
Overlay Image courtesy of K. Foote/M. Lynch, UT-Austin
Overlay 0 1
Overlay like an attribute join
Types of overlay operations Union Intersect Identity Max Min Etc.
Union computes the geometric intersection of two polygon coverages. All polygons from both coverages will be split at their intersections and preserved in the output coverage.
Union within 25 miles of a city OR within 25 miles of a major river.
Intersect computes the geometric intersection of two coverages. Only those features in the area common to both coverages will be preserved in the output coverage.
Intersect within 25 miles of a city AND within 25 miles of a major river.
Identity computes the geometric intersection of two coverages. All features of the input coverage, as well as those features of the identity coverage that overlap the input coverage, are preserved in the output coverage.
Identity within 25 miles of a city OR within 25 miles of a major river. within 25 miles of a city AND within 25 miles of a major river. Portion of the major city buffer WITHIN the major river buffer
Identity Intersect
Buffer
Complex Retrieval: Map Algebra Combinations of spatial and attribute queries can build some complex and powerful GIS operations, such as weighting. Weighted overlay analysis really just complex retrieval.
Map Algebra Compared with RAINFALL 1990 RAINFALL 1991 MAX RAINFALL 1990-’91
Recode OR
A-B = AGRICULTURAL C-E = NON-AGRICULTURAL