Download presentation
Presentation is loading. Please wait.
Published byRandolph Murphy Modified over 9 years ago
1
Guofeng Cao CyberInfrastructure and Geospatial Information Laboratory Department of Geography National Center for Supercomputing Applications (NCSA) University of Illinois at Urbana-Champaign Geog 480: Principles of GIS
2
What we have learned The Nature of Geographic Data: Tobler’s first law of geography (spatial dependence) Spatial Heterogeneity Fractal Behavior GIS Functionalities(spatial analysis) o Geometric, topological, and set-oriented analysis o Field-based analysis o Network analysis o Overlay analysis Data and Database o Spatial Data: Vector vs. Raster Hardware
3
Database Fundamentals
4
What is a database? A database is a collection of data organized in such a way that a computer can efficiently store and retrieve data o A repository of data that is logically related A database is created and maintained using a general- purpose piece of software called a database management system (DBMS)
5
The database approach Before databases, computers were primarily used to convert data between different formats o “The computer as a giant calculator” Databases treat computers as useful repositories of data o “The computer as data repository” Most applications (including GIS) require a balance of processing and storage
6
Databases in a nutshell In order to be effective, databases must offer the following functions: All these functions are managed by the DBMS –Reliability –Integrity –Security –User views –User interface –Data independence –Self-describing –Concurrency –Distributed capabilities –High performance
7
Nutty Nuggets #1 We might write a program to organize the stock for the “Nutty Nuggets” restaurant As time continues, this program will become more complex, offering more functions Stage 1 Stage 2
8
Nutty Nuggets #2 Key problems with the previous approach are: o Loss of integrity o Loss of independence o Loss of security Stage 3, the database, solves these problems Stage 3
9
Common database applications Home/office database o Simple applications (e.g., Nutty Nuggets) Commercial database o Store the information for businesses (e.g. customers, employees) Engineering database o Used to store engineering designs (e.g. CAD) Image and multimedia database o Store image, audio, video data Geodatabase o Store a combination of spatial and non-spatial data
10
Elements of a DBMS Query language Query compiler Runtime database processor Constraint enforcer Stored data manager System catalog/data dictionary
11
Transaction management A transaction is an atomic unit of interaction between user and database o Insertion of data o Modification of data o Deletion of data o Retrieval of data Transaction management must support o Concurrency (multiple users accessing the same data at the same time) o Recovery management (retrieval of a valid database state following system failure)
12
Concurrency: Lost update Lost update can occur when atomic transactions are incorrectly interleaved
13
Relational databases
14
Database architectures Most databases today are either: o Relational; or o Object-oriented (especially useful for spatial data) Early database systems were based on the hierarchical model o Efficient storage, but limited expressiveness The network model was used to overcome lack of expressiveness in hierarchical databases o But led to highly complex database system The deductive model is an active research area today o Stores rules in addition to facts
15
The relational model A relational database is a collection of relations, often just called tables Each relation has a set of attributes The data in the relation is structured as a set of rows, often called tuples Each tuple consists of data items for each attribute Each cell in a tuple contains a single value A relational database management system (RDBMS) is the software that manages a relational database
16
Example relation Relation Attribute Tuple Data item
17
Relations A relation is basically a “table” A relation scheme is the set of attribute names and the domain (data type) for each attribute name A database scheme is a set of relation schemes In a relation: o Each tuple contains as many values as there are attributes in the relation scheme o Each data item is drawn from the domain for its attribute o The order of tuples is not significant o Tuples in a relation are all distinct from each other In most relational systems, data items are atomic o A relation that contains only atomic items is said to be in first normal form (1NF) The degree of a relation is its number of columns The cardinality of a relation is the number of tuples
18
Relation scheme A candidate key is an attribute or minimal set of attributes that will uniquely identify each tuple in a relation One candidate key is usually chose as a primary key
19
Operations on relations Fundamental relational operators: o Union, intersection and difference: usual set operations, but require both operands have the same schema o Selection: picking certain rows o Projection: picking certain columns o Products and joins: compositions of relations Together, these operations and the way they are combined is called relational algebra combined: o An algebra whose operands are relations or variables that represent relations The relational model is said to be closed, because relational operators take one or more relations as input and return a relation
20
Project operator The project operator is unary o It outputs a new relation that has a subset of attributes o Identical tuples in the output relation are coalesced Relation Sells: barbeerprice Murphy’sBud2.50 Murphy’sMiller2.75 Legend’sBud2.50 Legend’sMiller3.00 Prices := PROJ beer,price (Sells): beerprice Bud2.50 Miller2.75 Miller3.00
21
Select operator The select operator is unary o It outputs a new relation that has a subset of tuples o A condition specifies those tuples that are required Relation Sells: barbeerprice Murphy’sBud2.50 Murphy’sMiller2.75 Legend’sBud2.50 Legend’sMiller3.00 MurphyMenu := SELECT bar=“Murphy’s” (Sells): bar beerprice Murphy’sBud 2.50 Murphy’sMiller2.75
22
Join operator The join operator is binary o It outputs the combined relation where tuples agree on a specified attribute (natural join) Sells(bar, beer,price )Bars(bar,address) Murphy’sBud2.50 Murphy’s 604 Green St. Murphy’s Miller2.75 Legend’s 522 Green St. Legend’sBud2.50 Legend’sCoors3.00 BarInfo := Sells JOIN Bars Note Bars.name has become Bars.bar to make the natural join “work.” BarInfo(bar,beer,price,address ) Murphy’sBud2.50604 Green St. Murphy’sMilller2.75604 Green St. Legend’sBud2.50522 Green St. Legend’sCoors3.00522 Green St.
23
Join operator Join is the most time-consuming of all relational operators to compute o In general, relational operators may not be arbitrarily reordered (left join, right join) o Query optimization aims to find an efficient way of processing queries, for example reordering to produce equivalent but more efficient queries
24
Complex relational operator example Join relations SHOW and FILM using FILM_NAME and TITLE Select using CINEMA_ID=1 Project TITLE, DIRECTOR, CINEMA_ID, and SCREEN_NO For full database see book web site: http://worboys.duckham.org
25
Relational databases and spatial data Several issues prevent unmodified databases being useful for spatial data o Structure of spatial data does not naturally fit with tables o Performance is impaired by the need to perform multiple joins with spatial data o Indexes are non-spatial in a conventional relational database An extensible RDBMS offers some solutions to these problems with o user defined data types o user-defined operations o user-defined indexes and access methods o active database functions (e.g., triggers)
26
End of this topic
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.