GIS Table Analysis. 2121 Geographic Data Tabular Data Standard GISDataModel Linked spatial and attribute data ID AgeTypePavement 1124LnBitumen 222LnConcrete.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

Organisation Of Data (1) Database Theory
Chapter 10: Designing Databases
Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
Introduction to Databases
Concepts of Database Management Seventh Edition
Concepts of Database Management Sixth Edition
Concepts of Database Management Seventh Edition
The Relational Database Model
Introduction to Structured Query Language (SQL)
File Systems and Databases
Normal Forms in Relational Databases 1 1 Bolstad, P. pp
The Relational Database Model:
Databases Week 7 LBSC 690 Information Technology.
Chapter 11 Data Management Layer Design
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
BUSINESS DRIVEN TECHNOLOGY
1 Chapter 2 Reviewing Tables and Queries. 2 Chapter Objectives Identify the steps required to develop an Access application Specify the characteristics.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Page 1 ISMT E-120 Introduction to Microsoft Access & Relational Databases The Influence of Software and Hardware Technologies on Business Productivity.
IST Databases and DBMSs Todd S. Bacastow January 2005.
Concepts of Database Management, Fifth Edition
GIS Concepts ‣ What is a table? What is a table? ‣ Queries on tables Queries on tables ‣ Joining and relating tables Joining and relating tables ‣ Summary.
DBMS By Narinder Singh Computer Sc. Deptt. Topics What is DBMS What is DBMS File System Approach: its limitations File System Approach: its limitations.
DATABASE MANAGEMENT SYSTEMS BASIC CONCEPTS 1. What is a database? A database is a collection of data which can be used: alone, or alone, or combined /
Attribute Data in GIS Data in GIS are stored as features AND tabular info Tabular information can be associated with features OR Tabular data may NOT be.
Databases C HAPTER Chapter 10: Databases2 Databases and Structured Fields  A database is a collection of information –Typically stored as computer.
ASP.NET Programming with C# and SQL Server First Edition
Chapter 1 Overview of Database Concepts Oracle 10g: SQL
Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS.
Chapter 15: Using LINQ to Access Data in C# Programs.
Concepts and Terminology Introduction to Database.
How do we represent the world in a GIS database?
Normalization (Codd, 1972) Practical Information For Real World Database Design.
In this chapter, you learn about the following: ❑ Anomalies ❑ Dependency and determinants ❑ Normalization ❑ A layman’s method of understanding normalization.
GUS: 0262 Fundamentals of GIS Lecture Presentation 3: Relational Data Model Jeremy Mennis Department of Geography and Urban Studies Temple University.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
Concepts of Database Management Seventh Edition
Chapter 12: Designing Databases
What's a Database A Database Primer Let’s discuss databases n Why they are hard n Why we need them.
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
Microsoft Office XP Illustrated Introductory, Enhanced Tables and Queries Using.
GIS Data Structures How do we represent the world in a GIS database?
Creating and Maintaining Geographic Databases. Outline Definitions Characteristics of DBMS Types of database Relational model SQL Spatial databases.
Query and Reasoning. Types of Queries Most GIS queries will select spatial features Query by Attribute (Select by Attribute) –Structured Query Language.
GIS Data Models GEOG 370 Christine Erlien, Instructor.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 7 (Part II) INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) Instructor.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
A337 - Reed Smith1 Structure What is a database? –Table of information Rows are referred to as records Columns are referred to as fields Record identifier.
Benjamin Post Cole Kelleher.  Availability  Data must maintain a specified level of availability to the users  Performance  Database requests must.
A table is a set of data elements (values) that is organized using a model of vertical columns (which are identified by their name) and horizontal rows.
DATA Spatial Data – where things are Non Spatial Data or Attribute Data – What things are Data in a computer database are managed and accessed through.
Flat Files Relational Databases
9-1 © Prentice Hall, 2007 Topic 9: Physical Database Design Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich,
Important Database Concepts Introduction to GIS. How is Data Stored? People use number system with base 10 –decimal number Each digit corresponds to 10.
11-1 © Prentice Hall, 2004 Chapter 11: Physical Database Design Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich,
1 10 Systems Analysis and Design in a Changing World, 2 nd Edition, Satzinger, Jackson, & Burd Chapter 10 Designing Databases.
Week 2 Lecture The Relational Database Model Samuel ConnSamuel Conn, Faculty Suggestions for using the Lecture Slides.
Databases Databases are collections of information; our study repeats a theme: Tell the computer the structure, and it can help you! © 2004, Lawrence Snyder.
Rationale Databases are an integral part of an organization. Aspiring Database Developers should be able to efficiently design and implement databases.
1 Section 1 - Introduction to SQL u SQL is an abbreviation for Structured Query Language. u It is generally pronounced “Sequel” u SQL is a unified language.
Geog. 314 Working with tables.
Databases and DBMSs Todd S. Bacastow January
Chapter 4 Relational Databases
Announcements Project 2’s due date is moved to Tuesday 8/3/04
Spreadsheets, Modelling & Databases
Relational Database Design
ESRM 250/CFR 520 Autumn 2009 Phil Hurvitz
Presentation transcript:

GIS Table Analysis

2121 Geographic Data Tabular Data Standard GISDataModel Linked spatial and attribute data ID AgeTypePavement 1124LnBitumen 222LnConcrete 3951LnDirt etc. IDLake SizeQuality etc.

For each layer there is typically a one-to-one relationship between geographic features (point, line, or polygon) and records in a table

Terminology Database – an organized collection of data Table – data organized in rows and columns Attribute – a variable or item Record – a collection of attributes Domain – the range of values an attribute may take Index/key – attribute(s) used to identify, organize, or order records in a database

integer domain real domain alpha- numeric domain (a string) R e cord (o r tuple) Attribute (or item or field) Commoncomponentsofadatabase: IDAREAAREAPerimClassCode a11z a119f

A DBMS provides Translation (many views on the data) Protection (e.g., against errors due to simultaneous updates)

Multi-tiered architecture Isolates end user (upper tier) from changes at lower tiers Allows integration of multiple divergent data sources

Common Database Models: Flat File Hierarchical Network Relational

Flat File Data in a “text” or other lightly formatted file. Little structure, cross- referencing or linking among entries. Often in a row/column format Advantages: Transparent, easily transportable Disadvantages: Little structure, few error safeguards

Relational Database Model Minimal row- column structure Items/records with specified domains (possible values) Advantages: Minimum structure, easy programming, flexible Disadvantages: Relatively slow, a few restrictions on attribute content

Relational Databases Are Most Common Flexible Relatively easy to create and maintain Computer speeds have overcome slow response in most applications Low training costs Inertia – many tools are available for RDBMS, large personnel pool

Hybrid Database Coordinate data as a connected structure Attribute data as a relational table

Eight Fundamental Operations Restrict (query) – subset by rows Project – subset by columns Product – all possible combinations Divide – inverse of product

Eight Fundamental Operations Union – combine top to bottom Intersect – row overlap Difference – row non-overlap Join (relate) – combine by a key column

Main OperationswithRelational Tables Query/Restrict Conditional selection Calculation and Assignment Sort rank based on attributes Relate/Join Temporarily combine two tables by an index

Query / Restrict Operations with Relational Tables Set Algebra Uses operations less than ( ), equal to (=), and not equal to (<>). Boolean Algebra uses the conditions OR, AND, and NOT to select features. Boolean expressions are evaluated by assigning an outcome, True or False, to each condition.

Query / Restrict Operations with Relational Tables Each record is inspected andis added to the selected set if it meets one to several conditions AND, OR and NOT may be applied alone or in combinations AND typically decreases the number of records selected ORtypically increases the number of records selected NOT Is the negation operation and is interpreted as meaning select those that do not meet the condition following the NOT.

Query / Restrict – simple, AND

Query / Restrict – OR, NOT

Operation Order is Important in Query (D OR E) AND Fmay not be the same asD OR (E AND F) NOT (A and B) may not be the same as [ NOT (A) AND NOT (B)] Typically need to clarify order with delimiters

NAME

StructuredQueryLanguage (SQL) A standard system for query syntax Uniformly interpreted set of operations, e.g., CREATE INSERT SELECT Anybody can (and it appear everybody has) create a database “engine” that fits under SQL – then we can switch vendors and upgrade whenever we want….in theory.

CREATETABLEinformix.culverts ( se_row_idserialprimarykey, shapeST_Point); CREATEINDEXculverts_ix2 ONculverts(shapeST_Geometry_Ops) USINGRTREE; INSERTINTOgeometry_columns VALUES( "hamlet", "informix", "culverts", "shape", NULL, 1, NULL, 0 ); --ThecodeforaPoint Source: SQL Example (Creating a file of points where roads and streams intersect)

Recap Query / Restrict Set Algebra Uses operations less than (<), greater than (>), equal to (=), and not equal to (<>). Boolean Algebra uses the conditions OR, AND, and NOT to select features.Boolean expressions are evaluated by assigning an outcome, True or False, to each condition. SQL - Structured Query Language

SELECT AREA > COLOR = blue OR ID = F NOT (COLOR = blue)NOT (COLOR = blue) ID < cID < c COLOR = blue AND AREA > 10 OR ID = g Practice Do not turn in

SELECT AREA > COLOR = blue OR ID = F NOT (COLOR = blue)NOT (COLOR = blue) ID < CID < C COLOR = blue AND AREA > 10 OR ID = g

OUTER JOIN

INNER JOIN

Main OperationswithRelational Tables Query / Selection Conditional selection Calculation and Assignment Sort rank based on attributes Relate/Join Temporarily combine two tables by an index

Calculation and Assignment Slope = “steep” Aspect = 45.2 Cost =[ M * U + cos (distance) ] / (F – P/R*T)

Main OperationswithRelational Tables Query/Selection Conditional selection Calculation and Assignment Sort rank based on attributes Relate/Join Temporarily combine two tables by an index

Sort – ordering by attribute values Simple sort – ascending AREA Compound sort – ascending Type, then descending AREA within Type NameNameAREAAREAclassclassTypeType Emily, Lake52,222.61Limnetic zone Emily, Lake58,662.21Limnetic zone 60,826.62Shallow lakes 64,588.52Shallow lakes 70,590.32Shallow lakes Long Lake88,259.51Limnetic zone 143,285.32Littoral zone Sleepy Eye Lake170,797.12Littoral zone Mud Lake193,318.52Shallow lakes Goldsmith Lake201,127.12Littoral zone Emily, Lake336,343.22Littoral zone 349,528.71Limnetic zone 384,160.12Littoral zone Emily, Lake420,798.41Limnetic zone Savidge Lake479,709.72Littoral zone Emily, Lake545,381.81Limnetic zone Dog Lake635,537.02Littoral zone Duck LakeDuck Lake1,126,331.91Limnetic zone Wita Lake1,354,583.22Littoral zone 1,418,133.31Limnetic zone Ballantyne Lake1,428,331.51Limnetic zone Washington, LakeWashington, Lake1,914,835.31Limnetic zone 1,937,698.61Limnetic zone 4,040,675.71Limnetic zone NameNameAREAAREAclassclassTypeType 4,040,675.71Limnetic zone 1,937,698.61Limnetic zone Washington, Lake1,914,835.31Limnetic zone Ballantyne Lake1,428,331.51Limnetic zone 1,418,133.31Limnetic zone Duck Lake1,126,331.91Limnetic zone Emily, Lake545,381.81Limnetic zone Emily, Lake420,798.41Limnetic zone 349,528.71Limnetic zone Long Lake88,259.51Limnetic zone Emily, Lake58,662.21Limnetic zone Emily, Lake52,222.61Limnetic zone Dog Lake635,537.02Littoral zone Wita Lake1,354,583.22Littoral zone Savidge Lake479,709.72Littoral zone 384,160.12Littoral zone Emily, Lake336,343.22Littoral zone Goldsmith Lake201,127.12Littoral zone Sleepy Eye Lake170,797.12Littoral zone 143,285.32Littoral zone Mud Lake193,318.52Shallow lakes 70,590.32Shallow lakes 64,588.52Shallow lakes 60,826.62Shallow lakes

Main OperationswithRelational Tables Query/Selection Conditional selection Calculation and Assignment Sort rank based on attributes Relate/Join Temporarily combine two tables by an index

Tables in GIS Attribute tables are often huge We have to maintain our tables (change values, remove, add records or items) Different people/applications are interested in different subsets of attributes (columns) We often break our tables up into pieces (many tables), and use relational joins as needed to combine them back together

A Relational Join

Relational Tables This is common because most users do not understand the concepts Normal Forms in relational tables. Relational tables have many advantages, but If improperly structured, they may suffer from: Poor performance Inconsistency Redundancy Difficult maintenance

Relational Tables This is common because most users do not understand the concepts Normal Forms in relational tables. Relational tables have many advantages, but If improperly structured, table may suffer from: Poor performance Inconsistency Redundancy Difficult maintenance

Tables in Non-normal Form repeat columns, “dependent” data, empty cells by design

Normal Forms AreGood Because: It reduces total data storage Changing values in the database is easier It “insulates” information – it is easier to retain important data Many operations are easier to code

1 st Normal Forms in Relational Tables Tables are in first normal form when there are no repeat columns Advantages: easy to code queries (can look in only one column) Disadvantages: slow searches, excess storage, cumbersome maintenance

2 nd Normal Forms in Relational Tables 2NF if: it is in 1NF and if every non-key attribute is functionally dependent on the primary key What is a key? An item or set of items that may be used to uniquely identify every row What is functional dependency? If you know an item (or items) for a row, then you automatically know a second set of items for the row – this means the second set of items is functionally dependent on the item (or items)

Keys Item(s) that uniquely identify a row STATE can be a key, but not REGION, SIZE, or POPULATION

Sometimes we need >1 column to form a key, e.g., Parcel-ID and Own-ID together may form a key Keys Item(s) that uniquely identify a row

Functional Dependency Knowing the value of an item (or items) means you know the values of other items in the row e.g., if we know the person’s name, then we know the address In our example, if we know the Parcel-ID, we know the Alderman, Township name, and other Township attributes: Parcel-ID - > AldermanParcel-ID - > Thall_add Parcel-ID - > Tship-ID Parcel-ID - > Tship_name

Moving from First Normal Form (1NF to Second Normal Form (2NF), we need to: Identify functional dependencies Place in separate tables, one key per table

3 rd Normal Forms in Relational Tables Remove transitive functional dependencies A transitive functional dependency is when A -> B (if we know A, then we know B) and B -> C (if we know B, then we know C) So A -> C (if we know A, then we know C). To be in 3NF, we must identify all transitive functional dependencies, and remove them, typically by splitting the table(s) that contain them

In our example, one transitive functional dependency: Parcel-ID -> Tship-ID, Alderman Tship-ID -> Tship_name, Thall_add

Bad Things in (unormalized) Relational Tables: Repeat (or similar) variables e.g., parcel #, owner 1, owner2, owner3, owner 4 Multiple dependencies per record e.g., owner name, house#, street, city, county, zipcode, state, country Repeat records Many blank cells

Normal Forms Summary No repeat columns (create new records such that there are multiple records per entry) Split the tables, so that all non-key attributes depend on a primary key. Split tables further, if there are transitive functional dependencies. This results in tables with a single, primary key per table.

Normal Forms AreGood Because: It reduces total data storage Changing values in the database is easier It “insulates” information – it is easier to retain important data Many operations are easier to code

In ArcGIS 10 Per capita energy use 40 million) and (car theft <1) The order of operations is assumed to be this When you don’t include parentheses Per capita energy use < 4,000 OR ( (population > 40 million) and (car theft <1) )

YOU MUST TURN IN THE ANSWERS TO THIS SHEET WITH YOUR LAB ASSIGNMENT