Lecture 3: Spatial Data Management Dr. Taysir Hassan Abdel Hamid Associate Professor, Information Systems Dept., Faculty of Computers and Information Assiut University March 6, 2016
Outline Geodatabases Spatial relationships Spatial queries
A Spatial Database A spatial database is a database that is optimized to store and query data that is related to objects in space, including points, lines and polygons.database While typical databases can understand various numeric and character types of data, additional functionality needs to be added for databases to process spatial data types. These are typically called geometry or feature. The Open Geospatial Consortium created the Simple Features specification and sets standards for adding spatial functionality to database systems Open Geospatial ConsortiumSimple Features
Geodatabases Definition: The geodatabase is a collection of geographic datasets of various types. An ArcGIS geodatabase is a collection of geographic datasets of various types held in a common file system folder, a Microsoft Access database, or a multiuser relational database (such as Oracle, Microsoft SQL Server, PostgreSQL, Informix, or IBM DB2).
Fundamental Datasets in the Geodatabase A key geodatabase concept is the dataset. It is the primary mechanism used to organize and use geographic information in ArcGIS. The geodatabase contains three primary dataset types: – Feature classes – Raster datasets – Tables
Datasets Types in GeoDatabases File Geodatabase - Stored as folders in a file system. Each dataset is held as a file that can scale up to 1 TB in size. The file geodatabase is recommended over personal geodatabases. Personal geodatabases - All datasets are stored within a Microsoft Access data file, which is limited in size to 2 GB Enterprise geodatabases – unlimited number of users, Stored in a relational database using Oracle, Microsoft SQL Server, IBM DB2, IBM Informix, or PostgreSQL.
Spatial relationships logical connections between spatial objects represented by points, lines and polygons e.g., - point-in-polygon - line-line - polygon-polygon
Spatial relationships “adjacent to” “connected to” “near to” “intersects with” “within” “overlaps” etc.
Spatial relationships Some relationships are stored in a topological data model - “adjacent to” right poly and left poly in the line attribute table - “connected to” list of lines that share the same node in the node attribute table others need to be computed
“is nearest to” point/point - which family planning clinic is closest to the village? point/line - which road is nearest to the village same with other combinations of spatial features
“is nearest to”: Thiessen polygons
“is near to”: buffer operations point buffer - affected area around a polluting facility - catchment area of a water source
buffer operations line buffer - how many people live near the polluted river? - what is the area impacted by highway noise?
buffer operations polygon buffer - area around a reservoir where development should not be permitted
“crosses”: line intersection when traveling to the dispensary, do farmers have to cross this river?
“is within”: point in polygon which of the cholera cases are within the contaminated water catchment area?
General overlay types Identity – spatial join or point-in-polygon Clip – similar to set extent when using raster data Intersection Buffer ( for all of the above, operations are on layers, not single polygons )
Spatial Overlay Analysis: A type of analysis in which data is extracted from one layer (such as block groups) to an overlay layer (such as a trade area). Overlay is one of the most common and powerful GIS functions. It investigates the spatial association of features by “vertically stacking” feature layers to investigate geographic patterns and determine locations that meet specific criteria.
Spatial Join Point in polygon operation – which points are in the Polygon? Polygon ID (id_1) is added to the point layer’s attribute table.
Clip A B ABC Two polygons, A and B, Overlap. Clip A using B as a cookie cutter. Clip operation creates a new polygon, C, which is the intersect, or overlap, of A and B. Attributes of A do not appear in C.
Creating subsets create a subset of a data set using another incompatible set “cookie-cutting” A B C A A B C inputclip coveroutput
Intersect A B ABC Two polygons, A nd B, Overlap. Find the Intersection of A using B. Intersect operation creates a new polygon, C, which is the intersection, or overlap, of A and B. Attributes of A and B do appear in C.
Buffer Buffers are polygon shapes that surround a feature by a uniform distance. Buffers can be created around points, lines, and polygons. Buffers don’t share the attributes of the feature that they surround. Use spatial Joins to add the attributes.
Point in a Polygon
An ER Example World Database
Conceptual Model –3 Entities: Country, City, River –2 Relationships: capital-of, originates-in 3 Relations –Country(Name, Cont, Pop, GDP, Life-Exp, Shape) –City(Name, Country, Pop,Capital, Shape) –River(Name, Origin, Length, Shape) Keys Primary keys are Country.Name, City.Name, River.Name Foreign keys are River.Origin, City.Country Data for 3 tables
World database data tables
Populating Tables in SQL Adding a row to an existing table “INSERT INTO” statement Specifies table name, attribute names and values Example: INSERT INTO River(Name, Origin, Length) VALUES(‘Nile River’, ‘Habasha’, 6,6500) Related statements SELECT statement with INTO clause can insert multiple rows in a table DELETE statement removes rows UPDATE statement can change values within selected rows
World database data tables
SELECT Example 1. Simplest Query has SELECT and FROM clauses Query: List all the cities and the country they belong to. SELECT Name, Country FROM CITY Result
SELECT Example 2. Query: List the names of the capital cities in the CITY table. SELECT * FROM CITY WHERE CAPITAL=‘Y ’ Result
SQL Example 3 Query: List the attributes of countries in the Country relation where the life-expectancy is less than seventy years. SELECT Co.Name,Co.Life-Exp FROM Country Co WHERE Co.Life-Exp <70 Note: use of alias ‘Co’ for Table ‘Country’ Result
SQL Example 4: Multi-table Query Query: List the capital cities and populations of countries whose GDP exceeds one trillion dollars. Note:Tables City and Country are joined by matching “City.Country = Country.Name”. SELECT Ci.Name,Co.Pop FROM City Ci,Country Co WHERE Ci.Country =Co.Name AND Co.GDP > AND Ci.Capital=‘Y ’
SQL Example 5: Aggregate Staistics Query: What is the average population of the noncapital cities listed in the City table? SELECT AVG(Ci.Pop) FROM City Ci WHERE Ci.Capital=‘N ’ Query: For each continent, find the average GDP. SELECT Co.Cont,Avg(Co.GDP)AS Continent-GDP FROM Country Co GROUP BY Co.Cont
SQL Example 6 Query: For each country in which at least two rivers originate, find the length of the smallest river. SELECT R.Origin, MIN (R.length) AS Min-length FROM River GROUP BY R.Origin HAVING COUNT (*) > 1
Spatial Operations with SQL Topological and Set comparison Operations Equal: True if the interior and boundary of two geometries are equal. Disjoint: True if the boundaries and interiors do not intersect Intersection: true if geometries are not disjoint Touch: true if the boundaries of two surfaces intersect but their interiors do not Cross: true if the interior of a surface intersects with a curve. Within: returns true if the interior of a given geometry does not intersect with the exterior of another geometry Contains: tests whether the given geometry contains another given geometry. Overlap: returns true if the interiors of two geometries have non-empty intersecrtions.
Spatial Operations with SQL Spatial Analysis Distance: return the shortest distance between to geometries Buffer: returns zone around some geometries Intersection: Returns the intersection of two geometries Union: Returns the union of two geometries Difference: returns the portion of a geometry that does not intersect with another given geometry. SymmDif: returns the portions of two geometry that do not intersect with each other.
World database data tables
Using Spatial Operation in WHERE clause Query: Find the names of all countries which are neighbors of the United States (USA) in the Country table. SELECT C1.Name AS "Neighbors of USA" FROM Country C1,Country C2 WHERE Touch (C1.Shape,C2.Shape)=1 AND C2.Name =‘USA ’ Note: Spatial operator Touch() is used in WHERE clause to join Country table with itself. This query is an example of spatial self join operation.
Using spatial operator Distance Query: List the GDP and the distance of a country’s capital city to the equator for all countries. SELECT Co.GDP, Distance (Point(0,Ci.Shape.y),Ci.Shape) AS "Distance" FROM Country Co,City Ci WHERE Co.Name = Ci.Country AND Ci.Capital =‘Y ’
Spatial Query with multiple tables Query: For all the rivers listed in the River table, find the countries through which they pass. SELECT R.Name, C.Name FROM River R, Country C WHERE Cross (R.Shape,C.Shape)=1 Note: Spatial operation “Cross” is used to join River and Country tables. This query represents a spatial join operation.
buffer operations line buffer - how many people live near the polluted river? - what is the area impacted by highway noise?
Using spatial operation in an aggregate query Query: List all countries, ordered by number of neighboring countries. SELECT Co.Name, Count(Co1.Name) FROM Country Co, Country Co1 WHERE Touch (Co.Shape,Co1.Shape) GROUP BY Co.Name ORDER BY Count (Co1.Name) Notes: This query can be used to differentiate querying capabilities of simple GIS software (e.g. Arc/View) and a spatial database. It is quite tedious to carry out this query in GIS.
Nested Spatial Query Query: List the countries with only one neighboring country. A country is a neighbor of another country if their land masses share a boundary. According to this definition, island countries, like Iceland, have no neighbors. SELECT Co.Name FROM Country Co WHERE Co.Name IN (SELECT Co.Name FROM Country Co,Country Co1 WHERE Touch (Co.Shape,Co1.Shape) GROUP BY Co.Name HAVING Count(*)=1) Note: It shows a complex nested query with aggregate operations. Such queries can be written into two expression, namely a view definition, and a query on the view. The inner query becomes a view and outer query is run on the view. This is illustrated in the next slide.
Basic Functions