Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spatial Databases: Lecture 3

Similar presentations


Presentation on theme: "Spatial Databases: Lecture 3"— Presentation transcript:

1 Spatial Databases: Lecture 3
DT249 Semester 2 Pat Browne

2 Outline We will look in more detail on what happens when a spatial table is constructed in PostgreSQL/PostGIS. We will describe how construct a table for a sub-set of the historical data set1. We take a closer look at a range of OGC queries that can be used in PostGIS2. A reminder of viewing and querying using OpenJump We will look at map accuracy. 1The historically data set is publically available from: See also 2For further details of the operations you should consult the OGC’s Simple Features for SQL and the PostGIS manual (local copies on the course web page).

3 PostgreSQL-PostGIS An object-relational DBMS with PostGIS spatial extensions. Is completely Open Source Compliant with OGS’s Simple Features for SQL Has a spaghetti-like spatial data model Spatial indexing Supports OGC types and PostgeSQL’s ‘native types’: point, line, box, path, polygon, and circle geometric types Topology is available in PostGIS2. Can perform overlay function Simple features are based on 2D geometry with linear interpolation between vertices.

4 PostGIS levels of representation

5 PostGIS S/W components
PostGiS provides Open Database Connectivity ODBC connectivity. PostGIS includes extensions to the underlying PostgreSQL ODBC drivers which allow transparent access to GIS objects from PostGIS via the ODBC protocol. ODBC connectivity is part of the OGC standard. PostGIS also provides Java Database Connectivity (JDBC), which is not part of the OGC standard. GiST (Generalized Search Tree) provides high speed spatial indexing. PROJ.4 is an open source library that provides coordinate reprojection to convert between geographic coordinate systems. GEOS (Geometry Engine, Open Source) is a library used by PostGIS to perform all the operations in the OpenGIS Simple Features for SQL Specification. The GEOS library is used to provide geometry tests (ST_Touches(), ST_Contains(), ST_Intersects()) and operations (ST_Buffer(), ST_Union(),ST_Intersection() ST_Difference()) within PostGIS.

6 PostGIS and OGC standard
PostGIS implements and is compliant with the OGC’s Simple Features for SQL standard. PostGIS supports all OGC types: Point, Line, Polygon, MultiPoint, MultiLine, MultiPolygon, GeometryCollection and operations on those types PostGIS uses OGC well-known text format on the SQL command-line to represent GIS features.

7 Creating a table. The basic steps to create a new spatially enabled table are: Create a table with the desired non-spatial attributes. Add a spatial column with as PostGIS/OGC extension AddGeometryColumn Insert the geometry with a SQL insert & select statements.

8 Creating a spatial table
We can also create a spatial table from an existing table. On the following slides we will describe how to make a table containing a subset of the historical data set from National Monuments Service1. We will make a table with the historical information for Dublin. We assume that the county2 table exists and that Dublin is a single region in the county table (Dublin consists of four regions). Note a system generated identified (gid) is used as the primary key. It is possible to include a geometry column at table creation time, but the system would not generate integrity constraints so we will stick with this method. 1,2 The data set is available in STUDENT-DISTRIB X:\PBrowne\SPATIAL-DATABASES-SOFTWARE-DATA\Data

9 Creating a spatial table, step 1
CREATE TABLE "public"."dublin_historical“ (gid serial PRIMARY KEY, "rmp_prop" int8, "map_symbol" int8, "entity_id" varchar(7), "co_id" int8, "smr_val0" numeric, "nat_grid_e" numeric, "class_desc" varchar(255), "nat_grid_n" numeric, "objectid" int8, "townlands" varchar(255), "scope_n1" varchar(255), "smrs" varchar(255));

10 Creating a spatial table, step 1
PostgreSQL/PostGIS will respond: NOTICE: CREATE TABLE will create implicit sequence "dublin_historical_gid_seq" for serial column "dublin_historical.gid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "dublin_historical_pkey" for table "dublin_historical"

11 Creating a spatial table, step 1
Examine the table (\d): Table "public.dublin_historical" Column | Type | Modifiers gid | integer | not null default nextval('dublin_historical_gid_seq'::regclass) Reset of data Indexes: "dublin_historical_pkey"PRIMARY KEY,btree (gid)

12 Creating a spatial table, step 2
Here we add the geometry column, PostGIS will automatically generate integrity constraints. SELECT AddGeometryColumn('public','dublin_historical','the_geom','29900','POINT',2); Do not execute this command in the lab, this table already exists. We use an version of the Irish National Grid with SRID = The type of the geometry to be stored in the column must be included, in this case it is POINT. This command accesses geometry_columns system table (details later).

13 Creating a spatial table, step 2
First system generated constraint ALTER TABLE dublin_historical DROP CONSTRAINT enforce_dims_the_geom; ALTER TABLE dublin_historical ADD CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2); DELETE FROM table WHERE (geometrytype(the_geom) = 'MULTILINESTRING'::text OR the_geom IS NULL);

14 Creating a spatial table, step 2
Second system generated constraint ALTER TABLE dublin_historical ADD CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = ‘POINT'::text OR the_geom IS NULL);

15 Creating a spatial table, step 2
Third system generated constraint ALTER TABLE dublin_historical ADD CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 29900);

16 Creating a spatial table, step 2
The Primary Constraint was created in step1 CONSTRAINT dublin_historical_pkey PRIMARY KEY(gid);

17 Creating a spatial table, step 3
Next we insert the data from the all Ireland historical table into the newly created dublin_historical table. Only data is contained in Dublin is inserted into the new table. Check the OGC & PostGIS documentation on the contains predicate AKA ‘a spatial relationship function’.

18 Creating a spatial table, step 3
INSERT INTO dublin_historical (rmp_prop,map_symbol, entity_id, co_id,smr_val0, nat_grid_e“ ,class_desc,nat_grid_n,objectid, townlands,scope_n1, smrs, the_geom) SELECT rmp_prop, map_symbol, entity_id, co_id,smr_val0, nat_grid_e, class_desc, nat_grid_n, objectid, townlands, scope_n1, smrs, h.the_geom FROM county c, historical h WHERE contains(c.the_geom,h.the_geom) AND c.name = 'Dublin';

19 PostGIS system tables The next few slides describe the built-in PostGIS meta-tables that provide the spatial functionality. We only outline the main features. For further details, please see the PostGIS manual.

20 geometry_columns table
Column | Type |Modifiers f_table_catalog | character varying(256) | not null f_table_schema | character varying(256) | not null f_table_name | character varying(256) | not null f_geometry_column | character varying(256) | not null coord_dimension | integer | not null srid | integer | not null type | character varying(30) | not null Indexes: "geometry_columns_pk" PRIMARY KEY, btree (f_table_catalog, f_table_schema, f _table_name, f_geometry_column) This table allows PostgreSQL/PostGIS to keep track of actual user spatial tables. The columns are as follows: F_TABLE_CATALOG, F_TABLE_SCHEMA, F_TABLE_NAME The fully qualified name of the feature table containing the geometry column. Note that the terms "catalog" and "schema" are Oracle-ish. There is not PostgreSQL analogue of "catalog" so that column is left blank -- for "schema" the PostgreSQL schema name is used (public is the default). F_GEOMETRY_COLUMN The name of the geometry column in the feature table. COORD_DIMENSION The spatial dimension (2, 3 or 4 dimensional) of the column. SRID The ID of the spatial reference system used for the coordinate geometry in this table. It is a foreign key reference to the SPATIAL_REF_SYS. TYPE The type of the spatial object. To restrict the spatial column to a single type, use one of: POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, GEOMETRYCOLLECTION or corresponding XYM versions POINTM, LINESTRINGM, POLYGONM, MULTIPOINTM, MULTILINESTRINGM, MULTIPOLYGONM, GEOMETRYCOLLECTIONM. For heterogeneous (mixed-type) collections, you can use "GEOMETRY" as the type.

21 spatial_ref_sys table
Displaying a spherical earth on a flat surface requires a projection. This table uses a standard numbering, called the EPSG1, to describe various projections. Using PostgreSQL’s expanded display we can examine the details for a particular projection representing the Irish National Grid: \x select * from spatial_ref_sys where srid=29900; 1EPGS stands for the now-defunct European Petroleum Survey Group: postgis=# select * from spatial_ref_sys where srid = 29900; -[ RECORD 1 ]---- srid | 29900 auth_name | EPSG auth_srid | 29900 srtext | PROJCS["TM65 / Irish National Grid (deprecated)",GEOGCS["TM65",DATUM ["TM65",SPHEROID["Airy Modified 1849", , ,AUTHORITY["EPSG"," 7002"]],AUTHORITY["EPSG","6299"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]] ,UNIT["degree", ,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4 299"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",53.5],PA RAMETER["central_meridian",-8],PARAMETER["scale_factor", ],PARAMETER["fal se_easting",200000],PARAMETER["false_northing",250000],UNIT["metre",1,AUTHORITY[ "EPSG","9001"]],AUTHORITY["EPSG","29900"]] proj4text | +proj=tmerc +lat_0=53.5 +lon_0=-8 +k= x_0= y_0=25000 0 +a= b= units=m +no_defs

22 spatial_ref_sys table
\d spatial_ref_sys Column | Type | Modifiers srid | integer | not null auth_name | character varying(256) | auth_srid | integer | srtext | character varying(2048) | proj4text | character varying(2048) | Indexes: "spatial_ref_sys_pkey" PRIMARY KEY, btree (srid)

23 OGC – Metadata tables FOSS Relational Database and GeoDatabase Part III Marco Ciolli, Fabio Zottele :

24 Valid Geometry? select IsValid(the_geom) from lakes;
NOTICE: Self-intersection at or near point NOTICE: Self-intersection at or near point NOTICE: Self-intersection at or near point isvalid t f IsValid checks for self-intersection. It does not check if the projection is correct. It is useful during geometry creation e.g. you cannot create a line with only one point.

25 Finding the centre point of a county
To find the centre point of each county: select name, asText(Centroid(the_geom)) from county;

26 Finding a bounding box To find the bounding box of each county:
select name, asText(envelope(the_geom)) from county; Or select name,extent(the_geom) from county group by name; Must use ‘group by’ which can handle multiple polygons, also extent returns the geometry as text, so no need for asText. 25In OpenJump select name, asBinary(extent(the_geom)) from county group by name; ST_AsBinary function to convert the PostGIS geometry to an OGC standard binary format

27 Finding the dimensions of a bounding box
select name, extent(the_geom) from county where name= 'Dublin County Borough' group by name; Returns Dublin County Borough | BOX( ,325443…) Using the above, we can measure the dimensions of the bounding box: select distance( geomFromText(‘Point….)’), geomFromText(‘Point….)’)); If the points were stored as degrees, we would need to use ‘transform’ and a pair of SRIDs to project them. For spherical coords we can use distance_spheroid. In OpenJump select name, asbinary(extent(the_geom)) from county where name= 'Dublin County Borough' or name = ‘Dublin Belgard’ group by name;

28 Point in Bounding Box We can use a non-OGC function && to find objects. SELECT name FROM county where ST_GeomFromText( 'POINT( )', 29900) && the_geom; See result on next slide. When constructing a query it is important to remember that only the bounding-box-based operators such as && can take advantage of the PostgreSQL’s GiST spatial index.

29 Result Point in Bounding Box
SELECT name, AsBinary(the_geom) FROM county WHERE GeomFromText( 'POINT( )', 29900) && the_geom; The AsBinary function is used to convert the PostGIS geometry to an OGC standard binary format.

30 Point in Bounding Box

31 Overlapping Bounding Boxes
SELECT asbinary(a.the_geom), a.name FROM county a, county b WHERE (b.the_geom && a.the_geom ) and b.name like 'Meath';

32 Find things near a point
select townlands,class_desc,scope_n1, asbinary(the_geom) from dublin_historical WHERE distance(the_geom, GeomFromText('POINT( )', 29900)) < 100; When we submit such a query we should have some idea of the type of data in the result set. To find out the expected return type use PostGIS manual or the OGC standard for SQL. You can issue this command from the Shell as follows: select townlands,class_desc,scope_n1 from dublin_historical WHERE distance(the_geom, GeomFromText('POINT( )', 29900)) < 100;

33 Find things near a point
SELECT c.name, h.townlands FROM county AS c,dublin_historical AS h WHERE distance(h.the_geom, GeomFromText('POINT( )', 29900)) < 1000 and c. Name = 'Dublin County Borough'; Do we know the type of data in the result set here?

34 Find things near a point
SELECT c.name, h.townlands FROM county AS c, dublin_historical AS h WHERE st_dwithin(h.the_geom, PointFromText('POINT( )', 29900),1000) and c. Name = 'Dublin County Borough'; Similar to previous query but using st_dwithin. ST_DWithin(geometry, geometry, float) Is a predicate that returns true if geometries are within the specified distance of one another. ST_Within(geometry A, geometry B) Returns 1 (TRUE) if Geometry A is "spatially within" Geometry B. A has to be completely inside B.

35 Find things near a point
SELECT gid,townlands, distance(the_geom, GeomFromText('POINT( )',29900)) FROM dublin_historical WHERE st_dwithin(the_geom, GeomFromText('POINT( )',29900),500) ORDER BY distance(the_geom, GeomFromText('POINT( )',29900)) LIMIT 100;

36 Finding the largest county in Ireland
SELECT name, area(the_geom)/10000 AS hectares FROM county ORDER BY hectares DESC LIMIT 1; Leaving out Northern Ireland SELECT name, area(the_geom)/10000 AS hectares FROM county WHERE name != 'Northern Ireland‘ ORDER BY hectares DESC LIMIT 1;

37 Stored and calculated areas
Area in square KMs SELECT name, area(the_geom)/ AS Calculated, area_km2 AreaStored FROM county ;

38 What is the length of roads fully contained within Dublin County Borough?
SELECT c.name, sum(length(r.the_geom))/1000 as roads_km FROM roads AS r, county AS c WHERE r.the_geom && r.the_geom AND contains(c.the_geom,r.the_geom) AND c.name = 'Dublin County Borough' GROUP BY c.name ORDER BY roads_km; The OpenJump version SELECT c.name,r.class, asbinary(r.the_geom) FROM roads AS r, county AS c WHERE r.the_geom && r.the_geom AND contains(c.the_geom,r.the_geom) AND c.name = 'Dublin County Borough';

39 What historical objects are near Dublin roads?
SELECT townlands FROM dublin_historical h, roads r WHERE distance(h.the_geom,r.the_geom) < 200; SELECT townlands, asbinary(h.the_geom) FROM dublin_historical h, roads r WHERE distance(h.the_geom,r.the_geom) < 200; SELECT count(townlands) FROM dublin_historical h, roads r, WHERE

40 Overlays  We should distinguish the overlay operation and tests for overlap. The OGCSFSQL use two similar keyword. Table-on-table overlays are possible with the ST_Intersection() function ST_Intersects(a,b) returns BOOLEAN ST_Intersection(a,b) returns GEOMETRY   ST_Intersects(a,b)=TRUE | FALSE ST_Intersection()=GEOMETRY

41 Efficiency of Search1 It is expensive to process the exact geometry of an object. Therefore approximations such as bounding boxes (BB) or a convex s (CH) are used to help to examine candidate objects and decided whether a candidate fulfils the query or not. The are used as ‘geometric filter’ 1. Efficient Spatial Query Processing in Geographic Database Systems Hans-Peter Kriegel, Thomas Brinkhoff, Ralf Schneider

42 Efficiency of Search1 1

43 Efficiency of Search1 1 This is a graphical representation of what the operations Buffer, Expand, Extent look like when applied to geometries. This is a graphical representation of what the operations Buffer, Expand, Extent look like when applied to geometries.

44 Network Queries In order to execute network queries, we need to augment the spatial information used in the OGC Simple Features for SQL standard. We will use pgRouting1. We added this feature to PostgreSQL in lab 1,

45 Shortest Path from Dublin to Waterford
Network Queries1 Shortest Path from Dublin to Waterford

46 What two houses within 500 meters of the Chester Beatty Library have the most residents?
SELECT b1.residents FROM buildings_geodir b1, buildings_geodir b2 WHERE b2.name = 'CHESTER BEATTY LIBRARY' and ST_DWithin(b2.the_geom,b1.the_geom, 500) ORDER BY residents DESC LIMIT 2; Note residents column will need to be added to the buildings_geodir table

47 Counties that have exactly 1 neighbour
SELECT c1.name FROM county c1, county c2 WHERE touches(c1.the_geom, c2.the_geom) = 'TRUE' GROUP BY c1.name HAVING count(c2.name) =1; For aggregate functions can be used (sum, max, min and count) the HAVING clause plays a similar role than WHERE. name Galway County Borough Donegal Cork County Borough

48 Proportions SELECT saps_label,primary_degree10_4,(male1_1 + female1_1) as Pop, ((cast (primary_degree10_4 as float)) / (cast ((male1_1 + female1_1) as float)) * 100) as Grad_percent FROM dublin_eds WHERE primary_degree10_4 IS NOT NULL ORDER BY Grad_percent DESC LIMIT 1; saps_label | primary_degree10_4 | pop | grad_percent 130 Pembroke West A | | 4262 | (1 row) To calculate percent, the basic formula is: (A/B)*100=C% Where, A is a subset, B is the total, and C is the resulting percent. SELECT saps_label,primary_de,(male1_1 + female1_1) as Pop, ((cast (primary_de as float)) / (cast ((male1_1 + female1_1) as float)) * 100) as Grad_percent FROM dublin_eds WHERE primary_de IS NOT NULL ORDER BY Grad_percent DESC LIMIT 1; saps_label | primary_de | pop | grad_percent 117 Mansion House A | | 3802 | Note table & column names may differ. Check with \d tableName

49 Accuracy You should calculate the area of county Meath and compare the result to the area stored in the county table, you should have get two different figures for the area of Meath. Which area is correct? To answer this question we would need to know the accuracy of the area stored in the database and the accuracy the map that we used to calculate the area.

50 Accuracy Obviously no spatial database or GIS can increase the positional accuracy of a spatial dataset. The accuracy does not change as the viewing scale changes. So we need to know the scale of the original survey and the expected accuracy at that scale.

51 Accuracy The accuracy of a map is dependent on the differences between the true position of features and their representative position in the map. To find the true position requires highly accurate devices such as industrial grade Global Positioning Systems (GPS) and sophisticated mathematical software.

52 Accuracy A common of way of defining positional accuracy for maps is to place limits on the root mean square error (RMSE) for individual position components (the X,Y and possibly Z i.e. height) The RMSE is derived from the square root of the average of the squared discrepancies when compared to a higher level independent survey. The RMSE is normally defined in terms of ground scale errors (e.g. +- one metre).

53 A Rough Guide to Accuracy
Scale Expected RMSE 1:1000 RMSE < 0.5 metres 1:2500 0.5 < RMSE < 2 1:10,000 4 < RMSE < 5 1:250,000 100 < RMSE < 120

54 A Rough Guide to Accuracy

55 A Rough Guide to Accuracy 1:2,500

56 A Rough Guide to Accuracy 1:50,000

57 Accuracy RMSE can be viewed as one criteria for positional accuracy. It is measured with respect to some more precise truth (often GPS). RMSE is a measure of absolute accuracy, with respect to a more precise framework. Together with an acceptable RMSE a map must be consistent within itself. That is the various components must fit together. This is a measure of relative accuracy.

58 Accuracy If data has a nominal scale of say 1:250,000, it still may be more precise than the rough guidelines on slide 51 would indicate. For example it could be derived from more accurate maps e.g. 1:50,000. Without meta-data it is difficult for humans or computers to know a map’s accuracy.

59 Accuracy The following table shows the expected absolute and relative accuracy values for well defined points within each accuracy category. The relative values apply up to the stated maximum measured distances quoted in the table.

60 From The British Ordnance Survey (OSGB)
Accuracy Original link seems to be broken From The British Ordnance Survey (OSGB)

61 SQL create table staff ( employee text, dept text, salary int4,
PRIMARY KEY (employee, dept) ); ;

62 SQL SELECT dept,employee,salary FROM staff AS p1 WHERE salary > (SELECT avg(salary) FROM staff as p2 WHERE p2.dept = p1.dept) ORDER by dept;

63 SQL SELECT employee, dept, salary FROM staff AS p1 WHERE salary > ANY (SELECT salary FROM staff as p2 WHERE p2.dept = p1.dept) ORDER BY dept;


Download ppt "Spatial Databases: Lecture 3"

Similar presentations


Ads by Google