Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li.

1 Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

2 STEMPNU 2 Importance of Database Application of Spatial Databases (e.g. GIS) Garbage-In Garbage-Out About 70% of GIS Development Cost: DB Cost

3 STEMPNU 3 Comparison with Software Lifecycle Requirement Analysis Functional Specification Design Development Environments Coding Test Maintenance Software Life Cycle – Waterfall Model Requirement Analysis Modeling Schema Design DB Environments Data Collection and Input Quality Control Maintenance DB Life Cycle

4 STEMPNU 4 Requirement Analysis Analysis of Status  as it is and  as it shall be. Output of Analysis  Use-Case Diagram of UML: Workflow Analysis  Data items that have been maintained and to be maintained  Description of each item: Data Dictionary  Relationships and Constraints on items  Required accuracy Spatial Precision Temporal Precision Current State: As it isAs it must be

5 STEMPNU 5 Data Dictionary Definitions and Representation of Data Items such as  Precise definition of data elements  Integrity constraints or Constrains  Stored procedures and trigger rules  Specification of Producer and Consumer of data element Why it is so important?  Common understanding on data items  Consistency of databases  Important input to data modeling

6 STEMPNU 6 Data Modeling  Understanding the real world and application  A very small piece of the real world According to viewpoint Determined by applications  Drawing what you have understood in formal method Class Diagram in UML 4 steps  Definition of Entities  Attributes of each Entity  Relationships  Constraints

7 STEMPNU 7 Class Diagram: Basic DVD MovieVHS MovieVideo Game Rental Item {abstract} Rental Invoice 1..* 1 Customer Checkout Screen 0..1 1 Simple Association Class Abstract Class Simple Aggregation Generalization Composition (Dependency) Multiplicity MyClassName +SomePublicAttribute : SomeType -SomePrivateAttribute : SomeType #SomeProtectedAttribute : SomeType +ClassMethodOne() +ClassMethodTwo() Responsibilities -- can optionally be described here.

8 STEMPNU 8 Extract nouns from  Problem statement  Use-Case Diagram Delete unnecessary entities  Duplication  Attributes rather than entity ex. Loan amount Definition of Features  Geographic Entity  Granularity Definition of Entities MyClassName

9 STEMPNU 9 Definition of Features Feature  Meaningful Object of GIS in real world  Must have a geometry Point, Line, Polygon, etc.. How to define the Granularity of Features  Example How to define “a” coastal line? The highway from Pusan to Seoul is a long feature ?  How to separate this long road?

10 STEMPNU 10 Definition of Attributes Attributes of Feature  Geometric type: Spatial Attribute  Non-Spatial Attributes Geometric Type  Different Levels of Detail (LOD) Building  Polygon in 1/1,000 scale  Point in 1/1,000,000 scale Road  Polygon in 1/1,000 scale  Polyline in 1/1,000,000 scale MyClassName +SomePublicAttribute : SomeType -SomePrivateAttribute : SomeType #SomeProtectedAttribute : SomeType +GeometricAttribute

11 STEMPNU 11 Relationship  Non-Spatial Relationship  Spatial Relationship: Topology

12 STEMPNU 12 Constraints Example  No building on road surface  More than 50 meters between two poles Implementation  Internal Functions for checking constraints (or constructor)  Spatial OCL (Object Constraint Language) More detail and complete constraint  Better quality of DB

13 STEMPNU 13 Quality Control for Data Modeling For the quality control,  A Simulation with a pre-defined test scenario

14 STEMPNU 14 Schema Design Automatic Conversion from Data Modeling to Schema Check Points: Performance Issues  Materialization  Index  Geographic Distribution of DB: Clustering Based on Workload Analysis  Distribution of operations  Distribution of values

15 STEMPNU 15 Materialization In SQL, view is a virtual table derived from a Select statement  Eample CREATE VIEW ExcellentStudents AS SELECT Name, Department, Score FROM Students WHERE Score > 4.0 SELCT Name FROM ExcellentStudents Where Department=‘CS’ Invoke ExcellentStudents Materialization

16 STEMPNU 16 Materialize or Not ? Materialization  Duplication Not 3NF (BCNF) Cause an inconsistency between the original and derived tables Update: Overhead due to update propagation  Extra Space Requirements Should be determined depending on the WORKLOAD  Frequency of updates  Cost for update propagation Especially when materialized view is geographically distributed

17 STEMPNU 17 Spatial Index Index: Accelerate Search Spatial Index  Spatial predicates: contain, overlapping, k-NN  Much improves the query processing performance  Has a performance overhead for insertion/deletion Search Condition { Block# } Search Block Number Database on Disk 1 st Phase 2 nd Phase

18 STEMPNU 18 Clustering: Placement of records Vertical Fragmentation vs. Horizontal Fragmentation  Vertical Fragmentation: Decomposition of table  Horizontal Fragmentation: Placement of objects  Consideration on Workload Vertical FragmentationHorizontal Fragmentation

19 STEMPNU 19 Clustering Clustering: Grouping objects so as to maximize Prob(a  C, b  C), when O K =a and O K+1 =b for any two objects a and b of the same group C. Spatial Clustering  Basic Assumption: If dist(a,b) Prob(O K =a, O K+1 =c) Two consecutive accesses a b c

20 STEMPNU 20 Spatial Clustering Methods k-Means CLARANS in IEEE TKDE 2002, 14(5) BIRCH in proc. VLDB 1996 DBSCAN in proc. KDD 1996 SMTIN in proc. ACM-GIS 1997

