SPATIO-TEMPORAL DATABASES Spatial Databases
Spatial Data. Representing Spatial Objects Spatial Data. Representing Spatial Objects. Spatial Notions and Semantics The workspace dimensionality: the space and the spatial objects are usually represented as 1D, 2D, or 3D spatial information. Partition: Let S be a non-empty set. Then P = {P1, P2, …, Pp} is a partition of S if and only if the following conditions are fulfilled: i:=1..p, Pi i, j:=1..p, i j, Pi Pj = Pi = S, i:=1..p In general, a partition is used in the decomposition of the workspace [Sa90]. Eg. for a 2D region of space, a planar decomposition with polygonal or non-polygonal shapes could be applied. One partition of the space is called regular if the polygonal components are regular. Otherwise, it is called non-regular. If a partition allows the recursive decomposition of its elements, then it is considered unlimited, otherwise – limited.
Figure 1.1.– 3 types of partitions: regular non-limited, regular limited, non-regular non-limited.
Spatial objects: frequently encountered are point, line, and region (simple, individual objects). Besides these, there are special data types, such as segment, half-segment (used in dual representation of a segment, considering that a half-segment corresponds to a segment end) [FG+00], circle, region with holes [FG+00, GB+00] etc. In addition, some applications require modeling and managing collections of spatial objects correlated in space: partition (e.g. map of regions), network or graph (e.g. transportation network, rivers, electricity network). Centroid: center of gravity / mass of a geometric figure.
The structure of space. There are two techniques for modeling spatial data in a computer system: raster (grid) and vector. In a vector model, a spatial information indicates where something is or where it occurs, and in a raster model it indicates something that exists or occurs everywhere.
The structure of space Raster (grid) models are representing the space covered by a set of cells and are treating information like temperature, pressure, altitude. Transforms a cell in a value of a given attribute domain. The neighbor cells that have associated the same value form a region, and such regions form a partition of the workspace. This kind of modeling provides a continuous view of space [CZ00]. It is generally materialized in the form of grid, within which each cell is rectangular. The parameters that define a raster model are: grid size, grid resolution, information about geography. The information about geography associates a cell with a specific location in the shaped reality. A cell may have associated values of certain attributes (thematic) (one or more) to its central point. The raster representation of spatial objects can capture continuous numeric values (qualitative information, e.g. temperature) or continuous categories (qualitative information, e.g. types of climates).
The structure of space The raster models are usually used in representing thematic maps (e.g. a map of temperature levels, types of vegetation, etc.). Where there are more than one such map defined on a spatial region for various features, their superposition / overlap is performed. Also individual spatial objects can be represented in a space partitioned as a grid (by the set of cells / pixels which intersect the object’s shape => spatial data is not represented as continuous geometrical shape, but is divided into discrete units of information). Advantages: uses simple data structures, simple procedures of spatial analysis. Disadvantages: need a relatively large storage space (which depends on the granularity of the grid cells and associated information), lack of accuracy of visual result of data for a less fine granularity.
The structure of space Vector-based models: any point is represented by coordinates relative to reference point belonging to the workspace. The mentioned spatial data types are easily represented and managed. Spatial data that is modeled using vector data represent discrete features, and a vector model captures only the relevant information (of represented objects), not of the whole workspace. Thus, it provides a discrete view of space [CZ00]. Advantages: less storage space than the raster models, topological relations are easily determined, displaying data is much more realistic and the resolution is not a parameter of the system, but is simply given by the data received in the system. Disadvantages: need more complex data structures, more expensive equipment and applications.
The discrete space domain The spatial domain of most of the application is seen (at least theoretically) as being the Euclidian space. Yet, because a computational system is limited in representing the infinite set of real numbers, modeling spatial data uses: A discrete domain (pre-defined data types), or A discrete domain re-defined (a custom domain, UDT). Case to be discussed: the intersection of two line segments, if the intersection point has coordinates that do not belong to chosen domain. There are two strategies, depending on the spatial model and the required accuracy: It is accepted that the final result is an approximation of the real result; Corrections are applied to the result by translating the real intersection point to a point that is situated in the working space [GS93]. (See intersection problems for realms.)
Spatial Databases Definition A spatial database is a database optimized for storing, managing, and querying spatial data. It provides spatial data types in its data model and query language, support for spatial indexing and for spatial join [Gu94]. Remark. The difference between image databases and spatial databases: image databases manage data that is introduced as digital images made with different equipment (cameras, satellites and so on); these images contain a set of objects, which then can be analyzed by different applications; spatial databases do not record images of objects, but values of spatial characteristics of a set of objects.
Modeling Spatial Data Realms Uses a discrete domain for representing space and spatial objects [GS93, GS95, Sc95]. The user can define a finite domain of numerical values that is used as the base in defining a set of spatial data types. Realm is a finite set of points and line segments defined over a finite domain, of type grid, so that: Each point is a point of the grid; Each segment end is a grid point; No point of the realm belongs to the interior of a segment; Any two distinct segments do not intersect and do not overlap. The spatial objects considered in the design process using realms are points, lines, and regions. These can be represented using only points and segments of the realm. Basically, a spatial object is not created on the realm, but there are construction elements associated to it (points and segments).
Modeling Spatial Data Realms Updates are not performed on the object, but on the elements of the realm, in this way they being propagated on the objects it contains. Advantages of modeling spatial data on a discrete domain using the realms: the possibility of defining different types of spatial data in the same area / domain, the property of closure is guaranteed (in spatial operations), and forcing consistency of geometric objects in spatial relations (e.g. adjacency of two objects of type region). Disadvantages: relatively difficult integration of realms in a DBMS, the cost of restoring the spatial objects from realm elements.
Figure 1.2. – discrete spatial domain of type grid and spatial objects of type point (P), line (L), and region (R).
Simplicial Complexes [Sc95] Considers the workspace as being continuous (theoretically) or discrete, uses a collection of non-regular geometric shapes. The space and the spatial objects are modeled by joining such basic shapes, called k-simplex. Definition Let k+1 points from Rn, v0, v1, ..., vk, such as the vectors v1 - v0, ..., vk - v0 are linearly independent. The set {v0, v1, ..., vk} is called geometrically independent and the set of points k = {x Rn | } Rn is called simplex of dimension k (k-simplex), with vertexes v0, v1, ..., vk. A k-simplex represents the convex closure of the k +1 points in at least k-dimensional space. Any k-simplex consists of k+1 simplexes of dimension k-1, where they are called faces of the k-simplex object.
Figure: simplexes of different dimensions
Simplicial Complexes k-complex: finite set of simplexes; the greatest dimension of a simplex is k. K-complex restriction: the intersection of two simplexes is the empty set or a common face of them. The spatial objects are located in this space and any such object is built by aggregation of objects of type simplex in the partition. Advantages: preservation of topological consistency between spatial objects and easy implementation of data structures and algorithms for management of simplicial complexes. Disadvantages: high cost of workspace triangulation and calculation of numerical operations, such as distances.
Figure
Geo-Relational Algebra A data model defined on the Euclidian space (discretized) [Gu88], proposed in order to be implemented on top of a relational DBMS. Offers spatial data types and operators for spatial data (=> geo-relational algebra) Spatial data types: Point Line – chained list of segments; simple lines (non self-intersecting lines) Pgon – closed chained list of segments; simple polygons, convex or concave Area – similar to Pgon; represents a region of a partition
Geo-Relational Algebra One spatial object is represented by a tuple within a table, and a table contains only objects of the same type (set of points, set of lines, etc.). Does not use decomposition of objects and does not borrow objects of the underlying space, but represents them as they exist in reality. Simple data structures. Does not allow storing data of different types in the same table within the database (the structure of the database depends on the application’s characteristics).
Spatial Model with Linear Constraints The database model with constrains [KK+90] was easily used in representing spatial objects [BB+97, GR+98a]. Therefore, each geometric object is represented as infinite set of points, by first-order formula. These formulas are given in the disjunctive normal form, and their terms are linear constraints of the form where {=, }, ai Z, p 1.
Spatial Model with Linear Constraints The geometric objects that can be represented using linear constraints: point, line segment, semi-line, line, polygon, or any kind of region (finite or infinite) of the space. Structure of space – this model corresponds to the vectorial one. Limit of the model: the possibility of representing only convex polygons using a conjunction of linear constraints (two or more linear constraints). In order to store a non-convex polygon, it is decomposed into convex polygons (therefore – it is stored as a union of geometric shapes = disjunction of conjunctions of linear constraints). Advantages: allows the representation of objects in an n-dimensional space, where n 1, however large, even if physically it's hard to imagine. In addition, it can represent infinite sub-spaces in a finite (limited) manner.
Figure 1.3. point (P), line segment (S), non-convex polygon (Pg), infinite region (Ri)
Table 1.1. The linear constraints for example from figure 1.3. Geometric object Linear constraints P x = 2 y = 5 S 3x - y = 2 -x -1 x 2 Pg -x -5 x 7 -y -2 y 6 -y -1 y 2 -x -5 x + y 8 R -x + y 1 2x + y 18
K-Spaghetti K-Spaghetti [LT92] – used for the representation of spatial objects in a k-dimensional vectorial space. It is frequently encountered in applications where the dimension of the workspace is 2 or 3. The purpose of the k-Spaghetti model – to provide a general way to represent geometric objects in a relational tuple or a set of tuples. Each spatial object is triangulated and each such obtained triangle is represented by a single tuple in the relation that stores the spatial objects. Able to represent objects of type point, line segment or polygon (possible, by degenerate rectangles)
Figure 1.4. point (P), line segment (S), non-convex polygon (Pg)
Table 1.2. Records for example from figure 1.4. OID x1 y1 x2 y2 x3 y3 P 2 6 S 1 4 3 Pg 8 7 5