Download presentation
Presentation is loading. Please wait.
Published byMarcus Ryan Modified over 5 years ago
1
A Hierarchical Vector Data Model for Distributed Geospatial Processing
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 A Hierarchical Vector Data Model for Distributed Geospatial Processing Eric B. Wolf Barbara P. Buttenfield University of Colorado at Boulder NSF BCS Breathe! Just read this script. Make eye contact. Don’t even look at the overhead! Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
2
The Problem Spatial data is compiled by NMAs at fixed scales:
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 The Problem Spatial data is compiled by NMAs at fixed scales: E.g., 1:24,000, 1:100,000, 1:1,000,000 Mixing fixed-scale data corrupts topology. Incorrect generalization -> incorrect modeling output. Lack of persistent storage for generalized representations. NMAs have to manage multiple data sets representing the same features. People routinely bring fixed scaled data into GISystems and integrate them resulting in corrupted topology: The River and road might not cross at the bridge anymore Generalization is difficult: Incorrect generalization leads to incorrect vector lengths resulting in poor network analyses. Generalized data is not persistent: If the data does not hang around, how can anyone repeat your results? Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
3
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure
2/23/2019 Our Project: MRVIN MRVIN: Multiple Representations for Vector INformation. A hierarchical architecture for vector geospatial data. Provides multiple representations across a range of resolutions. Preserves topology within each data theme and between data themes. Sustains distributed data retrieval through standard interfaces. That’s Marvin – not M’sieur Vin… Multiple Representations for Vector Information A distributed hierarchical architecture for vector data providing multiple representations across scale While preserving topology within each data theme and between data themes (pause – DO NOT read slide) Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
4
Previous Work Ramer 1972, Douglas-Peucker 1973 Ballard 1981
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 Previous Work Ramer 1972, Douglas-Peucker 1973 Ballard 1981 Herschberger and Snoeyink 1992 Saalfeld 1999 Bertolotto and Egenhofer 1999 Buttenfield 2002 Representations are created from a large scale dataset by applying Ramer-Douglas-Peucker (RDP) algorithm which was developed independently by Ramer in 1972 and Douglas-Peucker in 1973 (and a few others as well). Ballard in 1981 proposed building a hierarchical data structure, strip-trees, to retain the levels of deconstruction from the RDP algorithm Herschberger and Snoeyink in 1992 optimized the RDP algorithm using Convex Hulls – the farthest point from a base line will like on the convex hull Saalfeld in 1999 took advantage of these Convex Hulls and showed how to resolve some topological inconsistencies. Bertolotto & Egenhofer 1999 used strip-trees for progressive transmission of vector features Buttenfield 2002 used strip-trees for progressive transmission of vector features while using convex hulls to maintain internal topology Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
5
Ballard Strip Tree (1981) Hierarchical Efficient storage Complete
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 Ballard Strip Tree (1981) Hierarchical MBR Efficient storage search Complete RDP order While RDP is recognized as one of the best ways to generalize vector representations, Recomputing the generalization from source data on each use is a potential source of error. The strip-tree provides a persistent hierarchy for retaining and retrieving generalized features. AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
6
Creation of Strip-Trees
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 Creation of Strip-Trees A quick look at RDP and the strip-tree hierarchy. A baseline between the first and last vertices of the original feature make up the first strip or root node in the tree. The vertex in the original feature most distant from this baseline is used as an end-point for two new strips This process is continued until all vertices are exhausted. Strip are linked in a binary-tree hierarchy. Note that the tree is not and will not be a balanced binary tree. Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
7
MRVIN Data Structure Presentation for the AAG National Conference 2007
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 MRVIN Data Structure In the MRVIN Data Structure: (click) The end-points of each strip is stored in the relational database with convex hulls. (click) Strips are stored in hierarchical Trees. (click) Representational levels are pre-calculated and stored as a sequence of strips. A relational data structure mirrors the basic structure of a multi-part feature class. (click) The Trees correspond to the parts of multi-part features. (click Groves correspond to Features storing the parts making up a feature. (click) Forests correspond to Feature Classes storing features that make up a thematic type. Multiple Forests can be stored to handle different thematic types. Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
8
Mathematical Topology
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 Mathematical Topology A topology on a set X is a collection T of subsets of X having the following properties: Ø and X are in T The union of elements of any subcollection of T is in T The intersection of the elements of any finite subcollection of T is in T Since topology is so significant to my work, I wanted to see how mathematicians define topology: “A topology on a set X is a collection T of subsets of X having the three properties shown” Example – take the set of contiguous 48 state boundaries – which is actually a set of lines and vertices: An example of an empty set would be the intersection of New York and New Mexico. You can take any part of the collection and combine it with any other part and that will be in the collection – say the four-corners states. You can any part of the collection and look at the overlap with any other part and that will be in the collection. The intersection of the four-corners states and the “Red States” Munkres, J. R Topology. Second ed. Upper Saddle River, NJ: Prentice Hall. P. 76. Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
9
Examples of Topologies
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 Examples of Topologies Mathematicians like to see how topologies compare: The topologies on the left are equivalent. Even though the areas of each polygon differ in size and shape, the set of vertices and line segments remains constant in order. (click) The topologies on the right are comparable. The top topology (states) is strictly coarser than the topology below (counties). The set of state boundaries is actually a subset of the set of county boundaries but is still a topology in itself. In mathematics, topologies are relations on sets. The set elements might be spatial or the might not. In our project, we are dealing with spatial topology. Equivalent Topologies Comparable Topologies Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
10
Internal & Relative Topology
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 Internal & Relative Topology This figure demonstrates an internal topology problem. RDP Generalization can result in features that self-cross. (click) The figure on the right is an example of a potential relative topology problem. Generalized to smaller scales, Boulder Creek might be represented as crossing Broadway at the intersection of Arapahoe when the bridges are actually some distance away. Self-Crossing Feature Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
11
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure
2/23/2019 Preserving Topology “If two convex hulls do not overlap, the contents of those hulls will not overlap.” (Saalfeld 1999) During RDP generalization, convex hulls at each strip level can be checked for overlaps (on the left). If there is an intersection, a “dirty flag” can be set in the database indicating that the Re-composition should go further down the tree to get the topologically correct representation. (on the right) Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
12
Preserving Topology in Multi-Part Features
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 Preserving Topology in Multi-Part Features Compound Vectors are stored as “groves” of “trees” The convex hull for each tree is calculated and stored. A convex hull for the grove is calculated and stored. Saalfeld’s test is applied among all trees in each grove and among groves. Unlike prior work on progressive vector transmission, MRVIN’s architecture stores compound vectors or multi-part features. The stream network on the right consists of five strip-trees. These five trees make up a Grove. (click) Convex hulls are stored for each strip tree. Convex hulls are stored for each grove of trees. Saalfeld’s test is applied among all trees and among all groves flagging All internal topology problems introduced by RDP. (pause) Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
13
Preserving Relative Topology
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 Preserving Relative Topology This is the “Road and river cross at the bridge” problem. (click) We again apply Saalfeld’s test to find overlapping Convex Hulls We can now descend the tree hierarchy in each looking for either the highest level where there is no overlap or the lowest level in the tree. This will determine the anchor points between the two themes. By integrating these points back into the representation, Relative topology is preserved. Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
14
MRVIN Architecture Presentation for the AAG National Conference 2007
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 MRVIN Architecture The basic idea behind MRVIN is to decompose or deconstruct large-scale features into multiple representations stored in a relational database. Looking at this diagram from left to right: Shapefiles and text files of vertex pairs are the primary input for data interoperability. A program using a C++ class structure mimicking the relational database structure decomposes each feature into strip-trees and stores them in a RDBMS Server. The relation database server stores data in the MRVIN data structure and allows for scalability. An automated routines on the RDBMS Server apply Saalfeld’s Test between themes and groves. Another automated routine calculates geometric structure signatures to be stored in the RDBMS. A representation extraction routine implements a CGI interface in C++ that accepts as inputs: Theme selection, spatial extent and desired scale And outputs transparent PNG files or shapefiles for features or the convex hulls. Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
15
Future Research Extending MRVIN to points and polygons.
Session: 4118 Distributed Geospatial Information Processing: Cyberinfrastructure 2/23/2019 Future Research Extending MRVIN to points and polygons. Managing dimensional collapse (when a polygon becomes a line or a point) ArcGIS script like “TerraServer Download”. Merge MRVIN into PostGIS and extend Minnesota Map Server to create tiles from MRVIN for WMS access. Keyhole Markup Language output for Google Earth. Future research directions, in addition to enhancing MRVIN to support more data types: For an undergraduate senior project, an ArcGIS script might be created similar to the “TerraServer Download” tool or MRVIN could be extended to generate KML for GoogleEarth. For an masters thesis, the MRVIN data structure could be merged into the PostGIS datamodel and Minnesota Map Server could be extended to generate tiles from MRVIN that can be accessed by any software that can consume WMS services. Of course, Babs and I have some loftier goals for MRVIN which we’ll report on at a later date. Presentation for the AAG National Conference 2007 AAG Imperial Ballroom A, SF Hilton - Friday, 4/20/07 at 8:00 AM
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.