An Open Source GIS Architecture Connected and Linked Data for Connected and Linked Data Jerry Hayes Frank Hardisty (Advisor)
Connected Data in GIS Today Many use cases in GIS Creates logical model to represent topology. Storage of logical model is not optimal. Problem: Relational database abstraction impedes performance and scalability.
Linked Data in GIS Today Links described by semantic relationships. Semantics enable data discovery. Early stages of adoption in GIS. Problem: Vast potential of the Semantic Web is unrealized in GIS!
How GIS Stores Connected Data Uses relational database tables Having trouble “visualizing” the network ? … so do machines! Abstraction introduces unnecessary overhead. Bad for large datasets!
Graph Databases for Connected Data Stores connected data in its native format. Much easier to visualize network … machines are happier too! Removes unnecessary overhead. Good for large datasets!
Database Performance Comparisons Performance comparisons are difficult. … how “connected is the connected data? Preprocessing data helps mitigate issues. … ESRI’s preprocessed logical network model. In general … i) RDBMS are optimized for aggregation queries ii) Graph databases are optimized for traversing.
Graph Database Characteristics Two basic properties define graph databases. Native Graph Processing Native Graph Storage
Linked Data … the Next Frontier Connects data to data on the Web Uses Resource Descriptive Framework (RDF). Creating quality linked data is challenging! Only useful in sufficient quality and quantity.
Accessing Linked Data Many RDF datasets are now available LinkedGeoData for GIS applications. Data quality, availability and stability concerns. Tools are available for accessing RDF models.
Open Source System Architecture Server side is stateless. PostGIS used for .. Storing physical model. Data visualization. Neo4j used for … Storing logical model Graph traversals Implemented in the IBM Cloud
Servlet Architecture Provides RESTful API. Enables spatial analytics Enables “data” discovery. Integrates physical and logical model processing. Implemented in the IBM Cloud