Scaling Bathymetry: Data handling for large volumes Mark Masry CARIS R&D Fredericton – Canada • Heeswijk – The Netherlands • Ellicott City – United States
Trends Sensors are getting better Processors are more powerful Hard Drives are getting bigger
Hard Drive Trends
Tech Summary Current storage formats for grids and points are good Single resolution Lots of data required in memory Goals: Rebuild data storage mechanisms from the ground up Use multiple resolutions Structure so that not all data is required in memory
New technology Stack Applications Grid Point Cloud DataFlow CSAR Framework Where does CSAR framework fit in technology stack - Information flow Storage Device
CSAR Framework A framework for managing chunks of data Data chunks have flexible structure Storage device independent The basis for new Grids and Point Clouds A platform for data management in the coming years
CSAR Block Diagram Set Layer Cache Layer DataFlow Storage Layer Storage Device
CSAR Set Layer Primary point of interaction with CSAR framework Associates data chunks with unique keys Keys can be anything Used to index a collection of chunks Retrieves chunks of data from Cache or from Storage layer
CSAR Caching Layer Stores chunks in memory in a common pool Cache size can be modified dynamically Layer can have single cache or split data up into multiple independent caches Chunks swapped in and out of cache from storage on request Lazy writeback
CSAR Storage Layer Communicates with Storage Device Could be proprietary file or database or network Translates chunks of data from storage into internal format Writes and reads can happen without blocking processing
Storage: CSAR File First implementation of a storage backend Designed to store large chunks of data Can have multiple Grids or Clouds in a single file Based on lightweight open source database
Storage: Oracle Spatial Write a new Storage Layer implementation Supports high volume Grid and Cloud type Use native Oracle Spatial data representation Translate data chunks to and from Oracle Spatial representation Store in cache layer then write them back
Data Structures Grid and Cloud High Volume Built using CSAR Multi-resolution Multi-band with many data types
Georeferenced Cloud Storage for High Volume (X,Y,Z) points Tested 300,000,000 points with multiple attribute bands and flags on each point Has both high level and low level structure for points Can view and interact with entire cloud in 2D and 3D Intend to edit directly on the cloud without extracting subsections
Georeferenced Grid Stores high volume gridded data > 40 Billion grid nodes with multiple attribute bands Programmable updaters to create multiple levels of resolution Inter-band dependencies are handled Also connect to GDAL
Visualization 2D/3D support for new Grid and Cloud Fast loads and zooms Smooth even for large data sets Rebuilt 3D visualization engine Dynamic lighting and colour mapping System brings in data in background while moving
Remote Visualization Raster and Clouds structured for remote visualization Visualization over web using Spatial Fusion Visualization over network using applications Remote visualization from Bathy DataBASE Fast load times facilitated by data structures.
3D Point Cloud Viz
3D Point Cloud Viz
3D Point Cloud Viz
3D Raster Viz
3D Raster Viz
3D Raster Viz
Conclusions CSAR provides a new platform for all our applications for the coming years Organizes, loads and caches data partitioned into chunks New data structures for gridded and point data designed for high volumes New visualization engine