Database Server Extension for managing and querying 4D gridded spatiotemporal data Presented at the Edinburgh e-Science Institute Nov 1-2, 2005 conference.

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

The Role of Error Map and attribute data errors are the data producer's responsibility, GIS user must understand error. Accuracy and precision of map and.
C6 Databases.
BARRODALE COMPUTING SERVICES LTD. Managing and serving large volumes of gridded spatial environmental data Adit Santokhee, Chunlei Liu,
Management Information Systems, Sixth Edition
Raster Based GIS Analysis
Packard BioScience. Packard BioScience What is ArrayInformatics?
Data Management Design
Organizing Data & Information
Geographic Information Systems
Printed by STORING AND MANIPULATING GRIDDED DATA IN SPATIALLY-ENABLED DATABASES Adit Santokhee, Jon Blower, Keith Haines Reading.
RIZWAN REHMAN, CCS, DU. Advantages of ORDBMSs  The main advantages of extending the relational data model come from reuse and sharing.  Reuse comes.
Database Management Systems (DBMS)
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
High-Speed, High Volume Document Storage, Retrieval, and Manipulation with Documentum and Snowbound March 8, 2007.
Chapter 1: Introduction to Spatial Databases 1.1 Overview 1.2 Application domains 1.3 Compare a SDBMS with a GIS 1.4 Categories of Users 1.5 An example.
IBM Informix 2010 © 2010 IBM Corporation Vehicle Tracking System Vaibhav S Dantale Prasanna A Mathada Prepared on: Nov 9, 2010 Last Updated : Jan 19, 2011.
Module Title? DBMS Introduction to Database Management System.
GADS: A Web Service for accessing large environmental data sets Jon Blower, Keith Haines, Adit Santokhee Reading e-Science Centre University of Reading.
Web-Enabled Decision Support Systems
Lecture On Database Analysis and Design By- Jesmin Akhter Lecturer, IIT, Jahangirnagar University.
1 CS 430 Database Theory Winter 2005 Lecture 1: Introduction.
Environmental Information System Framework for Pantex Plant
Simple Database.
Introduction: Databases and Database Users
IST 210 Introduction to Spatial Databases. IST 210 Evolution of acronym “GIS” Fig 1.1 Geographic Information Systems (1980s) Geographic Information Science.
Geographic Information System GIS This project is implemented through the CENTRAL EUROPE Programme co-financed by the ERDF GIS Geographic Inf o rmation.
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
MULTIMEDIA DATABASES -Define data -Define databases.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
The Client/Server Database Environment Ployphan Sornsuwit KPRU Ref.
BARRODALE COMPUTING SERVICES LTD. Spatial Data Activities at the Reading e-Science Centre Adit Santokhee, Jon Blower, Keith Haines Reading.
Chapter 18 Object Database Management Systems. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Motivation for object.
Database Architectures Database System Architectures Considerations – Data storage: Where do the data and DBMS reside? – Processing: Where.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Introduction to Database AIT632 Chapter 1 Sungchul Hong.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
Creating and Maintaining Geographic Databases. Outline Definitions Characteristics of DBMS Types of database Relational model SQL Spatial databases.
CIS/SUSL1 Fundamentals of DBMS S.V. Priyan Head/Department of Computing & Information Systems.
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
John Pickford IBM H11 Wednesday, October 4, :30. – 14:30. Platform: Informix Practical Applications of IDS Extensibility (Part 2 of 2)
Object Oriented Database By Ashish Kaul References from Professor Lee’s presentations and the Web.
BASINS 2.0 and The Trinity River Basin By Jóna Finndís Jónsdóttir.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
Client-Server Paradise ICOM 8015 Distributed Databases.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Chapter 6.  Problems of managing Data Resources in a Traditional File Environment  Effective IS provides user with Accurate, timely and relevant information.
Lecture 10 Creating and Maintaining Geographic Databases Longley et al., Ch. 10, through section 10.4.
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
1 Geog 357: Data models and DBMS. Geographic Decision Making.
Data The fact and figures that can be recorded in system and that have some special meaning assigned to it. Eg- Data of a customer like name, telephone.
Text TCS INTERNAL Oracle PL/SQL – Introduction. TCS INTERNAL PL SQL Introduction PLSQL means Procedural Language extension of SQL. PLSQL is a database.
Spatial Data Models Geography is concerned with many aspects of our environment. From a GIS perspective, we can identify two aspects which are of particular.
Technology Drill Down: Windows Azure Platform Eric Nelson | ISV Application Architect | Microsoft UK |
® Sponsored by Improving Access to Point Cloud Data 98th OGC Technical Committee Washington DC, USA 8 March 2016 Keith Ryden Esri Software Development.
Geographic Information Systems GIS Data Databases.
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
Spatial Data Activities at the Reading e-Science Centre
Lecture 8 Database Implementation
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Geographic Information Systems
What's New in eCognition 9
Value of SDBMS Non-spatial queries: Spatial Queries:
What's New in eCognition 9
What's New in eCognition 9
Geographic Information Systems
Database management systems
Working with Temporal Data
Presentation transcript:

Database Server Extension for managing and querying 4D gridded spatiotemporal data Presented at the Edinburgh e-Science Institute Nov 1-2, 2005 conference on “Spatiotemporal Databases” by Ian Barrodale Barrodale Computing Services Ltd. (BCS)

Barrodale Computing Services Ltd. (BCS) “At BCS we let the actual tasks that our clients are trying to accomplish guide our solutions, rather than producing software that dictates how clients can perform their work.” Provides customized software and R&D services to technical clients Successfully completed 450+ software development projects since incorporation in 1978 Long-term professional staff IBM Business Partner Major clients include: Canada - Province of BC (Elections BC, Ministry of Forests), DND,... USA - US Navy, NOAA, IBM, SPAWAR, Univ. of Mississippi,...

Barrodale Computing Services Ltd. (BCS) Some Application Areas: Defense Sciences (ASW, MCM, METOC) Elections/Census (Geo-Spatial Database) Forestry (Spatial Timber Supply Models) Terrain Modeling (Watershed Delineation) Seabed Monitoring (Gas Hydrates) Some Skill Sets: Mathematical Analysis Algorithm Development Signal & Image Processing Modeling & Simulation Software Engineering Spatial Data Analysis Spatial Database Design Database Server Extensions Large Dataset Management Graphical User Interfaces Data Visualization Web Map Services

Complex Data Simple Data Query No Query File System Relational DBMS Object Oriented DBMS Object Relational DBMS (Simplistic) Database Classification Matrix

File Server vs. RDBMS + File Server Files for both metadata and data vs. RDBMS for metadata & files for data. File Server alone: + Simpler. + Less expensive. ± Metadata stored in data file name/directory or inside gridded data file. RDBMS + File Server: + Integrity checking of metadata - integrity checking of metadata can be performed by built-in RDBMS features (check constraints, triggers, etc.). + Efficient access to metadata - e.g., indices can be used. + Easier to locate gridded data of interest - e.g., complicated queries on metadata can be performed. − Metadata separated from gridded data - data inconsistencies possible.

Object Relational DBMS RDBMS for metadata & fileserver for data vs. ORDBMS (metadata & data integrated). ORDBMS: + Improved concurrency - concurrent users can safely query the same gridded data. + Composite data types - gridded data bundled with their metadata. + Improved integrity - ability to reject bad gridded data before it is stored in ORDBMS. + Database extensibility - easy addition of data types and operations. + Uniform treatment of data items - SQL interface can perform complex queries based on any of these data items, e.g., metadata as well as gridded data; less need for custom 3GL programming. + Custom data access methods - e.g., R-tree indexes. + Point-in-time recovery of gridded data possible. + Built-in complex SQL functions for gridded data operations - e.g., aggregating, slicing, subsetting, reprojecting, subsampling,...

BCS specializes in ORDBMS Applications The current main platforms for BCS database applications are IBM Informix Dynamic Server and PostgreSQL. Object Relational Data Base Management Systems (ORDBMSs) have four features that set them apart from traditional DBMSs: User-defined abstract data types (ADTs). ADTs allow new data types with structures suited to particular applications to be defined. User-defined routines (UDRs). UDRs provide the means for writing customized server functions that have much of the power and functionality expressible in C. “SmartBLOBs”. These are disk-based objects that have the functionality of random access files. ADTs use them to store any data that does not fit into a table row. Flexible spatial indexing. R-tree indexing for multi-dimensional data enables fast searching of particular ADTs in a table. Example: Sum the “area” (UDR) of all “lakes” (ADT) contained (R-tree) in “British Columbia” (ADT)

Query: From a given point on a stream, what is the entire area from which drainage is received?

“SQL” Example 1: Find the area of the watershed that is upstream from where a given road crosses a given stream. SELECT Area(Watershed(streamElement, (Intersection(streamElement, roadElement)))) FROM streamNetwork, roadNetwork WHERE Overlap(Intersection(Box(streamElement), Box(roadElement)), userDefinedArea); Note: userDefinedArea is, say, a string provided by the user. UDRs: BOX - rectangle enclosing object INTERSECTION - common area OVERLAP - T or F WATERSHED - calculates watershed upstream from a point AREA - calculates area

“SQL” Example 2: Find all side-scan sonar images, that are in a user-defined area, with a heading within one degree of degrees and with an average slant range of less than 50 m. SELECT image FROM sonarImageArchive WHERE Overlap(Box(image), userDefinedArea) AND ABS(Heading(image) ) < 1.0 AND Average(SlantRange(image)) < 50.0; Note: userDefinedArea is, say, a string provided by the user. UDRs: SLANTRANGE - calculates slant range AVERAGE - calculates average HEADING - supplies heading of object ABS - absolute value BOX - rectangle enclosing object OVERLAP - T or F

“SQL” Example 3: In a user-defined area, overlay on a sea floor map all “West”-looking side scan sonar images of “sandy” sea floor bottom type. SELECT Overlay(image, map) FROM sonarImageArchive, seaFloorMapping WHERE Overlap(Box(image), userDefinedArea) AND Overlap(Box(map), Box(image)) AND SlantDirection(image) = “West” AND surfaceType = “sandy”; Note: userDefinedArea is, say, a string provided by the user, and surfaceType is a column in seaFloorMapping. UDRs: SLANTDIRECTION - calculates slant direction BOX - rectangle enclosing object OVERLAP - T or F OVERLAY - overlays one image on another

Gridded data occurs in meteorology, oceanography, the life sciences, non-destructive testing, exploration for oil, natural gas, coal & diamonds,… These datasets range from simple, uniformly spaced grid points along a single dimension (e.g., time series) to multidimensional grids containing several types of values (e.g., 4D cubes of meteorological attributes). Grids have typically been stored in simple files and then manipulated by programs that operated on these files. Nowadays there is increasing justification for storing and manipulating gridded data in DBMSs: the principal advantages are their ability to (i) ensure data integrity and consistency, and (ii) provide diverse users with independent and effective query-based access to these data across multiple applications and systems. Gridded Data in Databases

However, implementing an efficient gridded DBMS can be very challenging, particularly when it involves Binary Large Objects (BLOBs), user-defined abstract datatypes (ADTs) that encapsulate grid data structures and attributes, and user-defined routines (UDRs) with which applications can create, manipulate and access the gridded data stored in these new datatypes. BCS has developed an efficient technology that supports database storage, update, and fast retrieval of gridded data; it uses BLOBs, ADTs, and UDRs. Our first implementation of this technology was a Grid DataBlade for IBM Informix, and then a Grid Extension for PostgreSQL; we are currently developing an analogous Grid Cartridge for use with Oracle. Gridded Data in Databases

is designed to handle 1D, 2D, 3D, 4D (and “5D”) grids. stores grids using SmartBLOBS and a (user-controlled) tiling scheme that together permit very efficient generation of products (e.g., oblique slices or 1D sticks from 4D grids). sometimes provides more than 50-fold increases in speed of data product generation compared to the conventional approach that does not involve tiling or SmartBLOBs. can store the data in, and convert it between, hundreds of mapping projections. can handle irregularly spaced grids in any/all grid dimensions. can handle the presence of multiple vector and/or scalar values. provides several interpolation options. provides for convenient database loading and extraction of grid files via one form of the commonly used NetCDF format. provides C, Java, and SQL application programming interfaces. is supplied with full user/programmer documentation. The BCS Grid DataBlade/Extension

U.S. Navy Solution Worldwide weather grid Used API to develop grid types, functions & indexes Grid types Grid functions BCS Grid DataBlade ORDBMS SQL User query Get grid sample of interest Sample of interest

U.S. Navy: Tactical Environmental Data Services Humidity- Refractive Effects Air Temperature Aerosols Dust Trafficability Fog Soil Moisture Beach Profile Waves Reefs, Bars, Channels Sediment Transport Shelf / Internal Waves Swell / Wave Refraction Hydrography - Fine Scales Ice Biologics Slope (Sea Floor) Coastal Configuration Tidal Pulse Sensible and Latent Heat Wind Speed / Direction Land Cover Terrain Surf Turbidity Rain Rate Straits Island Flow Wind - Driven Circulation Wrecks

Medical Application Demo

Medical Application Demo

Grid Fusing: Visualized through IDV

Grid Fusing: S QL for this example SELECT GRDFuse( GRDFuseCollect(GRDPriorityGrid(image,1.0)), '((grdspec (translation ) (affine_transformation ) (dim_sizes )) (rules(weight)) )') FROM images i, places_of_interest p WHERE i.imageType = 'aerialPhoto' AND overlap(grdbox(i.image),grdbox(p.loc)) AND p.name = 'New Orleans';

SQL driving the Grid Fusion SELECT GRDFuse( GRDFuseCollect(GRDPriorityGrid(image,1.0)), '((grdspec (translation ) (affine_transformation ) (dim_sizes )) (rules(weight)) )') FROM images i, places_of_interest p WHERE i.imageType = 'aerialPhoto' AND overlap(grdbox(i.image),grdbox(p.loc)) AND p.name = 'New Orleans'; UDR to resample a set of grids into a single grid.

SQL driving the Grid Fusion SELECT GRDFuse( GRDFuseCollect(GRDPriorityGrid(image,1.0)), '((grdspec (translation ) (affine_transformation ) (dim_sizes )) (rules(weight)) )') FROM images i, places_of_interest p WHERE i.imageType = 'aerialPhoto' AND overlap(grdbox(i.image),grdbox(p.loc)) AND p.name = 'New Orleans'; Two UDRs to build a set of transient grids, associating a floating-point value with each of these grids. This floating-point value is later used to establish the relative weight of each grid’s elements in producing the fused grid. We’ve chosen each grid to have equal weight.

SQL driving the Grid Fusion SELECT GRDFuse( GRDFuseCollect(GRDPriorityGrid(image,1.0)), '((grdspec (translation ) (affine_transformation ) (dim_sizes )) (rules(weight)) )') FROM images i, places_of_interest p WHERE i.imageType = 'aerialPhoto' AND overlap(grdbox(i.image),grdbox(p.loc)) AND p.name = 'New Orleans'; Each source grid is resampled at the same locations, using the source images’ spatial reference system, which is a Lat-Lon grid. The fused grid’s horizontal resolution is degrees.

SQL driving the Grid Fusion SELECT GRDFuse( GRDFuseCollect(GRDPriorityGrid(image,1.0)), '((grdspec (translation ) (affine_transformation ) (dim_sizes )) (rules(weight)) )') FROM images i, places_of_interest p WHERE i.imageType = 'aerialPhoto' AND overlap(grdbox(i.image),grdbox(p.loc)) AND p.name = 'New Orleans'; The source of the grids is a table called “images”.

SQL driving the Grid Fusion SELECT GRDFuse( GRDFuseCollect(GRDPriorityGrid(image,1.0)), '((grdspec (translation ) (affine_transformation ) (dim_sizes )) (rules(weight)) )') FROM images i, places_of_interest p WHERE i.imageType = 'aerialPhoto' AND overlap(grdbox(i.image),grdbox(p.loc)) AND p.name = ‘New Orleans'; We use metadata stored in another column to pick only those images derived from aerial photographs.

SQL driving the Grid Fusion SELECT GRDFuse( GRDFuseCollect(GRDPriorityGrid(image,1.0)), '((grdspec (translation ) (affine_transformation ) (dim_sizes )) (rules(weight)) )') FROM images i, places_of_interest p WHERE i.imageType = 'aerialPhoto' AND overlap(grdbox(i.image),grdbox(p.loc)) AND p.name = 'New Orleans'; A second table called “places_of_interest” is used to include only source grids that overlap a region called “New Orleans”.

Barrodale Grid Datablade for IBM Informix select GRDExtract(grid, "((dim_sizes )(dim_names time level row column)(translation )(affine_transformation )(srtext from grdImages where g_keytext = "world2k"; Resample and reproject a 386 MB raster image of the World projectionDemo/ ProjectionApplet.html

Barrodale Grid Datablade for IBM Informix Sample a 4D grid along a flight path Head Wind Humidity Cross Wind Ocean Applications?

Grids can have 1, 2, 3, or 4 dimensions. Each grid point can store several variables. Some grid point values can be NULL. Grid spacing along axes can be non-uniform. BCS Grid DataBlade/Extension: Grids of Data

Orthogonal …. Oblique………………………... Radial …. BCS Grid DataBlade/Extension: Types of Extraction

BCS Grid DataBlade/Extension: Orthogonal Extraction

BCS Grid DataBlade/Extension: Oblique Extraction

BCS Grid DataBlade/Extension: Radial Extraction

Individual Points Appending Replacing Slices BCS Grid DataBlade/Extension: Types of Updates

BCS Grid DataBlade/Extension: Updating Points

BCS Grid DataBlade/Extension: Appending a Grid

BCS Grid DataBlade/Extension: Replacing a Slice

JoinNew JoinExisting Union BCS Grid DataBlade/Extension: Types of Aggregation

BCS Grid DataBlade/Extension: JoinNew

BCS Grid DataBlade/Extension: JoinExisting

BCS Grid DataBlade/Extension: Union

How much memory does the server need when extracting a large gridded derived product using the BCS Grid DataBlade (Informix) or the BCS Grid Extension (PostgreSQL)?

BCS Grid DataBlade: The Effect of Tile Size A good choice of tile size allows larger grids to be extracted.

…..is designed for applications where: 1. The data volumes are such that they can’t be kept in memory. 2. The amount of data extracted, in a particular query, is small relative to the amount stored. 3. The data needs some form of resampling. SUMMARY The BCS Grid DataBlade/Extension

CONTACT INFORMATION : Dr. Ian Barrodale, President Barrodale Computing Services Ltd. (BCS) P.O. Box 3075 STN CSC Victoria BC V8W 3W2 Canada (250) voice, (250) fax For more information about BCS projects, experience, and capabilities, please visit: