Spatial Database Engine

Slides:



Advertisements
Similar presentations
KEYS TO SUCCESS DATA PREPARATION AND ORGANIZATION
Advertisements

Introduction to IBM DB2 Keith T. Weber GIS Director- Idaho State University.
Chapter 10: Designing Databases
Geo GIS Practicuum Introduction to ArcGIS 8 Exercise 5 - ESRI Virtual Campus Chapters 1-2, ArcGIS Methods …
Types of geodatabases File geodatabases—Stored as folders in a file system. Each dataset is held as a file that can scale up to 1 TB in size. The file.
Bentley and ESRI Interoperability. Designed to serve all types of workflows Desktop Interoperability Server Interoperability.
Welcome to DEP’s GIS Workshop Series Workshop 2: GIS Data and File Types 1.
Understanding Servers Keith T. Weber GIS Training and Research Center Idaho State University.
 Workflow that manages concurrent multiuser editors on a single ArcSDE data source  Versions represent states or views of the geodatabase  Edits.
GI Systems and Science January 30, Points to Cover  Recap of what we covered so far  A concept of database Database Management System (DBMS) 
IS 466 ADVANCED TOPICS IN INFORMATION SYSTEMS LECTURER : NOUF ALMUJALLY 20 – 11 – 2011 College Of Computer Science and Information, Information Systems.
Mercator/Coronelli ArcGIS Server 9.3 Data Management GIS Web Services Mapping Application Developer Tools Spatial Analysis Publishing to Clients Image.
ArcGIS Geodatabase Miles Logsdon Spatial Information Technologies, UW Garry Trudeau - Doonesbury.
Copyright © 2005 Bruce Kessler All Rights Reserved Ch. 2 GeoDatabase Basics Laying the foundations.
Benefits and Concerns when Constructing an Enterprise-scale Geodatabase Larry Theller, presenter Agricultural and Biological Engineering Dept Purdue University.
Implementing ISO Aleta Vienneau and David Danko ESRI.
1 Introducing Scenario Network Data Editing and Enterprise GIS January 27, 2010 Minhua Wang, Ph.D. Citilabs, Inc.
School of Geography FACULTY OF ENVIRONMENT Introduction to ArcGIS 1.
Chapter 1 Introduction to Databases
Rebecca Boger Earth and Environmental Sciences Brooklyn College.
NSF DUE ; Laura Johnson Cherie Aukland.
Intro. To GIS Lecture 4 Data: data storage, creation & editing
GIS Maintenance & Tips Jennifer Kuchar. Maintenance is often the bottleneck of the entire GIS Enterprise Parcel Data AssessorRecorderAuditorSurveyor Planning.
University of California , San Diego (UCSD)
ArcGIS Workflow Manager An Introduction
Gary MacDougall Premjit Singh Managing your Distributed Data.
Data Structures & GeoDatabase. GeoDatabase Implemented in a relational database Comes in two flavors – Personal & Enterprise (Access & Sys. Like Oracle)
Cube Enterprise Database Solution presented to MTF GIS Committee presented by Minhua Wang Citilabs, Inc. November 20, 2008.
Implementing Geodatabase Technology
Troubleshooting Replication and Geodata Services
GEODATABASE Lower Adirondack GIS Users Group Meeting March 2, 2005 Lower Adirondack GIS Users Group Meeting March 2, 2005.
Faculty of Applied Engineering and Urban Planning Civil Engineering Department Geographic Information Systems Vector and Raster Data Models Lecture 3 Week.
material assembled from the web pages at
Major parts of ArcGIS ArcView -Basic mapping, editing and Analysis tools ArcEditor -all of ArcView plus Adds ability to deal with topological and network.
Introduction to the Geodatabase. What is a Geodatabase? What are feature classes and feature datasets? What are domains Design a personal Geodatabase.
ARCSDE & ARCIMS Mr. David A. Perini. ARCIMS  Internet Mapping Server Distribute GIS information over the Internet Integrates with addition ESRI softwareESRI.
Understanding our world.. Technical Workshop 2013 Esri International User Conference July 8–12, 2013 | San Diego, California Editing Versioned Geodatabases.
Introduction to GeoDatabase Lecture
Data Structures & GeoDatabase. Introduction You have been using GDBs from nearly the start of the course Why? Because I think that most of the time you.
L9 – The GIS Database Part 2. Relational Databases The relational database model was defined by E.F. Codd. This is the most common database design due.
CES – VCU November 2003 Geodatabases William Shuart Center for Environmental Studies Virginia Commonwealth University.
GIS Data Models Vector Data Models Vector File Formats Raster Data Models Raster File Formats.
Introducing ArcGIS Chapter 1. Objectives  Understand the architecture of the ArcGIS program.  Become familiar with the types of data files used in ArcGIS.
Working with ArcGIS Data Data Management and Tips Your friend…..ArcCatalog.
Creating and Maintaining Geographic Databases. Outline Definitions Characteristics of DBMS Types of database Relational model SQL Spatial databases.
Esri UC 2014 | Technical Workshop | Editing Versioned Geodatabases : An Introduction Cheryl Cleghorn and Shawn Thorne.
ESRI User Conference 2004 ArcSDE. Some Nuggets Setup Performance Distribution Geodatabase History.
Nhóm thực hiện: nhóm 1 Nguyễn Ngọc Trường Trần Minh Khang Bùi Quốc Huy Nguyễn Ngọc Linh Nguyễn Thành Phương Trần.
Intro to GIS | Summer 2012 Attribute Tables – Part 1.
Introduction to Active Directory
Copyright © 2006 by Maribeth H. Price 13-1 Chapter 13 Working with Geodatabases.
Towards Unifying Vector and Raster Data Models for Hybrid Spatial Regions Philip Dougherty.
Geodatabase Kyung Hee University Geography Jinmu Choi 1.
@2007 Austin Troy Lecture 2: Introduction to the Architecture of ArcGIS By Weiqi Zhou University of Vermont Thanks are due to Prof. Troy, upon whose lecture.
Introduction to Geodatabases
Chapter 14 Geodatabases.
Physical Structure of GDB
ESRI Geodatabases Ming-Chun Lee.
Types of geodatabases Introduction to GIS - Student notes
ESRI Geodatabases Ming-Chun Lee.
Esri’s ArcGIS Enterprise
Lower Adirondack GIS Users Group Meeting March 2, 2005
Introducing Scenario Network Data Editing and Enterprise GIS
Enterprise Geodatabase Administration – Tips and Tricks
Introducing Citilabs’ Scenario Based Master Network Data Model
Publishing image services in ArcGIS
ArcCatalog and Geodatabases
Geodatabase Best Practices
The Geodatabase : An Introduction
Presentation transcript:

Spatial Database Engine Keith T. Weber, GISP GIS Director Idaho State University

Today’s Topics What is SDE? Why use SDE? SDE Data Structure How is data stored within SDE? DEMO: Meet ArcSDE Professional GDB Enterprise workflow: Versioning and Replication

What is SDE? SDE A spatial database engine that works on an RDBMS. Helps to serve geospatial data to clients via a network SDE Is SDE a database? Does SDE store data or just manage data that is stored elsewhere?

Why use SDE? = Advantages: Data loss/integrity degradation through versioning Centralized data management Enterprise GIS Geo-spatial data is immediately usable = Define enterprise?

Why use SDE? (cont’d) Disadvantages Data management role RDBMS administration Capital expenditure Cost: Return on Investment? Unknown now

To Use SDE…or Not To Use SDE… What will help make this decision? ROI TCO Is this the correct technology for the problem? Let students brainstorm

ArcGIS Data Structures GDB Vector Objects Shape files Coverages Raster Objects Grids Images

The GDB Can store tables (just information), vector feature classes, and raster layers

Layers and Layer Files All GIS Datasets are considered LAYERs in ArcMap. A LAYER FILE is a file that you save in ArcMap to retain customized settings. This file refers to the LAYER (shape file, coverage, grid, or feature class) It displays the data with your saved visualization settings, textual annotation, etc.

Workspaces Arc/Info Collection of ArcView shape files Geodatabases Info folder Geodata sets (coverages, grids, TINs) Collection of ArcView shape files Geodatabases Hold on a minute…what in the world is a workspace? A workspace is a folder on your workstation where you store your GIS coverages and files. Each workspace needs an Info folder. You can instruct AI to create a workspace or AI will create a workspace for you when you set your workspace to an existing folder. How do you do this…simple type &workspace c:\winnt\profiles\…etc [NOTE THE &] If you are going to do all this typing it certainly makes sense to use AML’s. It will also be helpful to use AML’s to set up some routine configurations each time you run AI. To help you out and get you going I have created two simple AML’s that you should download from the Server and copy to your Personal profile folder. The personal folder will be your ROOT or HOME WORKSPACE. Your workspace will contain coverages. Coverages differ from themes in that they are a set of files stored in a coverage folder --the name of the coverage-- and the workspace’s info folder.

Coverages Tic Bnd Arc AAT, PAT Lets explore a coverage. Each coverage will contain a number of files. These are: Tic: The location of registration tics Bnd: The map extent of the coverage…boundary information Arc: Arc, Line work Arx: ArcIndex file, topological data Lab: Labels and label points AAT: Arc Attribute Table PAT: Point or Polygon Attribute Table LOG: Log file, a record of the coverage’s history Other topological database files exist as well. In this class and actually, whenever you work with AI, you will directly contact the TIC and BND files somewhat, but will use the AAT and PAT files most. The LOG file is also important but also as a reference. These files are stored in Info format. To access these you can use the Info module of AI and we will learn how to use it. You can also convert the files to Dbase files to viewing and editing.

GeoDatabases Personal File-based ArcSDE Personal ArcSDE Professional (or Enterprise)

Personal Geodatabases Uses the MS Access Jet Database engine Note: Do not open/edit these with MS Access Limitations 2GB (Access) Only vector feature classes are actually stored inside the Access database 4 users but only one editor Does not support versioning

File-based Geodatabase fGDB Stores vector and raster layers in the file/folder structure. Limitations Multi-user (max = 10) 1 Editor (no versioning) Max size is 1 TB RDBMS

ArcSDE Personal Uses MS SQL Server Express Limitations 4 GB Supports versioning/replication but only one editor

ArcSDE Professional Geodatabases Uses DB2, Oracle, Informix, SQL Server, etc. No software size limits and unlimited number of users Can accommodate vector and raster data

Given all these differences, there are really many similarities

Geospatial Data Storage (Vector) Geo-spatial data are stored as Feature classes Non-spatial data are stored as stand-alone tables Vector data is handled by DB2’s Spatial Extender. SDE is a broker. This schema shows paths for vector data storage as an example. Effectively, you must know and understand the data is stored as a feature class. It is no longer a coverage or a shape file. Raster data is also accommodated in SDE.

Geo-spatial Data Storage (Raster) Two methods Stand-alone raster data set Mosaic ArcSDE is not the best solution to store raster GIS data for the Enterprise Size considerations Performance issues Raster data is handled by SDE Stand-alone: Each raster grid/image imported becomes its own raster layer in SDE. Embedded: Each raster grid/image imported has most of its data stored separately within the RDBMS tables, some parts are stored collectively sort of like a grouping in ArcMap. The data is displayed in one piece. Data returned to the user is the index value for the raster layer (1 (the first one added to the catalog) through n). Mosaic: All raster grids/images imported are mosaiced together into one piece. The data is stored as one unit and displayed as one unit. Building pyramids and calculating statistics is very important. The number of “stats” records and pyramids records changes too. BRAINSTORM BENEFITS AND ADVANTAGES…METHODS VARY FOR DIFFERENT DATA SETS…

Internal Data Storage Within the DB2 RDBMS All data is stored within table spaces –referred to by Configuration Keyword. A Configuration Keyword points to a set of two table spaces: Attribute table space Coords table space Table spaces are invisible to the user or client.

Loading Vector Data into a GDB PART 1: Stand-alone feature classes Log into SDE as a manager level. In catalog Goto SDE, choose import Do singles…you will see why (field names)

The Spatial Index Grid Uniform grid of square tiles Like grid reference on a street map Each feature (lakes) referenced by one or more tiles Envelope of feature determines tiles occupied Spatial Index Key records occurrences of features in tiles Empty tiles not stored Reference ID Grid X, Grid Y

Loading Vector Data into ArcSDE PART 2: Feature classes within a Feature Data Set First, you need a Feature Data Set What is a Feature Data Set? What is the precision of the source data What will the future data be like for this feature class The data storage precision value is determined during import. Rule of thumb: Doubling precision results in 10% more table space usage (HD usage). To change the precision of the data storage, click change settings Spatial reference tab XY Domain Lower the domain (by default ArcGIS will set the precision at its highest value. This is determined by the full spatial extent of the data set to be imported. (min, max X,Y). Also mention the item names import…query import….and keywords

A Feature Data Set is: Required to implement Full Topology! What?!

Full Topology “The spatial relationship among feature classes participating in a topology layer” Must belong to a feature dataset Feature classes share geographic reference system, and spatial domain. More realistic representation of data AKA Shared topology or Advanced Topology.

A Feature Data Set then… Is an organizational tool used to ensure that all feature classes within it use a common: Geospatial reference system Spatial domain

Understanding the Spatial Domain Low-precision GDB Based upon LONG INTEGER (32-bit) What is the domain range of a LONG? High-precision GDB Based upon 64-bit Integer Covers a geographic reference systems “Horizon” Both X and Y coords can be stored in 4B space

Fitting the World into a LONG If we express the X,Y coordinates in the familiar Latitude/Longitude system… By whole degrees, we would use: Latitude -90 -- +90 (180 units) Longitude -180 -- + 180 (360 units) This is only 0.000009% of the 4B space Calc shown is for Longitude

Problems with this approach Resolution to 1 degree is terrible Wastes the capacity of LONG INTEGER

What if we use Decimal Degrees? Hold on! Decimals cannot be stored in an INTEGER data type Let’s just shift the decimal place to the right by multiplying the coordinate by a scaling factor e.g., 10 preserves one decimal place, 100 preserves decimal places etc.

Fitting the World into a LONG (revisited) By using a scaling factor of 1M, the world would fit nicely into a 3.6B space (there’s even a bit left over!) What is the spatial resolution of 1/1Mth of a degree? Approximately 1/10th of a millimeter! In Idaho, there is approximately 10,000m per 7.5’ quad (along the X) That is 1,333 meters per degree That is 0.00133 meter @ 1/1millionth of a degree That then is 1/10th of millimeter!

More about the High-Precision GDB Can be pGDB, fGDB, or SDE GDB Uses 64-bit integer to encapsulate the spatial horizon What? 64-bit numbers have a range of 18,446,744,073,709,551,616 That‘s 18 quintillion! http://www.jimloy.com/math/billion.htm

The Spatial Horizon? Essentially, it’s a spatial domain large enough the contain the entire earth at high-precision

Applying this to ArcGIS Rule #1, use the high-precision GDB model whenever possible. Why not always? Long is long paper. Precision may not be the best word…in actuality resolution would have been better.

Hints and Tips Optimize the spatial domain by using high-precision GDB Feature dataset If not, set up your low-precision Feature dataset to Allow for spatial growth Allow for improved instrumentation I would choose a precision of 1000 Make the min/max X,Y’s fit EVENLY around the study area or AOC

ArcSDE Professional Demo Import a vector data set into ArcSDE

The Future… SDE going away?

Think about it… Object-relational databases have native geospatial support ArcGIS for Server can make geospatial data available to the Enterprise Do we need an ArcSDE middle-ware? ArcGIS Spatial Data Server Spend some time discussing…

Questions…

Geodatabases in an Enterprise Workflow Keith T. Weber, GISP GIS Director, ISU GIS Training and Research Center

Understanding and managing workflow Presentation and Discussion Understanding and managing workflow

Let’s Get Started Adjectives GIS is… Data-driven Powerful Dynamic GIS is many things…many adjectives can be used to describe it. For our workshop today however, there is one property of GIS that we will concentrate on and that is “GIS is Dynamic”

GIS Data Life Cycle Create Data Change Happens! Edition Backup Edit Validate Update Metadata Change Happens! This cycle is not new… it is in fact, old…since the beginning of GIS this is how things were done. Let’s think about a roads layers. Create by digitizing…that is edition number one when it is done (fix the overshoots and dangles, and populate the database). Once we recognized that things have changed and our layer is outdated, we plan for how we will fix this. This is very task oriented… so we backup our data (copy it) and then proceed with editing it. Hopefully we validate our edits and update our metadata as well. Now we have a new edition of the roads layer and all is right with the world. There is nothing wrong with this cycle. IF the rate at which “change happens” and the demand/expectation for a new edition is not too frequent. For instance… 1 revolution per year or per quarter is not too bad. But what if the demand and expectation for a new edition…and the need for a need edition requires 1 revolution per week or per day?

The Bottleneck Distributing the new edition

The Solution Networks and the Internet

A New Problem is Born “MY” version

GIS Grows Up! RDBMS Keep the benefit of network connectivity Eliminate the problem of “MY” version Eliminate the bottleneck And, change the cycle of events

GIS Data Life Cycle Create Data Change Happens! Edition Version and/or Replicate Edit Validate: Synchronize or reconcile and post Update Metadata Change Happens! Two things have changed: Wording/terminology: We don’t backup today, instead we version and replicate. We don’t just check things over as our validation, today we also need to synchronize or reconcile and post. Colors: this is important as it symbolizes a distributed workflow within a team environment… a team that is part of the enterprise. Blue = manager, orange = technician/specialist, and green = metadata librarian.

Backup vs. Versioning Backups and archiving are still critical steps for the enterprise. BUT, not part of the GIS Life Cycle any longer

In the Beginning… Backups were made in case we really messed up Edits were made to the original Copies of the “clean” new edition were distributed

Today… The original [parent] is versioned [a child is born] Edits are made to the child, not the parent “Clean” edits are copied [synchronized or posted] to the parent.

Benefits Of This Approach Brainstorm!!! Minimize downtime Processes completed within the RDBMS

The Role of Backups Data retention and deletion Legal requirements

GIS Data Life Cycle…Today Create Data Change Happens! Edition Version and/or Replicate Edit Validate: Synchronize or reconcile and post Update Metadata This process is enterprise enabled. The manager, technician, and librarian do not have to be in the same office. Indeed, this cycle is “outsourcing-ready”…

Questions/Discussion?

Replication and versioning Presentation and Discussion Replication and versioning

What is Replication? Duplication Copying Mirroring Synonyms

True Replication… Does not need ArcGIS Every RDBMS can be replicated natively However, using ArcGIS to perform the replication Is easy Supports GIS workflows better

Why Replicate? Enable disconnected editing for: Performance/load balancing Network load reduction Publishing data to subscribers

Network Load Reduction The network is a primary bottleneck in a High-Capacity Enterprise Note: capacity refers to how many concurrent users a system can support without loss of performance

How Do I Replicate? We will cover this with the hands-on exercise As an overview… Version the database Replicate the database Edit/update Synchronize changes with the parent

So Replication is Versioning No… but replication uses a versioned database

What is Versioning? One database Parent edition (tables) remains live/usable Child edition(s) simultaneously edited Roll-up is seamless

Versioning: Principal Concepts Edits are stored in “Supporting Tables” Geographic changes (linework) are stored in Supporting Vector Tables Attribute changes are stored in Supporting Delta Tables.

Delta Tables A = Add (insert) D = Delete U = Update (delete existing then add)

A Tree is Formed As versions are created and changes are made, a tree grows Q: What kind of tree? A: A State Tree

Sort of an Upside-down Tree

The State Tree Tree Trunk Branches Default: state 0 Arthur’s Court sub-division [Another] sub-division Branches

Multiple Versions Multiple versions are allowed Versions can be based upon location (north edits, south edits), projects (sub-divisions), or other logic decided upon my the GIS Manager. Batch reconcile and post are supported

The Day of Reconciliation Arthur’s Court sub-division edits have been completed Time to reconcile This process looks for conflicts Once all conflicts have been resolved… Reconciliation is complete

Post To roll-up the edits back to the “trunk of the state tree” we Post

Considerations Performance can degrade with active databases Workflow itself can generate unnecessary versions Delta tables will become large over time DBMS statistics may need to be refreshed or reviewed by the DB Admin

The Cure For many of these ArcGIS-centric performance issues is compressing the database Moves common rows from delta tables into base tables Reduces depth of the state tree by removing states no longer needed

Compression Example Active editing sessions are shown in yellow Versions with no deltas since last reconcile/post are shown in hollow GIS Manager compresses, says, I do not need these versions any longer. They are eliminated.

Questions/Discussion?

Hands-On Exercise Practice both replication and versioning

Your Assignment Complete the exercise handouts Connecting to and using SDE on DB2 Practice both replication and versioning Read the PDFs in the SDE exercise folder Visit the URL link for Spatial Data Server and explore this topic

Key Concepts SDE is an engine layer residing between a spatially-enabled RDBMS and the GIS desktop. SDE enables Enterprise GIS SDE reduces data management responsibilities. Understand Enterprise workflow