DAT319 - Building Location-Aware Applications in SQL Server 2008: Introducing the Spatial Data Type Michael Rys Principal Program Manager SQL Server Engine, Microsoft
Session Prerequisites General knowledge of SQL Server Interest in understanding how to use spatial information in your application
Demo A Teaser
Session Objectives and Agenda What is spatial? What kinds of spatial data are there? Why might you care? How can you use it? What’s coming in SQL Server 2008? Spatial types and operations Indexing Q&A
Spatial is about mapping... Many applications make very direct use of mapping. The map may very well be the primary output of these applications Examples: Consumer mapping products (Virtual Earth, etc.) Cadastral mapping Utility (electrical / water / gas) grid layouts Business geographics
Spatial is about more than mapping... Many applications may make use of spatial data, even if they do not explicitly make maps. Examples: Send warehouse pickers on efficient runs Predict bus arrival times Applying for building variances Your favorite LOB app here
What is Spatial Data? Vector Points LineStrings Polygons (Areas, Regions) Raster Satellite Imagery Digitized Aerial Photos
Spatial Data in SQL Server 2008 We’re providing vector support We’re targeting geospatial... Spatial data which is referenced to a location on the Earth Typically uses spherical coordinates or projected planar coordinates I’ll come back to this distinction in a moment...but, there is no restriction that the data is actually geospatial Only 2D for now.
Sample Query Which roads intersect Microsoft’s main campus? SELECT * FROM roads WHERE
It’s a big problem…
It can be quite complex, too...
Flat Earth Models Round Earth (planar) (geodetic)
State Plane Coordinate System
The right kind of data... Where? Lat: Lon: (NAD 83) - or - E: ft N: ft (WA N) source: SQL lives here.
E: ft N: ft E: ft N: ft...need the right treatment. Naive planar length: (nonsense!) Correct geodetic length: km Correct planar length: feet ( km) , ,
Planar and Geodetic Cover Different Scenarios Planar (flat-earth) Supports legacy and legal mapping requirements: surveyors and the specialist GIS crowd Interior spaces (building layouts, etc.) Computationally simpler Conceptually more difficult for geospatial Geodetic (round-earth) Supports existing long- range mapping requirements: military, shipping, etc. Supports new local applications Computationally more complex Conceptually simpler for geospatial
SQL Server 2008 Spatial Support Large (>8000 byte) CLR UDTs Geodetic Type: Geography Planar Type: Geometry Indexing July CTP November CTP 2 Spatial Types Geodetic Type: Geography Planar Type: Geometry Implemented as large CLR user-defined type (UDT) Spatial Operations as methods Indexing
Geodetic Type New type: GEOGRAPHY GEOGRAPHY can store instances of various types Points Line strings Polygons Collections of the above Methods for computing Spatial relationships: intersects, disjoint, etc. Spatial constructions: intersection, union, etc. Metric functions: distance, area
Geodetic Type Exposed as a “system CLR type” Based on CLR code and built in to SQL Server 2008 Assembly available for managed access Transport formats: Well-known text and binary formats (WKT and WKB) GML XML format Most data commonly available user data is geodetic Anything expressed as latitude/longitude This is the type we expect most people to be interested in
Example Code Create an instance: geography = geography::Parse(‘POINT( )’) Create a table: create table T(id int, region geography) Select some data select * from T where = 1
Planar Type Second type: GEOMETRY Very similar interface to geography Some semantics differ Following Open Geospatial Consortium (OGC) Simple features for SQL Single type implementation Same type represents points, lines, polygons
Demo Under the Covers
Spatial Indexing Basics In general, split predicates in two Primary filter finds all candidates, possibly with false positives (but never false negatives) Secondary filter removes false positives The index provides our primary filter Original predicate is our secondary filter Some tweaks to this scheme Sometimes possible to skip secondary filter A A B B C C D D A A B B D D A A B B Primary Filter (Index lookup) Secondary Filter (Original predicate) E E
The SQL Server Problem SQL Server has B-Trees Spatial indexing is usually done through other structures Quad tree, R-Tree Challenge: How do we repurpose the B-Tree to handle spatial queries? Add a level of indirection!
Mapping to the B-Tree B-Trees handle linearly ordered sets well We need to somehow linearly order 2-d space Either the plane or the globe We want a locality-preserving mapping from the original space to the line I.e., close objects should be close in the index Can’t be done, but we can approximate it
Simplified Index Example Overlay a grid over the Spatial data space2. Identify grids for spatial object to store in index3. Identify grids for query object(s)4. Intersecting grids identifies candidates Indexing Phase Primary Filter Secondary Filter 5. Apply actual CLR method on candidates to find matches
Implementation of the Index Persist a table-valued function Internally rewrite queries to use the table idgeometry 1g1 2g2 3g3 idcell_id Base Table T Internal Table for sixd CREATE SPATIAL INDEX sixd ON T(geography)
SQL Server 2008 Indexing Story Multi-Level Grid Much more flexible than a simple grid Hilbert numbering instead of Z-numbering Grid index features 4 levels Customizable grid subdivisions Customizable maximum number of cells per object Planar one grid Requires bounding box Geodetic two top-level grid projections of sphere No bounding box
Multi-Level Grid
Index Syntax Create index example: CREATE SPATIAL INDEX sixd ON spatial_table(geom_column) WITH ( BOUNDING_BOX = (0, 0, 500, 500), GRIDS = (LOW, LOW, MEDIUM, HIGH), CELLS_PER_OBJECT = 20) Use ALTER and DROP INDEX for maintenance.
Demo Indexing and Performance
Index Support New catalog views, DDL Events DBCC Checks File groups/Partitioning Aligned to base table Separate file group Full rebuild only Can be hinted Not supported: Online rebuild Parallel creation Database Tuning advisor
What we aren’t doing Raster data 3D Topology: Points make up LineStrings, LineStrings make up Polygons Network models Distance between cities along the road network
TechEd Presentations: SQL Server 2008: Beyond Relational Whitepapers: ial.mspxhttp:// ial.mspx Forum: ?ForumID=1629&SiteID=1http://forums.microsoft.com/MSDN/ShowForum.aspx ?ForumID=1629&SiteID=1 Weblogs: Related Content
Resources Technical Communities, Webcasts, Blogs, Chats & User Groups Microsoft Learning and Certification Microsoft Developer Network (MSDN) & TechNet Trial Software and Virtual Labs ult.mspx ult.mspx New, as a pilot for 2007, the Breakout sessions will be available post event, in the TechEd Video Library, via the My Event page of the website Required slide: Please customize this slide with the resources relevant to your session MSDN Library Knowledge Base Forums MSDN Magazine User Groups Newsgroups E-learning Product Evaluations Videos Webcasts V-labs Blogs MVPs Certification Chats learn support connect subscribe Visit MSDN in the ATE Pavilion and get a FREE 180-day trial of MS Visual Studio Team System!
Complete your evaluation on the My Event pages of the website at the CommNet or the Feedback Terminals to win!
© 2007 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.