Spatial (or N-Dimensional) Search in a Relational World Jim Gray, Microsoft Alex Szalay, Johns Hopkins U.

Slides:



Advertisements
Similar presentations
Data Challenges I'm Struggling With Jim Gray, Microsoft Research 1.Sneakernet is probably the best way to moving WAN data at 1GBps File transfer efforts.
Advertisements

Spatial (or N-Dimensional) Search in a Relational World Jim Gray.
1 There Goes the Neighborhood! Spatial (or N-Dimensional) Search in a Relational World Jim Gray, Microsoft Alex Szalay, Johns Hopkins U.
1 Distributed Computing Economics Slides at: Grayhttp://research.microsoft.com/~gray/talks Microsoft Research.
Dr. Alexandra I. Cristea CS 252: Fundamentals of Relational Databases: SQL5.
SQL: The Query Language Part 2
What is a Database By: Cristian Dubon.
Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
Advanced Topics in Algorithms and Data Structures Lecture 7.2, page 1 Merging two upper hulls Suppose, UH ( S 2 ) has s points given in an array according.
Optimized Stencil Shadow Volumes
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A Modified by Donghui Zhang.
INFS614, Fall 08 1 Relational Algebra Lecture 4. INFS614, Fall 08 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of.
Computational Geometry for the Tablet PC
CS447/ Realistic Rendering -- Solids Modeling -- Introduction to 2D and 3D Computer Graphics.
20 Spatial Queries for an Astronomer's Bench (mark) María Nieto-Santisteban 1 Tobias Scholl 2 Alexander Szalay 1 Alfons Kemper 2 1. The Johns Hopkins University,
2003 by Jim X. Chen: Introduction to Modeling Jim X. Chen George Mason University.
Efficient Distance Computation between Non-Convex Objects by Sean Quinlan presented by Teresa Miller CS 326 – Motion Planning Class.
Apex Point Map for Constant-Time Bounding Plane Approximation Samuli Laine Tero Karras NVIDIA.
Rutgers University Relational Algebra 198:541 Rutgers University.
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Concepts of Database Management Sixth Edition
Structured Query Language (SQL) A2 Teacher Up skilling LECTURE 2.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 3: Introduction.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
László Dobos, Tamás Budavári, Alex Szalay, István Csabai Eötvös University / JHU Aug , 2008.IDIES Inaugural Symposium, Baltimore1.
Data Access Patterns Some of the problems with data access from OO programs: 1.Data source and OO program use different data modelling concepts 2.Decoupling.
HAP 709 – Healthcare Databases SQL Data Manipulation Language (DML) Updated Fall, 2009.
Chapter 6 SQL: Data Manipulation (Advanced Commands) Pearson Education © 2009.
CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
1 Relational Algebra. 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of data from a database. v Relational model supports.
 Agenda 2/20/13 o Review quiz, answer questions o Review database design exercises from 2/13 o Create relationships through “Lookup tables” o Discuss.
G. Fekete, JHU Efficient search indices for geospatial data in a relational database Gyorgy (George) Fekete Dept. Physics and Astronomy Johns Hopkins University.
CS146 References: ORACLE 9i PROGRAMMING A Primer Rajshekhar Sunderraman
1 Relational Algebra and Calculas Chapter 4, Part A.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
1 Relational Algebra Chapter 4, Sections 4.1 – 4.2.
3D Object Representations
CSCD34-Data Management Systems - A. Vaisman1 Relational Algebra.
IST 210 The Relational Language Todd S. Bacastow January 2004.
Database Management Systems, R. Ramakrishnan1 Relational Algebra Module 3, Lecture 1.
 CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian.
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
Recent spatial work by Jim Gray and Alex Szalay Bob Mann.
January 23, 2016María Nieto-Santisteban – AISRP 2003 / Pittsburgh1 High-Speed Access for an NVO Data Grid Node María A. Nieto-Santisteban, Aniruddha R.
Lecture 3 With every passing hour our solar system comes forty-three thousand miles closer to globular cluster 13 in the constellation Hercules, and still.
Thinking in Sets and SQL Query Logical Processing.
 MySQL  DDL ◦ Create ◦ Alter  DML ◦ Insert ◦ Select ◦ Update ◦ Delete  DDL(again) ◦ Drop ◦ Truncate.
Relational Algebra COMP3211 Advanced Databases Nicholas Gibbins
Spatial Searches in the ODM. slide 2 Common Spatial Questions Points in region queries 1.Find all objects in this region 2.Find all “good” objects (not.
Select Complex Queries Database Management Fundamentals LESSON 3.1b.
Catalogs contain hundreds of millions of objects
More SQL: Complex Queries,
Module 2: Intro to Relational Model
Prepared by : Moshira M. Ali CS490 Coordinator Arab Open University
Relational Algebra Chapter 4 1.
3D Object Representation
Sky Query: A distributed query engine for astronomy
Chapter 2: Intro to Relational Model
Query Processing in Databases Dr. M. Gavrilova
Relational Algebra Chapter 4 1.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
The Relational Model Textbook /7/2018.
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
Example of a Relation attributes (or columns) tuples (or rows)
Chapter 2: Intro to Relational Model
Efficient Catalog Matching with Dropout Detection
3D Object Representation
LSST, the Spatial Cross-Match Challenge
Introduction to SQL Server and the Structure Query Language
Presentation transcript:

Spatial (or N-Dimensional) Search in a Relational World Jim Gray, Microsoft Alex Szalay, Johns Hopkins U.

Equations Define Subspaces For (x,y) above the line ax+by > c Reverse the space by -ax + -by > -c Intersect 3 half-spaces: a 1 x + b 1 y > c 1 a 2 x + b 2 y > c 2 a 3 x + b 3 y > c 3 x y x=c/a y=c/b ax + by = c x y

Domain is Union of Convex Hulls Simple volumes are unions of convex hulls. Higher order curves also work Complex volumes have holes and their holes have holes. (that is harder). Not a convex hull +

Now in Relational Terms create table HalfSpace ( domainID int not null -- domain name foreign key references Domain(domainID), convexID int not null,-- grouping a set of ½ spaces halfSpaceID int identity(),-- a particular ½ space x float not null, -- the (a,b,..) parameters y float not null, -- defining the ½ space z float not null, cfloat not null, -- the constant (c above) primary key (domainID, convexID, halfSpaceID) (x,y,z) inside a convex if it is inside all lines of the convex (x,y,z) inside a convex if it is NOT OUTSIDE ANY line of the convex select convexID-- return the convex hulls from HalfSpace-- from the constraints * x * y * z < l -- point outside the line? group by all convexID-- consider all the lines of a convexID having count(*) = 0 -- count outside == 0

The Algebra is Simple = spDomainNew = spDomainNewConvex = spDomainNewConvexConstraint = select * from float) Once constructed they can be manipulated with the Boolean = spDomainOr = spDomainAnd = spDomainNot varchar(8000))

What! No Bounding Box? Bounding box limits search. A subset of the convex hulls. If query runs at 3M halfspace/sec then no need for bounding box, unless you have more than 10,000 lines. But, if you have a lot of half-spaces then bounding box is good.

A Different Problem Table-valued function find points near a point –Select * from fGetNearbyEq(ra,dec,r) Use Hierarchical Triangular Mesh –Space filling curve, bounding triangles… –Standard approach 13 ms/call… So 70 objects/second. Too slow, so precompute neighbors: Materialized view. At 70 objects/sec it takes 6 months to compute a billion objects.

Zone Based Spatial Join Divide space into zones Key points by Zone, offset (on the sphere this need wrap-around margin.) Point search look in a few zones at a limited offset: ra ± r a bounding box that has 1-π/4 false positives All inside the relational engine Avoids impedance mismatch Can batch all-all comparisons 33x faster and parallel 6 days, not 6 months! r ra-zoneMax (r 2 +(ra-zoneMax) 2 ) cos(radians(zoneMax)) zoneMax x Ra ± x

In SQL: points near point select o1.objID -- find objects from zone o1 -- in the zoned table where o1.zoneID between -- where zone # and-- overlaps the circle and o1.ra quick filter on ra and o1.dec between and -- quick filter on dec and ( (sqrt( -- careful filter on distance Eliminates the ~ 21% = 1-π/4 False positives Bounding box

Do the following in {-1, 0, 1} example ignores some spherical geometry details in full paper insert neighbors-- insert one zone's neighbors select o1.objID as objID, -- object pairs o2.objID as NeighborObjID,.. other fields elided from zone o1 join zone o2 -- force a nested loop on = o2.zoneID -- using zone number and ra and o2.ra between o1.ra and o1.ra -- elided where -- elided margin logic, see paper. and o2.dec between and -- quick filter on dec and sqrt(power(o1.x-o2.x,2)+power(o1.y-o2.y,2)+power(o1.z-o2.z,2)) -- careful filter on distance

Summary SQL is a set oriented language You can express constraints as rows Then You –Can evaluate LOTS of predicates per second –Can do set algebra on the predicates. Benefits from SQL parallelism SQL == Prolog?

References Representing Polygon Areas and Testing Point-in-Polygon Containment in a Relational Database A Purely Relational Way of Computing Neighbors on a Sphere,