Spatial Access Methods

Slides:



Advertisements
Similar presentations
Multidimensional Data Structures
Advertisements

Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Spatial and Temporal Data Mining V. Megalooikonomou Spatial Access Methods (SAMs) II (some slides are based on notes by C. Faloutsos)
Spatial Data Structures Hanan Samet Computer Science Department
CMU SCS : Multimedia Databases and Data Mining Lecture#5: Multi-key and Spatial Access Methods - II C. Faloutsos.
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Chapter 4: Trees Part II - AVL Tree
Searching on Multi-Dimensional Data
Data Organization - B-trees. 11.2Database System Concepts A simple index Brighton A Downtown A Downtown A Mianus A Perry.
Multimedia Databases Chapter 4. Multidimensional Data Structures An important source of media data is geographic data. A geographic information system.
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
2-dimensional indexing structure
CSE332: Data Abstractions Lecture 9: BTrees Tyler Robison Summer
Spring 2008MMDB--MIRS1 Multimedia Information Retrieval Systems.
Spatial indexing PAMs (II).
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Multiple-key indexes Index on one attribute provides pointer to an index on the other. If V is a value of the first attribute, then the index we reach.
Spatial Indexing SAMs. Spatial Access Methods PAMs Grid File kd-tree based (LSD-, hB- trees) Z-ordering + B+-tree R-tree Variations: R*-tree, Hilbert.
Spatial Information Systems (SIS) COMP Raster-based structures (2) Data conversion.
Spatial Indexing SAMs.
Multi-dimensional Indexes
CS 206 Introduction to Computer Science II 12 / 03 / 2008 Instructor: Michael Eckmann.
Spatial Indexing I Point Access Methods.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Tirgul 6 B-Trees – Another kind of balanced trees Problem set 1 - some solutions.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
Multimedia Databases Chapter 4.
Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
Data Structures for Computer Graphics Point Based Representations and Data Structures Lectured by Vlastimil Havran.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Trees for spatial data representation and searching
UNC Chapel Hill M. C. Lin Point Location Reading: Chapter 6 of the Textbook Driving Applications –Knowing Where You Are in GIS Related Applications –Triangulation.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
10/09/2001CS 638, Fall 2001 Today Spatial Data Structures –Why care? –Octrees/Quadtrees –Kd-trees.
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
Temple University – CIS Dept. CIS616– Principles of Data Management V. Megalooikonomou Spatial Access Methods (SAMs) ( based on notes by Silberchatz,Korth,
Multi-dimensional Search Trees
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
Starting at Binary Trees
1 Tree Indexing (1) Linear index is poor for insertion/deletion. Tree index can efficiently support all desired operations: –Insert/delete –Multiple search.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Spatial Access Methods - z-ordering.
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
CS 206 Introduction to Computer Science II 04 / 22 / 2009 Instructor: Michael Eckmann.
Spatial Database 2/5/2011 Reference – Ramakrishna Gerhke and Silbershatz.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
Multi-dimensional Search Trees CS302 Data Structures Modified from Dr George Bebis.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
The Present. Outline Index structures for in-memory Quad trees kd trees Index structures for databases kdB trees Grid files II. Index Structure.
Multidimensional Access Structures COMP3017 Advanced Databases Dr Nicholas Gibbins –
Spatial Data Management
15-826: Multimedia Databases and Data Mining
Tree-Structured Indexes: Introduction
Multidimensional Access Structures
Spatial Indexing.
B+-Trees.
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
B+-Trees.
B+-Trees.
The Quad tree The index is represented as a quaternary tree
Spatial Indexing I Point Access Methods
KD Tree A binary search tree where every node is a
Orthogonal Range Searching and Kd-Trees
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
CSE 373 Data Structures and Algorithms
Multidimensional Search Structures
Presentation transcript:

Spatial Access Methods CSCI 2720 Spring 2005

General Overview Multimedia Indexing Spatial Access Methods (SAMs) k-d trees Point Quadtrees MX-Quadtree z-ordering R-trees

SAMs - Detailed outline spatial access methods problem definition k-d trees point quadtrees MX-quadtrees z-ordering R-trees

Spatial Access Methods - problem Given a collection of geometric objects (points, lines, polygons, ...) organize them on disk, to answer spatial queries

Spatial Access Methods - problem Given a collection of geometric objects (points, lines, polygons, ...) organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ queries)

Spatial Access Methods - problem Given a collection of geometric objects (points, lines, polygons, ...) organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ queries)

Spatial Access Methods - problem Given a collection of geometric objects (points, lines, polygons, ...) organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ queries)

Spatial Access Methods - problem Given a collection of geometric objects (points, lines, polygons, ...) organize them on disk, to answer point queries range queries k-nn queries (nn=nearest neighbors) spatial joins (‘all pairs’ queries)

Spatial Access Methods - problem Given a collection of geometric objects (points, lines, polygons, ...) organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ within ε)

SAMs - motivation Q: applications?

SAMs - motivation traditional DB GIS age salary

SAMs - motivation traditional DB GIS age salary

SAMs - motivation CAD/CAM find elements too close to each other

SAMs - motivation CAD/CAM

SAMs - motivation eg,. std S1 1 365 day Sn eg, avg 1 365 day F(S1) F(Sn) Sn eg, avg 1 365 day

SAMs: solutions K-d trees point quadtrees MX-quadtrees z-ordering R-trees (grid files) Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page)

SAMs - Detailed outline spatial access methods problem dfn k-d trees point quadtrees MX-quadtrees z-ordering R-trees

k-d trees Used to store k dimensional point data It is not used to store region data A 2-d tree (i.e., for k=2) stores 2-dimensional point data while a 3-d tree stores 3-dimensional point data, etc.

2-d trees – node structure Binary trees Info: information field Xval,Yval: coordinates of a point associated with the node Llink, Rlink: pointers to children Properties (N: node): If level N even -> for all nodes M in the subtree rooted at N.Llink: M.Xval < N.Xval for all nodes P in the subtree rooted at N.Rlink: P.Xval >= N.Xval If level N odd -> Similarly use Yvals

2-d trees – Example

2-d trees: Insertion/Search To insert a node N into the tree pointed by T If N and T agree on Xval, Yval then overwrite T Else, branch left if N.Xval < T.xval, right otherwise (even levels) Similarly for odd levels (branching on Yvals)

2-d trees – Example of Insertion City (Xval, Yval) Banja Luka (19, 45) Derventa (40, 50) Toslic (38, 38) Tuzla (54, 35) Sinj (4, 4) Splitting of region by Banja Luka Splitting of region by Derventa Splitting of region by Toslic Splitting of region by Sinj

2-d trees: Deletion Deletion of point (x,y) from T If N is a leaf node easy Otherwise either Tl (left subtree) or Tr (right subtree) is non-empty Find a “candidate replacement” node R in Tl or Tr Replace all of N’s non-link fields by those of R Recursively delete R from Ti Recursion guaranteed to terminate - Why?

2-d trees: Deletion Finding candidate replacement nodes for deletion Replacement node R must bear same spatial relation to all nodes in Tl and Tr as node N

2-d trees: Range Queries Q: Given a point (xc, yc) and a distance r find all points in the 2-d tree that lie within the circle A: Each node N in a 2-d tree implicitly represents a region RN – If the circle (specified by the query) has no intersection with RN then there is no point in searching the subtree rooted at node N

SAMs - Detailed outline spatial access methods problem dfn k-d trees point quadtrees z-ordering R-trees

Point Quadtrees Represent point data Always split regions into 4 parts 2-d tree: a node N splits a region into two by drawing one line through the point (N.xval, N.yval) Point quadtree: a node N splits a region by drawing a horizontal and a vertical line through the point (N.xval, N.yval) Four parts: NW, SW, NE, and SE quadrants Q: Quadtree nodes have 4 children?

Point Quadtrees Nodes in point quadtrees represent regions

Point quadtrees - Insertion City (Xval, Yval) Banja Luka (19, 45) Derventa (40, 50) Toslic (38, 38) Tuzla (54, 35) Sinj (4, 4) Splitting of region by Banja Luka Splitting of region by Derventa Splitting of region by Toslic Splitting of region by Tuzla Splitting of region by Sinj

Point Quadtrees - Insertion

Point quadtrees: Deletion Deletion of point (x,y) from T If N is a leaf node easy Otherwise a subtree (N.NW, N.SW, N.NE. N.SE) is non-empty Find a “candidate replacement” node R in one of the subtrees such that: Every other node R1 in N.NW is to the NW of R Every other node R2 in N.SW is to the SW of R etc… Replace all of N’s non-link fields by those of R Recursively delete R from Ti In general, it may not always be possible to find such as replacement node Q: What happens in the worst case?

Point quadtrees: Deletion Deletion of point (x,y) from T If N is a leaf node easy Otherwise a subtree (N.NW, N.SW, N.NE. N.SE) is non-empty Find a “candidate replacement” node R in one of the subtrees such that: Every other node R1 in N.NW is to the NW of R Every other node R2 in N.SW is to the SW of R etc… Replace all of N’s non-link fields by those of R Recursively delete R from Ti In general, it may not always be possible to find such as replacement node Q: What happens in the worst case? May require all nodes to be reinserted

Point quadtrees: Range Searches Each node in a point quadtree represents a region Do not search regions that do not intersect the circle defined by the query

SAMs - Detailed outline spatial access methods problem dfn k-d trees point quadtrees MX-quadtrees z-ordering R-trees

MX-Quadtrees Drawbacks of 2-d trees, point quadtrees: shape of tree depends upon the order in which objects are inserted into the tree splits may be uneven depending upon where the point (N.xval, N.yval) is located inside the region (represented by N) MX-quadtrees: shape (and height) of tree independent of number of nodes and order of insertion

MX-Quadtrees Assumption: the map is represented as a grid of size (2k x 2k) for some k When a region gets “split” it splits down the middle

MX-Quadtrees - Insertion After insertion of A, B, C, and D respectively

MX-Quadtrees - Insertion After insertion of A, B, C, and D respectively

MX-Quadtrees - Deletion Fairly easy – why? All point are represented at the leaf level Total time for deletion: O(k)

MX-Quadtrees –Range Queries Same as in point quadtrees One difference: Checking to see if a point is in the circle defined by the range query needs to be performed at the leaf level (points are stored at the leaf level)

SAMs - Detailed outline spatial access methods problem dfn k-d trees point quadtrees MX-quadtrees z-ordering R-trees

z-ordering Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page) Hint: reduce the problem to 1-d points(!!) Q1: why? A: Q2: how?

z-ordering Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page) Hint: reduce the problem to 1-d points (!!) Q1: why? A: B-trees! Q2: how?

z-ordering Q2: how? A: assume finite granularity; z-ordering = bit-shuffling = N-trees = Morton keys = geo-coding = ...

z-ordering Q2: how? A: assume finite granularity (e.g., 232x232 ; 4x4 here) Q2.1: how to map n-d cells to 1-d cells?

z-ordering Q2.1: how to map n-d cells to 1-d cells?

z-ordering Q2.1: how to map n-d cells to 1-d cells? A: row-wise Q: is it good?

z-ordering Q: is it good? A: great for ‘x’ axis; bad for ‘y’ axis

z-ordering Q: How about the ‘snake’ curve?

z-ordering Q: How about the ‘snake’ curve? A: still problems: 2^32

z-ordering Q: Why are those curves ‘bad’? A: no distance preservation (~ clustering) Q: solution? 2^32 2^32

z-ordering Q: solution? (w/ good clustering, and easy to compute, for 2-d and n-d?)

z-ordering Q: solution? (w/ good clustering, and easy to compute, for 2-d and n-d?) A: z-ordering/bit-shuffling/linear-quadtrees ‘looks’ better: few long jumps; scoops out the whole quadrant before leaving it a.k.a. space filling curves

z-ordering z-ordering/bit-shuffling/linear-quadtrees Q: How to generate this curve (z = f(x,y) )? A: 3 (equivalent) answers!

z-ordering z-ordering/bit-shuffling/linear-quadtrees Q: How to generate this curve (z = f(x,y))? A1: ‘z’ (or ‘N’) shapes, RECURSIVELY order-2 order-1 ... order (n+1)

z-ordering Notice: self similar (we’ll see about fractals, soon) method is hard to use: z =? f(x,y) ... order (n+1) order-2 order-1

z-ordering z-ordering/bit-shuffling/linear-quadtrees Q: How to generate this curve (z = f(x,y) )? A: 3 (equivalent) answers! Method #2?

z-ordering bit-shuffling x 0 0 y 1 1 z =( 0 1 0 1 )2 = 5 y 11 10 01 00

z-ordering bit-shuffling x 0 0 y 1 1 z =( 0 1 0 1 )2 = 5 y 11 10 01 00 How about the reverse: (x,y) = g(z) ? 00 10 x 01 11

z-ordering bit-shuffling x 0 0 y 1 1 z =( 0 1 0 1 )2 = 5 y 11 10 01 00 How about n-d spaces? 00 10 x 01 11

z-ordering z-ordering/bit-shuffling/linear-quadtrees Q: How to generate this curve (z = f(x,y) )? A: 3 (equivalent) answers! Method #3?

z-ordering linear-quadtrees : assign N->1, S->0 e.t.c. W E 00... 01... 10... 11... 1 N S 1

z-ordering ... and repeat recursively. Eg.: zgray-cell = WN;WN = (0101)2 = 5 W E 11 00 00... 01... 10... 11... 1 N S 1

z-ordering Drill: z-value of grey cell, with the three methods? W E 1 1

z-ordering Drill: z-value of grey cell, with the three methods? W E method#2: shuffle(11;10)= (1110)2 = 14 1 N S 1

z-ordering Drill: z-value of grey cell, with the three methods? W E method#2: shuffle(11;10)= (1110)2 = 14 method#3: EN;ES = ... = 14 1 N S 1

z-ordering - Detailed outline spatial access methods z-ordering main idea - 3 methods use w/ B-trees; algorithms (range, knn queries ...) non-point (eg., region) data analysis; variations R-trees

z-ordering - usage & algo’s Q1: How to store on disk? A: Q2: How to answer range queries etc

z-ordering - usage & algo’s Q1: How to store on disk? A: treat z-value as primary key; feed to B-tree PGH SF

z-ordering - usage & algo’s MAJOR ADVANTAGES w/ B-tree: already inside commercial systems (no coding /debugging!) concurrency & recovery is ready SF PGH

z-ordering - usage & algo’s Q2: queries? (eg.: find city at (0,3) )? PGH SF

z-ordering - usage & algo’s Q2: queries? (eg.: find city at (0,3) )? A: find z-value; search B-tree PGH SF

z-ordering - usage & algo’s Q2: range queries? PGH SF

z-ordering - usage & algo’s Q2: range queries? A: compute ranges of z-values; use B-tree PGH 9,11-15 SF

z-ordering - usage & algo’s Q2’: range queries - how to reduce # of qualifying ranges? PGH 9,11-15 SF

z-ordering - usage & algo’s Q2’: range queries - how to reduce # of qualifying ranges? A: Augment the query! PGH 9,11-15 -> 8-15 SF

z-ordering - usage & algo’s Q2’’: range queries - how to break a query into ranges? 9,11-15

z-ordering - usage & algo’s Q2’’: range queries - how to break a query into ranges? A: recursively, quadtree-style; decompose only non-full quadrants 12-15 9,11-15

z-ordering - usage & algo’s Q2’’: range queries - how to break a query into ranges? A: recursively, quadtree-style; decompose only non-full quadrants 12-15 9,11-15 9, 11

z-ordering - Detailed outline spatial access methods z-ordering main idea - 3 methods use w/ B-trees; algorithms (range, knn queries ...) non-point (eg., region) data analysis; variations R-trees

z-ordering - usage & algo’s Q3: k-nn queries? (say, 1-nn)? PGH SF

z-ordering - usage & algo’s Q3: k-nn queries? (say, 1-nn)? A: traverse B-tree; find nn wrt z-values and ... PGH SF

z-ordering - usage & algo’s ... ask a range query. PGH SF nn wrt z-value 12 3 5

z-ordering - usage & algo’s ... ask a range query. PGH SF nn wrt z-value 12 3 5

z-ordering - usage & algo’s Q4: all-pairs queries? ( all pairs of cities within 10 miles from each other? ) PGH SF (we’ll see ‘spatial joins’ later: find all PA counties that intersect a lake)

z-ordering - Detailed outline spatial access methods z-ordering main idea - 3 methods use w/ B-trees; algorithms (range, knn queries ...) non-point (eg., region) data analysis; variations R-trees ...

z-ordering - regions Q: z-value for a region? zB = ?? zC = ?? B A C

z-ordering - regions Q: z-value for a region? A: 1 or more z-values; by quadtree decomposition A B C zB = ?? zC = ??

z-ordering - regions Q: z-value for a region? zB = 11** zC = ?? W E A “don’t care” Q: z-value for a region? zB = 11** zC = ?? W E A B C 11 00 00... 01... 10... 11... 1 N S 1

z-ordering - regions Q: z-value for a region? zB = 11** “don’t care” Q: z-value for a region? zB = 11** zC = {0010; 1000} W E A B C 11 00 00... 01... 10... 11... 1 N S 1

z-ordering - regions Q: How to store in B-tree? Q: How to search (range etc queries) A B C

z-ordering - regions Q: How to store in B-tree? A: sort (*<0<1) Q: How to search (range etc queries) A B C

z-ordering - regions Q: How to search (range etc queries) – eg ‘red’ range query A B C

z-ordering - regions Q: How to search (range etc queries) – eg ‘red’ range query A: break query in z-values; check B-tree A B C

z-ordering - regions Almost identical to range queries for point data, except for the “don’t cares” - i.e., 1100 ?? 11** A B C

z-ordering - regions Almost identical to range queries for point data, except for the “don’t cares” - i.e., z1= 1100 ?? 11** = z2 Specifically: does z1 contain/avoid/intersect z2? Q: what is the criterion to decide?

z-ordering - regions z1= 1100 ?? 11** = z2 Specifically: does z1 contain/avoid/intersect z2? Q: what is the criterion to decide? A: Prefix property: let r1, r2 be the corresponding regions, and let r1 be the smallest (=> z1 has fewest ‘*’s). Then:

completely contained in z-ordering - regions r2 will either contain completely, or avoid completely r1. it will contain r1, if z2 is the prefix of z1 A B C 1100 ?? 11** region of z1: completely contained in region of z2

z-ordering - regions Drill (True/False). Given: z1= 011001** T/F r2 contains r1 T/F r3 contains r1 T/F r3 contains r2

z-ordering - regions Drill (True/False). Given: z1= 011001** T/F r2 contains r1 - TRUE (prefix property) T/F r3 contains r1 - FALSE (disjoint) T/F r3 contains r2 - FALSE (r2 contains r3)

z-ordering - regions Drill (True/False). Given: z1= 011001**

z-ordering - regions Drill (True/False). Given: z1= 011001** T/F r2 contains r1 - TRUE (prefix property) T/F r3 contains r1 - FALSE (disjoint) T/F r3 contains r2 - FALSE (r2 contains r3)

z-ordering - regions Spatial joins: find (quickly) all counties intersecting lakes

z-ordering - regions Spatial joins: find (quickly) all counties intersecting lakes Naive algorithm: O( N * M) Something faster?

z-ordering - regions Spatial joins: find (quickly) all counties intersecting lakes

z-ordering - regions Spatial joins: find (quickly) all counties intersecting lakes Solution: merge the lists of (sorted) z-values, looking for the prefix property footnote#1: ‘*’ needs careful treatment footnote#2: need dup. elimination

z-ordering - Detailed outline spatial access methods z-ordering main idea - 3 methods use w/ B-trees; algorithms (range, knn queries ...) non-point (eg., region) data analysis; variations R-trees

z-ordering - variations Q: is z-ordering the best we can do?

z-ordering - variations Q: is z-ordering the best we can do? A: probably not - occasional long ‘jumps’ Q: then?

z-ordering - variations Q: is z-ordering the best we can do? A: probably not - occasional long ‘jumps’ Q: then? A1: Gray codes

z-ordering - variations A2: Hilbert curve! (a.k.a. Hilbert-Peano curve)

z-ordering - variations ‘Looks’ better (never long jumps). How to derive it?

z-ordering - variations ‘Looks’ better (never long jumps). How to derive it? order-1 order-2 order (n+1) ...

z-ordering - variations Q: function for the Hilbert curve ( h = f(x,y) )? A: bit-shuffling, followed by post-processing, to account for rotations. Linear on # bits. See textbook, for pointers to code/algorithms (eg., [Jagadish, 90])

z-ordering - variations Q: how about Hilbert curve in 3-d? n-d? A: Exists (and is not unique!). Eg., 3-d, order-1 Hilbert curves (Hamiltonian paths on cube) #1 #2

z-ordering - Detailed outline spatial access methods z-ordering main idea - 3 methods use w/ B-trees; algorithms (range, knn queries ...) non-point (eg., region) data analysis; variations R-trees ...

z-ordering - analysis Q: How many pieces (‘quad-tree blocks’) per region? A: proportional to perimeter (surface etc)

z-ordering - analysis (How long is the coastline, say, of England? Paradox: The answer changes with the yard-stick -> fractals ...)

z-ordering - analysis Q: Should we decompose a region to full detail (and store in B-tree)?

z-ordering - analysis Q: Should we decompose a region to full detail (and store in B-tree)? A: NO! approximation with 1-3 pieces/z-values is best [Orenstein90]

z-ordering - analysis Q: how to measure the ‘goodness’ of a curve?

(#runs ~ #disk accesses on B-tree) z-ordering - analysis Q: how to measure the ‘goodness’ of a curve? A: e.g., avg. # of runs, for range queries 4 runs 3 runs (#runs ~ #disk accesses on B-tree)

z-ordering - analysis Q: So, is Hilbert really better? A: 27% fewer runs, for 2-d (similar for 3-d) Q: are there formulas for #runs, #of quadtree blocks etc? A: Yes ([Jagadish; Moon+ etc] see textbook)

z-ordering - fun observations Hilbert and z-ordering curves: “space filling curves”: eventually, they visit every point in n-d space - therefore: order-1 order-2 ... order (n+1)

z-ordering - fun observations ... they show that the plane has as many points as a line (-> headaches for 1900’s mathematics/topology). (fractals, again!) order-1 order-2 ... order (n+1)

z-ordering - fun observations Observation #2: Hilbert (like) curve for video encoding [Y. Matias+, CRYPTO ‘87]: Given a frame, visit its pixels in randomized hilbert order; compress; and transmit

z-ordering - fun observations In general, Hilbert curve is great for preserving distances, clustering, vector quantization etc

Conclusions z-ordering is a great idea (n-d points -> 1-d points; feed to B-trees) used by TIGER system and (most probably) by other GIS products works great with low-dim points