A Self-adjusting Data Structure for Multi-dimensional Point Sets Eunhui Park & David M. Mount University of Maryland Sep. 2012
Motivation Sleator & Tarjan introduced the splay tree almost 30 years ago. Self adjusts to access distribution Supports insertion and deletion in O(log n) amortized time Efficient access: Balance property – m accesses in O((m+n) log n) time Scanning property [Elmasry 2004] – access all items in O(n) time Working set property – … on temporal locality Static optimality property – Efficient access based on frequency Static & dynamic finger [Cole, 2000] properties – … on spatial locality Is there a multi-dimensional generalization?
Background Compressed Quadtree Hierarchical partition of space O(n) space O(log n) access time if augmented: Topology tree [Frederickson1985, Har-Peled 2005 ] Skip quadtree [Eppstein, Goodrich, Sun 2005] Quadtreap [Mount, Park 2010] based on treap [Seidel, Aragon 1996] Efficient approximate proximity queries Approximate nearest neighbor search Approximate range search
Objective Like quadtrees: A versatile geometric partition tree Supports efficient approximate proximity queries Like splay trees: Adjusts to access distribution Supports insertion/deletion in O(log n) amortized time Supports splay tree access properties: balance, static optimality, working set, static finger Quadtree + Splay tree Splay Quadtree
Overview BD-tree Rotation Splaying operation Basic splaying Splaying Efficiency Insertion/deletion Search and access efficiency
BD-tree Each node is associated with a region of space called a cell. Each cell is defined by an outer box and an optional inner box. Partition operations: split and shrink. Internal nodes: split nodes and shrink nodes. Each leaf has a single point or a single inner box. Box Decomposition tree (BD-tree) : A geometric data structure based on a hierarchical decomposition of space into d-dimensional axis-aligned rectangles Box Decomposition tree (BD-tree) : A geometric data structure based on a hierarchical decomposition of space into d-dimensional axis-aligned rectangles box cell leaves
BD-tree: Partitioning Operations Split Partitions a cell by an axis-orthogonal hyperplane that bisects the cell’s longest side. Shrink Partitions a cell by a shrinking box, which lies within the cell. C D E D E C C F C FC\F C split shrink left right inner outer
BD-tree: Promotion By construction, nodes are generated in shrink-split pairs. We merge each into a single ternary node, called a pseudo-node. Tree can be restructured through a local operation, called promotion. A BC C DE A BDE y y x x CD E AB inner outer left right left right outer shrink node split node pseudo-node
Splay Quadtree Given an internal node, x, splay(x) uses promotions to transform x to the root of the tree This makes future accesses to x more efficient x b c d e f g g b x f c e d splay(x)
Basic Splaying As in Sleator & Tarjan, splaying is based on primitive operations: Zig-zag Zig-zig C DE A B FG A BC DE FG A BC E FG D A BC DE FG A BC E FG D E FG C D A B x x x x x x y y y y y y z z z z z z
The Problem of Right Promotion Inner-left convention: If an internal node’s cell has an inner box, it resides in its left child If necessary, left and right children are relabeled to satisfy this This guarantees that each cell has constant complexity Right promotion may violate this convention E y B CD x A D AE B C y x AD E B C If this cell has an inner box, u Now, y’s cell has two inner boxes, u and v ! u v v v u u
Splaying in 3-Phases Promotions must be carefully structured to avoid this problem 3-phased approach (3 passes from bottom to top) As in Sleator & Tarjan, amortized efficiency is established by a potential-based analysis. a b c d e f g a b c f g e d b g a f c e d g b a f c e d R O O O L L L L R R R R
Insertion and deletion Insert(q): locate leaf x containing q add q as new leaf splay(x) Insertion can be performed in O(log n) amortized time. Deletion can be performed in O(log n) amortized time. x x x q q
Analogous to Splay Trees
Static Finger Theorem ×
×
×
×
×
Conclusions Splay Quadtree: Self-adjusting geometric data structure Supports insertion/deletion in O(log n) amortized time Supports efficient approximate proximity queries Open problems: Other properties of standard splay trees? Dynamic finger theorem Scanning theorem Better notions of distance (or generally locality) in a geometric setting?
References
Thank you!