Steve Wolfman Winter Quarter 2000

Steve Wolfman Winter Quarter 2000
CSE 326: Data Structures Lecture #19 Katie Quadtree, Multidimensional Superstar Steve Wolfman Winter Quarter 2000 Today we’re going to talk about k-D trees and Quad Trees.

Today’s Outline Multidimensional search trees Range Queries k-D Trees
Quad Trees I’ll start by introducing the ADT these represent. and then I’ll discuss what we want to do with that ADT. Then, we’ll actually talk about the trees. These are it for data structures, folks! After this we just talk about sorting and graphs. Everyone shed a tear.

Multi-D Search ADT Dictionary operations
create destroy find insert delete range queries Each item has k keys for a k-dimensional search tree Searches can be performed on one, some, or all the keys or on ranges of the keys 9,1 3,6 4,2 5,7 8,2 1,9 4,4 8,4 2,5 5,2 Multi-D searches are just an extension of regular searches in a lot of ways. Basically, we just have multiple keys all bundled together. In general, I’ll assume that the keys are all ints (and this can be important for some range queries) but they don’t have to be. The big new thing is range queries.

Applications of Multi-D Search
Astronomy (simulation of galaxies) - 3 dimensions Protein folding in molecular biology - 3 dimensions Lossy data compression - 4 to 64 dimensions Image processing - 2 dimensions Graphics - 2 or 3 dimensions Animation - 3 to 4 dimensions Geographical databases - 2 or 3 dimensions Web searching or more dimensions These pop up lots of places.

Range Query A range query is a search in a dictionary in which the exact key may not be entirely specified. Range queries are the primary interface with multi-D data structures. OK, what exactly is a range query? Why would we want to ask this? What if I wanted to find everyone with a given last name? What about if I wanted all the fire stations between two pairs of streets in Manhattan? What if I wanted all the stars whithin one light year of a given star?

Range Query Examples: Two Dimensions
Search for items based on just one key Search for items based on ranges for all keys Search for items based on a function of several keys: e.g., a circular range query There’s three main types of range queries. If I want everyone with a given last name, that’s a search on just one key. My Manhattan query gives a range on two keys (the aves and streets). Finally, my star query gives a sphere, a function of all three dimensions of the key space.

Range Querying in 1-D Find everything in the rectangle… x
OK, let’s go back to dictionaries. We talked a bit about range queries in dictionaries. How do we do a 1-D query?

Range Querying in 1-D with a BST
Find everything in the rectangle… x Well, let’s use a BST. Basically, we build a BST that keeps everything where it should be with respect to the other nodes in the tree. Then, we just use the boundaries of the range to decide whether to traverse left, right, or both.

1-D Range Querying in 2-D y x Well, we can do that in 2-D as well.
Just build a BST on one key. Great… but what if we query on both dimensions? x

2-D Range Querying in 2-D y x
OK, we ignore one dimension and just search on the other. This already looks a bit messy. I mean why are we even bothering with those bottom nodes? They’re way outside the range. How would this do in 100 dimensions? x

k-D Trees Split on the next dimension at each succeeding level
If building in batch, choose the median along the current dimension at each level guarantees logarithmic height and balanced tree In general, add as in a BST So, let’s just extend the idea of BSTs to higher dimensions. We’ll alternate which dimension “takes charge” at each level. Level 1 will split on x. Level 2 on y. Level 3 back to x. Level 4 y. Etc. If we choose the median at each level on that dimension while building, we’ll get a balanced tree! k-D tree node keys value The dimension that this node splits on dimension left right

Building a 2-D Tree (1/4) y x
Let’s build one on the points we saw earlier. Start by choosing the median in x. Now, everything on the left is in its left subtree. Everything on the right is in its right subtree. x

Building a 2-D Tree (2/4) y x Let’s split the right side.
We’ll need to split this on y right? Last time we split on x. Now, we split on y. x

Building a 2-D Tree (3/4) y x And so forth.
Now, let’s do the left side all at once. Where’s the first split? (2 choices: 3 and forth points vertically) x

Building a 2-D Tree (4/4) y x
OK, this structure shows how we divide up space for the keys. Draw a circle and ask where in the tree it goes. x

k-D Tree e h g l d k f c j i m b a a d b c e f h i g j m k l
Here’s the tree actually drawn out. To make it clearer, I’ve lettered the nodes. l d k f c j i m b a

2-D Range Querying in 2-D Trees
x y Now, let’s try that 2-D range query again. We just decide which way to traverse at each level using the dimension of the rectangle that we split on at that level. Notice that we still need to check whether each of the nodes we touch is actually in the range. Search every partition that intersects the rectangle. Check whether each node (including leaves) falls into the range.

Other Shapes for Range Querying
How about a circular query? Same deal. We can also just pretend for the first step (searching partitions) that this is a rectangle. All we do is put a rectangle around the shape (a square in this case) and search according to it. This is easier, but it might cause us to search areas we don’t have to. x Search every partition that intersects the shape (circle). Check whether each node (including leaves) falls into the shape.

Find in a k-D Tree find(<x1,x2, …, xk>, root) finds the node which has the given set of keys in it or returns null if there is no such node Node *& find(const keyVector & keys, Node *& root) { int dim = root->dimension; if (root == NULL) return root; else if (root->keys == keys) else if (keys[dim] < root->keys[dim]) return find(keys, root->left); else return find(keys, root->right); } Here’s the code for what we just did. I just pull the dimension out of the node and use that to compare left vs. right. Otherwise, this is just like a BST. Runtime? O(DEPTH OF NODE)!!!! But, k-D trees can suck. runtime:

k-D Trees Can Suck (but not when built in batch!)
5,0 insert(<5,0>) insert(<6,9>) insert(<9,3>) insert(<6,5>) insert(<7,7>) insert(<8,6>) 6,9 9,3 What’s the suck factor here? O(n). Worst case depth. In a balanced tree, this is O(log n). A random tree also has average depth O(log n). We’ll see something interesting and a bit different for quad trees. Speaking of which… 6,5 7,7 suck factor: 8,6

Find Example 5,2 find(<3,6>) find(<0,10>) 2,5 8,4 4,4 1,9
Compare on alternating keys. Notice that if we wanted to insert <0,10>, we’d just use a find to figure out where it goes and plug it in. What about delete? Pretty messy, huh? One thing we can do (not pretty) is rebuild the subtree rooted at the deleted node minus that node. At least we reestablish balance that way. 4,4 1,9 8,2 5,7 4,2 3,6 9,1

Quad Trees 0,1 1,1 0,0 1,0 Split on all (two) dimensions at each level
Split key space into equal size partitions (quadrants) Add a new node by adding to a leaf, and, if the leaf is already occupied, split until only one node per leaf quad tree node quadrant Quad trees don’t place their partitions based on the individual points, and they don’t split on just one dimension at a time. Instead, we’ll split the space into four equal squares at each step. How would this work in 3-dimensions? (CUBES!) Notice that we know a node is a leaf if _all_four_ of its pointers are NULL. 0,1 1,1 keys value x y Center: 0,0 1,0 Quadrants: 0,0 1,0 0,1 1,1 Center

Building a Quad Tree (1/5)
y Well, let’s build one. We start by splitting the top level. I’ll use these colors to encode where things are. We haven’t finished divvying up the points yet, so we need to split further. x

y I’ll start with the upper left. One more split divides up all the points, so we just make the little rectangles leaves containing those points. Next, the upper right. x

y We still haven’t split those up. Let’s split again. x

y That’s better! Now, I’ll do all the other splits. x

y Here’s the resulting tree. I’ll show you a slightly smaller one as an actual tree… x

Quad Tree Example a g b e f d c a g d e f b c
Here’s the tree for those points. Notice that the upper right causes a level with just one child. We’ll come back to that. Now, how do we do range queries? b c

2-D Range Querying in Quad Trees
Same as before. Just search whichever partitions intersect the rectangle. We can also do circular queries in the same fashion. (And single key queries) x

Find in a Quad Tree find(<x, y>, root) finds the node which has the given pair of keys in it or returns quadrant where the point should be if there is no such node Node *& find(Key x, Key y, Node *& root) { if (root == NULL) return root; // Empty tree if (root->isLeaf) return root; // Key may not actually be here int quad = getQuadrant(x, y, root); return find(x, y, root->quadrants[quad]); } Compares against center; always makes the same choice on ties. Here’s the code for find. I’ve left out the method of choosing the right quadrant, but it just compares the key’s values against the center. >> goes upper right << goes lower left >< goes lower right <> goes upper left What’s the runtime on this? O(depth) again. But, how deep can this get? runtime:

Quad Trees Can Suck a b suck factor:
The suck factor is equal to the worst case depth. BUT… The depth has to do with how close the closest pair of nodes is! The closer they are, the more partitions we need to break them up. Basically, the worst case is logarithmic in the inverse of the distance between the two closest nodes. Pretty weird, huh? a b suck factor:

Find Example find(<10,2>) (i.e., c) find(<5,6>) (i.e., d)
b e f d c a g d e f Here’s an example find… pretty straightforward. b c

… … Insert Example Find the spot where the node should go.
insert(<10,7>,x) a g b e f d c x a g … … To insert, we might need to split. Indeed, we might need to split several times. Find the spot where the node should go. If the space is unoccupied, insert the node. If it is occupied, split until the existing node separates from the new one. x g

Delete Example Find and delete the node.
delete(<10,2>)(i.e., c) a b c a g d e d e f Deleting, on the other hand, might destroy several splits (or none). f g Find and delete the node. If its parent has just one child, delete it. Propagate! b c

Nearest Neighbor Search
getNearestNeighbor(<1,4>) a b c a g d e d e f Here’s a nifty range-like query we can do on k-D and quad trees. Those doing maze generation… take heed. Here’s one way to find the closest point to a new dart thrown at the board. Do a find to get a “nearish” point. Do a circular range query until you get _any_ result. At that point, tighten the circle to include just that result. f g Find a nearby node (do a find). Do a circular range query. As you get results, tighten the circle. Continue until no closer node in query. b c Works on k-D Trees, too!

Quad Trees vs. k-D Trees k-D Trees Quad Trees Density balanced trees
Number of nodes is O(n) where n is the number of points Height of the tree is O(log n) with batch insertion Supports insert, find, nearest neighbor, range queries Quad Trees Number of nodes is O(n(1+ log(/n))) where n is the number of points and  is the ratio of the width (or height) of the key space and the smallest distance between two points Height of the tree is O(log n + log ) Supports insert, delete, find, nearest neighbor, range queries Let’s compare the two. There are, by the way, hybrids and variants of these that have difference performance characteristics and do better or worse in various situations.

To Do Get moving on Project IV Finish HW#5
Alright… GET MOVING ON PIV!!! Also, finish the homework.

Coming Up Project discussions in section tomorrow
Sorting on Friday and Monday with Special Guest Lecturer NO Quiz tomorrow HW#5 due (February 25th) Project IV due (March 7th; that’s only two weeks!) If there’s time, play hangman with the name of special guest lecturer :)

Steve Wolfman Winter Quarter 2000

Similar presentations

Presentation on theme: "Steve Wolfman Winter Quarter 2000"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Steve Wolfman Winter Quarter 2000

Similar presentations

Presentation on theme: "Steve Wolfman Winter Quarter 2000"— Presentation transcript:

Similar presentations

About project

Feedback