Download presentation
Presentation is loading. Please wait.
Published byAlisha Blake Modified over 9 years ago
1
1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric Lo Sindy Shou Hugh Wang
2
2 Efficient Processing of Spatial Join Using R-trees What is Spatial Data? Consists of points, lines, rectangles, polygons, surfaces… Two types of queries in DBS Single scan and Multiple scan queries How to retrieve spatial objects in GIS efficiently? Spatial Access Method (SAM) – eg. R*-tree
3
3 Designed to support single scan query eg. Window query “Find all objects which intersect a given window” Attempts to store objects which are close together in the data space on a common page Reduces number of disk accesses What is Spatial Access Method?
4
4 How is window query processed by SAM? 1) Filter step Find all objects whose minimum bounding rectangles intersects the query rectangle 2) Refinement step Check whether the objects fulfill the query condition
5
5 To combine two sets of spatial objects according to some spatial properties It is an important type of query for multiple scanning in spatial DBS What is Spatial Join?
6
6 Example of Spatial Join Two relations: forests, cities (Assume an attributes in each relation represents the borders of forests and cities) Example query would be: “Find all forests which are in a city”
7
7 Problems when performing Spatial Join It is too expensive in terms of CPU time and I/O time Traditional index structure is not efficient for spatial join How to make it more efficient? R*-tree
8
8 Why using R*-tree for Spatial Join ? To optimize CPU-time and I/O time Less comparison than a simple nested loop Other algorithms cannot be efficiently applied to spatial join
9
9 R*-tree Approach for Spatial Join Suppose there are two R*-trees R, S Idea: To use the property that directory rectangles form the minimum bounding box of data rectangles in the corresponding subtrees. If the rectangles of two directory entries E R and E S have common intersection then there is a pair (rect R, rect S )
10
10 Minimum Bounding Box
11
11 Is there anyway to be more efficient? There are two areas we need to take into account in order to be more efficient CPU – Time Tuning I/O – Time Tuning
12
12 CPU – Time Tuning Two ways to improve CPU – time Restricting the search space Spatial sorting and plane sweep
13
13 Restricting the search space Idea: Scan through each of two nodes marks all entries which are required for performing the join, (i.e. which intersect the intersecting rectangles of two nodes. ) Then, each marked entry of one node is tested against all marked entries of the other node.
14
14 Restricting the search space (cont’d) 1 4 3 2 5 6 7 1 2 3 4 6 5 7 Original: 7 of R * 7 of S 1 2 1 2 3 Now: 3 of R * 2 of S = 49 joins Plus Scanning: 7 of R + 7 of S =6 joins = 14 times
15
15 Spatial sorting and plane sweep Idea: Sort the entries in a node of the R*-tree according to the spatial location of the corresponding rectangles. Then move the Sweep-Line perpendicular to one of the axes from left to right to compute the intersections.
16
16 Example of Sorted Intersection Test t = r1 : r1 s1 t = s1 : s1 r2 t = r2 : r2 s2, r2 s3 t = s2 : - t = r3: r3 s3 Sweep-Line r1.xu s1.xl s1.xl < r1.xu
17
17 I/O Time Tuning To achieve good I/O-performance with a buffer size as small as possible R*-tree might occupy only small portion of LRU-buffer Compute a read schedule of the pages to minimize the number of disk accesses Local optimization policy based on spatial locality Idea of Read Schedule: If a frequently used page always resides in the buffer, the number of disk access can be improved by a lot
18
18 Three such techniques Local plane sweep Local plane sweep with pinning Local z-order
19
19 Local Plane-Sweep Order Idea: Based on spatial ordering, the plane-sweep algorithm creates a sequence of pairs of intersecting rectangles. This sequence can be used to determine the read schedule of the spatial join.
20
20 Local Plane-Sweep Order (cont’d) Read schedule: s1 r1 r2 s2 r3 r4 1 2 3 4 5 6 < s1 s2 r2 r1 r4 r3 >,,,,,
21
21 Local Plane-Sweep Order w/ Pinning Idea: 1. Determine a pair of (Er,Es) of entries wrt local plane sweep order. Compute the degree of the rectangles of both entries Deg(E.rect) = # of intersections between E.rect and the rectangles which belong to entries of the other tree that are not yet processed 2. Pin the page in the buffer whose corresponding rectangle has maximal degree 3. Perform spatial join on the pinned page with all other pages
22
22 Local Plane-Sweep Order w/ Pinning (cont’d) s1 r1 r2 s2 r3 r4 Er Es Er.rect = r1 Es.rect = s2 Deg(r1) = Deg(s2) = 0 2 1 2
23
23 Local Z-Order Idea: 1. Compute the intersections between each rectangle of the one node and all rectangles of the other node 2. Sort the rectangles according to the spatial location of their centers 3. Decompose the underlying space into cells of equal size and provide an ordering on this set of cells
24
24 Local Z-Order (cont’d) s1 r1 r2 s2 r3 r4 IV II I III IV Read schedule: II I III
25
25 Number of Disk Access 5384 5290 2373 2392 Size of LRU Buffer > <
26
26 Number of Disk Access (cont’d) Size of LRU Buffer
27
27 Q & A That’s it for the Presentation Any Questions?
28
28 Reference 1. Brinkhoff T., Kriegel H.P., Seeger B. (1993). Institute of Computer Science, University of Munich. Efficient Processing of Spatial Joins Using R-trees. Washington, DC, USA: ACM-SIGMOD.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.