Download presentation
Presentation is loading. Please wait.
1
Thomas ZahnCST1 Seminar: Information Management in the Web Query Processing Over Peer- to-Peer Data Sharing Systems (UC Santa Barbara)
2
Thomas ZahnCST2 Motivation E.g. find all object whose attribute values (NOT hash IDs!!) are between 100 and 200 DHTs poorly support range queries Due to hashing, semantically succeeding objects could be stored at "opposite" ends of the overlay for each value in range, a separate lookup needs to be issued
3
Thomas ZahnCST3 Overlay Object Placement d46a1c 3102ab 02 128 -1 d1a08e h(15) = d1a08e h(16) = 3102ab h(17) = d46a1c 15 16 17
4
Thomas ZahnCST4 Problem for each value in range, a separate lookup would have to be issued while theoretically possible for discrete sets (e.g. [10,11,12,…,50] ) completely impossible for continuous sets (e.g. [10.0, 50.0])
5
Thomas ZahnCST5 General Concept (1) uses 2-dimensional CAN virtual space virtual space is partitioned into rectangular zones each zone is owned by an active node each node maintains RT with its neighbors 4 7 (20,20) 3 6 5 1 (80,20) (20,80)(80,80) 35 50 2 61 3042
6
Thomas ZahnCST6 General Concept (2) node stores results of queries whose range are hashed to its zone range query hashed to target point (a,b) target zone, target node result of range query is stored at target node/zone 4 7 (20,20) 3 6 5 1 (80,20) (20,80)(80,80) 35 50 2 61 3042 e.g. range query
7
Thomas ZahnCST7 General Concept (3) given two range queries r1: and r2: two target points t1 (r1) and t2 (r2) 1.if a1 < a2 t1 lies to the left of t2 2.if b1 < b2 t1 lies below t2 3.t1 lies to the upper-left of t2 iff range r1 contains range r2
8
Thomas ZahnCST8 General Concept (4) range query hashed into zone A if any prior range query result containing exists must have been hashed to point in shaded region any intersecting zone can potentially contain a result C D B (x,y) A
9
Thomas ZahnCST9 General Concept (5) two target points t1 (r1) and t2 (r2) t1 lies to the upper-left of t2 iff range r1 contains range r2 Diagonal Zone: zone z (x1,y1),(x2,y2), zone z' (a1,b1),(a2,b2) z' is diagonal zone of z if a2 ≤ x1 and b1 ≥ y2 Intuitively: z' is diagonally above upper-left corner of z only non-empty zones exist a diagonal zone of z can answer ALL range queries that hash into z C z' B z
10
Thomas ZahnCST10 Zone Maintenance initially entire hash space is single zone assigned to one active node each active node has RT containing its neighbor active nodes along with their zone coordinates a zone splits when load (storage and/or processing) too high decision made by zone owner owner contacts a passive node assigns it portion of its zone transfer corresponding results, neighbor list
11
Thomas ZahnCST11 Query Routing (1) result likely to be cached at target zone range query is routed through virtual space toward its target zone starting at requesting zone, each zone passes query on to a neighboring zone a zone chooses neighbor zone whose coordinates are closest to target point process continues until target zone is reached
12
Thomas ZahnCST12 Query Routing (2) simple way: compute Euclidean distance between target point and center of a zone might not converge 1 2 4 3 t
13
Thomas ZahnCST13 Query Routing (3) distance of target t from a zone Z should be measured as the closest distance of t from the entire zone R5R1R6 R3ZR4 R7R2R8
14
Thomas ZahnCST14 Forwarding (1) query reaches target zone check local cache if no result containing query range is found forward query only zones to upper left of target point can have a result containing the given range 7 2 6 11 5 10 3 4 8 forwarding similar to flooding Forward Limit 0.0 – 1.0
15
Thomas ZahnCST15 Forwarding (2) Again: diagonal zones are especially interesting guaranteed to have a result containing the given range Because: every point in the diagonal zone contains the range every point lies to upper-left of target point BUT: zone may not have a diagonal zone 6 7 5 3 4 2 1
16
Thomas ZahnCST16 Updates tuple t with range attribute A=k is updated sent update message to target zone containing (k,k) tuple t included in all ranges s.th. a ≤ k ≤ b forward to all zones that lie on the upper left of target zone
17
Thomas ZahnCST17 Conclusion Does not assume natural equal distribution of attribute values Efficient average path length (O( ) ) BUT: hot spot nodes in upper-left section many splits heavy partitioning longer path length cached results may not reflect current result updates / deletion expensive
18
Thomas ZahnCST18 Questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.