Download presentation
Presentation is loading. Please wait.
1
Load Balancing in Structured P2P Systems (DHTs) Sonesh Surana sonesh@cs.berkeley.edu [Brighten Godfrey, Karthik Lakshminarayanan, Ananth Rao, Ion Stoica, Richard Karp]
2
Definitions, Goal, Assumptions Definitons –Load: could be storage, bandwidth, etc. –Target: the load a node is willing to take (ex. capacity, avg. util. + slack) Goal –To maintain the system in a state where load on a node is below it’s target Assumptions –Nodes are cooperative –Only one bottlenecked resource
3
Motivation for Load Balancing in DHTs In DHTs, item identifiers determine which node items are mapped to –Items and nodes are assigned Ids randomly –This can cause a log(N) imbalance in the number of items at a node Aggravating problems –Sizes of items may not be the same –Some applications want to choose identifiers themselves, not randomly (ex. Database applications) –Nodes themselves might have heteregeneous capacities
4
Virtual Servers Chord Ring A virtual server is a contiguous region of the Id space
5
Virtual Servers A virtual server is a contiguous region of the Id space Each node can be responsible for many virtual servers Node A Node B Node C
6
Static Mapping of Virtual Servers 20 11 3 10 15 L=45 L=31 L=3 Node A Node B Node C Heavy L=41 T=50 T=35 T=15 Chord Ring Imbalance: Give each node log(N) virtual servers. This causes constant factor imbalance Heterogeneity: Each node has number of virtual servers proportional to its target 30
7
Chord Ring 20 11 3 10 20 15 Dynamic Re-Mapping of Virtual Servers Allow dynamic re-mapping of load in a system. Virtual server is the basic unit of load movement. Node A Node C Node B T=50 T=15 T=35 Heavy L=45 L=31 L=3 L=41 30
8
Dynamic Re-Mapping of Virtual Servers Allow dynamic re-mapping of load in a system. Virtual server is the basic unit of load movement. Chord Ring 20 11 3 10 30 15 Node A Node C Node B T=50 T=15 T=35 L=45 L=31 L=14 L=30
9
Dynamic Re-mapping of Virtual Servers Advantages –Flexibility in being able to move load from any node to any other node, not just to neighbour –Easily supported by the underlying DHT because movement of virtual server appears as join and leave to the DHT
10
Load Balancing Scheme – 2 Actions Take periodic action –Try to bring the nodes below target to bring the system in a good state Take emergency action –If the arrival of an item causes a node to go over capacity, then seek help immediately Analogy: Go to hospital for regular checkups but rush to the hospital if you’ve fractured an arm
11
Load Balancing Scheme - Directories Directory Nodes All nodes, heavy or light, periodically report information to directories. Soft State Directories periodically compute the transfer schedule and report it back to the nodes, which then do the actual transfer. Heavy nodes H3H3 H2H2 H1H1 Directories D1D1 D2D2 L4L4 Light nodes L1L1 L2L2 L3L3 L4L4 L5L5
12
Basic Load Balancing Computation Unload Insert
13
Unload 25 764 10843 T=6, L=7 T=20, L=17 T=24, L=25 A B C Pool T=6, L=5 A T=24, L=22
14
Insert 5 764 1084 2 3 A B C T=24, L=22 T=20, L=17 T=6, L=5 T=20, L=20 Pool T=24, L=24
15
Emergency If an item comes that will make node overloaded (emergency), node asks help from its directory –Only this node unloads –Inserts happen as before, however we make sure the accepting node does not become heavy –If not solved by this directory, tries offloading remaining offending virtual servers in another directory.
16
Dynamicity in P2P Object Churn Rate –Objects are inserted into and deleted from the P2P system Load Impulses –Sudden popularity –Object ids might not be hashed Node Churn Rate –Nodes might come and go –We assume there are external methods to make sure nodes leaving does not cause loss of data objects Replication
17
Evaluation Static vs Dynamic mapping ? How well we do under impulses ? How well we do when nodes enter and leave the system ? How much load is moved around due to load balancing ?
18
Evaluation Static vs Dynamic mapping ? How well we do under impulses ? How well we do when nodes enter and leave the system ? How much load is moved around due to load balancing ?
19
Simulation Setup 4096 nodes in the system 16 directories 12 virtual servers per node Steady state: 100,000 data items Load Balancing frequency is 1 minute Distributions –Node capacities: Pareto –Item Sizes: Uniform –Arrival and Departure process for items: Poisson. We look at steady state. Arrival Rate = 100 items/sec At the moment, everything is synchronized
20
Static Mapping vs. Dynamic Re-Mapping
21
Impulse
22
Future Work Investigate what happens with different distributions for item size and item id distributions –zipf, bi-modal, traces Finally, implement on chord
23
Conclusion Plain hashing and static mapping of virtual servers may not be enough –Dynamic re-mapping. Flexible and easily supported by DHT P2P systems are dynamic – bursts happen –So do emergencyin addition to periodic
24
Questions? + Feedback
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.