Scalable Load-Distance Balancing 12.06.02 EE Department The Technion, Haifa, Israel Scalable Load-Distance Balancing Edward Bortnikov, Israel Cidon, Idit Keidar
Service Assignment Assign (many) users to (a few) servers 12.06.02 Service Assignment Assign (many) users to (a few) servers Applications: Content/game servers Internet gateways in a wireless mesh network (WMN) The increased demand for real-time access multiple service points (servers) Gives rise to the problem of service assignment – associating each user with a server s.t. the QoS is improved Many technologies
Load-Distance Balancing (LDB) 12.06.02 Load-Distance Balancing (LDB) Two sources of service delay Network delay – due to user-server distance e.g., depends on the number of network hops Congestion delay – due to server load General monotonic non-decreasing function Total delay = Network delay + Congestion delay The Load-Distance Balancing problem (LDB) Minimize the maximum total delay (cost) NP-complete (a 2-approximation exists)
LDB versus LB Network distance OK Congestion high Network distance OK 12.06.02 LDB versus LB Network distance OK Congestion high Network distance OK Congestion OK Network distance high Congestion OK
Distributed LDB Distributed assignment computation Requirements 12.06.02 Distributed LDB Distributed assignment computation Initially, users report locations to closest servers Servers communicate and compute the assignment Synchronous failure-free communication model Requirements Eventual quiescence Eventual stability of assignment Constant α-approximation of the optimal cost α ≥ 2 is a parameter (trade communication/time for cost) At startup, every user is assigned to the nearest server Servers can communicate and change assignment Eventually, the following conditions must hold: inter-server communication stops Assignment stops changing An a-approximation of the optimal assignment is computed….
What About Locality? Extreme global solution Extreme local solution 12.06.02 What About Locality? Extreme global solution Collect all data and compute assignment centrally Guarantees 2-approximation of optimal cost Excessive communication/network latency Extreme local solution Nearest-Server assignment No communication No approximation guarantee (can’t handle crowds) In addition to cost, in the distributed case we are interested about locality.
Workload-Sensitive Locality 12.06.02 Workload-Sensitive Locality The cost function is distance-sensitive Most assignments can go to the near servers … except for dissipating congestion peaks Distributed solution structure Start from the nearest-server assignment Offload congestion to near servers Workload-sensitive locality Go as far as needed … … to achieve the desired approximation α
Example: Light Uniform Workload 12.06.02 Example: Light Uniform Workload
Example: Peaky Workload 12.06.02 Example: Peaky Workload LDB-approximation LDB-approximation LDB-approximation
Iterative Clustering Partition servers into clusters 12.06.02 Iterative Clustering Partition servers into clusters Assign each user within its cluster Choose one leader per cluster Leader collects local data Computes within-cluster assignments Clusters may merge A cluster tries to merge as long as its cost is ε-improvable Can be decreased by ≥ 1+ε if all servers are harnessed No ε-improvable cluster desired approximation achieved (α =2(1+ε)) slack factor
Tree (Structured) Clustering 12.06.02 Tree (Structured) Clustering Maintain a hierarchy of clusters Employ clusters of size 2i While some cluster ε-improvable Double it (merge with hierarchy neighbor) Simple, O(log N) convergence Requires hierarchy maintenance May not adapt well Miss cross-boundary optimization
Ripple (Unstructured) Clustering 12.06.02 Ripple (Unstructured) Clustering Define linear order among servers While cluster improvable Merge with smaller-cost L/R neighbor Adaptive merging Conflicts possible A BC, ABC Randomized tie-breaking to resolve Many race conditions (we love it )
Ripple Example: Merging Without Conflicts 12.06.02 Ripple Example: Merging Without Conflicts Propose merge High cost, improvable Low cost Accept proposal
Example Ripple Conflict Resolution ABC 12.06.02 Example Ripple Conflict Resolution ABC Propose Propose Accept
Ripple’s Properties Near-Optimal Cost Convergence Locality 12.06.02 Ripple’s Properties Near-Optimal Cost α-approximation of the optimal cost Convergence Communication quiescence + stability of assignment N rounds in the worst case (despite coin-tossing) Much faster in practice Locality For isolated load peaks, final clusters are at most twice as large as minimum required to obtain cost
Sensitivity to Required Approximation: Cost 12.06.02 Sensitivity to Required Approximation: Cost Urban WMN 64 servers (grid) Internet gateways 12800 users Mixed distribution (50%uniform / 50% peaky) 2-approximation algorithm applied within each cluster Nearest Server Theoretical Worst-Case Bound Tree Ripple Tree and Ripple outperform NS Beyond upper bound Ripple s better α =2(1+ε)
Sensitivity to Required Approximation: Locality 12.06.02 Sensitivity to Required Approximation: Locality Urban WMN 64 servers Internet gateways 12800 users Mixed distribution (50%uniform / 50% peaky) Ripple: max cluster size Most clusters built by Ripple are smaller Ripple: avg cluster size
Scalability with Network Size: Cost 12.06.02 Scalability with Network Size: Cost Urban WMN 64 to 1024 servers 12800 to 204800 users Nearest Server Tree Ripple
Scalability with Network Size: Locality 12.06.02 Scalability with Network Size: Locality Urban WMN 64 to 1024 servers 12800 to 204800 users Tree Ripple
Summary LD Balancing: novel optimization problem 12.06.02 Summary LD Balancing: novel optimization problem Distributed workload-sensitive solutions Tree and Ripple algorithms Ripple is more complex but performs better No infrastructure Scales better in practice Achieves lower costs in practice
Thank you