Download presentation
Presentation is loading. Please wait.
Published byGregory Burke Modified over 9 years ago
1
Load Balancing Tasks with Overlapping Requirements Milan Vojnovic Microsoft Research Joint work with Dan Alistarh, Christos Gkantsidis, Jennifer Iglesias, Bo Zong
2
Motivating Application Scenario: Stream Processing Platforms 2
3
Tasks and Requirements 3
4
4
5
Problem #1: Bi-Criteria Load Balancing Query Assignment Problem: Find an assignment of tasks to machines that Criteria 1: minimizes the total number of distinct requirements that need to be supplied to machines Criteria 2: the number of tasks assigned over machines is balanced 5
6
Problem #2: Min-Max Load Balancing Query Assignment Problem: Find an assignment of tasks to machines that minimizes the maximum number of distinct requirements needed by a machine 6
7
Other Motivating Application Scenarios Scheduling tasks in distributed clusters of machines with data locality … Beyond resource allocation in data centres: Clustering of information objects (documents, images, videos) Summarizing topics for collections of documents … 7
8
Related Work 8
9
Problem #1: Bi-Criteria Load Balancing 9
10
NP Hardness Query Assignment Problem is NP-complete Proof: Reduction from the well known bin packing problem 10
11
Random Query Assignment 11
12
Deficiency of Random Query Assignment 12
13
Special Case: Tasks with Singleton Requirements There exists a polynomial-time algorithm that guarantees 2- approximation for singleton task requirements with arbitrary weights 13
14
Algorithm 14
15
Tasks with Arbitrary Sets of Requirements 15
16
Gadget: Minimum Task Type Packing 16
17
Algorithm 17
18
Experimental Evaluation 18
19
Offline Algorithms MQP = defined in an earlier slide OffRand = uniform random assignment of a query type to a machine IC = Incremental cost MMS = Min-max traffic cost per machine 19
20
Performance of Offline Algorithms Number of requirements per task 20
21
Online Task Assignment 21
22
Performance of Online Algorithms Number of requirements per task 22
23
Problem #2: Min-Max Load Balancing 23
24
Online Task Assignment 24
25
Hidden Co-Clustering Input 25
26
Recovery Theorem 26
27
Experimental Evaluation Dataset Greedy Random = random task arrival Decreasing with respect to the number of requirements Balance big = large tasks to least loaded, small items according to greedy Prefer big = large tasks to least loaded, delayed assignment of up to a fixed number of small tasks 27
28
Retail dataset 28
29
Conclusion Studied two variants of non-standard load balancing problems Bi-criteria and min-max Approximation ratios for offline problems Hidden clustering recovery conditions for a simple greedy online task assignment strategy Open questions: Tighter approximation ratios for offline versions of both problems? Similar hidden cluster recover questions (allowing for more memory)? 29
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.