Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths Mugurel Ionut Andreica Polytechnic University of Bucharest Computer Science Department
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 2 Summary Motivation – Scheduling File Transfers Offline Scheduling maximum profit scheduling (files with divisible sizes) – heuristic algorithm maximum profit scheduling (files with divisible sizes) – heuristic algorithm minimum cost scheduling – optimal algorithm minimum cost scheduling – optimal algorithm Online Resource Management algorithmic framework for the block partitioning method (=data structure) algorithmic framework for the block partitioning method (=data structure)
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 3 Motivation world-wide development and deployment of distributed systems, services and applications competition for bottleneck resources => poor performance inefficient usage of available resources => poor performance scheduling techniques => performance improvements
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 4 Maximum Profit Scheduling n files (divisible sizes) sz i – size of the i th file sz i – size of the i th file sz 1 ≤sz 2 ≤... ≤ sz n sz 1 ≤sz 2 ≤... ≤ sz n sz i =q i ·sz i-1, q i ≥1 is an integer sz i =q i ·sz i-1, q i ≥1 is an integer p i =profit of the i th file (transfer request) p i =profit of the i th file (transfer request) k paths path j – available in the interval [0,T j ] path j – available in the interval [0,T j ] unit speed unit speed schedule the requests non-preemptively + at most one request per path per time unit equivalent to multiple knapsack with divisible item sizes
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 5 Multiple Knapsack with Divisible Item Sizes NP-hard can be solved in O(n·max{T j } k ) (extension from single knapsack problem) can be solved in O(n·max{T j } k ) (extension from single knapsack problem) heuristics heuristics Greedy1MultipleKnapsack(n,k)Greedy1MultipleKnapsack(n,k) pack the items optimally in the first knapsack pack the items optimally in the first knapsack remove the q packed items and the knapsack remove the q packed items and the knapsack call Greedy1MultipleKnapsack(n-q,k-1) with the remaining items and knapsacks call Greedy1MultipleKnapsack(n-q,k-1) with the remaining items and knapsacks Greedy2Greedy2 sort the items according to some criterion (e.g. Profit/size) sort the items according to some criterion (e.g. Profit/size) insert the items into knapsack using the First Fit heuristic insert the items into knapsack using the First Fit heuristic new heuristic algorithm – this paper
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 6 New Heuristic Algorithm (1/2) (Multiple Knapsack with Divisible Item Sizes) split the items into groups all the items in the same group (i) have the same size (sg i ) all the items in the same group (i) have the same size (sg i ) sg 1 >sg 2 >...>sg G (G=total # of groups) sg 1 >sg 2 >...>sg G (G=total # of groups) within a group, sort the items in decreasing order of their profits pr i,1 ≥pr i,2 ≥…≥pr i,ni (n i =the # of items in group i) pr i,1 ≥pr i,2 ≥…≥pr i,ni (n i =the # of items in group i) insert the items into knapsacks using the First Fit heuristic traverse the items in inreasing order of group number; within a group, in increasing order of item number traverse the items in inreasing order of group number; within a group, in increasing order of item number choice of knapsack for an item (i,j) – not important => because of divisible item sizes, we insert the same set of items choice of knapsack for an item (i,j) – not important => because of divisible item sizes, we insert the same set of items
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 7 New Heuristic Algorithm (2/2) (Multiple Knapsack with Divisible Item Sizes) repeatedly improve the initial solution replace an item (i,j) in the knapsack with a subset Q of items outside the knapsack, s.t. pr(i,j)<profit(Q) && profit(Q)-pr(i,j)=maximum replace an item (i,j) in the knapsack with a subset Q of items outside the knapsack, s.t. pr(i,j)<profit(Q) && profit(Q)-pr(i,j)=maximum ignore item (i,j) from now on ignore item (i,j) from now on time complexity O(n·S·min{n,S·log(S)}) O(n·S·min{n,S·log(S)}) S=maximum size of an item S=maximum size of an item
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 8 Performance Evaluation (Multiple Knapsack with Divisible Item Sizes) tested algorithms new heuristic new heuristic Greedy1MultipleKnapsack Greedy1MultipleKnapsack Greedy2 (with multiple criteria) Greedy2 (with multiple criteria) direct solution (extension from single knapsack) direct solution (extension from single knapsack) results most of the cases – solved optimally by the new heuristic most of the cases – solved optimally by the new heuristic in terms of quality of solution and running time => the new heuristic = clear winner in terms of quality of solution and running time => the new heuristic = clear winner followed by Greedy1MultipleKnapsackfollowed by Greedy1MultipleKnapsack
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 9 Minimum Cost Scheduling (1/3) sequence of n identical files to be sent consecutively from the same source to the same destination to be sent consecutively from the same source to the same destination sending time per file=1 time unit sending time per file=1 time unit k data transfer providers provider j provider j cost: C j per time unitcost: C j per time unit lease at most ONE time interval of duration at most T max,i which includes [T 1,j, T 2,j ]lease at most ONE time interval of duration at most T max,i which includes [T 1,j, T 2,j ] default network link: cost L i for transferring the i th file minimize the total cost
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 10 Minimum Cost Scheduling (2/3) O(n·k) dynamic programming sort the data transfer providers in increasing order of T 2,j : T 2,1 ≤T 2,2 ≤…≤T 2,n Cmin[i,j]=the minimum total cost for sending the first j files using a subset of the first i providers (in the sorted order)
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 11 Minimum Cost Scheduling (3/3) compute an auxiliary array minp i in O(n) time for j in [T 2,i, T 1,i +T max,i ]: Cmin[i,j]=min{Cmin[i-1,j], Cmin[i,j-1]+L j, Cmin[i,j]=min{Cmin[i-1,j], Cmin[i,j-1]+L j, (j-T 1,i )·C i +minp i [j-T max,i ]} (j-T 1,i )·C i +minp i [j-T max,i ]} Cmin[k,n] – the answer O(n·k) overall
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 12 Online Resource Management (1/4) scenario a resource manager receives resource allocation and reservation requests a resource manager receives resource allocation and reservation requests request=amount of resource (bandwidth) + time constraints (fixed duration, earliest start time, latest finish time) request=amount of resource (bandwidth) + time constraints (fixed duration, earliest start time, latest finish time) model time – divided into equally-sized time slots time – divided into equally-sized time slots many requests simultaneously => we need low response times => efficient data structure many requests simultaneously => we need low response times => efficient data structure
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 13 Online Resource Management (2/4) algorithmic framework block partitioning method array of n cells array of n cells each cell – value v i each cell – value v i divide the cells: n/k blocks of size k divide the cells: n/k blocks of size k update and query functions: O(k+n/k) update and query functions: O(k+n/k) k=sqrt(n) => O(k+n/k)=O(sqrt(n)) k=sqrt(n) => O(k+n/k)=O(sqrt(n)) point and range queries v i =? v i =? qFunc(v a, v a+1,..., v b )=? qFunc(v a, v a+1,..., v b )=? point and range updates v i =u v i =u v i =uFunc(u, v i ), a≤i≤ b v i =uFunc(u, v i ), a≤i≤ b
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 14 Online Resource Management (3/4) update functions BPpointUpdate BPpointUpdate BPrangeUpdate BPrangeUpdate BPrangeUpdatePoints BPrangeUpdatePoints BPrangeUpdatePartialBlock BPrangeUpdatePartialBlock BPrangeUpdateFullBlock BPrangeUpdateFullBlock query functions BPpointQuery BPpointQuery BPrangeQuery BPrangeQuery BPrangeQueryPoints BPrangeQueryPoints BPrangeQueryPartialBlock BPrangeQueryPartialBlock BPrangeQueryFullBlock BPrangeQueryFullBlock
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 15 Online Resource Management (4/4) solves efficiently many problems range addition update+range min query range addition update+range min query range set update+range maximum sum segment query range set update+range maximum sum segment query... many more... many more typical resource reservation requests range addition update (reserve bandwidth for a full time interval) range addition update (reserve bandwidth for a full time interval) range minimum query (check to see if enough bandwidth is available in every time slot) range minimum query (check to see if enough bandwidth is available in every time slot)
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 16 Conclusions offline scheduling problems maximum profit scheduling maximum profit scheduling multiple knapsack with divisible item sizes – efficient heuristicmultiple knapsack with divisible item sizes – efficient heuristic minimum cost scheduling minimum cost scheduling optimal dynamic programming algorithmoptimal dynamic programming algorithm online resource management algorithmic framework (new) for the block partitioning method (well known) algorithmic framework (new) for the block partitioning method (well known)
Optimal Scheduling of File Transfers with Divisible Sizes on Multiple Disjoint Paths 17 Thank You !