Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling Mugurel Ionut Andreica, Nicolae Tapus Politehnica University of Bucharest.

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling Mugurel Ionut Andreica, Nicolae Tapus Politehnica University of Bucharest Computer Science Department

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 2 Summary Motivation Online Data Transfer Scheduling Model Scheduling over Time on a Single Link –Time slot array (basic data structures) –Disjoint sets –Balanced tree for maximal time slot intervals –Block Partition (algorithmic framework) –Segment Tree (algorithmic framework) –Batched Updates (followed by queries) Scheduling over Time on a Path Network –Multidimensional Data Structures –Multidimensional Batched Updates (Other) Practical Application Scenarios Conclusions

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 3 Motivation QoS guarantees – strictly necessary –Multimedia streams (minimum required bandwidth, constant latency) –Large file transfers (bandwidth, earliest start time, latest finish time) Efficient resource management –Scheduling (Grid schedulers, bandwidth brokers) –Resource availability –Resource reservations

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 4 Online Data Transfer Scheduling Model One (centralized) resource manager –Knows the network topology (structure) –Has full control over the network Many data transfer requests –Duration (D) (non-preemptive = a contiguous time interval) –Earliest start time (ES) –Latest finish time (LF) –Minimum required bandwidth (B min ) –Source (src) –Destination (dst) Simple greedy strategy –Handle the requests in the order of arrival –Verify if the request can be granted (satisfying the request’s constraints/parameters) –Grant the request (resource allocation/reservation) –Low response times a complex strategy would take too long even a simple strategy may take too long! => need some efficient techniques (data structures)

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 5 Scheduling over Time on a Single Link Two nodes, connected by a single link Two models –(1) One transfer at a time (mutual exclusion) => use the whole bandwidth of the link –(2) Multiple simultaneous data transfers (each uses some amount of bandwidth) Time horizon – divided into m time slots of equal length –good performance = fine-grained time division (=> m can be quite large)

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 6 The Time Slot Array The most basic data structure An entry for each time slot (ts[t] for time slot t) –model (1): ts[t]=0 (unoccupied) or 1 (occupied) –model (2): ts[t]=available bandwidth during time slot t Query(ES, LF, D) –model (1): find an interval of D unoccupied time slots, fully included in [ES,LF] => O(m) –model (2): find an interval of D time slots, fully included in [ES,LF], for which the minimum available bandwidth is maximum => O(m) (requires the use of a double-ended queue) Update(tstart, D, value) –Model (1): ts[t]=value, tstart≤t≤tstart+D-1 => O(m) –Model (2): ts[t]+=value, tstart≤t≤tstart+D-1 (value can be negative) => O(m)

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 7 Disjoint Sets (1/2) Only for model (1) Uncancelable requests (once a time slot is occupied, it cannot be un-occupied) Disjoint Sets –Tree representation –Every element has a parent –Tree root = the representative of the set –Find(i) = finds the representative of the set containing element i –Union (i,j) = joins the sets into which elements i and j are contained –O(log*(m)) per operation

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 8 Disjoint Sets (2/2) Maximal 1-intervals –e.g. 0 0 1 1 1 1 0 0 1 1 1 0 1 0 0 Maintain the maximal 1-intervals using the disjoint sets data structure –The representative of a set contains the left and right time slot of the 1-interval Query(ES, EF, D) –Find an interval of D unoccupied time slots –Jump over whole maximal 1-intervals => reducing the time complexity (only in practice) Update(tstart, D, value=1) –Set the time slot entries to 1 –Join the corresponding disjoint sets –O(m·log*(m)) overall (for all the update operations)

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 9 Balanced Tree of Maximal Time Slot Intervals Only for model (1) Decomposition of the m time slots into 0-intervals and 1-intervals –stored in a balanced tree T (red-black, AVL, scapegoat tree, 2-3-4,...) Query(ES, EF, D) –Obtain (in O(log(m)) time) from T each 0-interval intersecting [ES,EF] (and compare its length to D) each 1-interval intersecting [ES,EF] (and jump over it) time complexity: O(m·log(m)); better, in practice (due to jumps over large intervals) Update(tstart, D, value) –Remove from T the maximal intervals intersected by [tstart, tstart+D-1] –Insert into T the new intervals –At most one removal + at most 3 (re)insertions => time complexity O(log(m)) if ((all ES=1) and (all LF=m)) (no earliest start time and latest finish time constraints) –Maintain a heap with the lengths of the 0-intervals –Obtain the longest 0-interval in O(1) time => O(1) per update –Update heap whenever T is updated (on removals and insertions)

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 10 Block Partitioning Method (1/2) Divide the m time slots into m/k blocks of k time slots each (possibly less in the last block) For each block B –[B.left,B.right]=the interval of time slots corresponding to block B –uagg=update aggregate of all the update parameters of update calls whose ranges include [B.left, B.right] –qagg=querry aggregate for the current block (the answer if the query’s range is [B.left, B.right])

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 11 Block Partitioning Method (2/2) Framework –qFunc(x,y): range query(a,b) => compute qFunc(ts[a], qFunc(ts[a+1],..., qFunc(ts[b-1], ts[b])..) –uFunc(x,y): range update(u, a, b) => ts[t]=uFunc(u, ts[t]) a≤t≤b Pairs of range updates and range queries –Range addition update, range sum query –Range addition update, range minimum/maximum query (useful for model (2), when the start time and finish time are given as request parameters) –Range set update, range maximum sum segment query (useful for model (1)) –Range set update, range sum/min/max query Time complexity: O(k+m/k) for each range update/range query call –k=sqrt(m) => O(sqrt(m))

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 12 The Segment Tree Data Structure (1/3) Binary tree structure used for performing operations on an array v with m cells Node p –associated interval: [p.left, p.right] –two sons: the left son (p.lson) and the right son (p.rson) –left son’s interval: [p.left, mid] –right son’s interval: [mid+1, p.right] –where: mid=floor((p.left+p.right)/2) –leaves: [x,x] interval (only one cell) Range queries and updates => similar to the block partitioning method

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 13 The Segment Tree Data Structure (2/3)

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 14 The Segment Tree Data Structure (3/3) Algorithmic framework (new) –uagg and qagg maintained at each tree node uagg=update aggregate of all the update calls which “stopped” at the node (did not go further down in the tree) qagg=querry aggregate for the node’s interval –More troublesome for several update and query functions (had to “push” update aggregates further down in the tree => “piggy- backing” of future update and query calls) Same pairs of updates + queries => O(log(m)) time complexity per operation (update/query)

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 15 Batched Updates Multiple reservations (updates) performed before querying the data structure (only some types of update functions) q updates (u i, [a i, b i ]) –Addition update (folklore): taux[a i ]+=u i (Start entry) taux[b i+1 ]=-u \ (Finish entry) ts[t]=taux[1]+taux[2]+...+taux[t] –Set update: taux[a i ].add(Start, u i ) taux[b i+1 ].add(Finish, u i ) traverse taux from left to right + maintain a max-heap (the most recent “active” set update) –for each Finish entry => remove update –for each Start entry => add update ts[t]=heap.top() O(m) for all the q updates Preprocess the array for subsequent queries (without updates => easier, if there is no need to support updates)

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 16 Scheduling over Time on a Path Network A path network, composed of n vertices v 1, v 2,..., v n (v i,v i+1 ) connected by a link (1≤i≤n-1) A request=2D range [src,dst] x [ES,LF] –first dimension=the path interval –second dimension=the time slots interval n 1-dimensional data structures => performance decreases by a factor of O(n) 2-dimensional data structures => better (sublinear performance degradation)

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 17 d-dimensional Data Structures d-dimensional Block partition –O((k+m/k) d ) per update/query d-dimensional Segment tree –O(log d (m)) per update/query d-dimensional batched updates –Update range=[l 1,h 1 ] x... x [l d,h d ] –Generate all the 2 d numbers K of d bits Generate a position x=(x 1, x 2,..., x d ) –x i =l i, if K i =0; x i =h i +1, if K i =1 B=the number of 1-bits in K –if B is even => add a Start entry at taux[x] –if B is odd => add a Finish entry at taux[x] Based on the inclusion-exclusion principle

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 18 (Other) Practical Application Scenarios requests asking for a fixed amount of bandwidth B during a fixed time interval –range addition update + range minimum query requests asking for a range of consecutive frequencies (out of n available) + maximum sum of transfer rates (for just one time slot) –range maximum sum segment query + point update requests asking for a range of frequency numbers + range of time slots (time interval) + maximum total bandwidth –2D data structure: range addition update + range sum query

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 19 Conclusions Particular online scheduling model Efficient data structures –Segment tree framework (online) –Block partitioning framework (online) –Batched updates (semi-online) –use of disjoint sets/balanced trees/heaps (online, restricted situations) Types of networks –single link ; paths –can be used on any kind of network, but with decreasing performance (e.g. one data structure per link)

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 20 Thank You ! http://hipergrid.grid.pub.ro/

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling Mugurel Ionut Andreica, Nicolae Tapus Politehnica University of Bucharest.

Similar presentations

Presentation on theme: "Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling Mugurel Ionut Andreica, Nicolae Tapus Politehnica University of Bucharest."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling Mugurel Ionut Andreica, Nicolae Tapus Politehnica University of Bucharest.

Similar presentations

Presentation on theme: "Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling Mugurel Ionut Andreica, Nicolae Tapus Politehnica University of Bucharest."— Presentation transcript:

Similar presentations

About project

Feedback