1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling
2 Time Frame Given a set of tasks, let H denote the smallest hyper period of all tasks. –T1=(1,4), T2=(1.8,5), T3=(1,20), T4=(2,20) –H=20 Divide time into frames and frame size f should divide H. –f could be 2,4,5,10,20 Choose small frame size since this will make the scheduling solution more useful
Network flow formulation Denote all the tasks as J 1,J 2,…,J n Vertices –N job vertices –H/f time frame vertices –Source –Sink Edges –Source to job vertex with capacity set to execution time e i –Job vertex to time frame vertex with capacity f if the job can run in the time frame –Time frame to sink with capacity f 3
Flow network 4
Computing scheduling If the obtained maximum flow is equal to the sum of execution time of all tasks, then the task set is schedulable. 5
6 Flow network Given a directed graph G A source node s A sink node t Goal: To send as much information from s to t
7 Flows An s-t flow is a function f which satisfies: (capacity constraint) (conservation of flows ( at intermediate vertices )
8 Value of the flow st G: 6 Value = 19 Maximum flow problem: maximize this value
9 Cuts An s-t cut is a set of edges whose removal disconnect s and t The capacity of a cut is defined as the sum of the capacity of the edges in the cut Minimum s-t cut problem: minimize this capacity of a s-t cut
10 Flows ≤ cuts Let C be a cut and S be the connected component of G-C containing s.
11 Main result Value of max s-t flow ≤ capacity of min s-t cut (Ford Fulkerson 1956) Max flow = Min cut A polynomial time algorithm
12 Greedy method? Find an s-t path where every edge has f(e) < c(e) Add this path to the flow Repeat until no such path can be found. Does it work?
13 A counterexample The greedy algorithm produces a flow of value 20 while the maximum flow has value of 30.
14 Residual graph Key idea allow flows to push back c(e) = 10 f(e) = 2 c(e) = 8 c(e) = 2 Advantage of this representation is not to distinguish send forward or push back Can send 8 units forward or push 2 units back.
15 Ford-Fulkerson Algorithm 1.Start from an empty flow f 2.While there is an s-t path P in residual graph update f along the original graph 3.Return f
16 Ford-Fulkerson Algorithm s t G: Flow value = 0 0 flow capacity
17 Ford-Fulkerson Algorithm s t G: s t G f : X X X 0 Flow value = 0 capacity residual capacity flow
18 Ford-Fulkerson Algorithm s t G: s t G f : X X X 2 X Flow value = 8
19 0 Ford-Fulkerson Algorithm s t G: s t 4 2 G f : X X X 8 X Flow value = 10
20 Ford-Fulkerson Algorithm s t G: s t 1 6 G f : X X X 0 X Flow value = 16
21 Ford-Fulkerson Algorithm s t G: s t 6 2 G f : X X X 9 X X 3 Flow value = 18
22 Ford-Fulkerson Algorithm s t G: s t G f : Flow value = 19
23 Ford-Fulkerson Algorithm s t G: s t G f : Flow value = 19 Cut capacity = 19
24 Max-flow min-cut theorem Consider the set S of all vertices reachable from s s is in S, but t is not in S No incoming flow coming in S (otherwise push back) Achieve full capacity from S to T Min cut!
25 Integrality theorem If every edge has integer capacity, then there is a flow of integer value.
26 Complexity Assume edge capacity between 1 to C At most mC iterations Finding an s-t path can be done in O(m) time Total running time O(m 2 C)
27 Speedup with capacity scaling Capacity scaling to find paths with large capacity –Find 2 p-1 C 2 p –For i from p-1 to 0 Compute the graph with edge capacity at least 2 i Find maximum flow there At iteration i, there are at most m edges, the capacity of the min cut is at most m2 i+1 and each augmenting path has flow value at least 2 i, so there are at most 2m augmentations. Runtime is bounded in O(m 2 logC).
Speedup with BFS In each iteration, compute the breadth first search in the residual graph and choose the path with fewest edges. Let level i (v) denote the distance from s to v in the residual graph. Level i (v) cannot decrease during iterations. Prove by induction. –Suppose that in the i+1 iteration, edge u->v is picked in the residual graph for pushing flow. If u->v is an edge in the residual graph in last iteration, level i (u)+1=level i (v)<=level i+1 (u)+1=level i+1 (v) by induction –Otherwise, v->u is in the augmenting path of iteration i, which means that it is along the shortest path, so level i (v)=level i (u)-1<level i (u)+1<=level i+1 (u)+1=level i+1 (v) Each edge cannot appear and disappear many times. Given a consecutive disappearance in G i and appearance in G j of an edge u->v in two residual graphs. u->v is on the augmenting path of G i and v->u is on the augmenting path of G j, so level i (u)+1=level i (v) and level j (v)+1=level j (u). Note that level j (v)>=level i (v). We have level j (u)>=level i (u)+2. 28
Speedup with BDF (2) Distance from s to u increases by at least 2 for disappearance and appearance. The level is at most n, so the number of disappearance is bounded by n/2. Each edge can disappear at most n/2 times, totally m edges which means that the total disappearance is nm/2 At least one edge disappears, so at most nm/2 iterations Total runtime O(nm 2 ) 29
30 Precedence and nonpreemption Suppose that J 1 needs to be scheduled before J 2, then make sure that the release time of J 1 is before J 2. In the resulting schedule, if (part of) J 1 is scheduled after (part of) J 2, then just swap them. Nonpreemption cannot be handled and it is NP- hard.
NP completeness proof Reduce from 3-partition problem Given a set S of 3m elements where each element a has a value v(s) and ∑ s ∈ S v(s)=mB, one asks whether S can be partitioned into m disjoint subsets S 1,S 2,…,S m such that for each subset ∑ s ∈ Si v(s)=B? 31
Reduction Given an instance of 3-partition, form an instance of nonpreemptive scheduling problem which contains 3m+1 tasks, T 1,T 2,…,T 3m+1 as follows. For each element s i, create a task T i with p=d=mB+m and c=v(s i ). Create a task T 3m+1 with p=B+1 and d=c=1. We claim that the task set is schedulable if and only if the 3-partition instance is feasible. 32
Only if direction When the task set is schedulable –Task T 3m+1 is scheduled at time 0, B+1, 2(B+1), … –Consider the hyper period mB+m. All of the first 3m tasks need to be scheduled within it. –During this hyper period, T 3m+1 has run for m times with total time m. –Thus, mB time is for all other tasks. –The available time between the first and the second T 3m+1 is B. –The task set between them has total time bounded by B. Let S1 denote the corresponding set in S, so ∑ s ∈ S1 v(s) ≦ B –Similarly, ∑ s ∈ Si v(s) ≦ B for all 1 ≦ i ≦ m since T 3m+1 has run for m times –On the other hand, ∑ s ∈ S1 v(s) + ∑ s ∈ S2 v(s) +…+ ∑ s ∈ Sm v(s)=mB. One has that each ∑ s ∈ Si v(s)=B. 33
If direction When there is a feasible 3-partition solution, –One can schedule T 3m+1 at time 0, B+1, 2(B+1),… –One then puts the other tasks according to the 3-partition solution 34
3-partition First show that numerical 4DM is NP- complete. Reduce from 3DM. 4DM problem says that given four sets S1,S2,S3,S4, each of which consists of some distinct elements, and a collection C=S1S2S3S4, one asks whether there exists a subcollection C’ to partition the union of four sets and the sum of values of each set in C’ is B. 35
Reduce from 3DM to numerical 4DM Create four elements for each candidate set (x a,y b,z c ) in M. e 1 in S 1, e 2 in S 2, e 3 in S 3 and e 4 in S 4. If x a is in the candidate set, create an element e 1 with value either 2q 3 +aq 2 (core) or aq 2 (dummy). If y b is in the candidate set, create an element e 2 with value either bq (core) or q 3 +bq (dummy). If z c is in the candidate set, create an element e 3 with value either c (core) or q 3 +c (dummy). create an element e 4 with value 2q 3 -aq 2 -bq-c. If there is only one occurrence of a variable (e.g., x1) in M, then there is only one core element generated. If there are k occurrences (e.g., z7) in M, then there are k elements generated where contains one core element and k-1 dummy elements. Note that different elements can have the same value. Candidate sets in 4DM is created such that it contains either all core elements or all dummy elements. Enumerate all possible candidate sets. Set B=4q 3. 36
Reduction example Suppose that the candidate sets M in 3DM is as follows. (x 1,y 5,z 7 ), (x 2,y 2,z 7 ), (x 2,y 5,z 5 ) … (x 1,y 5,z 7 ) produces e 11 with value 2q 3 +q 2, e 21 with value 5q, e 31 with value q 3 +7, e 41 with value 2q 3 -q 2 -5q-7. (x 2,y 2,z 7 ) produces e 12 with value 2q 3 +2q 2, e 22 with value 2q, e 32 with value 7, e 42 with value 2q 3 -2q 2 -2q-7. (x 2,y 5,z 5 ) produces e 13 with value 2q 2, e 23 with value q 3 +5q, e 33 with value 5, e 43 with value 2q 3 -2q 2 -5q-5. If (x 1,y 5,z 7 ) is picked in M, we pick (e 11 e 21 e 32 e 41 ). Since (x 2,y 2,z 7 ), (x 2,y 5,z 5 ) are not picked, we pick (2q 2 q 3 +2q e 31 e 42 ) and (2q 2 e 23 e 33 e 43 ). The elements with values are those generated from other candidate sets in M. e 12 e 22 e 13 are not picked and they will be picked corresponding to some sets picked in M. 37
If direction When there is solution of 3DM problem, If a set is picked in 3DM, the corresponding core set is picked in numerical 4DM. Otherwise, the corresponding dummy set is picked. Each variable is picked exactly once in 3DM, so each core element is picked exactly once. Note that core elements generated from multiple sets in M could be combined together and picked (since we enumerate candidate sets in numerical 4DM). Given k occurrences of a variable in M, they are in k candidate sets in M. One of them is picked (so is the corresponding core element), and k-1 of them is not picked (so the corresponding k-1 dummy elements are picked). Thus, each generated element is picked exactly once. There is only one e4 for each set in M, which will be used to make the sum of values 4q 3. This is the subcollection of sets to partition the union of four sets and each set with the sum of values to be B. 38
Only if direction Given a solution to numerical 4DM, each core element is covered exactly once. There exists sets which contain only the core elements and one can pick the corresponding sets in M. 39