Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Constraint Graph and Performance Optimization Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)2 Retiming Theorem Given a network G(V, E, W) and a cycle time T, (r1,... ) is a feasible retiming if and only if: ri – rj ≤ wijfor all edges (vi,vj) ε E ri – rj ≤ W(vi,vj) – 1 for all node-pairs vi, vj such that D(vi,vj) > T Where, W(vi,vj) is the minimum weight path between vi and vj D(vi,vj) is the maximum delay among all minimum weight paths between vi and vj
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)3 Timing Optimization Find the clock period (T) by path analysis. Set clock period to T/2 and find a feasible retiming. If feasible, further reduce the clock period to half. If not feasible, increase clock period. Do a binary search for optimum clock period. Retime the circuit.
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)4 Representing a Constraint ri – rj ≤ wijorrj ≥ ri – wij rjri – wij
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)5 Constraint Graph r1 ≥ r0 + 3 r1 ≥ r2 + 1 r2 ≥ r0 + 1 r2 ≥ r1 – 1 r3 ≥ r1 + 1 r3 ≥ r2 + 4 r0 ≥ r3 – 6 r0 r1 r2 r
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)6 Feasibility Condition A set of values for variables can be found if and only if the constraint graph has no positive cycles. This is also the condition for the solvability of the longest path problem, which provides a solution to the set of constraints.
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)7 Example: Infeasible Constraints x1 ≥ x2 + 6 x2 ≥ x1 – 3 x1x x1 x2 6 0 x1 ≥ x2 + 6 x2 ≥ x1 – Positive cycle mean no longest path can be found.
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)8 Solving a Constraint Set r1 ≥ r0 + 3 r1 ≥ r2 + 1 r2 ≥ r0 + 1 r2 ≥ r1 – 1 r3 ≥ r1 + 1 r3 ≥ r2 + 4 r0 ≥ r3 – 6 r0 r1 r2 r Longest path from source r0: r0, r1, r2, r3 Path lengths: s0=0, s1=3, s2=2, s3=6 Solution: r0=0, r1=3, r2=2, r3=6
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)9 The General Path Problem Find the shortest (or longest) path in a graph from a source vertex to any other vertex. Graph has vertices and directed edges: Edge weights can be positive or negative Graph can be cyclic Single source vertex – a vertex with 0 in-degree Inconsistent problem Negative cycles for shortest path Positive cycles for longest path
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)10 Dijkstra’s Shortest Path Algorithm Greedy algorithm. Applies to directed acyclic graphs (DAG) with positive edge weights. Computational complexity O(|E| + |V| log |V|) ≤ O(n 2 ) References: A. Aho, J. Hopcroft and J. Ullman, Data Structures and Algorithms, Reading, Massachusetts: Addison-Wesley, T. Cormen, C. Leiserson and R. Rivest, Introduction to Algorithms, New York: McGraw-Hill, 1990.
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)11 Dijkstra’s Shortest Path Algorithm v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially: mark s0 0152 Step 1: mark s Step 2: mark s Step 3: mark s Each step marks the path with smallest weight and updates the unmarked path weights.
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)12 Dijkstra’s Algorithm, G(V, E, W) s0(1) = 0initialize source for ( i = 1 to n )initialize path weights, n=|V| –1 si(1) = w0i repeat { Select an unmarked vertex vq such that sq is minimal Select an unmarked vertex vq such that sq is minimal Mark vq Mark vq foreach ( unmarked vertex vi ) si = min { si, sq + wqi } } until (all vertices are marked)
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)13 Dijkstra’s Longest Path Algorithm v0 v2 v3 v1 w01= source si = path length (v0, vi) Alg. steps s0s1s2s3 Initially0-15-2 Step Step v0 v2 v3 v1 w01= source
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)14 Dijkstra’s Alg. for Cycles, Neg. Weights v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially0152 Step Step Step ? -2 There exists a v0 to v3 path of length 5
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)15 Bellman’s Equations – Shortest Path vi vn vm vkvj sq =minimum path weight between source and vq wki wji wmi wni For all vertices: si = min (sq + wqi) vq ε pred(vi)
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)16 Bellman-Ford Algorithm, G(V, E, W) Bellman-Ford { s0(1) = 0initialize source for ( i = 1 to n )initialize path weights, n = |V| – 1 si(1) = w0i for ( j = 1 to n )n iterations for ( i = 1 to n ) si(j+1) = min { si(j), sk(j) + wkj } vk ε pred(vi) } if ( si(j+1) == si(j) i ) return (true) } return (false) Complexity = O(|V||E|) ≤ O(n 3 )
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)17 Bellman-Ford Shortest Path v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially0152 Iteration Iteration Iteration n = 3
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)18 Bellman-Ford Longest Path v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially0-15-2 Iteration Iteration n = 3 (shortest path) Reverse the sign of weights and solve shortest path problem. (Alternative: keep original weights and change min operator in algorithm to max.) Weights reversed
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)19 Bellman’s Equations – Longest Path vi vn vm vkvj sq =maximum path weight between source and vq wki wji wmi wni For all vertices: si = max (sq + wqi) vq ε pred(vi)
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)20 Bellman-Ford for Cycles, Neg. Weights v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially0152 Iteration Iteration Iteration n = 3 (shortest path) This was incorrect with Dijkstra’s shortest path algorithm
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)21 Bellman-Ford for Negative Cycle v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially0152 Iteration Iteration Iteration Values not stabilized after n iterations. Inconsistent problem: negative cycle. n = 3 (shortest path)
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)22 Retiming Example FF 1055 Delay abc
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)23 Retiming Graph FF 1055 abc h0h0 a 10 b5b5 c5c Critical path = 15 It is the longest path consisting only of zero weight edges.
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)24 Feasibility Constraints FF 1055 abc h0h0 a 10 b5b5 c5c ri – rj ≤ wij edges i → j Retiming should not cause negative edge weights. rh – ra ≤ 0 ra – rb ≤ 0 rb – rc ≤ 1 rc – rh ≤ 1
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)25 Constraint Graph FF 1055 abc rh 0 ra 10 rb 5 rc 5 00 ri – rj ≤ wij edges i → j Retiming should not cause negative edge weights. rh – ra ≤ 0 ra – rb ≤ 0 rb – rc ≤ 1 rc – rh ≤ 1 Observation: Constraint graph has the same structure as the original retiming graph, with signs of weights reversed. Vertex labels are the retiming integer variables.
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)26 Max Delay for Min Weight Paths h0h0 a 10 b5b5 c5c W(h,a) = 0D(h,a) = 10 W(h,b) = 0D(h,b) = 15 W(h,c) = 1D(h,c) = 20 W(a,b) = 0D(a,b) = 15 W(a,c) = 1D(a,c) = 20 W(a,h) = 2D(a,h) = 20 W(b,c) = 1D(b,c) = 10 W(b,h) = 2D(b,h) = 10 W(b,a) = 2D(b,a) = 20 W(c,h) = 1D(c,h) = 5 W(c,a) = 1D(c,a) = 15 W(c,b) = 1D(c,b) = 20 T = 15
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)27 Timing Optimization, T = 7.5? W(h,a) = 0D(h,a) = 10 W(h,b) = 0D(h,b) = 15 W(h,c) = 1D(h,c) = 20 W(a,b) = 0D(a,b) = 15 W(a,c) = 1D(a,c) = 20 W(a,h) = 2D(a,h) = 20 W(b,c) = 1D(b,c) = 10 W(b,h) = 2D(b,h) = 10 W(b,a) = 2D(b,a) = 20 W(c,h) = 1D(c,h) = 5 W(c,a) = 1D(c,a) = 15 W(c,b) = 1D(c,b) = 20 rh 0 ra 10 rb 5 rc 5 00 ri – rj ≤ W(I,j) – 1 paths (i,j) such that D(i,j) > 7.5 Constraint graph (feasibility)
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)28 Timing Optimization, T = 7.5? W(h,a) = 0D(h,a) = 10 W(h,b) = 0D(h,b) = 15 W(h,c) = 1D(h,c) = 20 W(a,b) = 0D(a,b) = 15 W(a,c) = 1D(a,c) = 20 W(a,h) = 2D(a,h) = 20 W(b,c) = 1D(b,c) = 10 W(b,h) = 2D(b,h) = 10 W(b,a) = 2D(b,a) = 20 W(c,h) = 1D(c,h) = 5 W(c,a) = 1D(c,a) = 15 W(c,b) = 1D(c,b) = 20 rh 0 ra 10 rb 5 rc Positive cycle No solution
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)29 Timing Optimization, T = 11.25? W(h,a) = 0D(h,a) = 10 W(h,b) = 0D(h,b) = 15 W(h,c) = 1D(h,c) = 20 W(a,b) = 0D(a,b) = 15 W(a,c) = 1D(a,c) = 20 W(a,h) = 2D(a,h) = 20 W(b,c) = 1D(b,c) = 10 W(b,h) = 2D(b,h) = 10 W(b,a) = 2D(b,a) = 20 W(c,h) = 1D(c,h) = 5 W(c,a) = 1D(c,a) = 15 W(c,b) = 1D(c,b) = 20 rh 0 ra 10 rb 5 rc rh = 0 rb = 1 rc = 0 ra = 0
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)30 Retiming Graph FF 1055 abc h0h0 a 10 b5b5 c5c rh = 0 ra = 0 rb = 1 rc = 0 10 wij_retimed = wij + rj – ri
Spring 07, Apr 10, 12ELEC 7770: Advanced VLSI Design (Agrawal)31 Retimed Circuit FF a b c h0h0 a 10 b5b5 c5c5 0 1 rh = 0 ra = 0 rb = 1 rc = 0 10 Critical Path = 10 Logic optimization will remove these.