Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)1 ELEC 7770 Advanced VLSI Design Spring 2010 Constraint Graph and Retiming Solution Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)2 Retiming Theorem Given a network G(V, E, W) and a cycle time T, (r1,... ) is a feasible retiming if and only if: ri – rj ≤ wijfor all edges (vi,vj) ε E ri – rj ≤ W(vi,vj) – 1 for all node-pairs vi, vj such that D(vi,vj) > T Where, W(vi,vj) is the minimum weight path between vi and vj D(vi,vj) is the maximum delay among all minimum weight paths between vi and vj
Retiming Theorem Explained Condition 1, ri – rj ≤ wij is related to edge weight: Original circuit is feasible => original weight wij is positive Originally, ri = rj = 0 Retiming, rj flip-flops added to eij, ri flip-flops removed from eij, net reduction ri – rj must be less than wij to leave the retimed weight of eij positive. Condition 2, ri – rj ≤ W(vi,vj) – 1 is related to path delays between node pairs being less than clock period T whenever path weight is 0. Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)3
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)4 Timing Optimization Find the clock period (T) by path analysis. Set clock period to T/2 and find a feasible retiming. If feasible, further reduce the clock period to half. If not feasible, increase clock period. Do a binary search for optimum clock period. Retime the circuit.
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)5 Representing a Constraint ri – rj ≤ wijorrj ≥ ri – wij rjri – wij
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)6 Constraint Graph r1 ≥ r0 + 3 r1 ≥ r2 + 1 r2 ≥ r0 + 1 r2 ≥ r1 – 1 r3 ≥ r1 + 1 r3 ≥ r2 + 4 r0 ≥ r3 – 6 r0 r1 r2 r
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)7 Feasibility Condition A set of values for variables can be found if and only if the constraint graph has no positive cycles. This is also the condition for the solvability of the longest path problem, which provides a solution to the set of constraints.
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)8 Example: Infeasible Constraints x1 ≥ x2 + 6 x2 ≥ x1 – 3 x1x x1 x2 6 0 x1 ≥ x2 + 6 x2 ≥ x1 – Positive cycle mean no longest path can be found.
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)9 Solving a Constraint Set r1 ≥ r0 + 3 r1 ≥ r2 + 1 r2 ≥ r0 + 1 r2 ≥ r1 – 1 r3 ≥ r1 + 1 r3 ≥ r2 + 4 r0 ≥ r3 – 6 r0 r1 r2 r Longest paths from source r0 to r0, r1, r2, r3 Path lengths: s0=0, s1=3, s2=2, s3=6 Solution: r0=0, r1=3, r2=2, r3=6
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)10 The General Path Problem Find the shortest (or longest) path in a graph from a source vertex to all other vertices. Graph has vertices and directed edges: Edge weights can be positive or negative Graph can be cyclic Single source vertex – a vertex with 0 in-degree (not a necessary condition) Inconsistent problems Negative weight cycles for shortest path Positive weight cycles for longest path
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)11 Dijkstra’s Shortest Path Algorithm Greedy algorithm. Applies to directed acyclic graphs (DAG) with positive edge weights. Computational complexity O(|E| + |V| log |V|) ≤ O(n 2 ) References: A. Aho, J. Hopcroft and J. Ullman, Data Structures and Algorithms, Reading, Massachusetts: Addison-Wesley, T. Cormen, C. Leiserson and R. Rivest, Introduction to Algorithms, New York: McGraw-Hill, 1990.
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)12 Dijkstra’s Shortest Path Algorithm Example 1 v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially: mark v0 0152 Step 1: mark v Step 2: mark v Step 3: mark v Each step marks the path with smallest weight and updates the unmarked path weights.
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)13 Dijkstra’s Shortest Path Algorithm Example 2 v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially: mark v0 0152 Step 1: mark v Step 2: mark v Step 3: mark v Each step marks the path with smallest weight and updates the unmarked path weights.
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)14 Dijkstra’s Algorithm, G(V, E, W) s0(1) = 0initialize source for ( i = 1 to n )initialize path weights, n=|V| –1 si(1) = w0i repeat { Select an unmarked vertex vq such that sq is minimal Select an unmarked vertex vq such that sq is minimal Mark vq Mark vq foreach ( unmarked vertex vi ) si = min { si, sq + wqi } } until (all vertices are marked)
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)15 Try Dijkstra’s Algorithm for Your Graph
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)16 Dijkstra’s Longest Path Algorithm v0 v2 v3 v1 w01= source si = path length (v0, vi) Alg. steps s0s1s2s3 Initially0-15-2 Step 1: mark v Step 2: mark v Step 3: mark v v0 v2 v3 v1 w01= source Either change min to max Or change all positive weights to negatives
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)17 Dijkstra’s Alg. Does Not Work for Cycles, Mixed Weights v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially: mark v0 0152 Step 1: mark v Step 2: mark v Step 3: mark v1 0726? -2 Algorithm stops because all vertices are marked. But, there exists a v0 to v3 path of length 5
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)18 Bellman’s Equations – Shortest Path vi vn vm vkvj sq =minimum path weight between source and vq wki wji wmi wni For all vertices: si = min (sq + wqi) vq ε pred(vi)
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)19 Bellman-Ford Algorithm, G(V, E, W) Bellman-Ford { s0(1) = 0initialize source for ( i = 1 to n )initialize path weights, n = |V| – 1 si(1) = w0i for ( j = 1 to n )n iterations for ( i = 1 to n ) si(j+1) = min { si(j), sk(j) + wkj } vk ε pred(vi) } if ( si(j+1) == si(j) i ) return (true) } return (false) Complexity = O(|V||E|) ≤ O(n 3 )
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)20 Bellman-Ford Shortest Path v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially0152 Iteration Iteration Iteration n = 3
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)21 Bellman-Ford Longest Path v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially0-15-2 Iteration Iteration n = 3 (shortest path) Reverse the sign of weights and solve shortest path problem. (Alternative: keep original weights and change min operator in algorithm to max.) Weights reversed
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)22 Bellman’s Equations – Longest Path vi vn vm vkvj sq =maximum path weight between source and vq wki wji wmi wni For all vertices: si = max (sq + wqi) vq ε pred(vi)
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)23 Bellman-Ford for Cycles, Neg. Weights v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially0152 Iteration Iteration Iteration n = 3 (shortest path) This was incorrect with Dijkstra’s shortest path algorithm
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)24 Bellman-Ford for Negative Cycle v0 v2 v3 v1 w01= source si = path weight (v0, vi) Alg. steps s0s1s2s3 Initially0152 Iteration Iteration Iteration Values not stabilized after n iterations. Inconsistent problem: negative cycle. n = 3 (shortest path)
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)25 Retiming Example FF 1055 Delay abc
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)26 Retiming Graph FF 1055 abc h0h0 a 10 b5b5 c5c Critical path = 15 It is the longest path consisting only of zero weight edges.
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)27 Feasibility Constraints (Condition 1) FF 1055 abc h0h0 a 10 b5b5 c5c ri – rj ≤ wij edges i → j Retiming should not cause negative edge weights. rh – ra ≤ 0 ra – rb ≤ 0 rb – rc ≤ 1 rc – rh ≤ 1
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)28 Constraint Graph FF 1055 abc rh 0 ra 10 rb 5 rc 5 00 ri – rj ≤ wij edges i → j Retiming should not cause negative edge weights. rh – ra ≤ 0 ra – rb ≤ 0Constraints for rb – rc ≤ 1Condition 1 rc – rh ≤ 1 Observation: Constraint graph has the same structure as the original retiming graph, with signs of weights reversed. Vertex labels are the retiming integer variables.
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)29 Max Delay for Min Weight Paths h0h0 a 10 b5b5 c5c W(h,a) = 0D(h,a) = 10 W(h,b) = 0D(h,b) = 15 W(h,c) = 1D(h,c) = 20 W(a,b) = 0D(a,b) = 15 W(a,c) = 1D(a,c) = 20 W(a,h) = 2D(a,h) = 20 W(b,c) = 1D(b,c) = 10 W(b,h) = 2D(b,h) = 10 W(b,a) = 2D(b,a) = 20 W(c,h) = 1D(c,h) = 5 W(c,a) = 1D(c,a) = 15 W(c,b) = 1D(c,b) = 20 T = 15
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)30 Timing Optimization, T = 7.5? W(h,a) = 0D(h,a) = 10 W(h,b) = 0D(h,b) = 15 W(h,c) = 1D(h,c) = 20 W(a,b) = 0D(a,b) = 15 W(a,c) = 1D(a,c) = 20 W(a,h) = 2D(a,h) = 20 W(b,c) = 1D(b,c) = 10 W(b,h) = 2D(b,h) = 10 W(b,a) = 2D(b,a) = 20 W(c,h) = 1D(c,h) = 5 W(c,a) = 1D(c,a) = 15 W(c,b) = 1D(c,b) = 20 rh 0 ra 10 rb 5 rc 5 00 Add constraints for Condition 2:ri – rj ≤ W(I,j) – 1 paths (i,j) such that D(i,j) > 7.5 Constraint graph (feasibility)
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)31 Timing Optimization, T = 7.5? W(h,a) = 0D(h,a) = 10 W(h,b) = 0D(h,b) = 15 W(h,c) = 1D(h,c) = 20 W(a,b) = 0D(a,b) = 15 W(a,c) = 1D(a,c) = 20 W(a,h) = 2D(a,h) = 20 W(b,c) = 1D(b,c) = 10 W(b,h) = 2D(b,h) = 10 W(b,a) = 2D(b,a) = 20 W(c,h) = 1D(c,h) = 5 W(c,a) = 1D(c,a) = 15 W(c,b) = 1D(c,b) = 20 rh 0 ra 10 rb 5 rc Positive cycle; no solution for longest path
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)32 Timing Optimization, T = 11.25? W(h,a) = 0D(h,a) = 10 W(h,b) = 0D(h,b) = 15 W(h,c) = 1D(h,c) = 20 W(a,b) = 0D(a,b) = 15 W(a,c) = 1D(a,c) = 20 W(a,h) = 2D(a,h) = 20 W(b,c) = 1D(b,c) = 10 W(b,h) = 2D(b,h) = 10 W(b,a) = 2D(b,a) = 20 W(c,h) = 1D(c,h) = 5 W(c,a) = 1D(c,a) = 15 W(c,b) = 1D(c,b) = 20 rh 0 ra 10 rb 5 rc rh = 0 rb = 1 rc = 0 ra = 0
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)33 Retiming Graph FF 1055 abc h0h0 a 10 b5b5 c5c rh = 0 ra = 0 rb = 1 rc = 0 10 wij_retimed = wij + rj – ri
Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)34 Retimed Circuit FF a b c h0h0 a 10 b5b5 c5c5 0 1 rh = 0 ra = 0 rb = 1 rc = 0 10 Critical Path = 10 Logic optimization will remove these.
Reference G. De Micheli, Synthesis and Optimization of Digital Circuits, New York: McGraw-Hill, Spring 2010, Feb 10...ELEC 7770: Advanced VLSI Design (Agrawal)35