Download presentation
Presentation is loading. Please wait.
Published byHorace Patrick Modified over 9 years ago
1
1 Retiming and Re-synthesis Outline: RetimingRetiming Retiming and Resynthesis (RnR)Retiming and Resynthesis (RnR) Resynthesis of PipelinesResynthesis of Pipelines
2
2 Optimizing Sequential Circuits by Retiming Netlist of Gates Netlist of gates and registers: Various Goals: –Reduce clock cycle time –Reduce area Reduce number of latchesReduce number of latches Inputs Outputs
3
3 Retiming Problem –Pure combinational optimization can be myopic since relations across register boundaries are disregarded Solutions –Retiming: Move register(s) so that clock cycle decreases, or number of registers decreases andclock cycle decreases, or number of registers decreases and input-output behavior is preservedinput-output behavior is preserved –RnR: Combine retiming with combinational optimization techniques Move latches out of the way temporarilyMove latches out of the way temporarily optimize larger blocks of combinationaloptimize larger blocks of combinational
4
4 Circuit Represetation [Leiserson, Rose and Saxe (1983)] Circuit representation: G(V,E,d,w) –V set of gates –E set of wires –d(v) = delay of gate/vertex v, (d(v) 0) –w(e) = number of registers on edge e, (w(e) 0)
5
5 Circuit Representation Example: Correlator Circuit (x, y) = 1 if x=y 0 otherwise Operation delay 3 3 + 7 + 7 Every cycle in Graph has at least one register i.e. no combinational loops. 0 33 0 0 0 0 2 Graph (Directed) 7a b+ Host
6
6 Preliminaries For a path p : Clock cycle For correlator c = 13 Path with w(p)=0 0 33 0 0 0 0 27
7
7 Movement of registers from input to output of a gate or vice versaMovement of registers from input to output of a gate or vice versa Does not affect gate functionalitiesDoes not affect gate functionalities A mathematical definition: retardationA mathematical definition: retardation –r: V Z, an integer vertex labeling –w r (e) = w(e) + r(v) - r(u) for edge e = (u,v) Basic Operation Retime by 1 Retime by -1
8
8 Thus in the example, r(u) = -1, r(v) = -1 results in For a path p: s t, w r (p) = w(p) + r(t) - r(s)For a path p: s t, w r (p) = w(p) + r(t) - r(s) RetardationRetardation –r: V Z, an integer vertex labeling –w r (e) =w(e) + r(v) - r(u) for edge e= (u,v) –A retiming r is legal if w r (e) 0, e E Basic Operation vu 0 33 0 0 0 0 27v u 0 33 0 1 1 0 17
9
9 Retiming for minimum clock cycle Problem Statement: (minimum cycle time) Given G (V, E, d, w), find a legal retiming r so that is minimized Retiming: 2 important matrices Register weight matrixRegister weight matrix Delay matrixDelay matrix
10
10 Retiming for minimum clock cycle W V0 V1 V2 V3 V0V1V2V3 0 2 2 2 0 0 0 0 0 2 0 0 0 2 2 0 c p, if d(p) then w(p) 1 D V0 V1 V2 V3 V0V1V2V3 0 3 6 13 13 3 6 13 10 13 3 10 7 10 13 7 V2 V1 v0 0 33 0 0 0 0 27 W = register path weight matrix (minimum # latches (minimum # latches on all paths between on all paths between u and v) u and v) D = path delay matrix (maximum delay on (maximum delay on all paths between all paths between u and v) u and v)
11
11 Conditions for Retiming Assume that we are asked to check if a retiming exists for a clock cycle Legal retiming: w r (e) 0 for all e. Hence w r (e) = w(e) = r(v) - r(u) 0 or r (u) - r (v) w (e) For all paths p: u v such that d(p) , we require w r (p) 1 –Thus Or take the least w(p) (tightest constraint) r(u)-r(v) W(u,v)-1 Note: this is independent of the path from u to v, so we just need to apply it to u, v such that D(u,v)
12
12 All constraints in difference-of-2-variable formAll constraints in difference-of-2-variable form Related to shortest path problemRelated to shortest path problem Solving the constraints Correlator: = 7 Legal: r(u)-r(v) w(e) D>7: r(u)-r(v) W(u,v)-1 V2 v1 v0 0 33 0 0 0 0 27 W V0 V1 V2 V3 V0V1V2V3 0 2 2 2 0 0 0 0 0 2 0 0 0 2 2 0 D V0 V1 V2 V3 V0V1V2V3 0 3 6 13 13 3 6 13 10 13 3 10 7 10 13 7
13
13 Do shortest path on constraint graph: (O(|V| 3 )).Do shortest path on constraint graph: (O(|V| 3 )). A solution exists if and only if there exists no negative weighted cycle.A solution exists if and only if there exists no negative weighted cycle. Solving the constraints Legal: r(u)-r(v) w(e) D>7: r(u)-r(v) W(u,v)-1 A solution is r(v 0 ) = r(v 3 ) = 0, r(v 1 ) = r(v 2 ) = -1 r(1) r(0) r(3)r(2) 0 1 1 1 1 1 0,-1 0,-1 0 0 2 Constraint graph
14
14 Retiming To find the minimum cycle time, do a binary search among the entries of the D matrix (0( V 3 log V )) Retime Retimed correlator: Clock cycle = 3+3+7=13 = 3+3+7=13 Clock cycle = 7 V2 v1 v0 0 33 0 0 0 0 2 7 a b + Host a b + Host W V0 V1 V2 V3 V0V1V2V3 0 2 2 2 0 0 0 0 0 2 0 0 0 2 2 0 D V0 V1 V2 V3 V0V1V2V3 0 3 6 13 13 3 6 13 10 13 3 10 7 10 13 7
15
15 1. Relaxation based: –Repeatedly find critical path; –retime vertex at end of path by +1 (O( V E log V )) 2. Also, Mixed Integer Linear Program formulation Retiming: 2 more algorithms +1 u Critical path v
16
16 Retiming for minimum area (minimum # latches) Goal: minimize number of registers used where a v is a constant.
17
17 Minimize: Minimum registers - formulation Subject to: w r (e) =w(e) + r(v) - r(u) 0 Reducible to a flow problemReducible to a flow problem
18
18 Retiming and resynthesis: motivation Goal: incorporate combinational optimization into sequential optimization Naïve approach: carve out combinational regions, do optimization on each region. Only local gains made.Naïve approach: carve out combinational regions, do optimization on each region. Only local gains made. Can we do any better?Can we do any better? Sentovich, Malik, Brayton andSangiovanni-Vincentelli (‘89) 3 step approach 1.Move registers to boundary of circuit 2.Optimize network 3.Move registers back in an optimal way RnR: a new approach
19
19 RnR: circuit representation Circuit representation: communication graph –internal/peripheral edges –edge-weight = register count c a b11 1 2 0 i j internal peripheral
20
20 Extended Retiming Move register to the peripheryMove register to the periphery Negative edge-weights permittedNegative edge-weights permitted A negative latch has the interpretation that it advances its output by 1 clock cycle instead of delaying itA negative latch has the interpretation that it advances its output by 1 clock cycle instead of delaying it 1 negativelatch
21
21 Peripheral Retiming A retiming is called a peripheral retiming if it results in all internal edges having zero weight Peripheral edges can have negative weight 1111 All internal edge weights are 0 1111 2222 nnnn 2222 nnnn kkkk kkkk peripheralweights
22
22 Peripheral Retiming A circuit that undergoes peripheral retiming followed by a legal retiming, i.e. one that results in all weights 0 is “functionally equivalent” to the original circuit Functional equivalence: equivalence of finite state automata: But we have to be careful about the initial conditions and initializing sequences. The resulting circuit may only exhibit equivalence after an appropriate delay. See [Singhal et.al ICCAD 1995]. This holds even for regular retiming
23
23 Peripheral Retiming - an example c a b 01 0 0 2 i j Peripheral retiming Not possible for all circuits: c a b 11 1 2 0 i j c a b0i j d d000 100 0 0 o1 o2
24
24 Path Weight Matrix (PWM) Matrix W –rows:inputs –columns:outputs a b1 i jk 1 o1o2 o1 o2 o1 o2 I 1 * j 0 ~ k * 0
25
25 PWM and Peripheral Retiming Satisfiable path weight matrix: 1. W ij ~, i, j and 2. i, j, such that W ij = I + j, W ij * Peripheral retiming possible matrix is satisfiable – i, j specify registers on peripheral edge –some i, j can be negative Complexity: linear in size of communication graph
26
26 Path Weight Matrix - generation From inputs to outputs: generate output columns of W. Example: a b1 i jk 1 o1o2 [ * ~ 0 ] [ 1 0 * ] [ 0 * * ] [ * 0 * ] [ * * 0 ] ([ 0 * * ]+1) & ([ * 0 * ]+0) = [ 1 * * ] & [ * 0 * ] = [ 1 0 * ] ([ 0 * * ]+1) & ([ * 0 * ]+0) = [ 1 * * ] & [ * 0 * ] = [ 1 0 * ] ([ * 0 * ]+0) & ([ * * 0 ]+0) = [ * 0 * ] & [ * * 0 ] = [ * 0 0 ] ([ * 0 * ]+0) & ([ * * 0 ]+0) = [ * 0 * ] & [ * * 0 ] = [ * 0 0 ] ( [ * 0 * ]+1) & [ * 0 0 ] = [ * 1 * ] & [ * 0 0 ] = [ * ~ 0 ] ( [ * 0 * ]+1) & [ * 0 0 ] = [ * 1 * ] & [ * 0 0 ] = [ * ~ 0 ] o1 o2 o1 o2 I 1 * j 0 ~ k * 0 Paths from inputs to node
27
27 Computing , Each constraint of the form I + j = W ij Procedure 1.set 1 = 0 2.use first row to generate ‘s 3.determine ‘s from ‘s 4.check for consistency Example: o i 2 j 3 c a b 1/0 1/1 1/0 2/0 0/2 i j o 1 = 0 2 = 1 1 = 2
28
28 Computing , Example: o1 o2 o1 o2 i 0 0 j 0 1 1+ 1 = 0 2+ 1 = 0 1+ 1 = 0 2+ 1 = 0 1+ 2 = 0 2+ 2 = 1 ------------------------------ 1- 2 = 0 1- 2 = -1 c a b0i j d d000 100 0 0 o1 o2 contradiction
29
29 Optimizing Acyclic Sequential Circuits Acyclic circuits with satisfiable W –Do peripheral retiming putting i, j registers at the I/O –Resynthesize interior –Do a legal retiming (move registers in) May not always be possibleMay not always be possible Acyclic circuits with unsatisfiable W –Identify maximal sub-circuits with satisfiable W –Cut connections –Repeat previous procedure
30
30 Acyclic Sequential Circuits Note that both of the cut circuits can be peripherally retimed, unlike the original. cutcircuits
31
31 Optimizing Cyclic Sequential Circuits Cyclic circuits 1.Make circuit acyclic by breaking cycles –Different feedback-cuts give different W’s 2.Find sub-circuits with satisfiable W’s 3.Repeat procedure
32
32 FSM Optimization AB OUT B OUT A Break feedback XX YY X Y
33
33 FSM Optimization Peripheral retiming Resynthesize B OUT A B OUT A X Y X Y X Y X Y combinational
34
34 FSM Optimization Retime Reconnect B OUT A B OUT A X Y X Y X Y
35
35 FSM Optimization Resynthesized circuit B OUT A AB OUT Original circuit
36
36 Resynthesis of Pipelines Goal: Performance optimization of pipeline circuits Example: Pipeline circuit C CnCnCnCn CiCiCiCi C1C1C1C1 OnOnOnOn OiOiOiOi O1O1O1O1 InInInIn IiIiIiIi I1I1I1I1 CnCnCnCn CiCiCiCi C1C1C1C1 OnOnOnOn OiOiOiOi O1O1O1O1 InInInIn IiIiIiIi I1I1I1I1 n -1 n -1 i-1 -(n -1) -(i-1) PeripheralRetiming Combinationalcircuit
37
37 Resynthesis of Pipelines Parameters: –Avector of arrival times for inputs –Rvector of required times for outputs –ctarget clock cycle –Ccircuit Pipeline performance problem: P P (C P,c P,A P,R P ) Combinational performance problem: P C (C C,c C,A C,R C )
38
38 Resynthesis of Pipelines Problem Transformation: –Relax P P (C P,c P,A P,R P ) to P ‘ P (C P,c p + ,A P,R P ) where = largest possible single gate delaywhere = largest possible single gate delay –Convert: pipeline P ‘ P to combinational problem P ‘ C
39
39 Perfomance Synthesis of Pipeline Arrival and Required Times for P ‘ C :Arrival and Required Times for P ‘ C : CnCnCnCn CiCiCiCi C1C1C1C1 OnOnOnOn OiOiOiOi O1O1O1O1 InInInIn IiIiIiIi I1I1I1I1 PeripheralRetiming CnCnCnCn CiCiCiCi C1C1C1C1 OnOnOnOn OiOiOiOi O1O1O1O1 InInInIn IiIiIiIi I1I1I1I1 (i-1)c+A i Combinationalcircuit (i-1)c+R i Theoretical Contribution: –If P ‘ C ( c p + ) has a solution then retiming yields a solution to P ‘ P ( c p + ) –If there is a solution to P P ( c p ) then peripheral retiming yields a solution to P ‘ C ( c p + )
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.