Presentation is loading. Please wait.

Presentation is loading. Please wait.

Retiming with Interconnect and Gate Delay CUHK CSE CAD Group Dennis Tong 29 th Sept., 2003.

Similar presentations


Presentation on theme: "Retiming with Interconnect and Gate Delay CUHK CSE CAD Group Dennis Tong 29 th Sept., 2003."— Presentation transcript:

1 Retiming with Interconnect and Gate Delay CUHK CSE CAD Group Dennis Tong 29 th Sept., 2003

2 Presentation Outline  Retiming Revisit  Retiming with Interconnect Delay  Future Work  Conclusion

3 Retiming  Problem Formulation given a sequence circuit G(V, E, d(v), w(e uv )), retiming can be viewed as an vertex-to-integer mapping, r: V  Z, where Z is the set of integers such that a new circuit G ’ (V, E, d(v), w r (e uv )) is obtained. w r (e uv ) = w(e uv ) + r(v) – r(u)  0

4 Retiming with Interconnect Delay  Two Algorithms Proposed an optimal approach  gives optimal solution when both gate and interconnect delay are considered a near-optimal fast approach  gives optimal solution when gate delay is neglected, but still gives near-optimal results when both delays are considered  runs much faster

5 An Optimal Approach  Extension from the Original Paper “ Retiming Synchronous Circuitry ”, Charles E. Leiserson and James B. Saxe, Algorithmica, 6:5- 35, 1991.  Main Idea transform retiming to a special case of MILP which is polynomial time solvable

6 Near-optimal Fast Approach  give optimal solution when no gate delay  Pre-processing replace each gate by a wire represent gate delay d(v) by wire delay d(v 1,v 2 ) d(v) v pre-process v1v1 v2v2 d(v 1,v 2 ) = d(v) d(v 1 ) = 0 d(v 2 ) = 0

7 Near-optimal Fast Approach  Post-processing remove registered “ got retimed ” into the gates use linear programming to minimize clock v1v1 v2v2 “ got retimed ” into gate v post-process v1v1 v2v2 “ got removed ” from gate v

8 Near-optimal Fast Approach  Algorithm Overview 1. transform G(V,E) into a DAG G ’ (V ’,E ’ ) 2. construct timing constraints 3. solve the set of constraints 4. find optimum T opt by binary search 5. post-process flip flops “ got retimed ” into gates

9 Near-optimal Fast Approach Step 1: Transform G(V,E) into a DAG G ’ (V ’,E ’ ) – traverse G in a depth-first manner – break all back edges found – denote V b the set of vertices have back edges (e.g., A and B  V b ) DFS traversal A B C A’A’ B’B’ G ’ (V ’,E ’ ) A B C G(V,E)

10 Near-optimal Fast Approach Step 2: Define Timing Variable t v t v - for all v  V ’, denotes the maximum interconnect delay from a register connecting to an input of v. A B C A’A’ B’B’ t c1 t c2 t c = MAX { t c1, t c2 } = t c1 In general, t v is given by: t v  max { t u + d(u,v)  (w(e uv ) + r(v)  r(u)) T } u  in(v) in(v) : the set of vertices in V ’ with an edge pointing to v in G ’

11 Near-optimal Fast Approach Step 2: Construct Timing Constraints A B C A’A’ B’B’ t c1 t c2 Given t v for all v  V ’ : t v  max { t u + d(u,v)  (w(e uv ) + r(v)  r(u)) T } (1) u  in(v) We have constraints : t v  T  v  V ’ (2) t v’  t v  v  V b (3) r(v ’ ) = r(v)  v  V b (4)

12 Near-optimal Fast Approach Step 3: Solve the Set of Timing Constraints Main Idea A B C A’A’ B’B’ E xpress t v for  v  V ’ in terms of t u and r(u) where u  V b R educe the constraints involve t u and r(u) only U se Bellman-Ford algo. to solve for t u and r(u) D erive t v and r(v) by propagating t u and r(u) in G ’

13 Near-optimal Fast Approach Step 3: Express t v in terms of t u and r(u) where u  V b  uv - for all u, v  V ’ denotes the maximum delay among all the directed paths from u to v in G’ when no retiming is done, reducing the delay by T if a register is encountered. For example,  AA’ = max { d ABCA’ – 5T, d ACA’ – 3T }  AB’ = max { d ABCB’ – 4T, d ACB’ – 2T } Combining t v and  uv, (1) becomes: t v  max { t q +  qv  (r(v)  r(q)) T } (1 ’ ) A B C A’A’ B’B’ q  anc(v) anc(v) : the set of vertices in V b with a directed path to v in G ’

14 Near-optimal Fast Approach Step 3: Reduce the constraints involve t u and r(u) only A B C A’A’ B’B’ Given t v for all v  V ’ : t v  t q +  qv  (r(v)  r(q)) T }  q  anc(v) Let  v = t v + r(v) T for  v  V b, the constraints become:  q +  qv’   v (5)  q +  qv   v (6) A system of difference inequalities

15 Near-optimal Fast Approach Step 3: Derive t v and r(v) from t u and r(u) in G ’ A B C A’A’ B’B’ Step 4: Find optimum T opt by binary search S olve  u for  u  V b by Bellman-Ford algo. C ompute t u and r(u) given  u = t u + r(u) T C ompute t v and r(v) for  v  V ’ - V b using (1’)

16 Experimental Results - I NEAR- OPTIMAL OPTIMAL T ’ – T opt T opt (%) Circuits#V#E T’T’ runtime (sec) T opt runtime (sec) s1488655140518.850.2818.825.62 0.16 s1494649141120.780.2520.784.37 0.00 s32711574270710.241.0910.2433.70 0.00 s33301791289027.050.5027.0543.14 0.00 s33841687278224.210.7424.1625.19 0.21  Testing Environment Intel Xeon 1.8GHz, 512KB cache, 512MB RAM ISCAS89 ’ suite

17 Experimental Results - II NEAR- OPTIMAL OPTIMAL T ’ – T opt T opt (%) Circuits#V#E T’T’ runtime (sec) T opt runtime (sec) s48632344409323.583.1223.5887.75 0.00 s53782781426127.271.1627.25138.68 0.07 s66693082539923.071.9122.96177.59 1.00 s92345599800542.734.0842.73512.86 0.00 s1320779531130272.348.1172.341161.07 0.00 s1585097741379467.8224.0267.821545.59 0.00 s38417221813213536.5383.5636.527680.79 0.03 s38584192553301094.26445.63N/A>15000 --

18 Future Work  Multi-pin Net Handling find a maximum sharing of flip flops in a net while the clock is preserved avoid unrealistic increase in number of flip flops u v1v1 v3v3 v2v2 flip flop shared in the stem v1v1 v2v2 v3v3 u unrealistic increase in number of flip flop

19 Future Work  Circuit Delay Modeling flip flop positions affect delay estimation load in each branch affect one another retimed to X X Y


Download ppt "Retiming with Interconnect and Gate Delay CUHK CSE CAD Group Dennis Tong 29 th Sept., 2003."

Similar presentations


Ads by Google