Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of.

Similar presentations


Presentation on theme: "1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of."— Presentation transcript:

1 1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of Texas at Austin ISPD’ 2014

2 Outline  Background & Motivation  TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms  Experimental Results  Conclusion 2

3 History of VLSI RSMTs  Wirelength driven: BOI, BI1S, RV-based RST, FLUTE and GeoSteiner  Obstacle-avoiding RSMT (OA-RSMT) ›[Chow +, VLSI14] [Liu +, DAC12][Li +, ICCAD08]  Over-the-block RSMT (OB-RSMT) are proposed since 2012 ›[Huang +, ICCAD12] [Zhang +, ICCAD12]  Minimum delay routing tree (MDRT) : BA-Tree, etc.  RAT-driven RSMT: C-Tree, etc. 3

4 Limitations on Previous Timing- driven RST  Cluster nodes during bottom-up method ›Such as BA-Tree and C-Tree  Clustering distance metric: ›spatial and slack 4 Hard to find accurate slack: Some segments are not fixed yet All segments are not buffered yet

5 Limitations in Dealing Blocks  Completely neglect block will have slew problem ›No over-the-block buffer allowed  Obstacle avoiding ›More congested outside-block ›Detour means more WL and worse timing 5 detours

6 Post-buffering Topology Tuning is Necessary  Buffering plays a big role in delay reduction ›Shielding effect; linear delay on long wire ›But it is always placed after wiring  Change topology after buffering is fruitful! 6 D SB unchanged D SA decreased D b2

7 Our Contributions  Use pre-buffering to find practical slack for each node in the graph  Use over-the-block routing resource to improve WL, buffering cost and timing  Apply post-buffering tuning to improve timing on critical paths with little extra cost 7

8 Outline  Background & Motivation  TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms  Experimental Results  Conclusion 8

9 Problem Formulation  N = {s 0,s 1,s 2,...,s n }, n sinks and source s 0  B = {b 1, b 2,..., b m }, non-overlapping rectilinear blocks in two-dimensional space R  Buffered T(V, E) connects all the pins in N to optimize WNS with the lowest buffering cost ›V is the set of nodes ›E is the set of horizontal and vertical edges.  Slew rate on every point in T within constraints ›Slew mode buffering [Hu+, TCAD07]  No buffers are allowed over the blocks 9

10 Timing Models  Elmore Delay  Slew ›Peri Model + Bakoglu’s Metric » ( 4% error [Kashyap+, ISPD03] [Bakoglu+, 90] ) 10

11 Overall Algorithm 11 Initial timing-driven RST with Pre-buffering Find all over-the-block slew violation and fix them Buffering Tune the topology according to buffering information Buffering N & B Return buffered T

12 Initial Tree Generation with Pre-Buffering 12  Iterative method ›Until converges or oscillates between several states  Feed back real delay to each node to find slack (criticality) ›Identified critical sinks before topology construction are real critical ones ›Practical slack on each node

13 Initial Tree with Pre-Buffering Flow 13 [Lin+, TCAD11]

14 14 Initial Tree with Pre-Buffering Example Simple model without buffering suggests D is critical However, with buffering, D is not critical Now, D is inserted far from source with less WL

15 Buffering-Aware Over-the-Block TD-RST  TD-RST needs over-the-block route ›Better WL, buffer resources and timing ›Replace obstacle-avoiding detours with shorter over- the-block connection 15 150ps 100ps 120ps 110ps

16 16 Different with WL-driven BOB-RSMT Original WL driven  Move non-critical paths to save slew  Protect critical paths for timing WL+slack

17  The hard problem with over-the-block is slew  Each topology confines a set of inside trees  Use hypothetic buffer to check if it is possible for buffering 17 Slew Constraints in Buffering-Aware TD-RST

18 Optimization Primitives  Three optimization primitives 18 Parallel sliding Perpendicular sliding EP merging [Zhang, ICCAD12]

19  Formulation consider slack and WL together 19 Formulation of Buffering-Aware TD-RST W ij C d EP i t : delay increase for every sink downstream EP i t Increase of TNS Increase of WL

20 Buffer-location-based Tuning Benefits  Tuning topology after buffering benefits!  Buffering resources are costly  Improve timing without increasing buffers is tempting ›With small amount of WL increase  We propose a way to post-tune the topology base on buffer location information 20

21 Saturated/Un-saturated Buffers  Some buffers are “Saturated” and some are “Un- saturated” ›Saturate: the slew reaches maximum ›Un-saturated: slew does not reach maximum 21

22 Buffer-location-based Tuning Study  Un-saturated buffer == opportunity 22 WL increase Delay to A improves

23 Buffer-location-based Tuning Condition  Δslew = slew max – slew cur  L max is the max allowed distance to relocate ›If neglecting buffer input cap, L max = ›If consider buffer input cap, L max = 23

24 Buffer-location-based Tuning Flow 24 Sort all sinks according to slack Tuning Buffered T Return buffered T n = n.parent satisfy L max constraint ? For each neg slack sink n n at source? N Y Continue Buffering

25 Outline  Background & Motivation  TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms  Experimental Results  Conclusion 25

26 Experimental Setups  C++ programming language  Intel Core 3.0GHz Linux machine with 32GB memory  Gurobi Optimizer 5.10 for mathematical optimization  RC01-RC12 are benchmarks [Feng+, ISPD06]  Two sizes of buffers: 450 ohms and 850 ohms, 3.8 fF and 1.9 fF  Interconnect RC from ITRS and slew constraints 70ps 26

27 Experimental Setups  SD-OARST is baseline [Lin+, TCAD11]  TOB-RST-1 OA-RST with pre-buffering  TOB-RST-2 is over-the-block with pre-buffering  TOB-RST is over-the-block with pre-buffering and post-buffering tuning 27

28 Experimental Results 28  TOB-RST-1 to SD-OARST ›similarity of WL (buffering cost) ›pre-buffering benefits the slack  TOB-RST-2 to TOB-RST-1: ›179ps on average for WNS ›buffering cost and WL reduced by 6% and 5%  TOB-RST to TOB-RST-2: ›70ps in WNS on average, less than 1% more WL

29 Experimental Results 29

30 Outline  Background & Motivation  TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms  Experimental Results  Conclusion 30

31 Conclusion  Timing-driven over-the-block rectilinear Steiner minimum tree  Use pre-buffering to find practical slack for each node  Use over-the-block routing resources to improve WL, buffering cost and timing  Apply post-buffering tuning to improve timing on critical paths with little extra cost  Significantly improve WNS for all benchmarks along with 2% less WL and 4% less buffering cost than SD-OARST 31

32 Acknowledgment  This work is supported in part by Oracle  Thanks to Dr. Salim Chowdhury, Dr. Rajendran Panda and Dr. Akshay Sharma from Oracle 32 Thank you! Questions?


Download ppt "1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of."

Similar presentations


Ads by Google