1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of.

1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of Texas at Austin ISPD’ 2014

Outline  Background & Motivation  TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms  Experimental Results  Conclusion 2

History of VLSI RSMTs  Wirelength driven: BOI, BI1S, RV-based RST, FLUTE and GeoSteiner  Obstacle-avoiding RSMT (OA-RSMT) ›[Chow +, VLSI14] [Liu +, DAC12][Li +, ICCAD08]  Over-the-block RSMT (OB-RSMT) are proposed since 2012 ›[Huang +, ICCAD12] [Zhang +, ICCAD12]  Minimum delay routing tree (MDRT) : BA-Tree, etc.  RAT-driven RSMT: C-Tree, etc. 3

Limitations on Previous Timing- driven RST  Cluster nodes during bottom-up method ›Such as BA-Tree and C-Tree  Clustering distance metric: ›spatial and slack 4 Hard to find accurate slack: Some segments are not fixed yet All segments are not buffered yet

Limitations in Dealing Blocks  Completely neglect block will have slew problem ›No over-the-block buffer allowed  Obstacle avoiding ›More congested outside-block ›Detour means more WL and worse timing 5 detours

Post-buffering Topology Tuning is Necessary  Buffering plays a big role in delay reduction ›Shielding effect; linear delay on long wire ›But it is always placed after wiring  Change topology after buffering is fruitful! 6 D SB unchanged D SA decreased D b2

Our Contributions  Use pre-buffering to find practical slack for each node in the graph  Use over-the-block routing resource to improve WL, buffering cost and timing  Apply post-buffering tuning to improve timing on critical paths with little extra cost 7

Problem Formulation  N = {s 0,s 1,s 2,...,s n }, n sinks and source s 0  B = {b 1, b 2,..., b m }, non-overlapping rectilinear blocks in two-dimensional space R  Buffered T(V, E) connects all the pins in N to optimize WNS with the lowest buffering cost ›V is the set of nodes ›E is the set of horizontal and vertical edges.  Slew rate on every point in T within constraints ›Slew mode buffering [Hu+, TCAD07]  No buffers are allowed over the blocks 9

Timing Models  Elmore Delay  Slew ›Peri Model + Bakoglu’s Metric » ( 4% error [Kashyap+, ISPD03] [Bakoglu+, 90] ) 10

Overall Algorithm 11 Initial timing-driven RST with Pre-buffering Find all over-the-block slew violation and fix them Buffering Tune the topology according to buffering information Buffering N & B Return buffered T

Initial Tree Generation with Pre-Buffering 12  Iterative method ›Until converges or oscillates between several states  Feed back real delay to each node to find slack (criticality) ›Identified critical sinks before topology construction are real critical ones ›Practical slack on each node

Initial Tree with Pre-Buffering Flow 13 [Lin+, TCAD11]

14 Initial Tree with Pre-Buffering Example Simple model without buffering suggests D is critical However, with buffering, D is not critical Now, D is inserted far from source with less WL

Buffering-Aware Over-the-Block TD-RST  TD-RST needs over-the-block route ›Better WL, buffer resources and timing ›Replace obstacle-avoiding detours with shorter over- the-block connection 15 150ps 100ps 120ps 110ps

16 Different with WL-driven BOB-RSMT Original WL driven  Move non-critical paths to save slew  Protect critical paths for timing WL+slack

 The hard problem with over-the-block is slew  Each topology confines a set of inside trees  Use hypothetic buffer to check if it is possible for buffering 17 Slew Constraints in Buffering-Aware TD-RST

Optimization Primitives  Three optimization primitives 18 Parallel sliding Perpendicular sliding EP merging [Zhang, ICCAD12]

 Formulation consider slack and WL together 19 Formulation of Buffering-Aware TD-RST W ij C d EP i t : delay increase for every sink downstream EP i t Increase of TNS Increase of WL

Buffer-location-based Tuning Benefits  Tuning topology after buffering benefits!  Buffering resources are costly  Improve timing without increasing buffers is tempting ›With small amount of WL increase  We propose a way to post-tune the topology base on buffer location information 20

Saturated/Un-saturated Buffers  Some buffers are “Saturated” and some are “Un- saturated” ›Saturate: the slew reaches maximum ›Un-saturated: slew does not reach maximum 21

Buffer-location-based Tuning Study  Un-saturated buffer == opportunity 22 WL increase Delay to A improves

Buffer-location-based Tuning Condition  Δslew = slew max – slew cur  L max is the max allowed distance to relocate ›If neglecting buffer input cap, L max = ›If consider buffer input cap, L max = 23

Buffer-location-based Tuning Flow 24 Sort all sinks according to slack Tuning Buffered T Return buffered T n = n.parent satisfy L max constraint ? For each neg slack sink n n at source? N Y Continue Buffering

Experimental Setups  C++ programming language  Intel Core 3.0GHz Linux machine with 32GB memory  Gurobi Optimizer 5.10 for mathematical optimization  RC01-RC12 are benchmarks [Feng+, ISPD06]  Two sizes of buffers: 450 ohms and 850 ohms, 3.8 fF and 1.9 fF  Interconnect RC from ITRS and slew constraints 70ps 26

Experimental Setups  SD-OARST is baseline [Lin+, TCAD11]  TOB-RST-1 OA-RST with pre-buffering  TOB-RST-2 is over-the-block with pre-buffering  TOB-RST is over-the-block with pre-buffering and post-buffering tuning 27

Experimental Results 28  TOB-RST-1 to SD-OARST ›similarity of WL (buffering cost) ›pre-buffering benefits the slack  TOB-RST-2 to TOB-RST-1: ›179ps on average for WNS ›buffering cost and WL reduced by 6% and 5%  TOB-RST to TOB-RST-2: ›70ps in WNS on average, less than 1% more WL

Experimental Results 29

Conclusion  Timing-driven over-the-block rectilinear Steiner minimum tree  Use pre-buffering to find practical slack for each node  Use over-the-block routing resources to improve WL, buffering cost and timing  Apply post-buffering tuning to improve timing on critical paths with little extra cost  Significantly improve WNS for all benchmarks along with 2% less WL and 4% less buffering cost than SD-OARST 31

Acknowledgment  This work is supported in part by Oracle  Thanks to Dr. Salim Chowdhury, Dr. Rajendran Panda and Dr. Akshay Sharma from Oracle 32 Thank you! Questions?

1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of.

Similar presentations

Presentation on theme: "1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of.

Similar presentations

Presentation on theme: "1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of."— Presentation transcript:

Similar presentations

About project

Feedback