1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of.

Slides:



Advertisements
Similar presentations
Mathematical Preliminaries
Advertisements

3.6 Support Vector Machines
EE384y: Packet Switch Architectures
Advanced Piloting Cruise Plot.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
UNITED NATIONS Shipment Details Report – January 2006.
and 6.855J Cycle Canceling Algorithm. 2 A minimum cost flow problem , $4 20, $1 20, $2 25, $2 25, $5 20, $6 30, $
and 6.855J Spanning Tree Algorithms. 2 The Greedy Algorithm in Action
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Mean, Median, Mode & Range
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
1 Outline relationship among topics secrets LP with upper bounds by Simplex method basic feasible solution (BFS) by Simplex method for bounded variables.
Robust Window-based Multi-node Technology- Independent Logic Minimization Jeff L.Cobb Kanupriya Gulati Sunil P. Khatri Texas Instruments, Inc. Dept. of.
Porosity Aware Buffered Steiner Tree Construction C. Alpert G. Gandham S. Quay IBM Corp M. Hrkic Univ Illinois Chicago J. Hu Texas A&M Univ.
MICRO-BUMP ASSIGNMENT FOR 3D ICS USING ORDER RELATION TA-YU KUAN, YI-CHUN CHANG, TAI-CHEN CHEN DEPARTMENT OF ELECTRICAL ENGINEERING, NATIONAL CENTRAL UNIVERSITY,
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
THERMAL-AWARE BUS-DRIVEN FLOORPLANNING PO-HSUN WU & TSUNG-YI HO Department of Computer Science and Information Engineering, National Cheng Kung University.
Disassemble, Assemble and Perform a Function Check on the M249
1 Column Generation. 2 Outline trim loss problem different formulations column generation the trim loss problem master problem and subproblem in column.
Randomized Algorithms Randomized Algorithms CS648 1.
Gate Sizing for Cell Library Based Designs Shiyan Hu*, Mahesh Ketkar**, Jiang Hu* *Dept of ECE, Texas A&M University **Intel Corporation.
COMP 482: Design and Analysis of Algorithms
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
VOORBLAD.
15. Oktober Oktober Oktober 2012.
Making Time-stepped Applications Tick in the Cloud Tao Zou, Guozhang Wang, Marcos Vaz Salles*, David Bindel, Alan Demers, Johannes Gehrke, Walker White.
Name Convolutional codes Tomashevich Victor. Name- 2 - Introduction Convolutional codes map information to code bits sequentially by convolving a sequence.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
© 2012 National Heart Foundation of Australia. Slide 2.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
25 seconds left…...
Slippery Slope
Januar MDMDFSSMDMDFSSS
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Dantzig-Wolfe Decomposition
Local Search Jim Little UBC CS 322 – CSP October 3, 2014 Textbook §4.8
Intracellular Compartments and Transport
PSSA Preparation.
VPN AND REMOTE ACCESS Mohammad S. Hasan 1 VPN and Remote Access.
Essential Cell Biology
Foundations of Data Structures Practical Session #7 AVL Trees 2.
Mani Srivastava UCLA - EE Department Room: 6731-H Boelter Hall Tel: WWW: Copyright 2003.
Amit Goyal Laks V. S. Lakshmanan RecMax: Exploiting Recommender Systems for Fun and Profit University of British Columbia
Optimal Partition with Block-Level Parallelization in C-to-RTL Synthesis for Streaming Applications Authors: Shuangchen Li, Yongpan Liu, X.Sharon Hu, Xinyu.
OCV-Aware Top-Level Clock Tree Optimization
Advanced Interconnect Optimizations. Buffers Improve Slack RAT = 300 Delay = 350 Slack = -50 RAT = 700 Delay = 600 Slack = 100 RAT = 300 Delay = 250 Slack.
Topology-Aware Buffer Insertion and GPU-Based Massively Parallel Rerouting for ECO Timing Optimization Yen-Hung Lin, Yun-Jian Lo, Hian-Syun Tong, Wen-Hao.
1 Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertion Presented By Cesare Ferri Takumi Okamoto, Jason Kong.
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
Thermal-aware Steiner Routing for 3D Stacked ICs M. Pathak and S.K. Lim Georgia Institute of Technology ICCAD 07.
1 Efficient Obstacle-Avoiding Rectilinear Steiner Tree Construction Chung-Wei Lin, Szu-Yu Chen, Chi-Feng Li, Yao-Wen Chang, Chia-Lin Yang National Taiwan.
Fast Algorithms for Slew Constrained Minimum Cost Buffering S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** *Dept of ECE, Texas.
BOB-Router: A New Buffering-Aware Global Router with Over-the-Block Routing Resources Yilin Zhang1, Salim Chowdhury2 and David Z. Pan1 1 Department of.
Presentation transcript:

1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of Texas at Austin ISPD’ 2014

Outline  Background & Motivation  TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms  Experimental Results  Conclusion 2

History of VLSI RSMTs  Wirelength driven: BOI, BI1S, RV-based RST, FLUTE and GeoSteiner  Obstacle-avoiding RSMT (OA-RSMT) ›[Chow +, VLSI14] [Liu +, DAC12][Li +, ICCAD08]  Over-the-block RSMT (OB-RSMT) are proposed since 2012 ›[Huang +, ICCAD12] [Zhang +, ICCAD12]  Minimum delay routing tree (MDRT) : BA-Tree, etc.  RAT-driven RSMT: C-Tree, etc. 3

Limitations on Previous Timing- driven RST  Cluster nodes during bottom-up method ›Such as BA-Tree and C-Tree  Clustering distance metric: ›spatial and slack 4 Hard to find accurate slack: Some segments are not fixed yet All segments are not buffered yet

Limitations in Dealing Blocks  Completely neglect block will have slew problem ›No over-the-block buffer allowed  Obstacle avoiding ›More congested outside-block ›Detour means more WL and worse timing 5 detours

Post-buffering Topology Tuning is Necessary  Buffering plays a big role in delay reduction ›Shielding effect; linear delay on long wire ›But it is always placed after wiring  Change topology after buffering is fruitful! 6 D SB unchanged D SA decreased D b2

Our Contributions  Use pre-buffering to find practical slack for each node in the graph  Use over-the-block routing resource to improve WL, buffering cost and timing  Apply post-buffering tuning to improve timing on critical paths with little extra cost 7

Outline  Background & Motivation  TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms  Experimental Results  Conclusion 8

Problem Formulation  N = {s 0,s 1,s 2,...,s n }, n sinks and source s 0  B = {b 1, b 2,..., b m }, non-overlapping rectilinear blocks in two-dimensional space R  Buffered T(V, E) connects all the pins in N to optimize WNS with the lowest buffering cost ›V is the set of nodes ›E is the set of horizontal and vertical edges.  Slew rate on every point in T within constraints ›Slew mode buffering [Hu+, TCAD07]  No buffers are allowed over the blocks 9

Timing Models  Elmore Delay  Slew ›Peri Model + Bakoglu’s Metric » ( 4% error [Kashyap+, ISPD03] [Bakoglu+, 90] ) 10

Overall Algorithm 11 Initial timing-driven RST with Pre-buffering Find all over-the-block slew violation and fix them Buffering Tune the topology according to buffering information Buffering N & B Return buffered T

Initial Tree Generation with Pre-Buffering 12  Iterative method ›Until converges or oscillates between several states  Feed back real delay to each node to find slack (criticality) ›Identified critical sinks before topology construction are real critical ones ›Practical slack on each node

Initial Tree with Pre-Buffering Flow 13 [Lin+, TCAD11]

14 Initial Tree with Pre-Buffering Example Simple model without buffering suggests D is critical However, with buffering, D is not critical Now, D is inserted far from source with less WL

Buffering-Aware Over-the-Block TD-RST  TD-RST needs over-the-block route ›Better WL, buffer resources and timing ›Replace obstacle-avoiding detours with shorter over- the-block connection ps 100ps 120ps 110ps

16 Different with WL-driven BOB-RSMT Original WL driven  Move non-critical paths to save slew  Protect critical paths for timing WL+slack

 The hard problem with over-the-block is slew  Each topology confines a set of inside trees  Use hypothetic buffer to check if it is possible for buffering 17 Slew Constraints in Buffering-Aware TD-RST

Optimization Primitives  Three optimization primitives 18 Parallel sliding Perpendicular sliding EP merging [Zhang, ICCAD12]

 Formulation consider slack and WL together 19 Formulation of Buffering-Aware TD-RST W ij C d EP i t : delay increase for every sink downstream EP i t Increase of TNS Increase of WL

Buffer-location-based Tuning Benefits  Tuning topology after buffering benefits!  Buffering resources are costly  Improve timing without increasing buffers is tempting ›With small amount of WL increase  We propose a way to post-tune the topology base on buffer location information 20

Saturated/Un-saturated Buffers  Some buffers are “Saturated” and some are “Un- saturated” ›Saturate: the slew reaches maximum ›Un-saturated: slew does not reach maximum 21

Buffer-location-based Tuning Study  Un-saturated buffer == opportunity 22 WL increase Delay to A improves

Buffer-location-based Tuning Condition  Δslew = slew max – slew cur  L max is the max allowed distance to relocate ›If neglecting buffer input cap, L max = ›If consider buffer input cap, L max = 23

Buffer-location-based Tuning Flow 24 Sort all sinks according to slack Tuning Buffered T Return buffered T n = n.parent satisfy L max constraint ? For each neg slack sink n n at source? N Y Continue Buffering

Outline  Background & Motivation  TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms  Experimental Results  Conclusion 25

Experimental Setups  C++ programming language  Intel Core 3.0GHz Linux machine with 32GB memory  Gurobi Optimizer 5.10 for mathematical optimization  RC01-RC12 are benchmarks [Feng+, ISPD06]  Two sizes of buffers: 450 ohms and 850 ohms, 3.8 fF and 1.9 fF  Interconnect RC from ITRS and slew constraints 70ps 26

Experimental Setups  SD-OARST is baseline [Lin+, TCAD11]  TOB-RST-1 OA-RST with pre-buffering  TOB-RST-2 is over-the-block with pre-buffering  TOB-RST is over-the-block with pre-buffering and post-buffering tuning 27

Experimental Results 28  TOB-RST-1 to SD-OARST ›similarity of WL (buffering cost) ›pre-buffering benefits the slack  TOB-RST-2 to TOB-RST-1: ›179ps on average for WNS ›buffering cost and WL reduced by 6% and 5%  TOB-RST to TOB-RST-2: ›70ps in WNS on average, less than 1% more WL

Experimental Results 29

Outline  Background & Motivation  TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms  Experimental Results  Conclusion 30

Conclusion  Timing-driven over-the-block rectilinear Steiner minimum tree  Use pre-buffering to find practical slack for each node  Use over-the-block routing resources to improve WL, buffering cost and timing  Apply post-buffering tuning to improve timing on critical paths with little extra cost  Significantly improve WNS for all benchmarks along with 2% less WL and 4% less buffering cost than SD-OARST 31

Acknowledgment  This work is supported in part by Oracle  Thanks to Dr. Salim Chowdhury, Dr. Rajendran Panda and Dr. Akshay Sharma from Oracle 32 Thank you! Questions?