F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State) Provably Good Global Buffering by.

Slides:



Advertisements
Similar presentations
OCV-Aware Top-Level Clock Tree Optimization
Advertisements

Greed is good. (Some of the time)
The Randomization Repertoire Rajmohan Rajaraman Northeastern University, Boston May 2012 Chennai Network Optimization WorkshopThe Randomization Repertoire1.
Buffer and FF Insertion Slides from Charles J. Alpert IBM Corp.
Optimization of Linear Placements for Wirelength Minimization with Free Sites A. B. Kahng, P. Tucker, A. Zelikovsky (UCLA & UCSD) Supported by grants from.
1 Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertion Presented By Cesare Ferri Takumi Okamoto, Jason Kong.
A.B. Kahng, Ion I. Mandoiu University of California at San Diego, USA A.Z. Zelikovsky Georgia State University, USA Supported in part by MARCO GSRC and.
1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.
© Yamacraw, 2001 Minimum-Buffered Routing of Non-Critical Nets for Slew Rate and Reliability A. Zelikovsky GSU Joint work with C. Alpert.
Minimum-Buffered Routing of Non- Critical Nets for Slew Rate and Reliability Control Supported by Cadence Design Systems, Inc. and the MARCO Gigascale.
Background: Scan-Based Delay Fault Testing Sequentially apply initialization, launch test vector pairs that differ by 1-bit shift A vector pair induces.
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA/UCSD) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State) Practical Approximation Algorithms.
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
ER UCLA UCLA ICCAD: November 5, 2000 Predictable Routing Ryan Kastner, Elaheh Borzorgzadeh, and Majid Sarrafzadeh ER Group Dept. of Computer Science UCLA.
Network Design Adam Meyerson Carnegie-Mellon University.
ABSTRACT We consider the problem of buffering a given tree with the minimum number of buffers under load cap and buffer skew constraints. Our contributions.
F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (Georgia Tech/UCLA) S. Muddu (Silicon Graphics) A. Zelikovsky (Georgia State) Provably Good Global.
Provably Good Global Buffering Using an Available Buffer Block Plan F. F. Dragan (Kent) A. B. Kahng (UCLA) I. Mandoiu (Gatech) S. Muddu (Silicon graphics)
Local Unidirectional Bias for Smooth Cutsize-delay Tradeoff in Performance-driven Partitioning Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts. Work supported.
[1][1][1][1] Lecture 5-7: Cell Planning of Cellular Networks June 22 + July 6, Introduction to Algorithmic Wireless Communications David Amzallag.
Symmetric Connectivity With Minimum Power Consumption in Radio Networks G. Calinescu (IL-IT) I.I. Mandoiu (UCSD) A. Zelikovsky (GSU)
Continuous Retiming EECS 290A Sequential Logic Synthesis and Verification.
1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.
Non-tree Routing for Reliability & Yield Improvement A.B. Kahng – UCSD B. Liu – Incentia I.I. Mandoiu – UCSD Work supported by Cadence, MARCO GSRC, and.
EDA (CS286.5b) Day 19 Covering and Retiming. “Final” Like Assignment #1 –longer –more breadth –focus since assignment #2 –…but ideas are cummulative –open.
Floorplan Evaluation with Timing-Driven Global Wireplanning, Pin Assignment and Buffer / Wire Sizing Christoph Albrecht Synopsys, Inc., Mountain View formerly.
VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 18. Global Routing (II)
Ion I. Mandoiu Ph.D. Defense of Research August 11, 2000 Approximation Algorithms for VLSI Routing.
Introduction to Routing. The Routing Problem Apply after placement Input: –Netlist –Timing budget for, typically, critical nets –Locations of blocks and.
Escape Routing For Dense Pin Clusters In Integrated Circuits Mustafa Ozdal, Design Automation Conference, 2007 Mustafa Ozdal, IEEE Trans. on CAD, 2009.
1.3 Modeling with exponentially many constr.  Some strong formulations (or even formulation itself) may involve exponentially many constraints (cutting.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Global Routing.
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Solving Hard Instances of FPGA Routing with a Congestion-Optimal Restrained-Norm Path Search Space Keith So School of Computer Science and Engineering.
Archer: A History-Driven Global Routing Algorithm Mustafa Ozdal Intel Corporation Martin D. F. Wong Univ. of Illinois at Urbana-Champaign Mustafa Ozdal.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
1 CS612 Algorithms for Electronic Design Automation CS 612 – Lecture 8 Lecture 8 Network Flow Based Modeling Mustafa Ozdal Computer Engineering Department,
Thermal-aware Steiner Routing for 3D Stacked ICs M. Pathak and S.K. Lim Georgia Institute of Technology ICCAD 07.
Kwangsoo Han‡, Andrew B. Kahng‡† and Hyein Lee‡
ARCHER:A HISTORY-DRIVEN GLOBAL ROUTING ALGORITHM Muhammet Mustafa Ozdal, Martin D. F. Wong ICCAD ’ 07.
Tao Lin Chris Chu TPL-Aware Displacement- driven Detailed Placement Refinement with Coloring Constraints ISPD ‘15.
Fujitsu Labs, January 20, 2003 Non-tree Routing for Reliability and Yield Improvement Ion Mandoiu CSE Department, UC San Diego Joint work with A.B. Kahng.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Fast Algorithms for Slew Constrained Minimum Cost Buffering S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** *Dept of ECE, Texas.
CS223 Advanced Data Structures and Algorithms 1 Maximum Flow Neil Tang 3/30/2010.
Implicit Hitting Set Problems Richard M. Karp Erick Moreno Centeno DIMACS 20 th Anniversary.
1 CS612 Algorithms for Electronic Design Automation CS 612 – Lecture 8 Lecture 8 Network Flow Based Modeling Mustafa Ozdal Computer Engineering Department,
Lagrangean Relaxation
© Yamacraw, 2002 Symmetric Minimum Power Connectivity in Radio Networks A. Zelikovsky (GSU) Joint work with Joint work with.
© Yamacraw, Fall 2002 Power Efficient Range Assignment in Ad-hoc Wireless Networks E. Althous (MPI) G. Calinescu (IL-IT) I.I. Mandoiu (UCSD) S. Prasad.
Prof. Shiyan Hu Office: EERC 518
An Exact Algorithm for Difficult Detailed Routing Problems Kolja Sulimma Wolfgang Kunz J. W.-Goethe Universität Frankfurt.
A Novel Timing-Driven Global Routing Algorithm Considering Coupling Effects for High Performance Circuit Design Jingyu Xu, Xianlong Hong, Tong Jing, Yici.
Ion I. Mandoiu, Vijay V. Vazirani Georgia Tech Joseph L. Ganley Simplex Solutions A New Heuristic for Rectilinear Steiner Trees.
Retiming EECS 290A Sequential Logic Synthesis and Verification.
Confidential & Proprietary – All Rights Reserved Internal Distribution, October Quality of Service in Multimedia Distribution G. Calinescu (Illinois.
VLSI Physical Design Automation
Data Driven Resource Allocation for Distributed Learning
Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.
1.3 Modeling with exponentially many constr.
Quality of Service in Multimedia Distribution
Graph Partitioning Problems
Buffered tree construction for timing optimization, slew rate, and reliability control Abstract: With the rapid scaling of IC technology, buffer insertion.
Power Efficient Range Assignment in Ad-hoc Wireless Networks
1.3 Modeling with exponentially many constr.
Fast Min-Register Retiming Through Binary Max-Flow
Presentation transcript:

F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State) Provably Good Global Buffering by Multiterminal Multicommodity Flow Approximation

2 Outline Buffer-block methodology for global buffering Global routing via buffer-blocks problem Integer node-capacitated multiterminal multicommodity flow (MTMCF) formulation Provably good approximation of fractional MTMCF Provably good rounding of fractional MTMCF Key implementation choices Experimental results Extensions & conclusions

3 Motivation VDSM  buffer / inverter insertion for all global nets –50nm technology  >1,000,000 buffers Solution: insert buffers only in Buffer-Blocks (BBs)  Simplified design: isolates buffer insertion from circuit block implementations  Efficient utilization of routing/area resources (RAR) RAR(cap. 2k buffer-block) =   RAR(cap. k buffer-block) For high-end designs,   1.6

4 1.Buffer-block planning [Cong+99] [TangW00] –given placement of circuit blocks + netlist –find shape and location of BBs within available free space so that to maximize the number of routable nets 2.Global buffering via given BBs  This paper –given nets + BB locations and capacities –find buffered routing for each net, subject to timing-driven and buffer- parity constraints Buffer-Block Methodology

5

6 Problem Formulation Global Buffering via Buffer-Blocks (GRBB) Problem Given: BB locations and capacities list of multi-pin nets, each net has upper-bound + parity requirement on #buffers for each source- sink path [non-negative weight (criticality coefficient)] L/U bounds on wirelength b/w consecutive buffers/pins Find: buffered routing of a maximum [weighted] number of nets subject to the given constraints [Dragan+00]: 2-pin nets This paper: multi-pin nets

7 Our Contributions Integer node-capacitated MTMCF formulation Approximation algorithm for fractional MTMCF –Extends [GargK98,Fleischer99,Albrecht00,Dragan+00] to node- capacitated + multiterminal case Provably good fractional MTMCF rounding algorithms,  Provably good algorithm for GRBB Problem Practical rounding heuristics based on random-walks Computational study comparing alternative implementations

8 Integer Program Formulation

9 “Relax+Round” Approach 1.Solve the fractional relaxation –Relaxation = node-capacitated multiterminal multicommodity flow –Exact linear programming algorithms are impractical for large instances –KEY IDEA: use approximation algorithm can approximate optimum within a factor of (1-  ) for any  >0 allows continuous tradeoff between runtime and solution quality 2.Round to integer solution –Provably good rounding using [RaghavanT87] –Practical rounding using random-walks

10 The  -MTMCF Algorithm w(v) = , f = 0 For i = 1 to N do For k = 1, …, #nets do Find min weight valid routing tree T for net k While w(T) < min{ 1,  (1+2  )^i } do f(T)= f(T) + 1 For every v  T do w(v)  ( 1 +   (T,v) /cap(v) ) * w(v) End For Find min weight valid routing tree T for net k End While End For Output f/N

11 Runtime of  -MTMCF Algorithm Main step of  -MTMCF algorithm: computing min node-weight valid routing tree for a net  min node-weight directed rooted Steiner tree (DRST) in a directed acyclic graph

12 Implementation choices 2-Pin3,4-pinMulti-pin DecompositionStar, Minimum Spanning tree Matching, 3-restricted Steiner tree Not needed Min-weight DRSTShortest path (exact) Try all Steiner pts + shortest paths (exact) Very hard!  heuristics RoundingRandom-walkBackward random-walks [Dragan+00]This paper

13 1.Store fractional flows f(T) for every valid routing tree T 2.Scale down each f(T) by 1-  for small  3.Each net k routed with prob. f(k)=  { f(T) | T routing for k }  Number of routed nets  (1-  )OPT 4.To route net k, choose tree T with probability = f(T) / f(k)  With high probability, no BB capacity is exceeded Problem: Impractical to store all non-zero flow trees Provably Good Rounding

14 1.Store fractional flows f(T) for every valid routing tree T 2.Scale down each f(T) by 1-  for small  3.Each net k routed with prob. f(k)=  { f(T) | T routing for k }  Number of routed nets  (1-  )OPT 4.To route net k, choose tree T with probability = f(T) / f(k)  With high probability, no BB capacity is exceeded Random-Walk 2-TMCF Rounding use random walk from source to sink Practical: random walk requires storing only flows on edges

15 Random-Walk MTMCF Rounding S T1 T2 T3 Source  Sinks

16 Random-Walk MTMCF Rounding S T1 T2 T3 Source  Sinks

17 Our MTMCF Rounding Heuristic 1.Round each net k with probability f(k), using backward random walks –No scaling-down, approximate MTMCF < OPT 2.Resolve capacity violations by greedily deleting routed paths –Few violations 3.Greedily route remaining nets using unused BB capacity –Further routing still possible

18 Implemented Heuristics Greedy buffered routing 1.For each net, route sinks sequentially along shortest path to source or node already connected to source 2.After routing a net, remove fully used BBs MTMCF approximation + randomized rounding –2TMCF [Dragan+00] –3TMCF (3-pin decomposition +  -MTMCF + rounding) –4TMCF (4-pin decomposition +  -MTMCF + rounding) –MTMCF (  -MTMCF w/ approximate DRST + rounding)

19 Experimental Setup Test instances extracted from next-generation SGI microprocessor Up to 5,000 nets, ~6,000 sinks U=4,000  m, L=500-2,000  m 50 buffer blocks buffers / BB

20 % Sinks Connected #sinks/ #nets Greed 2TMCF3TMCF4TMCFMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  = / / / / / /

21 Runtime (sec.) #sinks/ #nets Greed 2TMCF3TMCF4TMCFMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  = / , , / , , / , , / , , , / , , , / , , ,833

22 % Routed Nets vs. Runtime

23 Resource Usage Greed 2TMCF3TMCF4TMCFMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  =.04 # Conn. Sinks 5,6455,7255,8425,7795,8965,8275,9425,8135,897 % Conn. Sinks Wirelength (meters) WL/sink (microns) 7,4797,8918,1827,6978,0837,5827,9927,7988,057 #Buffers90379,86010,6769,59110,6109,49710,5079,86010,647 #Buff/sink #nets = 4,764 #sinks = 6, buffers/BB

24 WL and #Buffers for 100% Completion #nets = 4,764 #sinks = 6,038 Flow-rounding wastes routing resources! BB Cap. Greed 2TMCF3TMCF4TMCFMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  = ——— ——— ——— ——— ,73811,31211,07911, ——— ,33011,68810,80211,26710,51211,11510,63111, ,33011,33411,55810,80211,28410,51211,37310,61910, ,33011, ,79411,78810,50311,80310,61910,625

25 Conclusions and Ongoing Work Provably good algorithms and practical heuristics based on node-capacitated MTMCF approximation –Higher completion rates than previous algorithms Extensions: –Combine global buffering with BB planning combine with compaction

26 Combining with compaction

27 Combining with compaction

28 Combining with compaction Sum-capacity constraints: cap(BB1) + cap(BB2)  const.

29 Conclusions and Ongoing Work Provably good algorithms and practical heuristics based on node-capacitated MTMCF approximation –Higher completion rates than previous algorithms Extensions: –Combine global buffering with BB planning combine with compaction –Enforce channel capacity constraints –Improved resource usage smart release of resources