12/4/2018 A Regularity-Driven Fast Gridless Detailed Router for High Frequency Datapath Designs By Sabyasachi Das (Intel Corporation) Sunil P. Khatri (Univ.

Slides:



Advertisements
Similar presentations
Semi-Detailed Bus Routing with Variation Reduction Fan Mo, Synplicity Robert Brayton, UC Berkeley Presented by: Philip Chong, Cadence.
Advertisements

Optimal Bus Sequencing for Escape Routing in Dense PCBs H.Kong, T.Yan, M.D.F.Wong and M.M.Ozdal Department of ECE, University of Illinois at U-C ICCAD.
Constraint Driven I/O Planning and Placement for Chip-package Co-design Jinjun Xiong, Yiuchung Wong, Egino Sarto, Lei He University of California, Los.
DSPs Vs General Purpose Microprocessors
Cadence Design Systems, Inc. Why Interconnect Prediction Doesn’t Work.
Frequent Closed Pattern Search By Row and Feature Enumeration
An Effective Floorplanning Algorithm in Mixed Mode Placement Integrated with Rectilinear- Shaped Optimization for Soft Blocks Changqi Yang, Xianlong Hong,
A Regularity-Driven Fast Gridless Detailed Router for High Frequency Datapath Designs By Sabyasachi Das (Intel Corporation) Sunil P. Khatri (Univ. of Colorado,
A Routing Technique for Structured Designs which Exploits Regularity Sabyasachi Das Intel Corporation Sunil P. Khatri Univ. of Colorado, Boulder.
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
Wen-Hao Liu1, Yih-Lang Li, and Cheng-Kok Koh Department of Computer Science, National Chiao-Tung University School of Electrical and Computer Engineering,
Meng-Kai Hsu, Sheng Chou, Tzu-Hen Lin, and Yao-Wen Chang Electronics Engineering, National Taiwan University Routability Driven Analytical Placement for.
Ripple: An Effective Routability-Driven Placer by Iterative Cell Movement Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui and Evangeline F.Y. Young.
An ILP-based Automatic Bus Planner for Dense PCBs P. C. Wu, Q. Ma and M. D. F. Wong Department of Electrical and Computer Engineering, University of Illinois.
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming The software development method algorithms.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 11 - Combinational.
1 A Timing-Driven Synthesis Approach of a Fast Four-Stage Hybrid Adder in Sum-of-Products Sabyasachi Das University of Colorado, Boulder Sunil P. Khatri.
1 Design of a Parallel-Prefix Adder Architecture with Efficient Timing-Area Tradeoff Characteristic Sabyasachi Das University of Colorado, Boulder Sunil.
VLSI Routing. Routing Problem  Given a placement, and a fixed number of metal layers, find a valid pattern of horizontal and vertical wires that connect.
Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.
A Resource-level Parallel Approach for Global-routing-based Routing Congestion Estimation and a Method to Quantify Estimation Accuracy Wen-Hao Liu, Zhen-Yu.
POLAR 2.0: An Effective Routability-Driven Placer Chris Chu Tao Lin.
Copyright 2008 Koren ECE666/Koren Part.6a.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
Routing 2 Outline –Maze Routing –Line Probe Routing –Channel Routing Goal –Understand maze routing –Understand line probe routing.
Introduction to Routing. The Routing Problem Apply after placement Input: –Netlist –Timing budget for, typically, critical nets –Locations of blocks and.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
Escape Routing For Dense Pin Clusters In Integrated Circuits Mustafa Ozdal, Design Automation Conference, 2007 Mustafa Ozdal, IEEE Trans. on CAD, 2009.
CAFE router: A Fast Connectivity Aware Multiple Nets Routing Algorithm for Routing Grid with Obstacles Y. Kohira and A. Takahashi School of Computer Science.
CRISP: Congestion Reduction by Iterated Spreading during Placement Jarrod A. Roy†‡, Natarajan Viswanathan‡, Gi-Joon Nam‡, Charles J. Alpert‡ and Igor L.
Global Routing.
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
CAD for Physical Design of VLSI Circuits
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation High-Performance.
Archer: A History-Driven Global Routing Algorithm Mustafa Ozdal Intel Corporation Martin D. F. Wong Univ. of Illinois at Urbana-Champaign Mustafa Ozdal.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
1 Wire Length Prediction-based Technology Mapping and Fanout Optimization Qinghua Liu Malgorzata Marek-Sadowska VLSI Design Automation Lab UC-Santa Barbara.
Thermal-aware Steiner Routing for 3D Stacked ICs M. Pathak and S.K. Lim Georgia Institute of Technology ICCAD 07.
Maze Routing مرتضي صاحب الزماني.
ARCHER:A HISTORY-DRIVEN GLOBAL ROUTING ALGORITHM Muhammet Mustafa Ozdal, Martin D. F. Wong ICCAD ’ 07.
A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.
ISSS 2001, Montréal1 ISSS’01 S.Derrien, S.Rajopadhye, S.Sur-Kolay* IRISA France *ISI calcutta Combined Instruction and Loop Level Parallelism for Regular.
Fishbone: A Block-Level Placement and Routing Scheme Fan Mo and Robert K. Brayton EECS, UC Berkeley.
LatchPlanner:Latch Placement Algorithm for Datapath-oriented High-Performance VLSI Design Minsik Cho, Hua Xiang, Haoxing Ren, Matthew M. Ziegler, Ruchir.
On Routing Fixed Escaped Boundary Pins for High Speed Boards T. Tsai, R. Lee, C. Chin and Y. Kajitani Global UniChip Corp. Hsinchu, Taiwan DATE 2011.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 13: February 20, 2002 Routing 1.
Detailed Routing مرتضي صاحب الزماني.
Development of Programmable Architecture for Base-Band Processing S. Leung, A. Postula, Univ. of Queensland, Australia A. Hemani, Royal Institute of Tech.,
Simultaneous Analog Placement and Routing with Current Flow and Current Density Considerations H.C. Ou, H.C.C. Chien and Y.W. Chang Electronics Engineering,
Maze Routing Algorithms with Exact Matching Constraints for Analog and Mixed Signal Designs M. M. Ozdal and R. F. Hentschke Intel Corporation ICCAD 2012.
An Exact Algorithm for Difficult Detailed Routing Problems Kolja Sulimma Wolfgang Kunz J. W.-Goethe Universität Frankfurt.
Effective Linear Programming-Based Placement Techniques Sherief Reda UC San Diego Amit Chowdhary Intel Corporation.
CURE: An Efficient Clustering Algorithm for Large Databases Authors: Sudipto Guha, Rajeev Rastogi, Kyuseok Shim Presentation by: Vuk Malbasa For CIS664.
Dept. of Electronics Engineering & Institute of Electronics National Chiao Tung University Hsinchu, Taiwan ISPD’16 Generating Routing-Driven Power Distribution.
A Survey of Fault Tolerant Methodologies for FPGA’s Gökhan Kabukcu
1 Chapter 5 Branch-and-bound Framework and Its Applications.
Partitioning a Directed Line Segment
1 Architecture of Datapath- oriented Coarse-grain Logic and Routing for FPGAs Andy Ye, Jonathan Rose, David Lewis Department of Electrical and Computer.
VLSI Physical Design Automation
Lecture 6 Topics Combinational Logic Circuits
RE-Tree: An Efficient Index Structure for Regular Expressions
Delay Optimization using SOP Balancing
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
2 University of California, Los Angeles
Sabyasachi Das Synplicity Inc Sunil P. Khatri Texas A&M University
Buffered tree construction for timing optimization, slew rate, and reliability control Abstract: With the rapid scaling of IC technology, buffer insertion.
ARM ORGANISATION.
Delay Optimization using SOP Balancing
Fast Min-Register Retiming Through Binary Max-Flow
Under a Concurrent and Hierarchical Scheme
Presentation transcript:

12/4/2018 A Regularity-Driven Fast Gridless Detailed Router for High Frequency Datapath Designs By Sabyasachi Das (Intel Corporation) Sunil P. Khatri (Univ. of Colorado, Boulder)

Outline Motivation Proposed approach Extracting Net-Clusters 12/4/2018 Outline Motivation Proposed approach Extracting Net-Clusters Selecting a representative bit-slice Routing same-bit and cross-bit nets Propagating the routes Advantages of our approach Experimental results Conclusions & Future Work

Datapath Designs Datapath Designs are characterized by unique regularity present across bit-slices One of the most critical parts in any circuit Most commonly found in Microprocessor, DSP, graphic ICs. Effective use of this regularity is crucial for efficient design of datapath Datapath placement techniques are available Not many results are available on datapath routing

Regularity in Datapath Circuits An important task is to efficiently route all the nets by using the regularity inherent in datapath circuits

Overall Routing Flow

Net-Cluster Extraction Net-cluster is a collection of nets (spread over multiple bit-slices), in which all nets have similar connections. Different algorithms to extract net-clusters: Footprint-driven clustering Instance-driven clustering Cluster-merging

Footprint-driven Clustering Footprints are of two types: Global footprint contains the pin-names and master-cells (of connecting instances) names. Detailed footprint contains the instance names and the net-name. This clustering technique has two steps: Groups are created from global footprint information. By using Detailed footprint in each Group, net-clusters are obtained.

Footprint-driven Clustering After Global Footprint, two Groups are obtained: G1 = {LB[3:0], SB[3:0]} G2 = {SM[3:0]} After Detailed Footprint, three net-clusters are obtained: From G1: NC1 = {LB[3:0]} NC2 = {SB[3:0]} From G2: NC3 = {SM[3:0]}

Instance-driven Clustering Uses the location of connecting pins (does not need the uniform naming scheme). After instance-clustering, two net-clusters are obtained (from same Group): NC1 = {AB, CD, EF, GH} NC2 = {KL, MN, RS, TV}

Cluster Merging Multiple non-full clusters are merged. FDC found 3 clusters: NC1 = {LB[3:0]} NC2 = {SM[1:0]} NC3 = {SB[3:0]} IDC found 1 cluster: NC4 = {ABC, DEF} Cluster merging technique merged NC2 and NC4 NC-New = {ABC, DEF, SM[1], SM[0]}

Selecting Representative Bit-Slice We conceptually extend datapath to have an infinite number of bit-slices. Then, any slice can be chosen as representative bit-slice for routing. While routing, we need to take care of: Same-bit nets Backward cross-bit nets Forward cross-bit nets

Routing Nets in Representative Bit-Slice Our routing approach is a combination of pattern-based routing and maze routing We call our routing as strap-based routing, where a strap is defined as a straight segment, which can be vertical or horizontal. Our router uses minimal memory, because it loads the design data of only representative bit-slice.

Routing Nets in Representative Bit-Slice The router routes nets in a sequential manner: The nets connected to topmost (Y-wise) pins are routed first. Long nets are given preference. Router tries to find direct-routes (only vertical or only horizontal strap or VTH/HTV strap) for nets. In case of conflict, rip-up and re-route technique is used. If strap-based router cannot find a route, maze router is used.

Routing Cross-Bit Nets We model a single cross-bit net (spread over multiple bit-slices), as a combination of multiple smaller sub-nets, each confined within a single bit-slice. We virtually instantiate all those sub-nets into the representative bit-slice. If maximum backward/forward connectivity is k, then we need to find k different routes (usually, value of k is not very high).

Propagating the Routes After routing nets in representative bit-slice, similar routes are propagated to other nets in same net-cluster. Following two cases are handled separately: For same-bit nets: Routes are propagated identically For cross-bit nets: If maximum degree of forward/backward connectivity is p, we find p different routes for that net & propagate them accordingly to the correct bit-slice’s net.

Advantages of Our Approach Speed of Routing: Only a portion of nets are routed, making it really faster. Easy Incremental Routing: Rip-up all the routes for a given net-cluster and again route them identically. This can be done multiple times. Predictable Routes: Wiring parasitics are similar for different nets in a net-cluster. So, timing estimation is also similar. Better Debuggability & Timing convergence: Easy to find and fix poorly routed nets.

Experimental Results (Circuits Used) # Instance # Connections # Bits Industry-1 1056 5504 32 Industry-2 2368 12672 Industry-3 4672 29184 64 Industry-4 6208 48128 These four industrial datapath designs are used

Experimental Results (run-time) Circuit Ind. Router Our Router Ratio Industry-1 70 12 0.17 Industry-2 145 26 0.18 Industry-3 264 36 0.14 Industry-4 394 42 0.11 Average 0.15 Our router is at least 7X faster.

Experimental Results (wire-length) Circuit Ind. Router Our Router % Improvement Industry-1 47458 46790 1.41 Industry-2 97453 99476 -2.07 Industry-3 276456 282569 -2.21 Industry-4 589679 564568 4.25 Average 0.35 Average gain of our method is minimal.

Experimental Results (via-count) Circuit Ind. Router Our Router % Improvement Industry-1 6856 5678 17.18 Industry-2 25876 22336 13.68 Industry-3 39568 39452 0.28 Industry-4 44568 42976 3.57 Average 8.68 Less number of vias are used because of strap-based nature of our router.

Conclusions & Future Work Regular routing technique is very useful for fast automated datapath design. In future, we plan to focus on Crosstalk issues in datapath routing Handling irregular connectivity