Chapter 7 – Specialized Routing

Slides:



Advertisements
Similar presentations
OCV-Aware Top-Level Clock Tree Optimization
Advertisements

4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
1 Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertion Presented By Cesare Ferri Takumi Okamoto, Jason Kong.
Coupling-Aware Length-Ratio- Matching Routing for Capacitor Arrays in Analog Integrated Circuits Kuan-Hsien Ho, Hung-Chih Ou, Yao-Wen Chang and Hui-Fang.
VLSI Physical Design: From Graph Partitioning to Timing Closure Paper Presentation © KLMH Lienig 1 EECS 527 Paper Presentation Topological Design of Clock.
Rajat K. Pal. Chapter 3 Emran Chowdhury # P Presented by.
Minimum-Buffered Routing of Non- Critical Nets for Slew Rate and Reliability Control Supported by Cadence Design Systems, Inc. and the MARCO Gigascale.
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
Chapter 2 – Netlist and System Partitioning
Chapter 6 – Detailed Routing
Chapter 1 – Introduction
VLSI Routing. Routing Problem  Given a placement, and a fixed number of metal layers, find a valid pattern of horizontal and vertical wires that connect.
Estimation of Wirelength Reduction for λ-Geometry vs. Manhattan Placement and Routing H. Chen, C.-K. Cheng, A.B. Kahng, I. Mandoiu, and Q. Wang UCSD CSE.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig © 2011 Springer Verlag 1 Chapter 5 – Global Routing.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 3: Chip Planning © KLMH Lienig 1 Chapter 3 – Chip Planning 3.1 Introduction to.
Routing 2 Outline –Maze Routing –Line Probe Routing –Channel Routing Goal –Understand maze routing –Understand line probe routing.
Chapter 5 – Global Routing
Chip Planning 1. Introduction Chip Planning:  Deals with large modules with −known areas −fixed/changeable shapes −(possibly fixed locations for some.
1 ENTITY test is port a: in bit; end ENTITY test; DRC LVS ERC Circuit Design Functional Design and Logic Design Physical Design Physical Verification and.
VLSI Physical Design Automation
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 FLUTE: Fast Lookup Table Based RSMT Algorithm.
Chih-Hung Lin, Kai-Cheng Wei VLSI CAD 2008
Introduction to Routing. The Routing Problem Apply after placement Input: –Netlist –Timing budget for, typically, critical nets –Locations of blocks and.
A Topology-based ECO Routing Methodology for Mask Cost Minimization Po-Hsun Wu, Shang-Ya Bai, and Tsung-Yi Ho Department of Computer Science and Information.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Interconnect design. n Crosstalk. n Power optimization.
Escape Routing For Dense Pin Clusters In Integrated Circuits Mustafa Ozdal, Design Automation Conference, 2007 Mustafa Ozdal, IEEE Trans. on CAD, 2009.
Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Global Routing. Global routing:  To route all the nets, should consider capacities  Sequential −One net at a time  Concurrent −Order-independent 2.
Global Routing.
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
Network Aware Resource Allocation in Distributed Clouds.
CAD for Physical Design of VLSI Circuits
© KLMH Lienig 1 Chapter 1 – Introduction Original Authors: Andrew B. Kahng, Jens Lienig, Igor L. Markov, Jin Hu VLSI Physical Design: From Graph Partitioning.
1 CS612 Algorithms for Electronic Design Automation CS 612 – Lecture 2 Background and Introduction Mustafa Ozdal Computer Engineering Department, Bilkent.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 6: Detailed Routing © KLMH Lienig 1 What Makes a Design Difficult to Route Charles.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Interconnect design. n Crosstalk. n Power optimization.
Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Pseudo-nMOS gates. n DCVS logic. n Domino gates. n Design-for-yield. n Gates as IP.
Modern VLSI Design 3e: Chapter 10 Copyright  1998, 2002 Prentice Hall PTR Topics n CAD systems. n Simulation. n Placement and routing. n Layout analysis.
Register Placement for High- Performance Circuits M. Chiang, T. Okamoto and T. Yoshimura Waseda University, Japan DATE 2009.
InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.
1 CS612 Algorithms for Electronic Design Automation CS 612 – Lecture 3 Partitioning Mustafa Ozdal Computer Engineering Department, Bilkent University Mustafa.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
CHAPTER 8 Developing Hard Macros The topics are: Overview Hard macro design issues Hard macro design process Physical design for hard macros Block integration.
Detailed Routing مرتضي صاحب الزماني.
Simultaneous Analog Placement and Routing with Current Flow and Current Density Considerations H.C. Ou, H.C.C. Chien and Y.W. Chang Electronics Engineering,
Modern VLSI Design 3e: Chapter 7 Copyright  1998, 2002 Prentice Hall PTR Topics n Power/ground routing. n Clock routing. n Floorplanning tips. n Off-chip.
Routing Tree Construction with Buffer Insertion under Obstacle Constraints Ying Rao, Tianxiang Yang Fall 2002.
Maze Routing Algorithms with Exact Matching Constraints for Analog and Mixed Signal Designs M. M. Ozdal and R. F. Hentschke Intel Corporation ICCAD 2012.
LEMAR: A Novel Length Matching Routing Algorithm for Analog and Mixed Signal Circuits H. Yao, Y. Cai and Q. Gao EDA Lab, Department of CS, Tsinghua University,
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
Prof. Shiyan Hu Office: EERC 518
Routing Topology Algorithms Mustafa Ozdal 1. Introduction How to connect nets with multiple terminals? Net topologies needed before point-to-point routing.
CSE 248 Skew 1Kahng, UCSD 2011 CSE248 Spring 2011 Skew.
Clock Distribution Network
VLSI Floorplanning and Planar Graphs prepared and Instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University July 2015VLSI Floor Planning and Planar.
Zero Skew Clock Routing ECE 556 Project Proposal John Thompson Kurt Ting Simon Wong.
A Novel Timing-Driven Global Routing Algorithm Considering Coupling Effects for High Performance Circuit Design Jingyu Xu, Xianlong Hong, Tong Jing, Yici.
VLSI Physical Design Automation
Chapter 1 – Introduction
Chapter 7 – Specialized Routing
1.3 Modeling with exponentially many constr.
Chapter 2 – Netlist and System Partitioning
Chapter 5 – Global Routing
Zero Skew Clock tree Implementation
Clock Tree Routing With Obstacles
Presentation transcript:

Chapter 7 – Specialized Routing VLSI Physical Design: From Graph Partitioning to Timing Closure Original Authors: Andrew B. Kahng, Jens Lienig, Igor L. Markov, Jin Hu

Chapter 7 – Specialized Routing 7.1 Introduction to Area Routing 7.2 Net Ordering in Area Routing 7.3 Non-Manhattan Routing 7.3.1 Octilinear Steiner Trees 7.3.2 Octilinear Maze Search 7.4 Basic Concepts in Clock Networks 7.4.1 Terminology 7.4.2 Problem Formulations for Clock-Tree Routing 7.5 Modern Clock Tree Synthesis 7.5.1 Constructing Trees with Zero Global Skew 7.5.2 Clock Tree Buffering in the Presence of Variation

Functional Design and Logic Design Physical Verification and Signoff 7 Specialized Routing System Specification Partitioning Architectural Design ENTITY test is port a: in bit; end ENTITY test; Functional Design and Logic Design Chip Planning Circuit Design Placement Physical Design Clock Tree Synthesis Physical Verification and Signoff DRC LVS ERC Signal Routing Fabrication Timing Closure Packaging and Testing Chip

Multi-Stage Routing of Signal Nets 7 Specialized Routing Routing Multi-Stage Routing of Signal Nets Global Routing Detailed Routing Timing-Driven Routing Large Single- Net Routing Geometric Techniques Coarse-grain assignment of routes to routing regions (Chap. 5) Fine-grain assignment of routes to routing tracks (Chap. 6) Net topology optimization and resource allocation to critical nets (Chap. 8) Power (VDD) and Ground (GND) routing (Chap. 3) Non-Manhattan and clock routing (Chap. 7)

7 Specialized Routing Area routing directly constructs metal routes for signal connections (no global and detailed routing, Secs. 7.1-7.2) Non-Manhattan routing is presented in Sec. 7.3 Clock signals and other nets that require special treatment are discussed in Secs. 7.4-7.5

7.1 Introduction to Area Routing The goal of area routing is to route all nets in the design without global routing within the given layout space while meeting all geometric and electrical design rules Area routing performs the following optimizations minimizing the total routed length and number of vias of all nets minimizing the total area of wiring and the number of routing layers minimizing the circuit delay and ensuring an even wire density avoiding harmful capacitive coupling between neighboring routes Subject to technology constraints (number of routing layers, minimal wire width, etc.) electrical constraints (signal integrity, coupling, etc.) geometry constraints (preferred routing directions, wire pitch, etc.)

7.1 Introduction to Area Routing Minimal wirelength: IC1 4 1 IC2 IC3 Alternative routing path: 4 IC3 1 1 IC1 IC2 1 4 4 Metal1 Metal2 Via

7.1 Introduction to Area Routing Distance metric between two points P1 (x1,y1) and P2 (x2,y2) Euclidean distance Manhattan distance P1 dM dE dM P2

7.1 Introduction to Area Routing Multiple Manhattan shortest paths between two points

7.1 Introduction to Area Routing Multiple Manhattan shortest paths between two points m = 210 y x Lösungsweg dargelegt in: Pomberger, Dobler: Algorithmen und Datenstrukturen. Pearson. S. 208-211 With no obstacles, the number of Manhattan shortest paths in an Δx × Δy region is

7.1 Introduction to Area Routing Two pairs of points may admit non-intersecting Manhattan shortest paths, while their Euclidean shortest paths intersect (but not vice versa).

7.1 Introduction to Area Routing If all pairs of Manhattan shortest paths between two pairs of points intersect, then so do Euclidean shortest paths.

7.1 Introduction to Area Routing The Manhattan distance dM is (slightly) larger than the Euclidean distance dE: 1.41 worst case: a square where 1.27 on average, without obstacles 1.15 on average, with obstacles

7.2 Net Ordering in Area Routing 7.1 Introduction to Area Routing 7.2 Net Ordering in Area Routing 7.3 Non-Manhattan Routing 7.3.1 Octilinear Steiner Trees 7.3.2 Octilinear Maze Search 7.4 Basic Concepts in Clock Networks 7.4.1 Terminology 7.4.2 Problem Formulations for Clock-Tree Routing 7.5 Modern Clock Tree Synthesis 7.5.1 Constructing Trees with Zero Global Skew 7.5.2 Clock Tree Buffering in the Presence of Variation

7.2 Net Ordering in Area Routing Effect of net ordering on routability A´ B´ Optimal routing of net A A B A´ B´ Optimal routing of net B A B A´ B´ Nets A and B can be routed only with detours A B © 2011 Springer Verlag

7.2 Net Ordering in Area Routing Effect of net ordering on total wirelength A A´ B B´ Routing net A first Routing net B first A A´ B B´ © 2011 Springer Verlag

7.2 Net Ordering in Area Routing For n nets, there are n! possible net orderings Constructive heuristics are used

7.2 Net Ordering in Area Routing Rule 1: For two nets i and j, if aspect ratio (i ) > aspect ratio (j ), then i is routed before j A A B´ B´ A´ B A´ B Net A has a higher aspect ratio of its bounding box; routing A first results in shorter total wirlength Routing net B first results in longer total wirelength © 2011 Springer Verlag

7.2 Net Ordering in Area Routing Rule 2: For two nets i and j, if the pins of i are contained within MBB(j ), then i is routed before j A B C D Constraint Graph Net Ordering B C A D D′ C´ B′ A′ B C A D D′ C´ B′ A′ Ordering D-A-C-B or D-C-B-A (not D-B-A-C)

7.2 Net Ordering in Area Routing Rule 3: Let (net) be the number of pins within MBB(net) for net net. For two nets i and j, if (i ) < (j ), then i is routed before j. For each net, consider the pins of other nets within its bounding box The net with the smallest number of such pins is routed first Ties are broken based on the number of pins that are contained within the bounding box and on its edge D´ A B C A´ E C´ E´ B´ D D´ A B C A´ E C´ E´ B´ D Pins Inside (Edge) (net) MBB (A) B C D E D (B,C,D) - (A,C,D) - (A) - (-) - (A,C) 3 1 2

7.3 Non-Manhattan Routing 7.1 Introduction to Area Routing 7.2 Net Ordering in Area Routing 7.3 Non-Manhattan Routing 7.3.1 Octilinear Steiner Trees 7.3.2 Octilinear Maze Search 7.4 Basic Concepts in Clock Networks 7.4.1 Terminology 7.4.2 Problem Formulations for Clock-Tree Routing 7.5 Modern Clock Tree Synthesis 7.5.1 Constructing Trees with Zero Global Skew 7.5.2 Clock Tree Buffering in the Presence of Variation

7.3 Non-Manhattan Routing Allow 45- or 60-degree segments in addition to horizontal and vertical segments λ-geometry, where λ represents the number of possible routing directions and the angles  / λ at which they can be oriented λ = 2 (90 degrees): Manhattan routing (four routing directions) λ = 3 (60 degrees): Y-routing (six routing directions) λ = 4 (45 degrees): X-routing (eight routing directions) Non-Manhattan routing is primarily employed on printed circuit boards (PCBs)

7.3.1 Octilinear Steiner Trees Route planning using octilinear Steiner minimum trees (OSMT) Generalize rectilinear Steiner trees by allowing segments that extend in eight directions More freedom when placing Steiner points 1 3 5 4 6 7 8 9 10 11 12 2 © 2011 Springer Verlag

7.3.1 Octilinear Steiner Trees Octilinear Steiner Tree Algorithm Input: set of all pins P and their coordinates Output: heuristic octilinear minimum Steiner tree OST OST = Ø T = set of all three-pin nets of P found by Delaunay triangulation sortedT = SORT(T,minimum octilinear distance) for (i = 1 to |sortedT |) subT = ROUTE(sortedT [i ] ) // route minimum tree over subT ADD(OST,subT ) // add route to existing tree IMPROVE(OST,subT ) // locally improve OST based on subT T.-Y.; Chang, et. al.: Multilevel Full-Chip Routing for the X-Based Architecture

7.3.1 Octilinear Steiner Trees (1) Triangulate 1 1 3 5 4 6 7 8 9 10 11 12 2 3 2 5 4 6 7 8 9 10 12 11 © 2011 Springer Verlag

7.3.1 Octilinear Steiner Trees (1) Triangulate (2) Add route to existing tree 1 1 1 3 5 4 6 7 8 9 10 11 12 2 3 3 2 2 5 5 4 4 6 7 6 7 8 9 8 9 10 10 12 12 11 11 © 2011 Springer Verlag

7.3.1 Octilinear Steiner Trees (1) Triangulate (2) Add route to existing tree (3) Locally improve OST 1 1 1 3 3 3 2 2 2 5 5 5 4 4 4 6 7 6 7 6 7 8 9 8 9 8 9 10 10 10 12 12 12 11 11 11 cost = 6 cost ≈ 5.7 © 2011 Springer Verlag

7.3.1 Octilinear Steiner Trees Final OST after merging all subtrees (3) Locally improve OST 1 1 3 3 2 2 5 5 4 4 6 7 6 7 8 9 8 9 10 10 12 12 11 11 © 2011 Springer Verlag

7.3.2 Octilinear Maze Search Expansion (2) S T 1 2 Backtracing T 1 2 3 S 1 S T Expansion (1) © 2011 Springer Verlag

7.3.2 Octilinear Maze Search

7.4 Basic Concepts in Clock Networks 7.1 Introduction to Area Routing 7.2 Net Ordering in Area Routing 7.3 Non-Manhattan Routing 7.3.1 Octilinear Steiner Trees 7.3.2 Octilinear Maze Search 7.4 Basic Concepts in Clock Networks 7.4.1 Terminology 7.4.2 Problem Formulations for Clock-Tree Routing 7.5 Modern Clock Tree Synthesis 7.5.1 Constructing Trees with Zero Global Skew 7.5.2 Clock Tree Buffering in the Presence of Variation

7.4.1 Terminology A clock routing instance (clock net) is represented by n+1 terminals, where s0 is designated as the source, and S = {s1,s2, … ,sn} is designated as sinks Let si, 0 ≤ i ≤ n, denote both a terminal and its location A clock routing solution consists of a set of wire segments that connect all terminals of the clock net, so that a signal generated at the source propagates to all of the sinks Two aspects of clock routing solution: topology and geometric embedding The clock-tree topology (clock tree) is a rooted binary tree G with n leaves corresponding to the set of sinks Internal nodes = Steiner points

7.4.1 Terminology u1 s0 s1 u2 u3 u4 s2 s3 s4 s5 s6 s1 s2 s4 s6 s5 s0 Clock routing problem instance u1 s0 s1 u2 u3 u4 s2 s3 s4 s5 s6 Connection topology s1 s2 s4 s6 s5 s0 s3 u1 u2 u3 u4 Embedding s1 s2 s0 s3 s5 s4 s6 © 2011 Springer Verlag

7.4.1 Terminology Clock skew: (maximum) difference in clock signal arrival times between sinks Local skew: maximum difference in arrival times of the clock signal at the clock pins of two or more related sinks Sinks within distance d > 0 Flip-flops or latches connected by a directed signal path Global skew: maximum difference in arrival times of the clock signal at the clock pins of any two (related or unrelated) sinks Difference between shortest and longest source-sink path delays in the clock distribution network The term “skew” typically refers to “global skew”

7.4.2 Problem Formulations for Clock-Tree Routing Zero skew: zero-skew tree (ZST) ZST problem Bounded skew: true ZST may not be necessary in practice Signoff timing analysis is sufficient with a non-zero skew bound In addition to final (signoff) timing, this relaxation can be useful with intermediate delay models when it facilitates reductions in the length of the tree Bounded-Skew Tree (BST) problem Useful skew: correct chip timing only requires control of the local skews between pairs of interconnected flip-flops or latches Useful skew formulation is based on analysis of local skew constraints

7.5 Modern Clock Tree Synthesis 7.1 Introduction to Area Routing 7.2 Net Ordering in Area Routing 7.3 Non-Manhattan Routing 7.3.1 Octilinear Steiner Trees 7.3.2 Octilinear Maze Search 7.4 Basic Concepts in Clock Networks 7.4.1 Terminology 7.4.2 Problem Formulations for Clock-Tree Routing 7.5 Modern Clock Tree Synthesis 7.5.1 Constructing Trees with Zero Global Skew 7.5.2 Clock Tree Buffering in the Presence of Variation

7.5 Modern Clock Tree Synthesis A clock tree should have low skew, while delivering the same signal to every sequential gate Clock tree synthesis is performed in two steps: (1) Initial tree construction (Sec. 7.5.1) with one of these scenarios Construct a regular clock tree, largely independent of sink locations Simultaneously determine a topology and an embedding Construct only the embedding, given a clock-tree topology as input (2) Clock buffer insertion and several subsequent skew optimizations (Sec. 7.5.2)

7.5.1 Constructing Trees with Zero Global Skew H-tree Exact zero skew due to the symmetry of the H-tree Used for top-level clock distribution, not for the entire clock tree Blockages can spoil the symmetry of an H-tree Non-uniform sink locations and varying sink capacitances also complicate the design of H-trees © 2011 Springer Verlag

7.5.1 Constructing Trees with Zero Global Skew Method of Means and Medians (MMM) Can deal with arbitrary locations of clock sinks Basic idea: Recursively partition the set of terminals into two subsets of equal size (median) Connect the center of gravity (COG) of the set to the centers of gravity of the two subsets (the mean)

7.5.1 Constructing Trees with Zero Global Skew Method of Means and Medians (MMM) Find the center of gravity Partition S by the median Find the center of gravity for the left and right subsets of S Connect the center of gravity of S with the centers of gravity of the left and right subsets Final result after recursively performing MMM on each subset © 2011 Springer Verlag

7.5.1 Constructing Trees with Zero Global Skew Method of Means and Medians (MMM) Input: set of sinks S, empty tree T Output: clock tree T if (|S| ≤ 1) return (x0,y0) = (xc(S),yc(S)) // center of mass for S (SA,SB) = PARTITION(S) // median to determine SA and SB (xA,yA) = (xc(SA),yc(SA)) // center of mass for SA (xB,yB) = (xc(SB),yc(SB)) // center of mass for SB ROUTE(T,x0,y0,xA,yA) // connect center of mass of S to ROUTE(T,x0,y0,xB,yB) // center of mass of SA and SB BASIC_MMM(SA,T) // recursively route SA BASIC_MMM(SB,T) // recursively route SB

7.5.1 Constructing Trees with Zero Global Skew Recursive Geometric Matching (RGM) RGM proceeds in a bottom-up fashion Compare to MMM, which is a top-down algorithm Basic idea: Recursively determine a minimum-cost geometric matching of n sinks Find a set of n / 2 line segments that match n endpoints and minimize total length (subject to the matching constraint) After each matching step, a balance or tapping point is found on each matching segment to preserve zero skew to the associated sinks The set of n / 2 tapping points then forms the input to the next matching step

7.5.1 Constructing Trees with Zero Global Skew Recursive Geometric Matching (RGM) Set of n sinks S Min-cost geometric matching Find balance or tapping points (point that achieves zero skew in the subtree, not always midpoint) Min-cost geometric matching Final result after recursively performing RGM on each subset © 2011 Springer Verlag

7.5.1 Constructing Trees with Zero Global Skew Recursive Geometric Matching (RGM) Input: set of sinks S, empty tree T Output: clock tree T if (|S| ≤ 1) return M = min-cost geometric matching over S S’ = Ø foreach (<Pi,Pj >  M) TPi = subtree of T rooted at Pi TPj = subtree of T rooted at Pj tp = tapping point on (Pi,Pj) // point that minimizes the skew of // the tree Ttp = TPi U TPj U (Pi,Pj) ADD(S’,tp) // add tp to S’ ADD(T,(Pi,Pj)) // add matching segment (Pi,Pj) to T if (|S| % 2 == 1) // if |S| is odd, add unmatched node ADD(S’, unmatched node) RGM(S’,T) // recursively call RGM

7.5.1 Constructing Trees with Zero Global Skew Exact Zero Skew Adopts a bottom-up process of matching subtree roots and merging the corresponding subtrees, similar to RGM Two important improvements: Finds exact zero-skew tapping points with respect to the Elmore delay model rather than the linear delay model Maintains exact delay balance even when two subtrees with very different source-sink delays are matched (by wire elongation)

7.5.1 Constructing Trees with Zero Global Skew Exact Zero Skew Tapping point tp, where Elmore delay to sinks is equalized t(Ts1 ) C(s1) C(w1) 2 R(w1) C(s2) C(w2) R(w2) 1 – z z t(Ts2 ) Tapping point tp z 1 – z w1 w2 s1 s2 Subtree Ts1 Subtree Ts2 © 2011 Springer Verlag

7.5.1 Constructing Trees with Zero Global Skew Deferred-Merge Embedding (DME) Defers the choice of merging (tapping) points for subtrees of the clock tree Needs a tree topology as input Weakness in earlier algorithms: Determine locations of internal nodes of the clock tree too early; once a centroid is found, it is never changed Basic idea: Two sinks in general position will have an infinite number of midpoints, creating a tilted line segment – Manhattan arc Manhattan arc: same minimum wirelength and exact zero skew Selection of embedding points for internal nodes on Manhattan arc will be delayed for as long as possible

7.5.1 Constructing Trees with Zero Global Skew Deferred-Merge Embedding (DME) Euclidean midpoint Euclidean midpoint s2 s1 s1 s2 s1 s2 Locus of all Manhattan midpoints is a Manhattan arc in the Manhattan geometry Sinks are aligned, hence, Manhattan arc has zero length © 2011 Springer Verlag

7.5.1 Constructing Trees with Zero Global Skew Deferred-Merge Embedding (DME) Embeds internal nodes of the given topology G via a two-phase process First phase is bottom-up Determines all possible locations of internal nodes of G consistent with a minimum-cost ZST T Output: “tree of line segments”, with each line segment being the locus of possible placements of an internal node of T Second phase is top-down Chooses the exact locations of all internal nodes in T Output: fully embedded, minimum-cost ZST with topology G

7.5.1 Constructing Trees with Zero Global Skew Deferred-Merge Embedding (DME) Tilted Rectangular Region (TRR) for the Manhattan arc of s1 and s2 with a radius of two units s1 s1 Core Radius s2 s2 © 2011 Springer Verlag

7.5.1 Constructing Trees with Zero Global Skew Deferred-Merge Embedding (DME) Merging segment for node u3 (the parent of nodes u1 and u2) is the locus of feasible locations of u3 with zero skew and minimum wirelength |eu2 | ms(u2) ms(u1) ms(u3) s1 s2 s3 s4 |eu1 | trr(u2) trr(u1) u1 s1 u3 u2 s2 s3 s4 © 2011 Springer Verlag

7.5.1 Constructing Trees with Zero Global Skew Deferred-Merge Embedding (DME) Build Tree of Segments Algorithm (DME Bottom-Up Phase) s1 s2 s8 s7 s6 s5 s0 s3 s4 s1 s2 s3 s4 s8 s7 s6 s5 s0 s1 s2 s3 s4 s8 s7 s6 s5 s0 s1 s2 s8 s7 s6 s5 s0 s3 s4 © 2011 Springer Verlag

7.5.1 Constructing Trees with Zero Global Skew Deferred-Merge Embedding (DME) Build Tree of Segments Algorithm (DME Bottom-Up Phase) Input: set of sinks S and tree topology G(S,Top) Output: merging segments ms(v) and edge lengths |ev|, v  G if foreach (node v  G, in bottom-up order) if (v is a sink node) // if v is a terminal, then ms(v) is a ms[v] = PL(v) // zero-length Manhattan arc else // otherwise, if v is an internal node, (a,b) = CHILDREN(v) // find v’s children and CALC_EDGE_LENGTH(ea,eb) // calculate the edge length trr[a][core] = MS(a) // create trr(a) – find merging segment trr[a][radius] = |ea| // and radius of a trr[b][core] = MS(b) // create trr(b) – find merging segment trr[b][radius] = |eb| // and radius of b ms[v] = trr[a] ∩ trr[b] // merging segment of v

7.5.1 Constructing Trees with Zero Global Skew Deferred-Merge Embedding (DME) Find Exact Locations (DME Top-Down Phase) ms(v) pl(par) trr(par) Possible locations of child node v given the location of its parent node par |epar| © 2011 Springer Verlag

7.5.1 Constructing Trees with Zero Global Skew Deferred-Merge Embedding (DME) Find Exact Locations (DME Top-Down Phase) s1 s2 s8 s7 s6 s5 s0 s3 s4 s1 s2 s8 s7 s6 s5 s0 s3 s4 s1 s2 s8 s7 s6 s5 s0 s3 s4 s7 s5 s1 s2 s8 s6 s0 s3 s4 © 2011 Springer Verlag

7.5.1 Constructing Trees with Zero Global Skew Deferred-Merge Embedding (DME) Find Exact Locations (DME Top-Down Phase) Input: set of sinks S, tree topology G, outputs of DME bottom-up phase Output: minimum-cost zero-skew tree T with topology G foreach (non-sink node v  G top-down order) if (v is the root) loc = any point in ms(v) else par = PARENT(v) // par is the parent of v trr[par][core] = PL(par) // create trr(par) – find merging segment trr[par][radius] = |ev| // and radius of par loc = any point in ms[v] ∩ trr[par] pl[v] = loc

7.5.2 Clock Tree Buffering in the Presence of Variation To address challenging skew constraints, a clock tree undergoes several optimization steps, including Geometric clock tree construction Initial clock buffer insertion Clock buffer sizing Wire sizing Wire snaking In the presence of process, voltage, and temperature variations, such optimizations require modeling the impact of variations Variation model encapsulates the different parameters, such as width and thickness, of each library element as well-defined random variables

Summary of Chapter 7 – Area Routing Area routing: avoiding the division into global and detailed routing Doing everything at once, subject to design rules Small netlists with complicated constraints Analog, MCM and PCB routing Manhattan vs Euclidean paths Euclidean paths are no longer than Manhattan, usually shorter Unique Euclidean shortest path Multiple Manhattan paths When Euclidean shortest paths intersect, there may exist Manhattan shortest paths that do not (not vice versa) Net ordering is important in area routing Rule 1: nets with higher aspect ratio (less flexible) routed first Rule 2: nets surrounded by other nets (more constrained) routed first Rule 3: nets with more pins inside other net's bounding boxes routed first

Summary of Chapter 7 – Non-Manhattan Tree Routing Recall that Manhattan routing is dictated by the limitations of modern semiconductor manufacturing for thin wires PCB routing is not subject to those limitations Can use shorter connections Non-Manhattan connections Diagonal (45- or 60-degree) segments in addition to horizontal and vertical segments Create more freedom to place Steiner points Octilinear Steiner Tree construction Algorithms are generally adapted from the Manhattan case Should produce results that are at least as good as the Manhattan case

Summary of Chapter 7 – Clock Network Routing Similar to signal-net routing, except for Very large numbers of sinks The need to equalize propagation delays from the root to sinks Longer routes (to satisfy the equalization constraint) Typical algorithms determine topology first, then geometric embedding Clock skew Consider propagation delay from the root to each sink Skew is the maximal pairwise difference between delays (over all pairs of sinks) May be limited to sinks that are within distance d > 0 (local skew) For a specified wire delay model ZST: Zero-Skew Tree routing requires that skew = 0 BST: Bounded-Skew Tree routing requires that skew < Bound

Summary of Chapter 7 – Modern Clock Tree Synthesis Initial clock tree construction Topology determination (MMM or RGM) DME embedding (different flavors for ZST and BST) Working with the Elmore delay model requires more effort than working with linear delay models Geometric obstacles (e.g., macros) May require detours Can be handled during DME (complicated) or during post-processing (often achieves as good results) Clock-tree optimization Buffer insertion Buffer sizing Wire sizing Wire snaking by small amounts Decreasing the impact of process variability