Routing Algorithms.

Slides:



Advertisements
Similar presentations
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 22: December 2, 2005 Routing 2 (Pathfinder)
Advertisements

Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
Meng-Kai Hsu, Sheng Chou, Tzu-Hen Lin, and Yao-Wen Chang Electronics Engineering, National Taiwan University Routability Driven Analytical Placement for.
Coupling-Aware Length-Ratio- Matching Routing for Capacitor Arrays in Analog Integrated Circuits Kuan-Hsien Ho, Hung-Chih Ou, Yao-Wen Chang and Hui-Fang.
ICS 252 Introduction to Computer Design Routing Fall 2007 Eli Bozorgzadeh Computer Science Department-UCI.
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
Spring 2010CS 2251 Graphs Chapter 10. Spring 2010CS 2252 Chapter Objectives To become familiar with graph terminology and the different types of graphs.
VLSI Routing. Routing Problem  Given a placement, and a fixed number of metal layers, find a valid pattern of horizontal and vertical wires that connect.
Reconfigurable Computing (EN2911X, Fall07)
EDA (CS286.5b) Day 14 Routing (Pathfind, netflow).
Metal Layer Planning for Silicon Interposers with Consideration of Routability and Manufacturing Cost W. Liu, T. Chien and T. Wang Department of CS, NTHU,
Routing 2 Outline –Maze Routing –Line Probe Routing –Channel Routing Goal –Understand maze routing –Understand line probe routing.
Lecture 5: FPGA Routing September 17, 2013 ECE 636 Reconfigurable Computing Lecture 5 FPGA Routing.
ECE 506 Reconfigurable Computing Lecture 8 FPGA Placement.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
Introduction to Routing. The Routing Problem Apply after placement Input: –Netlist –Timing budget for, typically, critical nets –Locations of blocks and.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
A Topology-based ECO Routing Methodology for Mask Cost Minimization Po-Hsun Wu, Shang-Ya Bai, and Tsung-Yi Ho Department of Computer Science and Information.
Escape Routing For Dense Pin Clusters In Integrated Circuits Mustafa Ozdal, Design Automation Conference, 2007 Mustafa Ozdal, IEEE Trans. on CAD, 2009.
Global Routing. Global routing:  To route all the nets, should consider capacities  Sequential −One net at a time  Concurrent −Order-independent 2.
CAFE router: A Fast Connectivity Aware Multiple Nets Routing Algorithm for Routing Grid with Obstacles Y. Kohira and A. Takahashi School of Computer Science.
Global Routing. Global routing:  Sequential −One net at a time  Concurrent −Order-independent −ILP 2.
Global Routing.
Solving Hard Instances of FPGA Routing with a Congestion-Optimal Restrained-Norm Path Search Space Keith So School of Computer Science and Engineering.
1 Global Routing Method for 2-Layer Ball Grid Array Packages Yukiko Kubo*, Atsushi Takahashi** * The University of Kitakyushu ** Tokyo Institute of Technology.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation High-Performance.
Archer: A History-Driven Global Routing Algorithm Mustafa Ozdal Intel Corporation Martin D. F. Wong Univ. of Illinois at Urbana-Champaign Mustafa Ozdal.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Maze Routing مرتضي صاحب الزماني.
FPGA Routing Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Tools - Implementation Options - Chapter15 slide 1 FPGA Tools Course Implementation Options.
A Routing Approach to Reduce Glitches in Low Power FPGAs Quang Dinh, Deming Chen, Martin D. F. Wong Department of Electrical and Computer Engineering University.
Ping-Hung Yuh, Chia-Lin Yang, and Yao-Wen Chang
ARCHER:A HISTORY-DRIVEN GLOBAL ROUTING ALGORITHM Muhammet Mustafa Ozdal, Martin D. F. Wong ICCAD ’ 07.
Register Placement for High- Performance Circuits M. Chiang, T. Okamoto and T. Yoshimura Waseda University, Japan DATE 2009.
1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + * The University of Texas at Austin + University of Illinois.
ECE 260B – CSE 241A /UCB EECS Kahng/Keutzer/Newton Physical Design Flow Read Netlist Initial Placement Placement Improvement Cost Estimation Routing.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 13: February 20, 2002 Routing 1.
Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 14: February 27, 2002 Routing 2 (Pathfinder)
FPGA CAD 10-MAR-2003.
High-Performance Global Routing with Fast Overflow Reduction Huang-Yu Chen, Chin-Hsiung Hsu, and Yao-Wen Chang National Taiwan University Taiwan.
Static Timing Analysis
FPGA Routing Pathfinder [Ebeling, et al., 1995] Introduced negotiated congestion During each routing iteration, route nets using shortest.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
مرتضي صاحب الزماني 1 Maze Routing. Homework 4 مهلت تحویل : 23 اردیبهشت پروژه 1 : انتخاب طرح : امروز مرتضي صاحب الزماني 2.
A Survey of Fault Tolerant Methodologies for FPGA’s Gökhan Kabukcu
Placement and Routing Algorithms. 2 FPGA Placement & Routing.
VLSI Physical Design Automation
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Chapter 7 – Specialized Routing
VLSI Physical Design Automation
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Buffer Insertion with Adaptive Blockage Avoidance
Topological Sort (topological order)
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
2 University of California, Los Angeles
ESE535: Electronic Design Automation
MATS Quantitative Methods Dr Huw Owens
Topics Logic synthesis. Placement and routing..
Sungho Kang Yonsei University
Register-Transfer (RT) Synthesis
ECE 697F Reconfigurable Computing Lecture 5 FPGA Routing
CS137: Electronic Design Automation
VLSI Physical Design Automation
Some Graph Algorithms.
Clock Tree Routing With Obstacles
ICS 252 Introduction to Computer Design
Under a Concurrent and Hierarchical Scheme
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Routing Algorithms

FPGA Placement & Routing

Routing Routing: One of the most basic, tedious, yet important steps in FPGA designs Last step in the design flow prior to generating bit-stream Definition: Successfully connect all signal nets subject to timing constraints Compared to ASIC routing: More restricted: it can use only the prefabricated routing resources  100% routability is more challenging

Routing Steps Steps: Routing-resource graph generation Global routing (optional) Detailed routing

Routing Resource Graph Models all the available routing resources in an FPGA An abstract data representation to be used by the global and detailed routers Vertices: I/O pins of the logic blocks Wire segments in the routing channels Edges: Programmable switches that connect two vertices Unidirectional switch (e.g. buffer): directed edge Bi-directional switch, (e.g. pass transistor): a pair of directed edges

Routing Resource Graph Modeling equivalent pins: A source vertex connects to all the logically equivalent output pins of a logic block, A sink vertex connects from all the logically equivalent input pins of a logic block

Routing Resource Graph Vertex capacity: Maximum number of nets that can use this vertex in a legal routing. Example: Source node capacity = 1 Sink node capacity = 2 Routing resource graph is usually very large.  Tool generates it for a basic tile and replicates and stitches together.

Routing Steps in FPGA Global routing: divides the available routing area into channels or routing regions determines the coarse routing topology of each net in terms of channels or routing regions that the net passes through minimizes overall congestion satisfies timing constraints of critical nets Detailed routing: generates detailed routing geometry to implement every net or subnet in each routing channel or region

Routing Steps in FPGA Advantage: Reduce complexity Disadvantage: GR has to use a rough model for available routing resources in each channel or routing region, GR doesn’t see the details of routing obstacles, pre-routed nets, …. More serious in FPGA Reason: detailed distribution of different types of wire segments and programmable switches may greatly affect the success of DR, but is hard to model during GR. Solution: Single-step routing

Global Routing Course routing-resource graph (routing graph): Vertices: Each routing channel (as opposed to each wire segment) Capacity: # of tracks in the channel. Each pin in a logic block Source and sink (for logically equivalent pins) Edges: Available connections from logic block input and output pins to the channels Available connections between adjacent channels

Course Routing Graph

Global Routing Global routing problem: Input: Coarse routing-resource graph G Output: Routing of each net on G such that all the channel capacity constraints are satisfied signal timing constraints on all the nets are satisfied Timing consideration: later Focus on routability issues for now

GR Algorithms GR problem very similar to ASIC:  May use many ASIC GR algorithms/techniques Most successful FPGA GR: VPR and PathFinder: Negotiation-based GR

PathFinder/VPR Negotiation-based router: Each net negotiates the use of shared resources with other nets until none of the resources are shared At each iteration: All nets are routed each using the minimum cost even though leading to over-congestion. Costs: associated with the vertices in the routing graph in some routing channels Re-adjust each vertex cost based on whether the corresponding channel has overflowed in the current iteration (and previous iterations). Route all nets based on this new cost Repeat until all congestion is removed (or some pre-defined stopping criteria is met: e.g., maximum number of iterations

PathFinder/VPR VPR cost function [Betz99]: for using routing resource n (node) when it is reached from routing resource m: Cost(n) = (b(n) + h(n)) . p(n) + BendCost(n,m) b(n): base cost: Delay of using node n (e.g., wire segment length or delay) Unchanged throughout the routing process p(n): present congestion penalty Depends on the amount of overflow at resource n h(n): historical penalty term Accumulates the congestion penalty in the previous iterations BendCost(n,m): Discourages bends in the routing solution

PathFinder/VPR Edge labels: Cost of using the edges (switches) Problem: Signals to be routed from Si’s to Di’s S1 S2 S3 D1 D2 D3 A B C 2 1 3 4 Route ignoring congestion: Minimum cost path for each signal: through B! First order congestion (h(n) = 0): For 1st iteration: pn = 1  No penalty for use of n regardless of how many signals occupy n Subsequent iterations: pn is increased gradually depending on how many signals share n.

PathFinder/VPR S1 S2 S3 D1 D2 D3 A B C 1st iteration: 4 1st iteration: All signals use B Later iterations: Signal 1 finds that a route through A gives lower cost than congested node B. Later: Signal 3 find a better route through C. Abrupt increase in p(n):  Oscillation

PathFinder/VPR 2nd order congestion: S1 S2 S3 A C B D1 D2 D3 Sequential rip-up & re-route: Signal 1 uses B, Signal 2 and Signal 3 share C To succeed, both Signals 1 and 2 should be rerouted. Signal 2 must be re-routed first. Signal 1 does not use congested node  Difficult to determine that it should be rerouted first  Cannot be solved by p(n) alone

PathFinder/VPR 2nd order congestion: S1 S2 S3 D1 D2 D3 A B C h(n) (history): Increased slightly each iteration that C is shared After some iterations, C becomes more expensive than B If Signal 2 uses B, B is shared by Signal 1 and Signal 2  Signal 1 uses A later.

PathFinder/VPR Routing of each net: Maze routing

X X Grid Graph S T S Grid Graph (Maze) S T Simplified Representation T Routing of each net: Maze routing S X Grid Graph (Maze) S T X S T Simplified Representation T Area Routing

Basic Idea Wave Propagation Retrace Maze Routing: A Breadth-First Search (BFS) of the grid graph. Always find the shortest path possible. Consists of 2 phases: Wave Propagation Retrace

An Illustration S 1 2 3 1 2 3 3 4 5 T 5 4 5 6

Wave Propagation 1 2 3 1 2 3 1 2 3 1 2 3 3 4 5 3 5 4 5 6 S S S T T T At step i, all vertices at Manhattan-distance i from S are labeled with i. A Propagation List (FIFO) is used to keep track of the vertices to consider in next step. S S S 1 2 3 1 2 3 1 2 3 1 2 3 3 4 5 3 T T T 5 4 5 6 After Step 6 After Step 0 After Step 3

Retrace 1 2 3 1 2 3 3 4 5 5 4 5 6 S T Final Labeling Retrace: Trace back the actual route. Starting from T. At vertex with i, go to any vertex with label i-1. S 1 2 3 1 2 3 3 4 5 T 5 4 5 6 Final Labeling

Example 2 A 1 2 2 B

Example (continued) A B 1 2 10 11 8 7 9 6 5 4 3 12

Retrace the Path A B 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

Alternative Paths A B 2 3 4 5 6 7 8 9 10 11 12 1 Guideline: do not change direction unless you must

Connecting Multipoint Nets One point is selected as the source and all the other points are the target Propagate from the source until one target is reached Find the path from the source to that target All the cells on the path are labeled as source cells and the remaining unconnected pins are targets Repeat the steps

Example 4 4 3 5 3 2 2 1 1 S T Start at the source and run the maze router until you hit a target Every cell on the path is a source – run the maze router

Global Routing Similarity with standard-cell global routing problem:  Recent advances in std-cell algorithm: BoxRouter, … can be employed No one has done!

Detailed Routing

Detailed Routing Detailed routing: Implements each route in the coarse routing-resource graph in the detailed routing-resource graph so that there is no resource conflict There are different types of wire segments and programmable switches in a channel  # of possible ways to implement each route in the coarse routing graph is still quite large. Two phases: Generate expanded graph Repeatedly select the exact routes

Expansion Graph Expansion graph generation: For each global route, enumerate all possible detailed routes in the routing-resource graph that go through the same set of channels.

Expansion Graph Cost: For each detailed route p, a cost is associated in the expansion graph based on: # of segments used waste of long segments by short connections, # of alternative paths to p Identify paths that are essential for a connection. impact that the selection of p has on other paths in the expansion graph Select paths that have fewest negative effects on others

Combined GR/DR VPR router: So far, the most successful router Combines GR and DR Uses the negotiation-based approach Applies exactly the same GR engine for combined GR/DR on the detailed routing-resource graph

Timing Optimization in Routing Timing optimization is important since routing delays in FPGA designs are significant largely due to the extensive use of programmable switches Timing analysis: Given a mapped and placed circuit, one can perform STA to compute the signal arrival/required times at every pin.  Compute slacks Levels of optimization techniques: Routing order optimization Routing tree topology optimization Slack distribution (not discussed) Net weighting (not discussed)

Timing Optimization in Routing Routing order optimization: Order nets based on their slacks Timing-critical nets (i.e., small slacks) are routed first to avoid long detours. Almost all timing-driven routers do this Optimize routing tree topologies of timing-critical nets: Routes a timing-critical net by an arborescence A routing tree with the shortest path from the source to every sink in the routing graph and also tries to minimize the total routing cost of the arborescence

Arborescence Steiner Minimum Tree Arborescence Arborescence with reduced length

Timing Optimization in Routing PathFinder and VPR: Use a delay penalty term in the routing cost function Balances the delay and congestion optimization Cost(n)=Crit(i, j) . delay(n,topology) + [1 − Crit(i, j)] . b(n) . h(n) . p(n) Crit(i, j): Shows criticality of sink j of net i (in [0,0.99]) 0.99 avoids ignoring congestion for the most timing critical nets delay(n, topology): Elmore delay at the vertex n given the partial routing topology constructed so far

References [Chen06] D. Chen, J. Cong and P. Pan, “FPGA Design Automation: A Survey,” Foundations and Trends in Electronic Design Automation, Vol. 1, No. 3 (2006) 195–330. [Betz99] V. Betz, J. Rose, and A. Marquardt. Architecture and CAD for Deep-Submicron FPGAs. Kluwer Academic Publishers, 1999. [Mcmurchie95] L. Mcmurchie and C. Ebeling. PathFinder: A negotiation-based performance-driven router for FPGAs. In Proceedings of International Symposium on Field-Programmable Gate Arrays, February 1995.