Download presentation
Presentation is loading. Please wait.
Published byWesley Shaw Modified over 8 years ago
1
Design Automation Conference (DAC), June 6 th, 20131 Taming the Complexity of Coordinated Place and Route Jin Hu †, Myung-Chul Kim †† and Igor L. Markov ††† † Systems and Technology Group (STG), IBM, Fishkill, NY †† Systems and Technology Group (STG), IBM, Austin, TX ††† Dept. of Computer Science and Engineering (CSE), University of Michigan [speaker]
2
Design Automation Conference (DAC), June 6 th, 20132 Placement Solutions Must Be Routable -An obvious statement, largely ignored in academic research until 2006-2010 -A “new” global+detailed placer may reduce WL by 2%, but create many routing violations: not an improvement DAC 1998 DAC 2000 (Capo)
3
Back to the Future ? Things have changed: Much larger, more challenging P&R instances Stronger, faster baseline placers & routers Very precise evaluation of results through contests Recent trend: simultaneous P&R Our contributions to P&R: 1. Speed 2. Speed 3. Speed Design Automation Conference (DAC), June 6 th, 20133 1 st place at ICCAD`12
4
What DAC`13 Reviewers Thought of Our Contributions Design Automation Conference (DAC), June 6 th, 20134
5
The Need for Speed Congestion estimation during global placement requires fast routing We can do 75K nets/sec (1 thread) Our competition – 6K nets/sec (1 thread) Our router is called 20 times during GP and still takes <15% of total runtime The secret – simplify the router’s task Additional secrets Scrap Dijkstra and A*-search Use array-based, cache-friendly algorithms Design Automation Conference (DAC), June 6 th, 20135
6
How to Avoid Extra Work The placer invokes a router The router works hard to reduce violations The placer changes locations The placer invokes a router The router works hard to reduce violations The placer changes locations The placer invokes a router The router works hard to reduce violations … Design Automation Conference (DAC), June 6 th, 20136
7
How to Avoid Extra Work Spread the router’s work across global-placement iterations Do not solve routing every time – no need to! Reuse work Design Automation Conference (DAC), June 6 th, 20137
8
Incremental Global Routing When movable objects stay in same GCells, reuse routes When objects move a little, reuse routes When relative positions do not change, reuse routes When everything changes, try to reuse routes Design Automation Conference (DAC), June 6 th, 20138
9
Incremental Maze Routing ? Dijkstra and A*-search don’t do incremental ! They are also slow – pointer-chasing in binary heaps is no good for cache Need something else ! Design Automation Conference (DAC), June 6 th, 20139
10
The Answer – Bellman-Ford (BF) Bellman-Ford can be incremental Bellman-Ford uses no pointers Bellman-Ford is cache-friendly Worst-case complexity is O(V 2 ) versus O(V log V) for Dijkstra and A*-search So, many skeptics: The trick: run 1 pass of BF at a time, incrementally Design Automation Conference (DAC), June 6 th, 201310
11
More on Bellman-Ford in Our Paper Finishes sooner with alternating passes (known) Generalizes monotonic routing (obvious) Finds some non-monotonic routes in one pass (with our improvements) Theorem 1: optimal routes with k monotonic segments are found in k passes For most nets, k is very small Design Automation Conference (DAC), June 6 th, 201311
12
What About Scenic Routes? Router invocations mark routing congestion The placer spreads cells to eliminate congestion Do not waste time on scenic routes (but incremental Bellman-Ford can find them anyway) Design Automation Conference (DAC), June 6 th, 201312
13
Design Automation Conference (DAC), June 6 th, 201313 Congestion map vs. Estimate (Early GP) LIRE1 Iteration of BFG-R
14
Design Automation Conference (DAC), June 6 th, 201314 Congestion map vs. Estimate (Early GP) LZ Routing1 Iteration of BFG-R
15
Design Automation Conference (DAC), June 6 th, 201315 Congestion map vs. Estimate (Early GP) L Routing1 Iteration of BFG-R
16
Design Automation Conference (DAC), June 6 th, 201316 Congestion map vs. Estimate (Mid GP) 1 Iteration of BFG-RLIRE
17
Design Automation Conference (DAC), June 6 th, 201317 Congestion map vs. Estimate (Mid GP) 1 Iteration of BFG-RLZ Routing
18
Design Automation Conference (DAC), June 6 th, 201318 Congestion map vs. Estimate (Mid GP) L Routing1 Iteration of BFG-R
19
Design Automation Conference (DAC), June 6 th, 201319 Congestion map vs. Estimate (Late GP) 1 Iteration of BFG-RLIRE
20
Design Automation Conference (DAC), June 6 th, 201320 Congestion map vs. Estimate (Late GP) 1 Iteration of BFG-RLZ Routing
21
Design Automation Conference (DAC), June 6 th, 201321 Congestion map vs. Estimate (Late GP) L Routing1 Iteration of BFG-R
22
Congestion Classification: Cell-based congestion: cell-to-cell proximity Solution: cell bloating (known) Layout-based congestion: due to static design properties (blockages, routing obstacles) Solution: static whitespace injection Remotely-induced layout-based congestion: caused by non-local factors, e.g., long nets Solution: tricky Design Automation Conference (DAC), June 6 th, 201322
23
Design Automation Conference (DAC), June 6 th, 201323 Packing Peanut vs. Macro Expansion After 4 invocations of placement Initial Macro After 2 invocations of placement Packing peanut Facilitates full use of available resources --- does not overconstrain placement
24
Design Automation Conference (DAC), June 6 th, 201324 Example During Global Placement Congestion MapPlacement
25
Design Automation Conference (DAC), June 6 th, 201325 Example During Global Placement Congestion MapPlacement
26
Design Automation Conference (DAC), June 6 th, 201326 Example During Global Placement Congestion MapPlacement
27
Design Automation Conference (DAC), June 6 th, 201327 Example During Global Placement Congestion MapPlacement
28
Design Automation Conference (DAC), June 6 th, 201328 Example During Global Placement Congestion MapPlacement
29
Design Automation Conference (DAC), June 6 th, 201329 Example During Global Placement Congestion MapPlacement
30
Design Automation Conference (DAC), June 6 th, 201330 Empirical Validation Compares against official results from ICCAD 2012 Contest [Viswanathan et al. – ICCAD 2012] CoPR implemented using C++ (g++ 4.7.0) using OpenMP CoPR is 1% slower than SimPLR, which was 5.7x faster than Ripple CoPR has 2% and 7% better quality than SimPLR and NTUplace [no runtime factor] CoPR invokes LIRE once every 3 placement iterations, contributing 14.3% of total runtime
31
Conclusions Crazy fast coordinated place-and-route Through incremental routing and better algorithms Three congestion types + ways to relieve them in global placement Design Automation Conference (DAC), June 6 th, 201331 Placement Routing 1 st place at ICCAD`12
32
Design Automation Conference (DAC), June 6 th, 201332 Backup: Placements Visualized
33
Design Automation Conference (DAC), June 6 th, 201333 Backup: Detailed Placement Congestion Aware DPAfter Global PlacementCongestion UNaware DP
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.