Download presentation
Presentation is loading. Please wait.
Published byJune Anderson Modified over 9 years ago
1
High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego * University of Michigan +
2
2 Outline Gate Selection in VLSI Design Previous Work Challenges in Gate Selection High-Performance Gate Selection with a Signoff Timer Overall Flow Experimental Results Conclusions and Future Works
3
3 Gate Selection in VLSI Design Effective approach to power, delay optimization Objective: select a library cell for each gate Tunable cell parameters: gate length, gate width, V th Minimize power Satisfy constraints: slack, slew, max load capacitance, … gate-width (drive-strength) multi-Vth L gate -bias INVX2INVX4 INVX8INVX16 HVTNVT LVT L=60nmL=65nmL=55nm lower (leakage) power lower speed higher (leakage) power higher speed
4
4 Previous Techniques Common heuristics/algorithms Limitations Do not account for realistic delay models and constraints (capacitance, slew) Continuous methods: industrial cell libraries offer discrete gate sizes, and rounding solutions is not easy Discrete methods: scalability to large circuits is an issue Continuous methods Discrete methods Linear programming Convex optimization Lagrangian relaxation Dynamic programming Sensitivity-based Selection Optimality Scalability
5
5 Previous Work Our work extends Trident 1.0 [Hu et al. Proc. ICCAD 2012] Produced strongest results on ISPD 2012 benchmarks as of ICCAD 2012 Metaheuristic optimization with importance sampling and sensitivity-guided search Limitation: no interconnect delay calculation unrealistic assumption
6
6 Outline Gate Selection in VLSI Design Previous Works Challenges in Gate Selection Interconnect delay Incorrect internal timer Critical paths High-Performance Gate Selection with a Signoff Timer Overall Flow Experimental Results Conclusions and Future Works
7
7 Challenges in Gate Selection Selection problem seen at all phases of RTL-to-GDS flow Becomes more challenging at later design stages RTL Gate Level Netlist Placed Netlist Routed Netlist GDS Logic Synthesis Placement Route Interconnects Gate Selection Gate Selection Timing constraints are strict Gate and interconnect delay Slew, max capacitance Gate Selection can result in large change in interconnect delay Challenging Our Problem New challenges in the ISPD 2013 Gate Selection Contest Routed netlists including interconnect Realistic timing constraints including slew and capaciatance Relying on an industry signoff timer
8
8 Issue 1: Interconnect Delay/Slew Delay and slew calculations for gates and wires Delay : 50% of input transition to 50% of output transition Slew : 25% to 75% of transition Gate delay and slew are estimated with the lookup table-based nonlinear delay models (NLDMs) Interconnect delay and slew are estimated with analytical models for RC trees
9
9 Issue 1: Interconnect Delay/Slew The impact of interconnects on slew values propagates to upstream and downstream and makes delay changes Output pin capacitance change + slew change by interconnect Slew propagation + slew degradation by interconnect Large delay changes in upstream and downstream gates and nets
10
10 Issue 2: Incorrect Internal Timer Timer is essential to estimate interconnect delay and slew which are affected by gate Selection/V th swapping Two options: Signoff Timer and Internal Timer An accurate internal timer is needed Signoff Timer Gate Selection/Vt-Swapping Post-LayoutPost-Layout SignoffSignoff Post-Layout Optimizer Iterative invocation Runtime increase Internal Timer TimingDiscrepancyTimingDiscrepancy
11
11 Issue 2: Inaccurate Internal Timer Challenges in matching signoff timer Error propagation along paths Error accumulation with netlist changes Error propagation on paths Error (internal – signoff) Error # logic depth along path # cell change Netlist change Error accumulation with netlist change Timing calibration to a signoff timer is needed to avoid divergence
12
12 Issue 3: Critical Paths Many near-critical paths in the given benchmarks Challenging to obtain a timing feasible solution * From ISPD 2013 Discrete Gate Selection Contest Presentation Dedicated critical path optimization is needed
13
13 Outline Gate Selection in VLSI Design Previous Works Challenges in Gate Selection High-Performance Gate Selection with a Signoff Timer Internal Timer with Interconnect Delay Modeling Calibration to a Signoff Timer Dedicated Critical Path Optimization Sensitivity Functions Overall Flow Experimental Results Conclusions and Future Works
14
14 Our Sizer High-Performance Gate Selection with a Signoff Timer 1.Interconnect delay/slew models for an internal timer 2.Efficient calibration to a signoff timer 3.Critical path optimization for timing-feasible solutions 4.Sensitivity-guided cell Selection
15
15 1. Interconnect Delay/Slew for Internal Timer Essential to estimate interconnect delay and slew affected by gate Selection/V th swapping Requirements for an internal timer Fast enough for move-based optimization Accurate enough to track signoff timer Our approach: use best-performing models for interconnect delay/slew from previous work
16
16 Interconnect Delay/Slew : Previously Known Models Early optimization does not require accuracy fast interconnect models We use pre-existing models Model selection criterion: endpoint slack error between signoff timer* and our estimation Elmore delay D2M DM1, DM2 Elmore delay D2M DM1, DM2 PERI S2M PERI S2M delay models slew models D2M: Alpert et al. ISPD 2000 DM1,DM2: Kahng et al. TCAD 1997 PERI: Kashyap et al. TAU 2002 S2M: Agarwal et al. TCAD 2004 McCormick: McCormick Thesis 1989 McCormick Total Cap. McCormick Total Cap. Effective Cap. models * Synopsys PrimeTime
17
17 Interconnect Delay/Slew : Model Selection The (D2M, PERI) model combination has the smallest mean and standard deviation Endpoint slack error distribution (EM, PERI)(D2M,PERI) (DM1,PERI) (DM2,PERI) x-axis: slack error (ps), y-axis: % of #paths Normalized mean/std. of endpoint slack error
18
18 2. Calibration to a Signoff Timer Challenges in matching the results of a signoff timer Timing divergence from error propagation along timing paths and error accumulation with netlist changes The divergence can be compensated with offset Offset-based slack calibration [Moon et al. Patent 7,823,098] Improve the accuracy of a given STA engine by periodically invoking a signoff timer and storing slack differences (offsets) Signoff Timer Internal Timer Request timing information offset = signoff timer – internal timer
19
19 Calibration Frequency vs. Error Impact of calibration frequency on average slack error while Selection: 5% threshold shows <10ps slack errors X-axis: % of cell changes during leakage optimization Y-axis: (avg.) slack error over the signoff timer
20
20 Tcl socket interface allows send/ receive commands to/from the signoff timer Basis of winning ISPD-2013 gate Selection contest entry Efficient Signoff-Timer Interface (a)Tcl socket code (b)Timing correlation w/ the socket I/F
21
21 3. Critical Path Optimization ISPD 2013 contest : many near-critical paths in benchmarks Challenging to obtain a timing feasible solution Dedicated critical path optimization: optimize cells on the most critical path to reduce WNS* DownSelection fanouts Peephole optimization * WNS: Worst Negative Slack
22
22 Critical Path Optimization: Downsizing Fanouts Downsizing fanouts of critical cells Improve delay of the target cell by reducing output load Downsize fanout cells with sensitivity score Critical cells Fanout cells DownSelection to reduce input cap. Cell delay decrese with reduced output load
23
23 Pick k cells in a critical path and exhaustively search the best combination of k All possible combinations are listed in order of Gray code to minimize the overhead of incremental STA* current windownext window N(# trial) = {#size option}^{k}... trial1 trial2 trialN pick the best move Critical Path Optimization: Peephole Optimization Critical path Enumerate all possible combination w/ Gray code iSTA * STA: Static Timing Analysis
24
24 4. Sensitivity Function: Timing Recovery
25
25 Sensitivity Function: Leakage Reduction Multiple sensitivity functions from [7] are use Among five SFs, the best SF is selected and used for the next optimization stage SF1∆leakage / ∆delay SF2∆leakage * slack SF3∆leakage / (∆delay*#paths) SF4∆leakage * slack / #paths SF5∆leakage * slack / (∆delay*#paths)
26
26 Outline Gate Selection in VLSI Design Previous Works Challenges in Gate Selection High-Performance Gate Selection with a Signoff Timer Overall Flow Global Timing Recovery Power Reduction with Feasible Timing Experimental Results Conclusions and Future Works
27
27 Overall Optimization Flow Overall flow: Global Timing Recovery (GTR) + Power Reduction with Feasible Timing (PRFT) Routed Netlist, SPEF GTRwoST Selection Solution GTRwST PRFT phase1 PRFT phase2 Set to minimum size Global Timing Recovery Power Reduction w/ Feasible Timing Find timing feasible solution with an internal timer Find timing feasible solution with a signoff timer Leakage reduction with different sensitivity functions Leakage reduction with kick-move
28
28 GTR without Signoff Timer GTR procedure Objective: find timing feasible solution with internal timer (no need of accurate timing information) Use guardband for the fast solution search Timing feasible solution (non-feasible with signoff timer) Increase guardband (GB) No Yes GTR(GB) GTR(α,γ) Feasible? Multi-threaded STA Calculate sensitivity (α) Upsize γ% of cells in descending order of sensitivity Timing meet? Incremental STA NO
29
29 GTR with Signoff Timer Objective: find timing feasible solution with signoff timer Timing recovery is added to GTR flow Feasible? Timing feasible Solution Cell upSelection Peephole & Critical path optimization Internal slack calibration Internal slack calibration No Yes Signoff timer Update slack offset Timing recovery procedure
30
30 PRFT with Sensitivity Functions Objective: find the best leakage solution Various sensitivity functions are tried sequentially Best Solution /Sensitivity Function (SF) Run static timing analysis Calculate sensitivity for all cells Downsize cell C with maximum sensitivity slack (C ) < 0 Incremental STA NO Revert the Selection Revert YES Feasible? SGGS(SF i ) Next Sensitivity Function (SF i ) SGGS procedure Timing recovery No Yes
31
31 PRFT : Speeding up Bottleneck Cells Speed up bottleneck cells: recover timing slack with minimum power impact To escape from a local optimum, γ% bottleneck cells are upsized Feasible? No SGGS(SF) Yes Timing recovery Next Kick Move (LSMC) with γ% ratio Best Solution /Best Sensitivity Function (SF) from PRFT phase 1
32
32 Outline Gate Selection in VLSI Design Previous Works Challenges in Gate Selection High-Performance Gate Selection with a Signoff Timer Overall Flow Experimental Results Conclusions and Future Works
33
33 ISPD 2013 Gate Selection Contest Realistic benchmarks and constraints Netilst (Verilog), parasitics (SPEF), timing constraint (SDC) Max slew/load constraint Library: 11 logic functions, 30 cell types (three multi-V th and ten different sizes) 330 cells Leakage power of violation-free solutions are compared Final timing evaluation with a commercial signoff tool
34
34 Experimental Results: Power and Runtime Result Power and runtime comparison with contest best result http://www.ispd.cc/contests/13/ISPD_2013_Contest_Final.pdf Normalized leakage power Normalized runtime No team found a feasible solution for netcard_fast
35
35 Runtime breakdown Experimental Results: Runtime Breakdown
36
36 Normalized TNS* and leakage power over GTR iterations After timing correlation, TNS increases due to discrepancy between internal timer and signoff timer Experimental Results: Optimization Trajectories GTR without signoff timer GTR with signoff timer After timing correlation * TNS: Total Negative Slack
37
37 The minimum leakage without timing violation is achieved with calibration for every 5% cell change No calibration timing violation cannot be fixed One calibration leakage increase after timing recovery Applying gaurdband (GB) leakage overhead Experimental Results: Impact of Timing Calibration Result of pci_b32_fast
38
38 Trident2.0: high-performance gate-Selection Fast interconnect models with reasonable accuracy for an efficient internal timer Calibration to a signoff timer with an interface to improve timing accuracy Dedicated critical path optimization with heuristics ISPD 2013 gate selection contest Trident 2.0 took 2 nd and 1 st places in two contest categories, resp. Future work See if Lagrangian relaxation helps Additional industry benchmarks Conclusions and Future Work
39
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.