Download presentation
Presentation is loading. Please wait.
Published byJohnathan Russell Modified over 7 years ago
1
Comprehensive Optimization of Scan Chain Timing During Late-Stage IC Implementation
Kun Young Chung*, Andrew B. Kahng+ and Jiajia Li+ + University of California at San Diego * Qualcomm Inc.
2
Agenda Motivation Related Works Our Methodology Experimental Results
Conclusions
3
Scan Chain Timing Matters
Scan chain timing is important to test time, cost and robustness Issues Small number of logic instances along scan timing paths Scan timing paths are vulnerable to hold violations Increase of #hold buffers (area, routing congestion) Scan shift at a high frequency large dynamic voltage drop (DVD) degraded setup scan timing “false failure” Goals Scan ordering for reduced #hold buffers Gating insertion to minimize timing degradation due to DVD Problems are not new, but do previous approaches really solve these problems?
4
Need Late-Stage Optimization
Hold-critical paths and DVD hotspots vary between early (post-placement) and late (post-routing) stages Clock skew and interconnect delay affect hold timing Post-placement Post-routing Hold-critical scan timing paths vary Design: LEON3MP Blue: non-critical paths Red: hold-critical paths
5
Need Late-Stage Optimization
Hold-critical paths and DVD hotspots vary between early (post-placement) and late (post-routing) stages Clock buffer insertion and timing optimization during routing affect DVD hotspots Dynamic voltage drop (during scan shift) varies Design: LEON3MP Switching activity at test input = 50% Post-placement Post-routing
6
Need Late-Stage Optimization
Hold-critical paths and DVD hotspots vary between early (post-placement) and late (post-routing) stages Clock skew and interconnect delay affect hold timing Clock buffer insertion and timing optimization during routing affect DVD hotspots An early-stage optimization might be misleading Focus on optimizations during late-stage IC implementation Challenges Consideration of timing impact on datapaths in function mode Minimization of area and power overheads
7
Agenda Motivation Related Works Our Methodology Experimental Results
Conclusions
8
Related Works Scan ordering
[Feuer83] uses TSP-based heuristic for scan ordering [Gupta03] considers physical information (routing segments) for scan ordering to minimize wirelength [Tudu13] performs test-pattern-aware scan ordering to minimize transitions [Teene03] applies skew-aware scan ordering to minimize #hold buffers Ours also considers wire delay and datapath timing Gating approaches [Gerstendörfer00] inserts gating logic to suppress activity of fanout combinational cells during scan shift [Elshoukry07] avoids gating insertion on timing-critical paths [Jayaram10] proposes to gate internal nodes inside fanout cones of scan cells Ours is the first gating approach for minimization of DVD-aware timing degradation
9
Agenda Motivation Related Works Our Methodology Experimental Results
Scan Ordering for Hold Buffer Removal DVD-Aware Gating Insertion Experimental Results Conclusions Post-Routing Scan Ordering Given a routed design, timing constraints, upper bound on wirelength penalty, perform scan ordering to minimize #hold buffers
10
Causes of Hold Violations
Negative clock skew values Small distance between launch and capture FFs Skew distribution of scan timing paths with hold buffers Majority of hold-critical paths has negative skew Design: LEON3MP, 28LP Red: hold slack < 50ps Blue: hold slack < 100ps Green: hold slack < 150ps Design: LEON3MP, 28LP Distances between launch and capture FFs versus hold slacks Smaller distances lead to smaller hold slacks
11
Causes of Hold Violations
Negative clock skew values Small distance between launch and capture FFs Skew distribution of scan timing paths with hold buffers Majority of hold-critical paths has negative skew Goal: (i) achieve greater incidence of positive skew values, and (ii) slightly increase start-end FF distances Remove hold buffers Design: LEON3MP, 28LP Distances between launch and capture FFs versus hold slacks Smaller distances lead to smaller hold slacks Red: hold slack < 50ps Blue: hold slack < 100ps Green: hold slack < 150ps Design: LEON3MP, 28LP
12
Scan Ordering for Hold Buffer Removal
Scan ordering procedure for each scan chain Ck π = original ordering of Ck h = original #hold buffers in Ck for i = 2 to (Nk - 2) // Nk = #scan cells in Ck for j = (i + 1) to (Nk - 1) π’ = 2OptSwap(π, i, j) h’ = #hold buffers with π’ if (h’ < h && π’ is feasible) then π = π’; h = h’ endif endfor endfor Reorder Ck based on π endfor feasible = (i) no timing degradation on datapaths, (ii) no additional hold violations, (iii) meet upper bound on wirelength penalty Subchain with fixed ordering is merged into one node before optimization Example of skew-aware scan ordering
13
Agenda Motivation Related Works Our Methodology Experimental Results
Scan Ordering for Hold Buffer Removal DVD-Aware Gating Insertion Experimental Results Conclusions DVD Mitigation by Gating Given a routed design, timing constraints, power information, upper bound on area overhead, perform gating insertion to maximize minimum DVD-aware slack
14
Overall Gating Insertion Flow
Determine DVD hotspots DVD hotspot = a grid with large DVD Only consider DVD hotspots having impact on scan (setup) timing slacks Worst-DVD hotspot ≠ hotspot with largest impact on timing Find gating locations Reduce dynamic power within selected DVD hotspots Minimize number of gating logic insertion ECO-based gating insertion Minimize area, power and wirelength penalties Illustration of gating insertion D TI TE Q Scan timing path SE Comb logic cone Gating logic (OR gate)
15
DVD-Aware Gating Insertion (1)
Integer linear programming-based hotspot selection Given limited number (R) of hotspots Maximize the minimum DVD-aware slack (Smin) Maximize Smin Such that Sj + ∑ (αi ∙ Δi) ≥ Smin αi ≤ L ∙ βr ∑ βr ≤ R Notations Sk Original slack of scan timing path Δi Expected scan cell delay improvement from DVD reduction αi Binary indicator of whether DVD on scan cell is improved βi Binary indictor of whether DVD hotspot is selected (optimized) L Large constant number
16
DVD-Aware Gating Insertion (2)
Netlist traversal to find gating locations Minimize dynamic power within selected hotspots A simplified example ECO optimization Perform matching optimization (Hungarian method) between white spaces and gating logics 0.5 Cells in selected hotspots Candidate gating locations 1 Assign gain of 1 to each cell within selected hotspots Propagate gain values from each cell within selected hotspots backwards based on #fanins Select gating location with the maximum gain 0.75 0.5 + = 1.5 1 0.75 Selected gating locations
17
Agenda Motivation Related Works Our Methodology Experimental Results
Conclusions
18
Design of Experiments Technology: 28LP foundry, dual-VT
Example of applicable tool chain Synthesis: Synopsys Design Compiler H SP3 Scan chain insertion: Synopsys DFT Compiler H SP3 (maximum length of each chain = 250) P&R: Synopsys IC Compiler I SP1 Signoff timer: Synopsys PrimeTime H SP2 Power analysis: Synopsys PT-PX H SP2 Vectorless DVD analysis: ANSYS RedHawk Testcases Design Clock period (ns) #Instances #Scan chains DES 0.85 74035 45 VGA 1.1 80412 78 LEON3MP 2 474108 445 NETCARD 1.8 428974 358
19
Scan Ordering Results Reference: Default SP&R using a commercial tool flow (orig) Remove up to 82% of hold buffers along scan chains Incur negligible timing and wirelength penalties Design Flow #Hold buffers WNS (ps) TNS (ns) THS (ns) Wirelength (mm) DES orig 1296 (1.00) -21 -0.089 -0.101 765.9 opt 487 (0.38) -0.081 -0.121 766.3 VGA 202 (1.00) -6 -0.019 -1.222 3087.9 89 (0.44) -0.018 -1.223 3089.7 LEON3MP 25581 (1.00) 30 -0.734 11088 (0.18) -0.705 11084 NETCARD 30864 (1.00) -4 -0.004 12729 26887 (0.87) 12720
20
Gating Insertion Results
Reference: Default SP&R using a commercial tool flow (orig) Achieve up to 58% improvement of DVD-induced slack degradation (Δslack) Small number of gating cells < 1% area penalty Worst DVD does not necessarily correspond to worst DVD-aware slack Design Flow Δslack (ps) WNS (ps) TNS (ns) #Gating cells DVD (mV) Area (μm2) DES orig 60 (1.00) -21 -0.089 - 85 79662 opt 25 (0.42) -0.079 36 84 79705 VGA 159 (1.00) -6 -0.019 93 120832 118 (0.74) -0.029 42 82 120873 LEON3MP 471 (1.00) 30 129 699885 383 (0.81) 62 121 699969 NETCARD 576 (1.00) -4 -0.004 163 575869 496 (0.86) -8 -0.008 111 147 576022
21
Agenda Motivation Related Works Our Methodology Experimental Results
Conclusions
22
Futures and Conclusions
Comprehensive scan timing optimization during late-stage IC implementation Validation with realistic implementation flow (under guidance of our industrial colleagues) Up to 82% hold buffer reduction and 58% improvement of DVD-induced scan timing degradation Future works Co-optimization of gating insertion, scan ordering and test pattern generation DVD optimization during capture stage Predictive model to determine DVD hotspot during scan shift/capture Thanks to Samsung Electronics for research support!
23
THANK YOU !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.