Proximity Optimization for Adaptive Circuit Design Ang Lu, Hao He, and Jiang Hu
Introduction Proposed Techniques Experiment Result Conclusion Overview 2
Design Challenges Process Variations Device Aging 3 Power
Adaptive Circuit Apply power according to actual chip variations More energy-efficient than the worst case desgin 4
Delay variation sensors – Critical path replica, used in IBM Power5 processor – Canary flip-flop, like in Razor Tuning knobs – Adaptive body bias – Adaptive supply voltage (voltage interpolation) Sensors and Tuning Knobs in Adaptive Circuit 5 Liang, et al., Micro 2009
Pros and Cons of Adaptive Circuit Cons: Potentially large area overhead Higher design complexity Pros: More energy-efficient than the worst case designs 6
Clustering for Adaptive Circuits Two extreme cases: Tune each cell individually? Too much overhead Tune entire circuit collectively? Less energy savings Achieve desired trade-off? Clustering 7
Introduction Proposed Techniques – Overall Flow – Time and Location Aware Cell Clustering – Clustering Driven Incremental Placement Experiment Result Conclusion Overview 8
Proposed Flow 9
Cluster cells based on spatial proximity Cluster cells based on their timing characteristics – Kulcarni, et al., TCAD 2008 – Monte Carlo (MC) simulation – Optimize for each MC run Existing Clustering Methods Location Timing Location Timing 10 Manual partition for regular datapath
Need to Consider Both Timing & Spatial Proximity Both paths A & B are critical Bad Cluster: A1 & B2 (Similar timing characteristic) Good Cluster: A2(A1) & B2 (B1) 11
Clustering Algorithm Distance definition Location Timing Clustering algorithm 12
Timing Slack 13
Timing Sensitivity 14
Highly correlated cells are clustered together Spatial correlation is partially addressed by spatial proximity Structural correlation – Not every cell on one critical path need to be tuned – Structurally correlated cells are rarely too far apart Correlations? 15
Introduction Proposed Techniques – Overall Flow – Time and Location Aware Cell Clustering – Clustering Driven Incremental Placement Experiment Result Conclusion Overview 16
Cluster Driven Incremental Placement 17
Min-Cost Network Flow Formulation Source nodes Sink nodes 18
Implementation Them min-cost network flow problem is solved by the Edmond-Karp algorithm Move cells heuristically for fractional flow solutions 19
Wirelength Overhead Control After incremental placement, wirelength increase is estimated If the increase > threshold, rerun clustering with increased weight on spatial proximity 20
Introduction Proposed Techniques Experiment Result Conclusion Overview 21
Experiment Setup Benchmark: – ICCAD 2014 Incremental Timing-Driven Placement Contest benchmark suites – 7 circuits, (130K, 960K) cells Adaptive body bias is employed as platform of adaptive circuit design # cluster is empirically chosen in a range from 10 to 25 22
Comparison and Methodology 1.Over design 2.Location-driven clustering 3.Timing-driven clustering 4.Location and timing driven clustering (Ours) Methods that are compared: For each method, simulate multiple times with varying parameters and report the average results Methodology 23
Testcases and Placement Perturbations Circuit# gatesCells movedAvg. cell move distance edit_dist %7 matrix_mult %8 vga_lcd %27 b %12 leon3mp %89 leon %50 netcard %90 24
Results from only Forward Body Bias Our method achieves 99% timing yield like other methods 1/4 less power than over design 1/3 less area overhead substantial less wire overhead 25
Results from Forward and Reversed Body Bias Our method achieves 99% timing yield similar power much less area overhead than location-only much less wire overhead than timing-only 26
Power/Area – Timing Tradeoff Circuit “mgc_matrix_mult” 27
Impact of Weighting Factors αβγ # clustersAdapt Power∆ Area∆ wire % % % % % % % % % % Circuit “mgc_matrix_mult” 28
Entire Flow Runtime 29
Conclusion Clustering and cluster-driven placement are proposed for adaptive circuit designs Reduce area and power overhead of adaptive circuit, outperform previous methods Assure gates of the same cluster locate in a contiguous region Wire-length increase <1%. 30
Built-In Self Optimization for Variation Resilience of Analog Filters Thank you!