Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective
Outline Introduction Gate sizing by lagrangian relaxation Combined clock and data sizing Clock sizing by dynamic programming Algorithm analysis Experimental results
Introduction Given a gate-level circuit and a standard cell library, the goal of such optimization is to find gate sizes that would yield best combination of total circuit power, performance and area. Data gates implement the logical function of the block Clock gates distributing common synchronization signal to different state elements in the circuit
Separation reasons Design methodology Data: best performance vs power or area Clock: meet minimum skew and skew variability Structure Data: DAG Clock: tree (Dynamic programming) Problem classification Data: convex Clock: non-convex
Contribution Combine clock and data sizing decisions to solve a common global objective Use Dynamic Programming algorithm to optimally solve the clock-related part of the relaxed objective
Gate sizing by lagrangian relaxation Minimizes the following objective:
Gate sizing by lagrangian relaxation Simplified formulation:
LR with skew optimization speeds up a clk at FF1 by up-sizing of gate A delay a clk at FF2 by down-sizing gate B.
Combined clock and data sizing Consider the set of clock gates Integrating these into Objective expanding each a s clk as a sum of delays from the root of the clock tree to the corresponding leaf
Combined clock and data sizing Under convex gate delay model the formulation (6) is not convex since some of the multiplier aggregations are negative. Applying sub-gradient descent techniques for optimizing this part of the objective would hence be inappropriate. Instead we propose using Dynamic Programming (DP) algorithm which performs systematic search over the solution space and thus is immune to the non-convexity of the problem.
Combined clock and data sizing
Clock sizing by dynamic programming DP algorithm is required to find clock gates sizes that minimize the following objective:
Clock sizing by dynamic programming
Dynamic programming Set of solutions per tree node n c is the associated downstream capacitance obj is the corresponding objective value Pruning criterion
Dynamic programming Leaf nodes: Solution merge: Gate sizing:
Additional considerations Side load effect Approximation+convergence Input slews
Algorithm analysis Complexity k-Sampling the complexity for the DP algorithm: Convergence Cooling concept from simulated annealing Optimality global optimality is not theoretically guaranteed.
Experimental results
Summary simultaneous clock and data gate sizing optimization applicable to wire sizing and buffer insertion. Probably could extend this method to handle simultaneous gate sizing and clock tree synthesis.