Reducing Clock Skew Variability via Cross Links LokWon Kim UCLA EE Ph.D. student Based on Anand Rajaram, Jiang Hu, and Rabi Mahapatra† Dept. of Electrical Engineering †Dept. of Computer Science Texas A&M University Today’s topic is Reducing Clock Skew Variability via Cross Links
Presentation Outline Introduction Traditional approaches to skew variability reduction The proposed method Analysis on the method The optimized node selection algorithm Experimental results Conclusion
Clock Distribution Network Signal transfer coordinated by clock signal All registers are supplied with clock signal by clock distribution network Skew = d1 – d2 Zero skew: d1 = d2 Useful skew, d1 – d2 = δ12 Register Register Dmax 1 2 T Catch signals d1 Launch signals d2 Clock Network In the digital system, data transfers are managed by clock signal. Clock skew adds unnecessary time to the circuit propagation delay time because the skew is uncertainty. Therefore, this makes the system slower. For high speed system, Clock skew should be minimized during clock routing.
Clock Tree Synthesis in Synchronous Circuits Clock signals synchronize data transfer between functional elements in synchronous design Clock skew becomes one of the most significant concerns in clock tree synthesis for high performance designs PLL MEM-ctrll Sys Disp AUDIO VIDEO Source Intel The right diagram shows the clock skew vs. clock frequency. The main observation is that as the frequency becomes higher, the skew is more comparable to the frequency. In fact, Clock skew becomes the NO.1 concern in clock tree synthesis for high performance designs.
Clock Distribution Networks: Important Considerations & Objectives One of the biggest & most frequently switching nets Very sensitive to unwanted skew introduced by Manufacturing variations Power supply noise Temperature variations Less skew variation is “MUST” for proper operation of chip Minimizing clock routing wire-length can Reduce power consumption Reduce power/ground noise Clock is one of the biggest & most frequently switching networks. This means that the clock distribution consumes lots of power just for clock delivery on systems. Some papers said that for some processors, only the clock network consumes about 30% of the total dynamic power. Clock skew in clock network is vulnerable by some variation sources in design time, manufacturing time and run time. Therefore, this paper provides a method for less skew variation with small additional wire on the conventional clock tree.
Sources of the Unwanted Skew Variations Process variations Gate variations Gate length variation Tox variation Interconnect variations Significantly affects delay and skew [Liu, et al., DAC’00] Load capacitance variations Power supply noise Temperature variations Gate variations tox Gate length Interconnect width Variations width Through the process on semiconductor, chip can get some variations such as gate, interconnect and load capacitance. These affects on clock skew variations In addition to this, power supply noise and spatial temperature variations can make an impact on the clock skew in run-time. So, for high speed and reliable operation, this paper is providing a effective and intuitive method to reduce such skew variation
Non-tree: Spine [Kurd et.al JSSC’01] Spines Clock sinks or local sub-networks Applied in Intel Pentium processor design Variations between spines still exists This is some existing way to reduce the skew variation. It is locally effective because the spines provide low variation. However, one limitation of this scheme is this still have variations between spines.
Non-tree: Mesh Top level mesh [Su et. al, ICCAD’01] Clock sinks or local sub-networks Top level mesh [Su et. al, ICCAD’01] Less wire, less effective Clock sinks or local sub-networks Leaf level mesh [Restle et. al, JSSC’01] Very effective, huge wire Applied in IBM microprocessor This structure is top level mesh clock network. The clock source drives a coarse mesh directly and clock subtrees are attached to the mesh The clock skew variations on the mesh are negligible, but skew variations within each sub tree still exist. So, this is less effective. This structure is leaf level mesh clock network. In this approach, a metal wire mesh is overlaid on the entire chip area and driven at multiple points directly from the clock source Each clock sink is connected to the nearest point on the mesh. This is very effective on suppressing skew variations but it consumes enormous wire resources and power. Its application is manly restricted to high-end products like microprocessors.
The proposed method for reducing skew variability
Alternative View on Non-tree Non-tree = tree + links Link = link_capacitors + link_resistor w u Rl C/2 i u w In this paper, it provides non-tree type clock networks scheme for reducing skew variation. It involves conventional clock tree and cross links. The links consist of link capacitors and a link resistor
Mathematical analysis on the Link insertion method
The Elmore Delay in a RC network Cj is the ground capacitance at node j. Rij is equal to the voltage at node i when 1A current is injected into node j and all other node capacitors are zero. i w 1A u We already know the Elmore delay concept in a RC network Ri,w = voltage at w
Effect of Link Capacitors on Delay Considering the effect of link capacitance only, the delay is: C: Total link capacitance w u i C/2 Adding link capacitors does not change the network topology, thus, its effect can be estimated easily. The elmore delay from the source to any sink I is like this. C/2
Effect of Link Resistor on Delay [Chan & Karplus, TCAD’90] w u Rl Cu Cw Delays evaluated with Cu = +1 and Cw = -1 Resistor added in the original circuit New delay after resistor addition Delays before resistor addition
Skew Between Link Endpoints Original skew Effect of the link resistor & capacitor on skew ru - rw > 0 always skew variation, then link resistor always reduces skew variation If nominal is zero, then can be treated as
Effect of Link Position For network with disjoint loops Rlink u w Rloop Value of becomes smaller when link is closer to leaf nodes for a given Rlink
Skew Variability Between Any Nodes u w P g h P is the nearest common ancestor for nodes u and w Tx: Sub-tree rooted at x Skew variation between node i and node j Scenario1: i Tg , j Th : variation smaller Scenario2: i & j Tg(or Th) : variation may be worse Scenario3: i Tp , j Tp : variation may be worse
A Simple Example 4 4’ 3 3’ 5 5’ 2 2’ 1 1’ The link pair number gives the rough order of effectiveness in reducing skew variations If you want to reduce skew between some two points. In this approach, we can short the two points by using the cross link. This is very intuitive way to suppress the variations. If you make large number of the links on the network, you can get effective results just with small amount of wire resource. This is the idea this paper is introducing.
Optimized node selection algorithm based on the analysis
General Flow of Non-tree Clock Routing Obtain initial clock tree Find node pairs for link insertion Add link capacitances to selected nodes Tune merging node location to restore original skew Insert link resistance to selected node pairs
Guidelines for Node Pair Selection for Link Insertion Select nodes which are hierarchically far apart Select nodes physically close to each other Select nodes with equal nominal delay Select nodes closer to leaf nodes In node selection time, to maximize the effect of link insertion, we should follow such guidelines
Rule Based Node Pair Selection Lower the α, better the link β-rule: Lower the β, lesser the tuning required γ-rule: The nearest common ancestor's depth from root is < γmax
Experimental result to prove the proposed method
Experimental Result on Skew Variability Benchmark r1 r2 r3 r4 r5 No. of sinks 267 598 862 1903 3100
HSPICE Validation Benchmark r1 r2 r3 r4 r5 No. of sinks 267 598 862 1903 3100
Experimental Result on Wire-length
Cost vs. Benefit Analysis of Link Addition The Law of `Diminished Returns’ holds!
Conclusions Effective link insertion methods has been proposed Significant skew variability reduction with limited wire-length increase Proposed methodology is independent of the nature of variability effects