Presentation is loading. Please wait.

Presentation is loading. Please wait.

Improved Algorithms for Link- Based Non-tree Clock Network for Skew Variability Reduction Anand Rajaram †‡ David Z. Pan † Jiang Hu * † Dept. of ECE, UT-Austin.

Similar presentations


Presentation on theme: "Improved Algorithms for Link- Based Non-tree Clock Network for Skew Variability Reduction Anand Rajaram †‡ David Z. Pan † Jiang Hu * † Dept. of ECE, UT-Austin."— Presentation transcript:

1 Improved Algorithms for Link- Based Non-tree Clock Network for Skew Variability Reduction Anand Rajaram †‡ David Z. Pan † Jiang Hu * † Dept. of ECE, UT-Austin ‡ Texas Instruments, Dallas * Dept. of EE, TAMU

2 Outline  Introduction  Review of link-based non-tree clock network  Improved algorithms (over [Rajaram et al, DAC ’ 04]) ›Rule based algorithm ( δ Rule) ›Graph theoretical approach (MST-based)  Experimental results  Conclusions

3 Clock Distribution Network Register D max Clock Network 12 d1d1 Launch signals d2d2 T Catch signals  Signal transfer coordinated by clock signal  All registers are supplied with clock signal by clock distribution network  Skew = d 1 – d 2  Zero skew: d 1 = d 2  Useful skew, d 1 – d 2 = δ 12

4 Clocks : Important Considerations & Objectives  One of the biggest & most frequently switching nets  Very sensitive to unwanted skew introduced by PVT ›Manufacturing process variations (P) ›Power supply voltage noise (V) ›Temperature variations (T)  Less clock skew variation a “ MUST ” for nanometer VLSI designs  Minimizing clock routing wire-length can ›Reduce power consumption

5 Approaches for Reducing Skew Variability  Buffer & wire sizing [Pullela et al., DAC ’ 93; Chung et al., ICCAD ’ 94; Wang et al., ISPD ’ 04]  Variation aware routing [Lin et al., ICCAD ’ 94; Lu et al., ISPD ’ 03]  Non-tree clock networks ›McCoy et al., ETC ’ 94; Vandenberghe et al., ICCAD ’ 97; Xue et al., ICCAD ’ 95 ›Link based non-tree clock networks [Rajaram et al., DAC ’ 04]

6 Non-tree: 1-D Spine [Kurd et.al JSSC ’ 01]  1-D spine  Applied in Intel Pentium processor design  Variations between spines still exists Spines Clock sinks or local sub-networks

7 Non-tree: 2-D Mesh  Top level mesh [Su et. al, ICCAD ’ 01]  Less wire, less effective  Leaf level mesh [Restle et. al, JSSC ’ 01]  Very effective, huge wire  Applied in IBM microprocessors Clock sinks or local sub-networks

8 Linked Non-tree = Tree + Links [Rajaram et al, DAC ’ 04]  Non-tree = tree + links  How to select link pairs is the key!  Link = link_capacitors + link_resistor u w i w u RlRl C/2 uw RlRl

9 Skew Between Link Endpoints  New skew with link (u, w): R link u w R loop  Value of becomes smaller when link is closer to leaf nodes for a given R link

10 Skew Between any Two Nodes (i, j) with Link (u, w) Skew variation between any node pair (i, j)  Scenario1: i  T g, j  T h => always smaller  Scenario2: i & j  T g (or T h ) => could be worse  Scenario3: i  T p, j  T p => could be much worse  Key idea: try to avoid Scenario 3 and 2 for link insertion u w P g h P: nearest common ancestor for u and w T x : Sub-tree rooted at x

11 Rule Based Algorithms [Rajaram et al, DAC ’ 04] α-rule: Lower the α, better the link β- rule: Lower the β, lesser the tuning required γ-rule: The nearest common ancestor's depth from root is < γ max

12 Guidelines for Node Pair Selection for Link Insertion  Select nodes which are hierarchically far apart  Select nodes physically close to each other  Select nodes with equal nominal delay  Select nodes closer to leaf nodes  For zero skew routing, only select leaf nodes

13  Merits ›Physical characteristics of the links considered. So bad links avoided. ›Independent of balanced nature of clock structure ›Efficient run time  Demerits ›No control over distribution of links. ›Possibility of links getting added in the same region  Solution ›δ-rule: No two links should have the same pair of ancestors at the depth = δ from the clock source ›Retains the merits of the previous rules and addresses the demerit A B CD A B CD Using δ = 2 Rule Based Algorithms [Rajaram et al, DAC’04]

14 δ Rule – An Example A B CD Crowding of links. Subtrees A and D not linked! Using δ = 2 δ is the node level from clock source

15 Graph Theoretical Approach Select_Node_Pairs(T v ) { l = v.left_child r = v.right_child P = Select_node_pair_between(T l, T r, k) ≥ if Depth(v) ≥ depth_limit, exit; P = P  Select_Node_Pairs(T l ) P = P  Select_Node_Pairs(T r ) Return P } lr v T l1 T l2 T r1 T r2  The entire clock tree is recursively divided into two parts and links added between them  This ensures distribution of links throughout the clock tree Edge weight = Min-distance between sinks of T li and T rj T l1 T l2 T r1 T r2

16 Graph theoretical approach – Min-matching [Rajaram et al, DAC ’ 04]  Bipartite min-matching algorithm to select the node pairs  Merits ›Distribute links evenly through all regions of the clock network  Demerits ›Due to the nature of the min-matching algorithm, only one link per sub-tree is allowed ›May result in some very lengthy links and increased wire lengths ›Lengthy links might be difficult to route ›Complexity of min-matching is O(n 3 ). Not scalable! l r v Lengthy links

17 New graph theoretical approach – Minimum Spanning Tree Based  MST algorithm allows more than one link per sub-tree ›More number of short links (cf. bipartite approach)  Retains the merits of the min- matching based approach ›Evenly distribute the links  Complexity is O(nlogn) ›Much faster than bipartite matching algorithm O(n 3 ) l r v

18 MST_node_pair_select(T l, T r, k) { Divide T l into k sub-trees, S l = { T l1, T l2, T l3, … T lk. } Divide T r into k subtrees, S r = { T r1, T r2, T r3, … T rk. } Find MST of the completely connected bipartite graph between S l & S r } T l1 T l2 T r1 T r2 SlSl SrSr l r v T l1 T l2 T r1 T r2 MST Based Algorithm After MST pair selection, iteratively delete edges violating the four rules (α, β, γ, and δ)

19 Experimental Setup  Benchmarks: r1 – r5 from bounded skew tree work [Cong et. al, ICCAD’95]  Interconnect width variation ›Smaller than thickness ›More sensitive to variations  Load capacitance variation -3σ -2σ -1σ +1σ +2σ +3σ Max Nom 99.74% Min All variables assumed to be Gaussian  Standard Deviation = Delay of sink i Delay of reference sink  Skew Variability measure: Standard Deviation

20 Experimental Result on Skew Variability Benchmarkr1r2r3r4r5 No. of sinks26759886219033100

21 HSPICE Validation Benchmarkr1r2r3r4r5 No. of sinks26759886219033100

22 Experimental Result on Wire- length

23 Wire-length comparison between link insertion methods

24 Conclusions  Two new efficient algorithms for link insertion have been proposed ›Significant skew variability reduction with very small wire-length increase ›Scale very well with size of clock network for both runtime and QOR  Proposed methodology is independent of the nature of variability effects  Friendly to incremental changes

25 Sources of the Unwanted Skew Variations  Process variations (P) ›Gate variations »Gate length variation »T ox variation ›Interconnect variations »Significantly affects delay and skew [Liu, et al., DAC ’ 00] ›Load capacitance variations  Supply voltage noise (V)  Temperature variations (T) t ox Gate length Gate variations Interconnect width Variations width

26 Skew Between Link Endpoints  Original skew skew variation, then link resistor always reduces skew variation  If nominal is zero, can be treated as  Link capacitance may affect nominal skew i w 1A1A R i,w = voltage at w u  Effect of the link resistor & capacitor on skew r u - r w > 0 always Elmore delays evaluated with C u = +1 and C w = -1

27 General Flow of Non-tree Clock 1. Obtain initial clock tree 2. Find node pairs for link insertion 3. Add link capacitances to selected nodes 4. Tune merging node location to restore original skew 5. Insert link resistance to selected node pairs

28 Run Time Comparison Runtime comparison between the different methods as a function of number of links at γ = 1


Download ppt "Improved Algorithms for Link- Based Non-tree Clock Network for Skew Variability Reduction Anand Rajaram †‡ David Z. Pan † Jiang Hu * † Dept. of ECE, UT-Austin."

Similar presentations


Ads by Google