Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)1 ELEC 7770 Advanced VLSI Design Spring 2014 Zero - Skew Clock Routing Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)2 Zero-Skew Clock Routing FF CK
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)3 Zero-Skew: References H-Tree A. L. Fisher and H. T. Kung, “Synchronizing Large Systolic Arrays,” Proc. SPIE, vol. 341, pp , May A. Kahng, J. Cong and G. Robins, “High-Performance Clock Routing Based on Recursive Geometric Matching,” Proc. Design Automation Conf., June 1991, pp M. A. B. Jackson, A. Srinivasan and E. S. Kuh, “Clock Routing for High-Performance IC’s,” Proc. Design Automation Conf., June 1990, pp
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)4 Zero-Skew Routing Build clock tree bottom up: Leaf nodes are all equal loading flip-flops. Two zero-skew subtrees are joined to form a larger zero-skew subtree. Entire clock tree is built recursively. R.-S. Tsay, “An Exact Zero-Skew Clock Routing Algorithm,” IEEE Trans. CAD, vol. 12, no. 2, pp , Feb J. Rubenstein, P. Penfield and M. A. Horowitz, “Signal Delay in RC Tree Networks,” IEEE Trans. CAD, vol. 2, no. 3, pp , July 1983.
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)5 Balancing Subtrees (1) t1 C1 c1/2 t2 C2 c2/2 r1 r2 (1 – x)L xL Tapping point Subtree 1 Subtree 2 A B
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)6 Balancing Subtrees (2) Subtrees 1 and 2 are each balanced (zero- skew) trees, with delays t1 and t2 to respective leaf nodes. Total capacitances of subtrees are C1 and C2, respectively. Connect points A and B by a minimum-length wire of length L. Determine a tapping point x such that wire lengths xL and (1 – x)L produce zero skew.
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)7 Balancing Subtrees (3) Use Elmore delay formula: 0.69 r1(C1 + c1/2) + t1 = 0.69 r2(C2 + c2/2) + t2 Substitute: r1 = axL, r2 = a(1 – x)L c1 = bxL, c2 = b(1 –x)L abL 2 x + aL(C1+C2)x = 1.45 (t2 – t1) + aL(C2+bL/2) Then solve for x: 1.45 (t2 – t1) + aL (C2 + bL/2) x =─────────────────── aL(bL + C1 + C2) aL(bL + C1 + C2)
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)8 Balancing Subtrees Example 1 Subtree parameters: Subtree 1: t1 = 5ps, C1 = 3pF Subtree 2: t2 = 10ps, C2 = 6pF Interconnect: L = 1mm Wire parameters: a = 100Ω/cm, b = 1pF/cm Tapping point: 1.45(t2 – t1) + aL (C2 + bL/2)1.45(10–5) + 100×0.1(6 + 1×0.1/2) X =────────────────── = ────────────────────── aL (bL + C1 + C2) 100×0.1(1× ) aL (bL + C1 + C2) 100×0.1(1× ) = 1.45( )/(10×9.1) = = 1.45( )/(10×9.1) =
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)9 Example 1 FF To next level Subtree 1 Subtree mm mm t1 = 5ps, C1 = 3pF t2 = 10ps, C2 = 6pF
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)10 Balancing Subtrees, x > 1 Tapping point set at root of tree with larger loading (C2, t2). Wire to the root of other tree is elongated to provide additional delay. Wire length L is found as follows: Set x = 1 in abL 2 x + aL(C1+C2)x = 1.45(t2 – t1)+aL(C2+bL/2) i.e., L 2 + (2C1/b)L – 2.9 (t2 – t1)/(ab) = 0 Wire length is given by: [(aC1) ab(t2 – t1)] ½ – aC1 [(aC1) ab(t2 – t1)] ½ – aC1 L= ──────────────────── a b R.-S. Tsay, “An Exact Zero-Skew Clock Routing Algorithm,” IEEE Trans. CAD, vol. 12, no. 2, pp , Feb
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)11 Balancing Subtrees Example 2 Subtree parameters: Subtree 1: t1 = 2ps, C1 = 1pF Subtree 2: t2 = 15ps, C2 = 10pF Interconnect: L = 1mm Wire parameters: a = 100Ω/cm, b = 1pF/cm Tapping point: 1.45(t2 – t1) + aL (C2 + bL/2)1.45(15–2) + 100×0.1(10 + 1×0.1/2) 1.45(t2 – t1) + aL (C2 + bL/2)1.45(15–2) + 100×0.1(10 + 1×0.1/2) x =─────────────────── = ────────────────────── aL (bL + C1 + C2) 100×0.1(1× ) aL (bL + C1 + C2) 100×0.1(1× ) = ( )/(10×11.1) = = ( )/(10×11.1) =
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)12 Example 2, x = Setting x = 1.0, [(aC1) ab(t2 – t1)]½ – aC1 [(aC1) ab(t2 – t1)]½ – aC1 L= ──────────────────── a b a b [(100×1) (15 – 2)]½ – 100×1 [(100×1) (15 – 2)]½ – 100×1 = ─────────────────────── 100×1 100×1 =0.1735cm For a wire of 1.735mm length, place the clock feed at one end.
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)13 Example 2, L = 1.735mm FF To next level Subtree 1 Subtree 2 L = mm t1 = 2ps, C1 = 1pF t2 = 15ps, C2 = 10pF
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)14 Balancing Subtrees, x < 0 Tapping point set at root of tree with smaller loading (C1, t1). Wire to the root of other tree is elongated to provide additional delay. Wire length L found as follows: Set x = 0 in abL 2 x + aL(C1+C2)x = 1.45(t2 – t1)+aL(C2+bL/2) i.e., L 2 + (2C2/b)L – 2.9 (t1 – t2)/(ab) = 0 Wire length is given by: [(aC2) ab(t1 – t2)] ½ – aC2 [(aC2) ab(t1 – t2)] ½ – aC2 L= ──────────────────── a b a b R.-S. Tsay, “An Exact Zero-Skew Clock Routing Algorithm,” IEEE Trans. CAD, vol. 12, no. 2, pp , Feb
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)15 Balancing Subtrees Example 3 Subtree parameters: Subtree 1: t1 = 15ps, C1 = 10pF Subtree 2: t2 = 2ps, C2 = 1pF Interconnect: L = 1mm Wire parameters: a = 100Ω/cm, b = 1pF/cm Tapping point: 1.45(t2 – t1) + aL (C2 + bL/2)1.45(2–15) + 100×0.1(1 + 1×0.1/2) x =─────────────────── = ────────────────────── aL (bL + C1 + C2) 100×0.1(1× ) aL (bL + C1 + C2) 100×0.1(1× ) = ( – )/(10×11.1) = – = ( – )/(10×11.1) = –
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)16 Example 3, x = – Setting x = 0.0, [(aC2) ab(t1 – t2)]½ – aC2 [(aC2) ab(t1 – t2)]½ – aC2 L= ────────────────── a b a b [(100×1) (15 – 2)]½ – 100×1 [(100×1) (15 – 2)]½ – 100×1 = ─────────────────────── 100×1 100×1 =0.1735cm
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)17 Example 3, L = 1.255mm FF To next level Subtree 1 L = 1.735mm FF Subtree 2 t1 = 15ps, C1 = 10pF t2 = 2ps, C2 = 1pF
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)18 Zero-Skew Design FF AFF B Comb. CK Single-cycle path delay time Tck = 75ns FF C Comb. Delay =75ns Delay = 50ns
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)19 Nonzero-Skew Design FF AFF B Comb. CK Single-cycle path delay time Tck = 50ns FF C Comb. Delay =75ns Delay = 50ns Delay = 25ns
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)20 Optimized Skew Design FF AFF B Comb. CK Period = T FF C Comb. Delay =75ns Delay = 50ns SA SB SC Delay Linear program:Objective: Minimize T Constraints (subject to): SB – SA + T ≥ 75 SC – SB + T ≥ 50 SA – SC + T ≥ 75 Comb. Delay =75ns
Online LP Solvers PHPSimplex solver LINDO (Download) File Lec12.ltx ! Lecture 11 example MIN T SUBJECT TO 1)SB - SA + T ≥75 2)SC - SB + T ≥ 50 3)SA - SC + T ≥ 75 END Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)21
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)22 Optimized Skew Design FF AFF B Comb. CK T = 66.67ns FF C Comb. Delay =75ns Delay = 50ns 8.33ns 16.67ns 0ns Delay Comb. Delay =75ns
Spring 2014, Mar 17...ELEC 7770: Advanced VLSI Design (Agrawal)23 Conclusion Zero-skew design is possible at the layout level. Zero-skew usually results in higher clock speed. Nonzero clock skews can improve the design with reduced hardware and/or higher speed.