Reducing Clock Skew Variability via Cross Links

Slides:



Advertisements
Similar presentations
Porosity Aware Buffered Steiner Tree Construction C. Alpert G. Gandham S. Quay IBM Corp M. Hrkic Univ Illinois Chicago J. Hu Texas A&M Univ.
Advertisements

Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective.
EE 201A Modeling and Optimization for VLSI LayoutJeff Wong and Dan Vasquez EE 201A Noise Modeling Jeff Wong and Dan Vasquez Electrical Engineering Department.
OCV-Aware Top-Level Clock Tree Optimization
4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
1 Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertion Presented By Cesare Ferri Takumi Okamoto, Jason Kong.
A Look at Chapter 4: Circuit Characterization and Performance Estimation Knowing the source of delays in CMOS gates and being able to estimate them efficiently.
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
Improved Algorithms for Link- Based Non-tree Clock Network for Skew Variability Reduction Anand Rajaram †‡ David Z. Pan † Jiang Hu * † Dept. of ECE, UT-Austin.
Minimal Skew Clock Embedding Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 19: Timing Issues; Introduction to Datapath.
Chapter 11 Timing Issues in Digital Systems Boonchuay Supmonchai Integrated Design Application Research (IDAR) Laboratory August 20, 2004; Revised - July.
Low-power Clock Trees for CPUs Dong-Jin Lee, Myung-Chul Kim and Igor L. Markov Dept. of EECS, University of Michigan 1 ICCAD 2010, Dong-Jin Lee, University.
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
A Novel Clock Distribution and Dynamic De-skewing Methodology Arjun Kapoor – University of Colorado at Boulder Nikhil Jayakumar – Texas A&M University,
Power-Aware Placement
EE4271 VLSI Design Interconnect Optimizations Buffer Insertion.
Lecture 8: Clock Distribution, PLL & DLL
04/09/02EECS 3121 Lecture 25: Interconnect Modeling EECS 312 Reading: 8.3 (text), 4.3.2, (2 nd edition)
Interconnect Optimizations
Lecture #25a OUTLINE Interconnect modeling
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
A Global Minimum Clock Distribution Network Augmentation Algorithm for Guaranteed Clock Skew Yield A. B. Kahng, B. Liu, X. Xu, J. Hu* and G. Venkataraman*
Chapter 22 Alternating-Current Circuits and Machines.
VLSI Physical Design Automation
A Methodology for Interconnect Dimension Determination By: Jeff Cobb Rajesh Garg Sunil P Khatri Department of Electrical and Computer Engineering, Texas.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Interconnect design. n Crosstalk. n Power optimization.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
Thermal-aware Steiner Routing for 3D Stacked ICs M. Pathak and S.K. Lim Georgia Institute of Technology ICCAD 07.
1 Interconnect and Packaging Lecture 8: Clock Meshes and Shunts Chung-Kuan Cheng UC San Diego.
Detailed Routing: New Challenges
Integrated Placement and Skew Optimization for Rotary Clocking A paper by: Ganesh Venkataraman, Student Member, IEEE, Jiang Hu, Member, IEEE, and Frank.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Interconnect design. n Crosstalk. n Power optimization.
Clock-Tree Aware Placement Based on Dynamic Clock-Tree Building Yanfeng Wang, Qiang Zhou, Xianlong Hong, and Yici Cai Department of Computer Science and.
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Simultaneous Analog Placement and Routing with Current Flow and Current Density Considerations H.C. Ou, H.C.C. Chien and Y.W. Chang Electronics Engineering,
Introduction to Clock Tree Synthesis
Chapter 4: Secs ; Chapter 5: pp
1ISPD'03 Process Variation Aware Clock Tree Routing Bing Lu Cadence Jiang Hu Texas A&M Univ Gary Ellis IBM Corp Haihua Su IBM Corp.
Low-Power and High-Speed Interconnect Using Serial Passive Compensation Chun-Chen Liu and Chung-Kuan Cheng Computer Science and Engineering Dept. University.
Clock Distribution Network
A Fully Polynomial Time Approximation Scheme for Timing Driven Minimum Cost Buffer Insertion Shiyan Hu*, Zhuo Li**, Charles Alpert** *Dept of Electrical.
An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical.
An O(nm) Time Algorithm for Optimal Buffer Insertion of m Sink Nets Zhuo Li and Weiping Shi {zhuoli, Texas A&M University College Station,
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Yanqing Zhang University of Virginia On Clock Network Design for Sub- threshold Circuitry 1.
University of Michigan Advanced Computer Architecture Lab. 2 CAD Tools for Variation Tolerance David Blaauw and Kaviraj Chopra University of Michigan.
Piero Belforte, HDT 1999: PRESTO POWER by Alessandro Arnulfo.
Power Distribution Copyright F. Canavero, R. Fantino Licensed to HDT - High Design Technology.
Piero Belforte, HDT, July 2000: MERITA Methodology to Evaluate Radiation in Information Technology Application, methodologies and software solutions by Carla Giachino,
The Interconnect Delay Bottleneck.
Chapter 7 – Specialized Routing
Crosstalk If both a wire and its neighbor are switching at the same time, the direction of the switching affects the amount of charge to be delivered and.
Reading: Hambley Ch. 7; Rabaey et al. Sec. 5.2
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
CMOS VLSI Design Chapter 13 Clocks, DLLs, PLLs
Chapter 5b Stochastic Circuit Optimization
Chapter 10 Timing Issues Rev /11/2003 Rev /28/2003
Yiyu Shi*, Wei Yao*, Jinjun Xiong+ and Lei He*
CMOS VLSI Design Chapter 13 Clocks, DLLs, PLLs
332:578 Deep Submicron VLSI Design Lecture 14 Design for Clock Skew
Guihai Yan, Yinhe Han, Xiaowei Li, and Hui Liu
Post-Silicon Calibration for Large-Volume Products
Wire Indctance Consequences of on-chip inductance include:
Energy Efficient Power Distribution on Many-Core SoC
Clock Tree Routing With Obstacles
Performance-Driven Interconnect Optimization Charlie Chung-Ping Chen
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Presentation transcript:

Reducing Clock Skew Variability via Cross Links LokWon Kim UCLA EE Ph.D. student Based on Anand Rajaram, Jiang Hu, and Rabi Mahapatra† Dept. of Electrical Engineering †Dept. of Computer Science Texas A&M University Today’s topic is Reducing Clock Skew Variability via Cross Links

Presentation Outline Introduction Traditional approaches to skew variability reduction The proposed method Analysis on the method The optimized node selection algorithm Experimental results Conclusion

Clock Distribution Network Signal transfer coordinated by clock signal All registers are supplied with clock signal by clock distribution network Skew = d1 – d2 Zero skew: d1 = d2 Useful skew, d1 – d2 = δ12 Register Register Dmax 1 2 T Catch signals d1 Launch signals d2 Clock Network In the digital system, data transfers are managed by clock signal. Clock skew adds unnecessary time to the circuit propagation delay time because the skew is uncertainty. Therefore, this makes the system slower. For high speed system, Clock skew should be minimized during clock routing.

Clock Tree Synthesis in Synchronous Circuits Clock signals synchronize data transfer between functional elements in synchronous design Clock skew becomes one of the most significant concerns in clock tree synthesis for high performance designs PLL MEM-ctrll Sys Disp AUDIO VIDEO Source Intel The right diagram shows the clock skew vs. clock frequency. The main observation is that as the frequency becomes higher, the skew is more comparable to the frequency. In fact, Clock skew becomes the NO.1 concern in clock tree synthesis for high performance designs.

Clock Distribution Networks: Important Considerations & Objectives One of the biggest & most frequently switching nets Very sensitive to unwanted skew introduced by Manufacturing variations Power supply noise Temperature variations Less skew variation is “MUST” for proper operation of chip Minimizing clock routing wire-length can Reduce power consumption Reduce power/ground noise Clock is one of the biggest & most frequently switching networks. This means that the clock distribution consumes lots of power just for clock delivery on systems. Some papers said that for some processors, only the clock network consumes about 30% of the total dynamic power. Clock skew in clock network is vulnerable by some variation sources in design time, manufacturing time and run time. Therefore, this paper provides a method for less skew variation with small additional wire on the conventional clock tree.

Sources of the Unwanted Skew Variations Process variations Gate variations Gate length variation Tox variation Interconnect variations Significantly affects delay and skew [Liu, et al., DAC’00] Load capacitance variations Power supply noise Temperature variations Gate variations tox Gate length Interconnect width Variations width Through the process on semiconductor, chip can get some variations such as gate, interconnect and load capacitance. These affects on clock skew variations In addition to this, power supply noise and spatial temperature variations can make an impact on the clock skew in run-time. So, for high speed and reliable operation, this paper is providing a effective and intuitive method to reduce such skew variation

Non-tree: Spine [Kurd et.al JSSC’01] Spines Clock sinks or local sub-networks Applied in Intel Pentium processor design Variations between spines still exists This is some existing way to reduce the skew variation. It is locally effective because the spines provide low variation. However, one limitation of this scheme is this still have variations between spines.

Non-tree: Mesh Top level mesh [Su et. al, ICCAD’01] Clock sinks or local sub-networks Top level mesh [Su et. al, ICCAD’01] Less wire, less effective Clock sinks or local sub-networks Leaf level mesh [Restle et. al, JSSC’01] Very effective, huge wire Applied in IBM microprocessor This structure is top level mesh clock network. The clock source drives a coarse mesh directly and clock subtrees are attached to the mesh The clock skew variations on the mesh are negligible, but skew variations within each sub tree still exist. So, this is less effective. This structure is leaf level mesh clock network. In this approach, a metal wire mesh is overlaid on the entire chip area and driven at multiple points directly from the clock source Each clock sink is connected to the nearest point on the mesh. This is very effective on suppressing skew variations but it consumes enormous wire resources and power. Its application is manly restricted to high-end products like microprocessors.

The proposed method for reducing skew variability

Alternative View on Non-tree Non-tree = tree + links Link = link_capacitors + link_resistor w u Rl C/2 i u w In this paper, it provides non-tree type clock networks scheme for reducing skew variation. It involves conventional clock tree and cross links. The links consist of link capacitors and a link resistor

Mathematical analysis on the Link insertion method

The Elmore Delay in a RC network Cj is the ground capacitance at node j. Rij is equal to the voltage at node i when 1A current is injected into node j and all other node capacitors are zero. i w 1A u We already know the Elmore delay concept in a RC network Ri,w = voltage at w

Effect of Link Capacitors on Delay Considering the effect of link capacitance only, the delay is: C: Total link capacitance w u i C/2 Adding link capacitors does not change the network topology, thus, its effect can be estimated easily. The elmore delay from the source to any sink I is like this. C/2

Effect of Link Resistor on Delay [Chan & Karplus, TCAD’90] w u Rl Cu Cw Delays evaluated with Cu = +1 and Cw = -1 Resistor added in the original circuit New delay after resistor addition Delays before resistor addition

Skew Between Link Endpoints Original skew Effect of the link resistor & capacitor on skew ru - rw > 0 always skew variation, then link resistor always reduces skew variation If nominal is zero, then can be treated as

Effect of Link Position For network with disjoint loops Rlink u w Rloop Value of becomes smaller when link is closer to leaf nodes for a given Rlink

Skew Variability Between Any Nodes u w P g h P is the nearest common ancestor for nodes u and w Tx: Sub-tree rooted at x Skew variation between node i and node j Scenario1: i  Tg , j  Th : variation smaller Scenario2: i & j Tg(or Th) : variation may be worse Scenario3: i  Tp , j  Tp : variation may be worse

A Simple Example 4 4’ 3 3’ 5 5’ 2 2’ 1 1’ The link pair number gives the rough order of effectiveness in reducing skew variations If you want to reduce skew between some two points. In this approach, we can short the two points by using the cross link. This is very intuitive way to suppress the variations. If you make large number of the links on the network, you can get effective results just with small amount of wire resource. This is the idea this paper is introducing.

Optimized node selection algorithm based on the analysis

General Flow of Non-tree Clock Routing Obtain initial clock tree Find node pairs for link insertion Add link capacitances to selected nodes Tune merging node location to restore original skew Insert link resistance to selected node pairs

Guidelines for Node Pair Selection for Link Insertion Select nodes which are hierarchically far apart Select nodes physically close to each other Select nodes with equal nominal delay Select nodes closer to leaf nodes In node selection time, to maximize the effect of link insertion, we should follow such guidelines

Rule Based Node Pair Selection Lower the α, better the link β-rule: Lower the β, lesser the tuning required γ-rule: The nearest common ancestor's depth from root is < γmax

Experimental result to prove the proposed method

Experimental Result on Skew Variability Benchmark r1 r2 r3 r4 r5 No. of sinks 267 598 862 1903 3100

HSPICE Validation Benchmark r1 r2 r3 r4 r5 No. of sinks 267 598 862 1903 3100

Experimental Result on Wire-length

Cost vs. Benefit Analysis of Link Addition The Law of `Diminished Returns’ holds!

Conclusions Effective link insertion methods has been proposed Significant skew variability reduction with limited wire-length increase Proposed methodology is independent of the nature of variability effects