March 8, 2006“Bus Stuttering”1 Bus Stuttering : An Encoding Technique To Reduce Inductive Noise In Off-Chip Data Transmission DATE 2006 Session 5B: Timing.

Slides:



Advertisements
Similar presentations
Exploiting Crosstalk to Speed up On-chip Buses Chunjie Duan Ericsson Wireless, Boulder Sunil P Khatri University of Colorado, Boulder.
Advertisements

Implementation Approaches with FPGAs Compile-time reconfiguration (CTR) CTR is a static implementation strategy where each application consists of one.
EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Synchronous Digital Design Methodology and Guidelines
1 Closed-Loop Modeling of Power and Temperature Profiles of FPGAs Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College.
Externally Tested Scan Circuit with Built-In Activity Monitor and Adaptive Test Clock Priyadharshini Shanmugasundaram Vishwani D. Agrawal.
Aug 23, ‘021Low-Power Design Minimum Dynamic Power Design of CMOS Circuits by Linear Program Using Reduced Constraint Set Vishwani D. Agrawal Agere Systems,
EELE 461/561 – Digital System Design Module #5 Page 1 EELE 461/561 – Digital System Design Module #5 – Crosstalk Topics 1.Near-End and Far-End Crosstalk.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 15: Interconnects & Wire Engineering Prof. Sherief Reda Division of Engineering,
EELE 461/561 – Digital System Design Module #6 Page 1 EELE 461/561 – Digital System Design Module #6 – Differential Signaling Topics 1.Differential and.
8/18/05ELEC / Lecture 11 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
Simulated-Annealing-Based Solution By Gonzalo Zea s Shih-Fu Liu s
Energy Efficient and High Speed On-Chip Ternary Bus Chunjie Duan Mitsubishi Electric Research Labs, Cambridge, MA, USA Sunil P. Khatri Texas A&M University,
Mehdi Amirijoo1 Power estimation n General power dissipation in CMOS n High-level power estimation metrics n Power estimation of the HW part.
“Performance Model for Inter-chip Busses”1 Performance Model for Inter-chip Busses Considering Bandwidth and Cost ISCAS 2005 Authors: Brock J. LaMeres.
October 5, 2005“Broadband Impedance Matching”1 Broadband Impedance Matching for Inductive Interconnect in VLSI Packages ICCD 2005 Authors: Brock J. LaMeres,
Analysis and Avoidance of Cross-talk in on-chip buses Chunjie Duan Ericsson Wireless Communications Anup Tirumala Jasmine Networks Sunil P Khatri University.
RLC Interconnect Modeling and Design Students: Jinjun Xiong, Jun Chen Advisor: Lei He Electrical Engineering Department Design Automation Group (
1 EE244 Project Your Title EE244 – Fall 2000 Name 1 Name 2.
Low power architecture and HDL coding practices for on-board hardware applications Kaushal D. Buch ASIC Engineer, eInfochips Ltd., Ahmedabad, India
1 Encoding-based Minimization of Inductive Cross-talk for Off-Chip Data Transmission Brock J. LaMeres Agilent Technologies, Inc. Sunil P. Khatri Dept.
Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
6.893: Advanced VLSI Computer Architecture, September 28, 2000, Lecture 4, Slide 1. © Krste Asanovic Krste Asanovic
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
Digital Fundamentals Floyd Chapter 1 Tenth Edition
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Interconnect design. n Crosstalk. n Power optimization.
Modern VLSI Design 4e: Chapter 7 Copyright  2008 Wayne Wolf Topics Global interconnect. Power/ground routing. Clock routing. Floorplanning tips. Off-chip.
1 VLSI Design SMD154 LOW-POWER DESIGN Magnus Eriksson & Simon Olsson.
Power Reduction for FPGA using Multiple Vdd/Vth
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
EE415 VLSI Design DYNAMIC LOGIC [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
CAD for Physical Design of VLSI Circuits
Lecture #35 Page 1 ECE 4110– Sequential Logic Design Lecture #35 Agenda 1.Clocking Techniques Announcements Next: 1.HW #15 due. 2.Final review.
School of Computer Science G51CSA 1 Computer Systems Architecture Fundamentals Of Digital Logic.
Open Discussion of Design Flow Today’s task: Design an ASIC that will drive a TV cell phone Exercise objective: Importance of codesign.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
ECE Advanced Digital Systems Design Lecture 12 – Timing Analysis Capt Michael Tanner Room 2F46A HQ U.S. Air Force Academy I n t e g r i.
© 2009 Pearson Education, Upper Saddle River, NJ All Rights ReservedFloyd, Digital Fundamentals, 10 th ed Digital Fundamentals with PLD Programming.
1 L24:Crosstalk-Concerned Physical Design Jun Dong Cho Sungkyunkwan Univ. Dept. ECE Homepage : vada.skku.ac.kr.
Ratioed Circuits Ratioed circuits use weak pull-up and stronger pull-down networks. The input capacitance is reduced and hence logical effort. Correct.
Washington State University
Teaching VLSI Design Considering Future Industrial Requirements Matthias Hanke
Modern VLSI Design 2e: Chapter 3 Copyright  1998 Prentice Hall PTR Topics n Electrical properties of static combinational gates: –transfer characteristics;
Low Power – High Speed MCML Circuits (II)
A Robust Pulse-triggered Flip-Flop and Enhanced Scan Cell Design
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Interconnect design. n Crosstalk. n Power optimization.
Recent Topics on Programmable Logic Array
Forbidden Transition Free Crosstalk Avoidance CODEC Design Chunjie Duan Mitsubishi Electric Research Labs, Cambridge, MA, USA Chengyu Zhu Polaris Microelectronic.
1 Bus Encoding for Total Power Reduction Using a Leakage-Aware Buffer Configuration 班級:積體所碩一 學生:林欣緯 指導教授:魏凱城 老師 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 33: November 20, 2013 Crosstalk.
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Modern VLSI Design 3e: Chapter 7 Copyright  1998, 2002 Prentice Hall PTR Topics n Power/ground routing. n Clock routing. n Floorplanning tips. n Off-chip.
By Nasir Mahmood.  The NoC solution brings a networking method to on-chip communication.
Inductance Screening and Inductance Matrix Sparsification 1.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
RTL Hardware Design by P. Chu Chapter 9 – ECE420 (CSUN) Mirzaei 1 Sequential Circuit Design: Practice Shahnam Mirzaei, PhD Spring 2016 California State.
Signal conditioning Noisy. Key Functions of Signal Conditioning: Amplification Filter  Attenuation  Isolation  Linearization.
Status and Plans for Xilinx Development
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 30: November 21, 2012 Crosstalk.
January 27, Controlling Inductive Cross-talk and Power in Off-chip Buses using CODECs ASP-DAC 2006 Session 8C-5: Inductive Issues in Power Grids.
MICROPROCESSOR DESIGN1 IR/Inductive Drop Introduction One component of every chip is the network of wires used to distribute power from the input power.
High Speed Properties of Digital Gates, Copyright F. Canavero, R. Fantino Licensed to HDT - High Design Technology
Worst Case Crosstalk Noise for Nonswitching Victims in High-Speed Buses Jun Chen and Lei He.
Power-Optimal Pipelining in Deep Submicron Technology
Reading: Hambley Ch. 7; Rabaey et al. Sec. 5.2
Day 33: November 19, 2014 Crosstalk
Day 31: November 23, 2011 Crosstalk
Digital Fundamentals Floyd Chapter 1 Tenth Edition
Chapter 6 (I) CMOS Layout of Complexe Gate
Presentation transcript:

March 8, 2006“Bus Stuttering”1 Bus Stuttering : An Encoding Technique To Reduce Inductive Noise In Off-Chip Data Transmission DATE 2006 Session 5B: Timing and Noise Analysis Presenter:Ganesh Venkataraman Texas A&M University Authors: Brock J. LaMeres Agilent Technologies Sunil P. Khatri Texas A&M University Contact:

March 8, 2006“Bus Stuttering”2 Agenda Problem Motivation Our Solution Experimental Results

March 8, 2006“Bus Stuttering”3 Why is IC Packaging Important? All Electronic Circuitry Resides in a Package - The package serves many purposes: 1) Protection of devices 2) Density Translation 3) Thermal Dissipation 4) Manufacturing Standardization Packaging Limits System Performance

March 8, 2006“Bus Stuttering”4 Why is packaging limiting performance? IC Design/Fabrication is Outpacing Package Technology - We’re seeing exponential increase in IC transistor performance - >1.3 Billion transistors on 1 die [Fall IDF-05]

March 8, 2006“Bus Stuttering”5 Why is packaging limiting performance? Packages Have Been Designed for Mechanical Performance - Electrical performance was not primary consideration - IC’s limited electrical performance - Package performance was not the bottleneck

March 8, 2006“Bus Stuttering”6 Why is packaging limiting performance? VLSI Performance Exceeds Package Performance - Packages optimized for mechanical reliability, but still used due to cost - IC performance far exceeds package performance On-Chip - f IC > 4GHz - large signal counts - exponential scaling Package - f pkg < 2GHz - limited signal counts - linear scaling

March 8, 2006“Bus Stuttering”7 Why is packaging limiting performance? Package Interconnect Contains Parasitic Inductance - Long interconnect paths - Large return current loops Wire Bond Inductance (up to 10s of nH)

March 8, 2006“Bus Stuttering”8 Why is packaging limiting performance? Package Parasitics Limit Performance - Excess inductance causes package noise - Noise limits how fast the package can transmit data 1.Supply Bounce (due to self inductance of VDD/GND bondwires) 2.Signal Coupling (due to mutual inductance between nearby signal bondwires)

March 8, 2006“Bus Stuttering”9 Why is packaging limiting performance? Aggressive Package Design Helps, but is expensive… - 95% of ASIC design-starts are wire bonded - Goal: Extend the life of current packages QFP – Wire Bond : ~ 4.5nH  $0.22 / pin BGA – Wire Bond : ~ 3.7nH  $0.34 / pin BGA – Flip-Chip : ~ 1.2nH  $0.63 / pin

March 8, 2006“Bus Stuttering”10 Our Solution “Encode Off-Chip Data to Avoid Inductive Cross-talk” Avoid the following cases: 1) Excessive switching in the same direction = reduce ground/power bounce 2) Excessive X-talk on a signal when switching = reduce edge degradation 3) Excessive X-talk on signal when static = reduce glitching

March 8, 2006“Bus Stuttering”11 Our Solution This results in: 1) A subset of vectors is transmitted that avoids inductive X-talk. 2) The off-chip bus can now be ran at a higher data rate. 3) The subset of vectors running faster can achieve a higher throughput over the original set of vectors running slower (including overhead). Throughput Throughput of less vectors of more vectors at higher data-rate at lower data-rate

March 8, 2006“Bus Stuttering”12 Bus Stuttering CODEC Intermediate States are Inserted Between Noise Causing Transitions - Stutter states limit the number of simultaneously switching signals - The source synchronous clock is gated during stutter state Package Un-encoded: B  C Vector Sequence Causes Noise Limit Violation Package Encoded: B  C Vector Sequence is eliminated using Stutter BAC BAC BAC Encoder Core No Encoding w/ Encoding BAC BAC BAC BAC BAC BAC A B stutterCA B CA B C

March 8, 2006“Bus Stuttering”13 Simultaneous Switching Noise Supply Bounce Induced Self Voltage Glitching Coupling onto Non-Switching Signals Edge Degradation Coupling onto Switching Signals Data Dependent Delay Bus Stuttering CODEC – Noise Sources

March 8, 2006“Bus Stuttering”14 Terminology Define the following: n =width of the bus segment where each bus segment consists of n-2 signals and 1 V DD and 1 V SS. j = the segment consisting of an n-bit bus. j is the segment under consideration. j-1 is the segment to the immediate left. j+1 is the segment to the immediate right. each segment has the same V DD /V SS placement.

March 8, 2006“Bus Stuttering”15 Terminology Define the following: =the transition (vector sequence) that the i th signal in the j th segment is undergoing, where = 1 = rising edge = -1 = falling edge = 0 = signal is static This 3-valued algebra enables us to model mutual inductive coupling of any sign

March 8, 2006“Bus Stuttering”16 Terminology Define the following coding constraints: Supply Bounce if is a supply pin, the total bounce on this pin is bounded by P bnc. P bnc is a user defined constant. Glitching if is a signal pin and is static ( = 0), the total magnitude of the glitch from switching neighbors should be less than P 0. P 0 is a user defined constant. Edge Degradation if is a signal pin and is switching ( = 1/-1), the total magnitude of the coupling from switching neighbors should be greater than P 1 / P -1. This coupling should not hurt (should aid) the transition. P 1 / P -1 is a user defined constant.

March 8, 2006“Bus Stuttering”17 Terminology Also define the following: p = how far away to consider coupling (ex., p = 3, consider K 11, K 12, and K 13 on each side of the victim) k q =Magnitude of coupled voltage on pin i when its q th neighbor p switches:

March 8, 2006“Bus Stuttering”18 Methodology For each pin v i j within segment j, we will write a series of constraints that will bound the inductive cross-talk magnitude. The constraints will differ depending on whether v i j is a signal or power pin. The coupling constraints will consider signals in adjacent segments (j+1, j-1) depending on p.

March 8, 2006“Bus Stuttering”19 Methodology Glitching : coupling is bounded by P 0 Example: v 2 j =0, and p=3. This means the three adjacent neighbors on either side of v 2 j need to be considered (v 4 j-1, v 0 j, v 1 j, v 3 j, v 4 j, v 0 j+1 ). Note we use modulo n arithmetic (and consider adjacent segments as required). v 2 j = 0 (static) -P 0 < k 3 ·(v 4 j-1 ) + k 2 ·(v 0 j ) + k 1 ·(v 1 j ) + k 1 ·(v 3 j ) + k 2 ·(v 4 j ) + k 3 ·(v 0 j+1 ) < P 0 The constraint equation is tested against each possible transition and the transitions that violate the constraint are eliminated. 0000

March 8, 2006“Bus Stuttering”20 Methodology Edge Degradation : coupling is bounded by P 1 and P - 1 Example: v 2 j = 1 or -1, and p = 3. This means the three adjacent neighbors on either side of v 2 j need to be considered (v 4 j-1, v 0 j, v 1 j, v 3 j, v 4 j, v 0 j+1 ). v 2 j = 1 (rising) k 3 ·(v 4 j-1 ) + k 2 ·(v 0 j ) + k 1 ·(v 1 j ) + k 1 ·(v 3 j ) + k 2 ·(v 4 j ) + k 3 ·(v 0 j+1 ) > P 1 v 2 j = -1 (falling) k 3 ·(v 4 j-1 ) + k 2 ·(v 0 j ) + k 1 ·(v 1 j ) + k 1 ·(v 3 j ) + k 2 ·(v 4 j ) + k 3 ·(v 0 j+1 ) < P - 1 Again, the constraint equations are tested against each possible transition and the transitions that violate the constraints are eliminated

March 8, 2006“Bus Stuttering”21 Methodology Supply Bounce : coupling is bounded by P bnc Example: v 0 j =V DD or V SS. The total number of switching signals that use v 0 j to return current must be considered. Due to symmetry of the bus arrangement, signal pins will always return current through two supply pins. i.e., (v 0 j-1 and v 0 j ) or (v 4 j and v 4 j+1 ). This results in the self inductance of the return path being divided by 2. Let z = |L di/dt| for any pin. Then, v 0 j = V DD (z/2)·(# of v i j pins that are 1) < P bnc v 4 j = V SS (z/2)·(# of v i j pins that are -1) < P bnc

March 8, 2006“Bus Stuttering”22 Methodology For each bit in the j th segment bus, constraints are written. If the pin is a signal, 3 constraint equations are written; - v 0 j = 0, the bit is static and a glitching constraint is written - v 0 j = 1, the bit is rising and an edge degradation constraint is written. - v 0 j = -1, the bit is falling and an edge degradation constraint is written. If the pin is V DD, 1 constraint equation is written to avoid supply bounce. If the pin is V SS, 1 constraint equation is written to avoid ground bounce. For the segment, 1 constraint equation is written to constrain power.

March 8, 2006“Bus Stuttering”23 Methodology This results in the total number of constraint equations written is: (3·n – 4) Each equation must be evaluated for each possible transition to verify if the transition meets the constraints. The total number of transitions that are evaluated depends on n and p: 3 (n+2p – 6) This follows since there are n-2 signal pins in the segment j, and 2p-4 signal pins in neighboring segments. The values of n and p are small in practice, hence this is tractable.

March 8, 2006“Bus Stuttering”24 Example # of Constraints = (3n – 4) = 11 1) v 0 j = V DD  (L/2)· (# of v i j pins that are 1) P 1 3) v 1 j = -1  k 1 · (v 2 j ) + k 2 · (v 3 j ) P 1 6) v 2 j = -1  k 1 · (v 1 j ) + k 1 · (v 3 j ) P 1 9) v 3 j = -1  k 2 · (v 1 j ) + k 1 · (v 2 j ) < P -1 10) v 3 j = 0  - P 0 < k 2 · (v 1 j ) + k 1 · (v 2 j ) < P 0 11) v 4 j = V SS  (L/2)· (# of v i j pins that are -1) < P bnc

March 8, 2006“Bus Stuttering”25 Example Transitions Eliminated due to Constraint Violations Rule(s) Violated Transition AggressiveNon Aggressive 011 violates 1, violates 4, violates 1, violates 1, violates 1,2,5,8 violates violates violates violates violates 7, violates violates violates 10, violates violates 3,6,9,11 violates 1

March 8, 2006“Bus Stuttering”26 Directed graph is created from surviving legal transitions Directed Graph is Used to Map Transitions Between any Two Vectors - A transition path (which may include stutters) exists between any two vectors if: There exists at least two outgoing edges for each vector v s  G (including self-edge) There exists at least two incoming edges for each vector v d  G (including self-edge) Bus Stuttering CODEC - Algorithm G

March 8, 2006“Bus Stuttering”27 Bus Stuttering CODEC - Construction Multiple Stutter States can be used - Between 0 and 2 (W bus -1) stutters can be inserted between any two vectors - Results show that for segments up to 8 bits, more than 3 stutters is rare Overhead - Overhead increases as segments sizes increase - Still useful since segments greater than 8 bits are rarely used.

March 8, 2006“Bus Stuttering”28 Bus Stuttering CODEC – Physical Results Circuit Implementation - 32 pipeline stages used - Pipeline reset after 32 idle states (similar to SRIO, HT, and PCI Express) - Protocol inherently handles pipeline overflow

March 8, 2006“Bus Stuttering”29 SPICE Simulations - 3 bit segment (5 pins including VDD and GND) - Fixed di/dt - Maximum noise reduced by limiting simultaneously switching signals SPICE simulations match analytical predictions with great fidelity Bus Stuttering CODEC – Physical Results Ground Bounce GlitchingEdge Degradation

March 8, 2006“Bus Stuttering”30 Bus Stuttering CODEC – Physical Results TSMC 0.13um Synthesis Results - RTL design, synthesized and mapped - Segment sizes 2  8 implemented - Logic, delay, and area evaluated

March 8, 2006“Bus Stuttering”31 Bus Stuttering CODEC – Physical Results Xilinx FPGA, 0.35um Implementation Results - RTL design implemented - Xilinx, VirtexIIPro, FPGA

March 8, 2006“Bus Stuttering”32 Bus Stuttering CODEC – Physical Results Xilinx FPGA, 0.35um Implementation Results - RTL design, implemented - Logic operation verified - Noise Reduced from 16% to 4% (Segments with 4 signal pins)

March 8, 2006“Bus Stuttering”33 Conclusion Packaging Performance is the Largest System Bottleneck Stutter Encoding Avoids Worst-Case Noise Patterns Performance Improved Even After Considering Encoding Overhead