Washington State University EE 587 SoC Design & Test Partha Pande School of EECS Washington State University pande@eecs.wsu.edu
SoC Physical Design Issues Interconnect Architectures and Signal Integrity
Design Challenges Non-scalable global wire delay Moving signals across a large die within one clock cycle is not possible. Current interconnection architecture- Buses are inherently non-scalable. Transmission of digital signals along wires is not reliable.
Bus – non scalability Clock cycle depends on the parasitic and bus length Multiple bus segments More than one design iteration Converges to network
Bus Architectures
Split Bus Architecture
Achievable Clock Cycle in a Bus segment
Minimize Power Consumption Modification of interconnect architectures Incorporate parallelism (ITRS 2003 & ISSCC 2004) Decoupling of communication and processing Modular architecture Minimize use of global wires Locality in communication
SoC Micro architecture Trend 50-100K gates block – No global wire delay problem. Block-based hierarchical design style that uses block sizes of 50-100K gates. Single synchronous clock regions will span only a small fraction of the chip area. Different self-synchronous IPs communicate via network-oriented protocols. Structured network wiring leads to deterministic electrical parameters - reduces latency and increases bandwidth. Failures due to inherent unreliable physical medium can be addressed by introducing error correction mechanisms.
New design paradigm New designs – very large number of functional blocks Moving bits around efficiently Develop on-chip infrastructure to solve future inter-block communication bottlenecks Development of infrastructure IPs SoC = (SFIP + SI2P)
Silicon Back plane
MIPS SoC-it
The network-on-chip paradigm Driven by Increased levels of integration Complexity of large SoCs New designs counting 100s of IP blocks Need for platform-based design methodologies DSM constraints (power, delay, time-to-market, etc…)
NoC Features Decoupling of functionality from communication Dedicated infrastructure for data transport NoC infrastructure switch link
Some Common Architectures (a) Mesh, (b) Folded-Torus (FT) and (c) Butterfly Fat Tree (BFT)
Data Transmission Packet-based communication Low memory requirement Packet switching Wormhole routing Packets are broken down into flow control units or flits which are then routed in a pipelined fashion
Connecting Different IP Blocks Using Tree Architecture
Communication Pipelining Need to constrain the delay of each stage within 15 FO4
Signal Integrity According to ITRS signal integrity will become a major issue in future technologies Causes for such inherent unreliability Shrinking geometries, layout dimensions Reduction in the charge used for storing bits Increased probability of transient events like: Crosstalk Ground Bounce Alpha particle hits
Micro network Protocol Stack
On Chip Signal Transmission Future global wires will function as lossy transmission lines Reduced-swing signaling Noise due to crosstalk, electromagnetic interference, and other factors will have increased impact. it will not be possible to abstract the physical layer of on-chip networks as a fully reliable, fixed-delay channel At the micro network stack layers atop the physical layer, noise is a source of local transient malfunctions.
Coding Schemes Low-Power Coding Reducing self-transition activity Crosstalk Avoidance Coding Reducing Coupling with adjacent lines Error Control Coding SEC, SECDED
Low Power Coding Reduction of self-transition activity Bus-Invert Code Data is inverted and an invert bit is sent to the decoder if the current data word differs from the previous data word in more than half the number of bits Effectiveness decreases with increase in bus width
Error Control Coding Linear block codes (n, k) linear block code, a data block, k bits long, is mapped onto an n bit code word, Forward Error Correction or Automatic Repeat Request Redundant wires Possibility of voltage reduction Energy efficiency is an important criterion Codec overhead
Worst Case Crosstalk Transition from 101 to 010 pattern or vice versa Due to Miller Capacitance worst case capacitance between adjacent wires become
Joint Crosstalk Avoidance and Single Error Correction Codes Reduce crosstalk as well correct errors due to other transient events Duplicate Add Parity (DAP) Dual Rail Code (DR) Boundary Shift Code (BSC) Modified Dual Rail Code (MDR) Worst case crosstalk capacitance is reduced to (1+2λ)CL
Duplicate-Add-Parity Code Each bit is duplicated A parity bit from one copy is computed Same as Dual Rail Code
Crosstalk Avoidance Double Error Correction Code (CADEC) The 32-bit flit is Hamming coded and then an overall parity is calculated All bits apart from the overall parity are duplicated The 32 bit original flit becomes 77 bits Minimum Hamming distance is 7 Worst case crosstalk capacitance is reduced to (1+2λ)CL
Energy Savings with Joint Codes Due to increased error resilience lower noise margins can be tolerated and hence operating voltage can be reduced Coding adds overhead in terms of extra wires and codec
Voltage Swing Reduction for CADEC The probability of word error for DAP V Word error rate
Energy Savings with CADEC
Communication Pipelining Inter- and Intra-switch stages Pipelined Data Transfer
Latency Characteristics The codes should be optimized It can be merged with existing stages No Latency penalty
Adaptive Supply Voltage Links Dynamic Voltage Scaling (DVS) DVS schemes dynamically adjust the processor clock frequency and supply voltage to just meet instantaneous performance requirement, making the system energy aware. communication architectures display a wide variance in their utilization depending on the communication patterns of applications adapts the link’s frequency and supply voltage in accordance with the instantaneous traffic bandwidth.
Repeater Insertion & Coding Repeater insertion reduces interconnect wire delay Increases power dissipation due large drivers CACs reduce coupling capacitance Joint repeater insertion and CAC is a promising solution to reduce power in global wires
Repeater Insertion & Coding Reference: A low-Power Bus Design Using Joint Repeater Insertion and Coding 130 nm
Repeater Insertion & Coding 45 nm
Reliability Crosstalk, electromigration,material ageing…. Transient failures Error control coding Crosstalk avoidance coding Power, area trade-off Permanent failures Spare switches and links Overall routing complexity Effect on system performance