Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Outlines Analysis (Signal Integrity) SPICE Diego RLC Reduction Synthesis (Interconnect Dominant) Networks on Chip Clock Distribution Floorplanning Datapath Packaging (High Performance)
Analysis: SPICE Large netlist, e.g. 100M transistors, 5G Hz Strong Coupling: interconnect delay, crosstalk, voltage drop, ground bounce Process Variations Short Channel Devices
Why SPICE Diego is better? SPICE Diego : fast accurate transistor level circuit simulator Powerful Matrix Solver Engine Transistor devices. Capable of capturing coupling effects. Device Model including Miller’s effect Less Memory Requirement (no LU factorization, dose not save matrix for transistors) Application interconnect delay Crosstalk voltage drop, ground bounce simultaneous switching noise
Experimental Results chip board Power Supply Test Case Board / Packaging / Chip Power Network Fully coupled packaging inductance 60k elements, 5000 nodes. Spice failed Our tool Less than 10 minutes
Synthesis: Clock Distribution Process variations causes significant amount of clock skew Working frequency keeps increasing, skew accounts for large portion of clock period Mesh is effective to reduce skew There is no theoretical design guide line for mesh structure
State-of-the-art In Engineering practice, very deep balanced buffer tree + mesh is widely adopted for global clock distribution IBM Power 4: 64 by 64 grid at the bottom of an H- tree Intel IA: clock stripe at the bottom of a buffer tree. “Skew Averaging”: shunt at different levels “Skew Averaging Factor” determined by simulation. No guideline for routing resource planning known yet
Clock Mesh Example (1) DEC Alpha 21264
Clock Mesh Example (2) IBM Power4 H-tree drives one domain clock mesh 8x8 area buffers
Clock Mesh Example (3) Intel Pentium 4 Tree drives three spines
Our Contributions and On-going Efforts Contribution: Analytical skew expression using R,C model Proposed generalized multi-level mesh network structure for skew reduction Optimal allocation of routing resources among meshes On-going Study: More accurate R,L,C delay model Signal propagation on a uniform mesh
Multi-level mesh structure
Skew on mesh Skew expression
Optimization Skew function Multi level skew function
Die size 1cm by 1cm 100nm copper technology Ground Shielded Differential Signal Wires for Global Clock Distribution Routing area is normalized to the area of a 16 by 16 mesh with minimal wire width Clock Design Settings +- GND
Delay Surfaces
Robustness Against Supply Voltage Variations
Y Architecture Chip-Package Breakaway Packaging
Grids of X and Y Architectures ( X-Architecture Y-Architecture
Clock Tree on Square Mesh N-level clock tree: path length 21% less than H- tree total wire length 9% less than H tree, 3% less than X tree No self-overlapping between parallel wire segments
Chip to Package Breakaway Manhattan Architecture
Y Architecture
Row by rowComparison IndentTwo sides Chip-Package Breakaways
Conclusion Analysis: Signal Integrity Synthesis: Interconnect Dominant Packaging: Performance Goals: Performance, Cost Resources: Physical Space Constraints: Yield, Signal Integrity