Toshiba Standard Cell Architecture for High Frequency Operation Peter Hsu, Ph.D. Chief Architect Microprocessor Development Toshiba America Electronics.

Slides:



Advertisements
Similar presentations
An International Technology Roadmap for Semiconductors
Advertisements

RAPID Memory Compiler Evaluation by David Artz
COEN 180 SRAM. High-speed Low capacity Expensive Large chip area. Continuous power use to maintain storage Technology used for making MM caches.
Semiconductor Memory Design. Organization of Memory Systems Driven only from outside Data flow in and out A cell is accessed for reading by selecting.
1 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea MAPLD 148:"Is Scaling the Correct Approach for Radiation Hardened Conversions.
Introduction to CMOS VLSI Design Sequential Circuits.
VLSI Design EE 447/547 Sequential circuits 1 EE 447/547 VLSI Design Lecture 9: Sequential Circuits.
MICROELETTRONICA Sequential circuits Lection 7.
Lecture 11: Sequential Circuit Design. CMOS VLSI DesignCMOS VLSI Design 4th Ed. 11: Sequential Circuits2 Outline  Sequencing  Sequencing Element Design.
Introduction to CMOS VLSI Design Lecture 10: Sequential Circuits David Harris Harvey Mudd College Spring 2004.
Design and Application of Power Optimized High-Speed CMOS Frequency Dividers.
Sequential Circuits. Outline  Floorplanning  Sequencing  Sequencing Element Design  Max and Min-Delay  Clock Skew  Time Borrowing  Two-Phase Clocking.
Ch.7 Layout Design Standard Cell Design TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.
Ch.3 Overview of Standard Cell Design
Programmable Logic Devices
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
Introduction to CMOS VLSI Design Clock Skew-tolerant circuits.
Clock Design Adopted from David Harris of Harvey Mudd College.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
11/29/2004EE 42 fall 2004 lecture 371 Lecture #37: Memory Last lecture: –Transmission line equations –Reflections and termination –High frequency measurements.
Die-Hard SRAM Design Using Per-Column Timing Tracking
Physical Design Outline –What is Physical Design –Design Methods –Design Styles –Analysis and Verification Goal –Understand physical design topics Reading.
A Cost-Driven Lithographic Correction Methodology Based on Off-the-Shelf Sizing Tools.
Signal Integrity Methodology on 300 MHz SoC using ALF libraries and tools Wolfgang Roethig, Ramakrishna Nibhanupudi, Arun Balakrishnan, Gopal Dandu Steven.
Digital Integrated Circuits for Communication
Timepix2 power pulsing and future developments X. Llopart 17 th March 2011.
CSET 4650 Field Programmable Logic Devices
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 7 Programmable.
Modern VLSI Design 4e: Chapter 7 Copyright  2008 Wayne Wolf Topics Global interconnect. Power/ground routing. Clock routing. Floorplanning tips. Off-chip.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Are classical design flows suitable below 0.18  ? ISPD 2001 NEC Electronics Inc. WR0999.ppt-1 Wolfgang Roethig Senior Engineering Manager EDA R&D Group.
Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers l Multiplexers,
CAD for Physical Design of VLSI Circuits
EE 5900 Advanced Algorithms for Robust VLSI CAD, Spring 2009 Static Timing Analysis and Gate Sizing.
Open Discussion of Design Flow Today’s task: Design an ASIC that will drive a TV cell phone Exercise objective: Importance of codesign.
ASIC Design Flow – An Overview Ing. Pullini Antonio
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n Latches and flip-flops. n RAMs and ROMs.
Washington State University
SRAM DESIGN PROJECT PHASE 2 Nirav Desai VLSI DESIGN 2: Prof. Kia Bazargan Dept. of ECE College of Science and Engineering University of Minnesota,
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Memory and Storage Dr. Rebhi S. Baraka
McKenneman, Inc. SRAM Proposal Design Team: Jay Hoffman Tory Kennedy Sholanda McCullough.
4. Combinational Logic Networks Layout Design Methods 4. 2
Recent Topics on Programmable Logic Array
NUMERICAL TECHNOLOGIES, INC. Assessing Technology tradeoffs for 65nm logic circuits D Pramanik, M Cote, K Beaudette Numerical Technologies Inc Valery Axelrad.
CHAPTER 8 Developing Hard Macros The topics are: Overview Hard macro design issues Hard macro design process Physical design for hard macros Block integration.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 33: November 20, 2013 Crosstalk.
Chapter 3 How transistors operate and form simple switches
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Modern VLSI Design 3e: Chapter 7 Copyright  1998, 2002 Prentice Hall PTR Topics n Power/ground routing. n Clock routing. n Floorplanning tips. n Off-chip.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 28: November 7, 2014 Memory Overview.
A High-Speed & High-Capacity Single-Chip Copper Crossbar John Damiano, Bruce Duewer, Alan Glaser, Toby Schaffer, John Wilson, and Paul Franzon North Carolina.
A High-Speed & High-Capacity Single-Chip Copper Crossbar John Damiano, Bruce Duewer, Alan Glaser, Toby Schaffer,John Wilson, and Paul Franzon North Carolina.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 28: November 8, 2013 Memory Overview.
07/11/2005 Register File Design and Memory Design Presentation E CSE : Introduction to Computer Architecture Slides by Gojko Babić.
Low Power SRAM VLSI Final Presentation Stephen Durant Ryan Kruba Matt Restivo Voravit Vorapitat.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Written by Whitney J. Wadlow
-1- Soft Core Viterbi Decoder EECS 290A Project Dave Chinnery, Rhett Davis, Chris Taylor, Ning Zhang.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 30: November 21, 2012 Crosstalk.
THE CMOS INVERTER.
The Interconnect Delay Bottleneck.
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
A High-Speed and High-Capacity Single-Chip Copper Crossbar
Introduction to CMOS VLSI Design Lecture 10: Sequential Circuits
Day 26: November 11, 2011 Memory Overview
The Xilinx Virtex Series FPGA
Timing Analysis 11/21/2018.
The Xilinx Virtex Series FPGA
Applications of GTX Y. Cao, X. Huang, A.B. Kahng, F. Koushanfar, H. Lu, S. Muddu, D. Stroobandt and D. Sylvester Abstract The GTX (GSRC Technology Extrapolation)
Presentation transcript:

Toshiba Standard Cell Architecture for High Frequency Operation Peter Hsu, Ph.D. Chief Architect Microprocessor Development Toshiba America Electronics Components, Inc. Created 14 March 2001 at the University of Wisconsin in Madison

Layout Architecture for High Frequency Operation2 Disclaimer The ideas, data and conclusions presented here are solely those of the Author, and do not in any way represent Toshiba Corporation policy or strategy.

Layout Architecture for High Frequency Operation3 Introduction High Frequency is Difficult! –Many Issues: Signal Integrity, Power Dissipation,... –My Approach: Disciplined Methodology Global Optimization Outline –Layout –Circuits –Analysis

Layout Architecture for High Frequency Operation4 Layout Strategy Leverage Advanced Technologies –Local Interconnect –Flip-Chip Area Array I/O CAD Tool Compatibility –Parasitic Estimation, Extraction Complex, High Frequency Designs –Robust Power Grid –Flexible Macro Embedding

Layout Architecture for High Frequency Operation5 Metal Usage 300nm 150nm 900nm 450nm 300nm 200nm VSS VDD Signal Via Clock (2x) 600nm450nm Global Wires Short Local Interconnect (M0): Tungsten, Aluminum or Copper Top Metal: Flip-Chip Solder Pads Dimensions are for nominal 0.12µm generation process Contact

Layout Architecture for High Frequency Operation6 Standard Cell Layout Cell Row Power Vias (1 every 6 Tracks) U1.AU1.Z U2.AU2.Z A ZZ U1 VDD U2 VSS A Unrelated Wire Minimum Cell 3 Tracks Crosspoint Power Vias Pins Must Stagger VSS VDD Cell Row Power Vias Minimum Power Rail 6 Tracks From Edge Minimum Pin Width 2 Tracks Local Interconnect Smallest Cell 13 Tracks Double Height Cell

Layout Architecture for High Frequency Operation7 Area Array I/O Decoder Sense Amp. 256 Rows  256 Columns Cell Array 256 Rows  256 Columns Cell Array 307µm 56µm 670µm 538µm 102µm 640µm 225  m pitch 5 I/O Macro (50K  m 2 ) Largest SRAM Macro without sacrificing I/O (16 KBytes) 1.2  m 2.5  m 2 Cell Core VDD Core VSS I/O VDD I/O VSS Signal

Layout Architecture for High Frequency Operation8 Self-Contained –5 Signals –VDDQ, VSSQ –ESD Protection –Latch-Up Ring SoC Flexibility –Many I/O Types –Different Voltages Routing Porosity –50% Channels Free in Global Wiring Layers –Short Output Trace on Top Metal (Electromigration) I/O Macro Cell I/O Macro Use M0+M1+M2 M3 M4 M5 Top Metal M6 Free Routing Channels

Layout Architecture for High Frequency Operation9 SRAM Metal Usage 6-Transistor Cell (1.2  2.1  m ) Bit LinesVSS VDD Word Line #1 Word Line #2 M1 M2 M3 Global Wires (1  or 2  Pitch) SRAM Macro Uses M0+M1+M2 SignalsVDDVSS CAD Tool Inserts M3:M2 Power Vias

Layout Architecture for High Frequency Operation10 Word Line Shielding Signals VDDVSS Cell Array Decoder Sense Amp. Zigzag Minimizes Coupling from M3 Signals to M2 Word Lines when SRAM is Rotated Blocked Tracks M3 Global Wires Bit Lines

Layout Architecture for High Frequency Operation11 Rationale “Effective Area” –Actual Footprint + Routing Disturbance –Larger, More Porous Layout  Faster Bigger Transistors More Space around Bit Lines Shielding SoC –Complex Microarchitecture –Many Small SRAMs

Layout Architecture for High Frequency Operation12 Circuit Design Building Blocks –Latch Array Malleable, Porous, Multi-Port SRAM –Dynamic Wire-OR Gate High Fan-in, Safe, CAD Compatible Power Dissipation –Double Edge Flipflop  50% Clock Tree   30% Peak Chip-Wide –Interpolation Cells

Layout Architecture for High Frequency Operation13 Latch Array G DQ E G DQ E G DQ E G DQ E CK D Q D Q Decoder CK DQ Decoder Read Address Write Address Write Data Read DataTest Mode Latch + Tristate Driver Combinatorial Read Path May Buffer during Place&Route CK DQ DQ Write Pulse Generator Write Enable

Layout Architecture for High Frequency Operation14 Dynamic Wire-OR Gate Input D 1 Input D N G D Q Clock Output Sized for Max. Length Driver Cell Receiver Cell Max. Length by Max-Load, Max-Transition Spec. Limit Max. N by Max-Fanout Spec. Keeper Sized for Max-Fanout Sized for 1 Highest Leverage –Dynamic vs. Static Safe, CAD Compatible –Limit Wire Length using Timing Driven Placement –No Dynamic Inputs _G_G D _Q_Q _G_G D _Q_Q

Layout Architecture for High Frequency Operation15 Double-Edge Flipflop Q D Ck Switching Nodes with Constant “1” Data Low Power –Clock ½ Frequency –Light Clock Load 2 Large + 4 Small Small, Fast § –15P + 15N Transistors Safe, Flexible –Fully Static –Supports Scan ______ § B. Nikolic, et.al., “Sense Amplifier-Based Flip-Flop,” ISSCC 1999.

Layout Architecture for High Frequency Operation16 Interpolation Cells Same Footprint, Shorter Transistors 1X Cell 2X Cell 4X Cell 2 / 3 Power 5 / 6 Power Full Power For Post Route In-Place Optimization

Layout Architecture for High Frequency Operation17 Analysis Signal Integrity –Parasitics “Accurate By Construction” Uniform Metal Density Majority Coupling to Power Rails (Shielding) Speed Yield –Balanced with Resources Area, Power, Design Time –Goal: Adequate Confidence

Layout Architecture for High Frequency Operation18 Uniform Metal Density Algorithmically Generated Filled Metal Uniform Density on all Layers (except Local Interconnect) A ZZ U1 VDD U2 VSS A Post Route Metal Usage

Layout Architecture for High Frequency Operation19 Advantages Design –Accurate Estimation Capacitance has Low Variance –Known Coupling  50% to Adjacent Power Line –Quick Feedback Interconnect-Only Extraction is Accurate Manufacturing –Uniform Etch Resist Loading

Layout Architecture for High Frequency Operation20 Asymmetric Rise-Fall Delays Slow Shrink Slow Elongates DelayDuty Cycle Same Same Size P Transistors Same Size N Transistors

Layout Architecture for High Frequency Operation21 Pros and Cons Advantages –More Compact Cells, Faster Circuits Disadvantages –Need Careful Analysis, Greater Margin Strategy: –Main Library Asymmetric, “No Wasted Space” –Symmetric Subset Gated Clocks, Write Pulse Buffering,...

Layout Architecture for High Frequency Operation22 Speed Yield Management Maximum Process Variation Slow NFast N Fast P Slow P Transistors “Four Corner” Analysis Target Design and Characterize Library Here Process Center Mature Process Variation Setup Time Failures Hold Time Failures Correct Operation Possibly Impossible to Meet Performance Goal, or Needlessly High Effort

Layout Architecture for High Frequency Operation23 Conclusions “Precision Physical Design” –Global Power Grid Macro Routing Porosity –Methodical Signal Integrity Parasitic Extraction Timing Uncertainties (Coupling) –Confident Correctness and Speed