1 Cleared for Open Publication July 30, 2004 04-S-2144 P148/MAPLD 2004 Rea MAPLD 148:"Is Scaling the Correct Approach for Radiation Hardened Conversions.

Slides:



Advertisements
Similar presentations
Barcelona Forum on Ph.D. Research in Communications, Electronics and Signal Processing 21st October 2010 Soft Errors Hardening Techniques in Nanometer.
Advertisements

An International Technology Roadmap for Semiconductors
Semiconductor Memory Design. Organization of Memory Systems Driven only from outside Data flow in and out A cell is accessed for reading by selecting.
Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.
Altera FLEX 10K technology in Real Time Application.
CHIMAERA: A High-Performance Architecture with a Tightly-Coupled Reconfigurable Functional Unit Kynan Fraser.
CHALLENGES IN EMBEDDED MEMORY DESIGN AND TEST History and Trends In Embedded System Memory.
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
Accelerating Productization. Functional Metrology TM Challenges of Semiconductor Productization Leading IDM’s Solution Novel Solution -> In-product Functional.
Introduction to CMOS VLSI Design Clock Skew-tolerant circuits.
Clock Design Adopted from David Harris of Harvey Mudd College.
11/29/2004EE 42 fall 2004 lecture 371 Lecture #37: Memory Last lecture: –Transmission line equations –Reflections and termination –High frequency measurements.
11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.
Die-Hard SRAM Design Using Per-Column Timing Tracking
Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.
1 adaptive body bias for reducing process variations nuno alves 19 / october / 2006.
BIST vs. ATPG.
1 EE244 Project Your Title EE244 – Fall 2000 Name 1 Name 2.
Digital Integrated Circuits for Communication
Switching Techniques Student: Blidaru Catalina Elena.
Case Study - SRAM & Caches
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
ENGG 6090 Topic Review1 How to reduce the power dissipation? Switching Activity Switched Capacitance Voltage Scaling.
Lecture#14. Last Lecture Summary Memory Address, size What memory stores OS, Application programs, Data, Instructions Types of Memory Non Volatile and.
Charles Kime & Thomas Kaminski © 2004 Pearson Education, Inc. Terms of Use (Hyperlinks are active in View Show mode) Terms of Use Lecture 12 – Design Procedure.
Ronny Krashinsky Seongmoo Heo Michael Zhang Krste Asanovic MIT Laboratory for Computer Science SyCHOSys Synchronous.
Introduction to VLSI Design – Lec01. Chapter 1 Introduction to VLSI Design Lecture # 2 A Circuit Design Example.
Power Reduction for FPGA using Multiple Vdd/Vth
Coarse and Fine Grain Programmable Overlay Architectures for FPGAs
Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers l Multiplexers,
CAD for Physical Design of VLSI Circuits
ITRS Factory Integration Difficult Challenges Last Updated: 30 May 2003.
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 4 Programmable.
Chapter 2 The CPU and the Main Board  2.1 Components of the CPU 2.1 Components of the CPU 2.1 Components of the CPU  2.2Performance and Instruction Sets.
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
SiLab presentation on Reliable Computing Combinational Logic Soft Error Analysis and Protection Ali Ahmadi May 2008.
Switching breaks up large collision domains into smaller ones Collision domain is a network segment with two or more devices sharing the same Introduction.
DCSL & LVDCSL: A High Fan-in, High Performance Differential Current Switch Logic Families Dinesh Somasekhaar, Kaushik Roy Presented by Hazem Awad.
J. Christiansen, CERN - EP/MIC
Chonnam national university VLSI Lab 8.4 Block Integration for Hard Macros The process of integrating the subblocks into the macro.
Introduction to CMOS VLSI Design Lecture 5: Logical Effort GRECO-CIn-UFPE Harvey Mudd College Spring 2004.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Pseudo-nMOS gates. n DCVS logic. n Domino gates. n Design-for-yield. n Gates as IP.
Houman Homayoun, Sudeep Pasricha, Mohammad Makhzan, Alex Veidenbaum Center for Embedded Computer Systems, University of California, Irvine,
Modern VLSI Design 3e: Chapter 7 Copyright  1998, 2002 Prentice Hall PTR Topics n Power/ground routing. n Clock routing. n Floorplanning tips. n Off-chip.
1 Energy-Efficient Register Access Jessica H. Tseng and Krste Asanović MIT Laboratory for Computer Science, Cambridge, MA 02139, USA SBCCI2000.
Introduction to Clock Tree Synthesis
Programmable Logic Device Architectures
ECE 551: Digital System Design & Synthesis Motivation and Introduction Lectures Set 1 (3 Lectures)
A High-Speed & High-Capacity Single-Chip Copper Crossbar John Damiano, Bruce Duewer, Alan Glaser, Toby Schaffer, John Wilson, and Paul Franzon North Carolina.
By: C. Eldracher, T. McKee, A Morrill, R. Robson. Supervised by: Professor Shams.
Integrated Microsystems Lab. EE372 VLSI SYSTEM DESIGNE. Yoon 1-1 Panorama of VLSI Design Fabrication (Chem, physics) Technology (EE) Systems (CS) Matel.
Introduction to Computing Systems and Programming Digital Logic Structures.
1 Timing Closure and the constant delay paradigm Problem: (timing closure problem) It has been difficult to get a circuit that meets delay requirements.
CS203 – Advanced Computer Architecture
EE141 © Digital Integrated Circuits 2nd Introduction 1 EE4271 VLSI Design Dr. Shiyan Hu Office: EERC 731 Adapted and modified from Digital.
VLSI Design Flow The Y-chart consists of three major domains:
ALPHA 21164PC. Alpha 21164PC High-performance alternative to a Windows NT Personal Computer.
Introduction to ASICs ASIC - Application Specific Integrated Circuit
CS203 – Advanced Computer Architecture
The Interconnect Delay Bottleneck.
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
Fabio Garzia / HIgh Speed Logic, Circuits, Libraries and Layout
XC4000E Series Xilinx XC4000 Series Architecture 8/98
XILINX CPLDs The Total ISP Solution
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
332:578 Deep Submicron VLSI Design Lecture 14 Design for Clock Skew
DARE180U Platform Improvements in Release 5.6
Technology scaling Currently, technology scaling has a threefold objective: Reduce the gate delay by 30% (43% increase in frequency) Double the transistor.
Presentation transcript:

1 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea MAPLD 148:"Is Scaling the Correct Approach for Radiation Hardened Conversions of Deep Submicron Microprocessors?" D. Rea, D. Bayles, A. Kazemzadeh, F. Thoma, and N. Haddad

2 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Introduction Opportunity: Provide faster and lower power devices for satellite applications through the use of advanced technologies –Migrate existing designs to new technologies –Develop new designs in new technologies Migration Challenge: Affordably maximize benefits of new technologies Situation: Migrate a 0.25u CMOS version of the RAD750 TM to both a 0.18u CMOS process and a 0.15u CMOS process and increase performance by ~33% at 0.18u and ~50% at 0.15u –All technologies are bulk CMOS –Transistor behavior and back end metallurgy are very compatible Increasing demands for highly reliable, radiation hardened processing power on satellites continue to push the capabilities of technology.

3 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Challenge: Increase microprocessor performance at each technology node without a degradation in radiation performance while maintaining affordability. Title III Radiation Hardened Microprocessor for Space Program Goals Prototypes 4/06 Flight parts 7/06

4 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Custom Blocks -Harden RAM cells, sense amps, and decoders -Harden latches and clocks -Harden PLL and add temperature compensation -Replace dynamic logic with static equivalents -Design circuits to minimize injected pulses -Replace low Vt devices Standard Cells (RLMs) (Control Logic) -Harden latches and clock splitters -Design circuits to minimize injected pulses -Replace low Vt devices l Complex Cells (OTS) (Data Flow ) -Harden latches and clock splitters -Replace dynamic logic with static equivalents -Design circuits to minimize injected pulses -Replace low Vt devices Circuit Families in the RAD750™ A variety of circuit families were utilized in the RAD750 to provide density and performance. Modifications were made to all circuit types to harden the design.

5 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Scaling Options The objective is to pick the highest performance at lowest cost migration option. Gate shrink only (one dimension, 1D) –Pro: drive current increases from larger W/L –Pro: simple to implement –Pro: minimal impact to floorplan –Con: no decrease in die size or wiring parasitics –Con: uneven performance improvement Two dimensional shrink (2D) –Pro: die size and parasitics decrease –Con: uneven performance improvement –Con: greater perturbation to routing Hybrid approach (combination of 1D, 2D shrinks and circuit optimization) –Pro: achieve balanced improvement –Con: increase in effort to implement (circuit level and full chip)

6 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Scaling Options - Standard Cell Study (RLMs) Largest average improvement observed with 1 dimensional scaling plus compaction. However, minimal cell size reduction yields little parasitic reduction, so advantage seen with no load is lost when loads are taken into consideration. 1D - 1 dimension scaling, 2D - 2 dimension scaling, C - scaling with compaction

7 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Largest average improvement observed with 2 dimensional scaling plus compaction. However, performance improvement is not uniform across all cells. Scaling Options - Standard Cell Study (RLMs) 1D - 1 dimension scaling, 2D - 2 dimension scaling, C - scaling with compaction

8 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea As expected, reduced power supply voltage accounts for majority of power reduction. Overall chip power will increase when frequency of operation increases. Scaling Options - Standard Cell Study (RLMs) 1D - 1 dimension scaling, 2D - 2 dimension scaling, C - scaling with compaction

9 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Scaling Options - Complex Cells (Data Flow) Study Using two dimensional scaling, the non-uniformity in performance improvement shows that the average is misleading. Should the “slower” cells end up in the critical path, overall speed could go down. Use “as is” until first full chip timing run Optimize or synthesize as necessary to improve performance

10 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Custom macros are the heart of the processor, representing over 2/3 the total transistor count and driving the critical performance paths. Scaling Options - Custom Macros

11 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea The MMU/TAG/CACHE paths on both the instruction and data sides comprise the processor critical path. Scaling Options - Custom Macros

12 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea As with the data flow complex cell macros (see p. 9), scaling of the custom macros produced uneven results. Scaling Options - Custom Macro Study 1D scaling was chosen for the custom macros for the following reasons Critical node spacing in the memory arrays had to be maintained Because of the amount of custom layout in these macros, simple 2D scaling resulted in a very large number of DRC errors that would have required considerable manual intervention to correct 1D Scaling Results

13 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Custom Blocks -1D Scale as baseline (minimize cost) -Optimize circuit design (new topologies, layout structure) as necessary Standard Cells -Resize transistors -Automatically generate layouts (~2D scaling) -Resynthesize at chip level l Complex Cells -2D scale where appropriate -Optimize when possible -Synthesize from standard cells where economically advantageous and performance isn’t required. Scaling Solution for the RAD750™ Bottom line - scaling by itself is not sufficient to meet performance goals. Scaling combined with other techniques supports the performance goals at a reasonable cost.

14 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Predicted Performance of RAD750™ By combining scaling with other design techniques, program goals can be met at an affordable price.

15 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea Summary Simple scaling is not sufficient to meet performance objectives at advanced technology nodes –improvements are not uniform –improvements from scaling don’t meet objectives even if average was uniform To maintain affordability, a hybrid approach consisting of several scaling techniques and circuit optimization was selected to maximize the advantages of the advanced technologies Automation is used where possible to support changes in technology groundrules and support conversion to future technologies –Additional automation in the custom macro area required to resolve issues with simple scaling Program goals can be met with the hybrid approach