Click to edit Master title style Progress Update Energy-Performance Characterization of CMOS/MTJ Hybrid Circuits Fengbo Ren 05/28/2010.

Slides:

Advertisements

Similar presentations

Barcelona Forum on Ph.D. Research in Communications, Electronics and Signal Processing 21st October 2010 Soft Errors Hardening Techniques in Nanometer.

Advertisements

Variation Aware Gate Delay Models Dinesh Ganesan.

Transmission Gate Based Circuits

CSET 4650 Field Programmable Logic Devices

Semiconductor Memory Design. Organization of Memory Systems Driven only from outside Data flow in and out A cell is accessed for reading by selecting.

Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.

Power Reduction Techniques For Microprocessor Systems

Leakage-Biased Domino Circuits for Dynamic Fine- Grain Leakage Reduction Seongmoo Heo and Krste Asanović Massachusetts Institute of Technology Lab for.

A Look at Chapter 4: Circuit Characterization and Performance Estimation Knowing the source of delays in CMOS gates and being able to estimate them efficiently.

Introduction to CMOS VLSI Design Lecture 18: Design for Low Power David Harris Harvey Mudd College Spring 2004.

Designing Combinational Logic Circuits: Part2 Alternative Logic Forms:

Design, Verification, and Test of True Single-Phase Adiabatic Multiplier Suhwan Kim IBM Research Division T. J. Watson Research Center, Yorktown Heights.

Low-Power CMOS SRAM By: Tony Lugo Nhan Tran Adviser: Dr. David Parent.

Outline Introduction – “Is there a limit?”

Micro-Architecture Techniques for Sensor Network Processors Amir Javidi EECS 598 Feb 25, 2010.

Towards An Efficient Low Frequency Energy Recovery Dynamic Logic Sujay Phadke Advanced Computer Architecture Lab Department of Electrical Engineering and.

S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.

Lecture 5 – Power Prof. Luke Theogarajan

Lecture 7: Power.

Low Power Design of Integrated Systems Assoc. Prof. Dimitrios Soudris

The CMOS Inverter Slides adapted from:

Digital Integrated Circuits for Communication

Timepix2 power pulsing and future developments X. Llopart 17 th March 2011.

1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University

MOS Inverter: Static Characteristics

EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.

© Digital Integrated Circuits 2nd Sequential Circuits Digital Integrated Circuits A Design Perspective Designing Sequential Logic Circuits Jan M. Rabaey.

Design of Robust, Energy-Efficient Full Adders for Deep-Submicrometer Design Using Hybrid-CMOS Logic Style Sumeer Goel, Ashok Kumar, and Magdy A. Bayoumi.

ENGG 6090 Topic Review1 How to reduce the power dissipation? Switching Activity Switched Capacitance Voltage Scaling.

Low-Power CMOS Logic Circuit Topic Review 1 Part I: Overview (Shaw) Part II: (Vincent) Low-Power Design Through Voltage Scaling Estimation and Optimization.

Low Power Techniques in Processor Design

Review: CMOS Inverter: Dynamic

Power Reduction for FPGA using Multiple Vdd/Vth

EE415 VLSI Design DYNAMIC LOGIC [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]

Abdullah Aldahami ( ) Feb26, Introduction 2. Feedback Switch Logic 3. Arithmetic Logic Unit Architecture a.Ripple-Carry Adder b.Kogge-Stone.

Gerousis Toward Nano-Networks and Architectures C. Gerousis and D. Ball Department of Physics, Computer Science and Engineering Christopher Newport University.

MICAS Department of Electrical Engineering (ESAT) Design-In for EMC on digital circuit October 27th, 2005 AID–EMC: Low Emission Digital Circuit Design.

FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n Latches and flip-flops. n RAMs and ROMs.

Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics Memories: –ROM; –SRAM; –DRAM; –Flash. Image sensors. FPGAs. PLAs.

DCSL & LVDCSL: A High Fan-in, High Performance Differential Current Switch Logic Families Dinesh Somasekhaar, Kaushik Roy Presented by Hazem Awad.

XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.

Supply Voltage Biasing Andy Whetzel and Elena Weinberg University of Virginia.

Energy Reduction for STT-RAM Using Early Write Termination Ping Zhou, Bo Zhao, Jun Yang, *Youtao Zhang Electrical and Computer Engineering Department *Department.

Inverter Chapter 5 The Inverter April 10, Inverter Objective of This Chapter  Use Inverter to know basic CMOS Circuits Operations  Watch for performance.

Basics of Energy & Power Dissipation

© Digital Integrated Circuits 2nd Inverter Digital Integrated Circuits A Design Perspective The Inverter Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Dynamic Logic Circuits Static logic circuits allow implementation of logic functions based on steady state behavior of simple nMOS or CMOS structures.

Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 6.1 EE4800 CMOS Digital IC Design & Analysis Lecture 6 Power Zhuo Feng.

Seok-jae, Lee VLSI Signal Processing Lab. Korea University

1 Dynamic CMOS Chapter 9 of Textbook. 2 Dynamic CMOS  In static circuits at every point in time (except when switching) the output is connected to either.

EE141 Combinational Circuits 1 Chapter 6 (I) Designing Combinational Logic Circuits Dynamic CMOS LogicDynamic CMOS Logic V1.0 5/4/2003.

Click to edit Master title style STT-RAM Circuit Design Column Circuitry Simulation (IBM 45nm SOI) Fengbo Ren.

1 Dual-V cc SRAM Class presentation for Advanced VLSIPresenter:A.Sammak Adopted from: M. Khellah,A 4.2GHz 0.3mm 2 256kb Dual-V CC SRAM Building Block in.

CMOS 2-Stage OP AMP 설계 DARK HORSE 이 용 원 홍 길 선

Presented by Rania Kilany.  Energy consumption  Energy consumption is a major concern in many embedded computing systems.  Cache Memories 50%  Cache.

M. Atef, Hong Chen, and H. Zimmermann Vienna University of Technology

Low Voltage Swing Logic Large Diffusion Connected Network (DCN) produces differential small signals. Extremely high-performance; Expensive reset network,

YASHWANT SINGH, D. BOOLCHANDANI

ECE 3130 Digital Electronics and Design

Low Write-Energy STT-MRAMs using FinFET-based Access Transistors

SEQUENTIAL LOGIC -II.

Multiple Drain Transistor-Based FPGA Architectures

STT-MRAM Tapeouts: IBM 65nm & IBM 45nm SOI

Modeling and Design of STT-MRAMs

Power and Heat Power Power dissipation in CMOS logic arises from the following sources: Dynamic power due to switching current from charging and discharging.

STT-RAM Design Fengbo Ren Advisor: Prof. Dejan Marković Dec. 3rd, 2010

Literature Review A Nondestructive Self-Reference Scheme for Spin-Transfer Torque Random Access Memory (STT-RAM) —— Yiran Chen, et al. Fengbo Ren 09/03/2010.

A New Hybrid FPGA with Nanoscale Clusters and CMOS Routing Reza M. P

EE216A – Fall 2010 Design of VLSI Circuits and Systems

Presentation transcript:

Click to edit Master title style Progress Update Energy-Performance Characterization of CMOS/MTJ Hybrid Circuits Fengbo Ren 05/28/2010

Modern MTJ  Bias voltage/current controlled variable resistance device –Low: R P –High: R AP –TMR = (R AP - R P )/ R P  Spin-transfer-torque (STT) Switching –Switching is controlled by the direction of writing current. –Writing current density has to exceed thresholds 2

Motivations for Hybrid Logic  Significant application in MRAM design.  Why logic? –CMOS-compitible ● Switching current: 200uA – 2mA ● 90nm transistor: 1mA/um gate width –Non-volatility, high stability ● Introducing MTJ's non-volatility into CMOS, which may suppress leakage in active mode and reduce the leakage in idle mode to minimum. –3D – stack ● Replace CMOS with MTJ may increase density. 3

Questions?  What architecture can best utilize MTJ's non-volatility feature to improve energy efficiency?  Can MTJ/CMOS hybrid circuit has better energy delay trade-off than CMOS circuit?  How much leakage power can be saved by introducing MTJ to CMOS?  Any overhead? How much is the switching power of MTJ?  What will be the trend of MTJ/CMOS hybrid circuit with technology scaling? 4

Logic-in-Memory MTJ (LIM-MTJ) Logic Style  LIMT-MTJ –Use differential MTJ in Dynamic Current-mode Logic (DyCML) ● Outputs are evaluated based on the resistance difference of pull down networks through x-coupled PMOS. ● Claimed to have dynamic and static power than SCMOS. 5 Schematic of LIM-MTJ 1-bit full adder.

Energy-Performance Characterization  V.S. SCMOS & DyCML –LIM-MTJ has no energy performance advantage as compared to the equivelent CMOS implementation 6 Schematic of SCMOS 1-bit full adder. Schematic of DyCML 1-bit full adder.

MTJ Switching Energy Analysis  Switching Energy –I W = J C ∙A, ● J C is the critical current density ● A is the junction area. A = π∙W∙L= K∙L 2, L is junction size. –R = δ/A ● δ is the resistance-area product, intrinsic MTJ parameter. δ = 20 Ω ∙ um 2 –t is time. 7

MTJ Switching Energy Analysis  J C is a function of current pulse width. –Switching time is a function of current density. ● Δ is the thermal stability factor (Δ≥40) ● t 0 is the intrinsic switching time. t 0 = 1 ns ● J C0 is the intrinsic critical current density, J C0 = J C at t= t 0. –Modern MTJs have been shown to have J C0 = 2-7 MA/cm 2 8

 Switching Energy –Function of switching time (t) given J C0, δ, L, Δ –Ref. MTJ ● J C0 = 5 MA/cm 2, δ= 20 Ω ∙ um 2, L=135nm, (W=65 nm,) ● R P =725 Ω, I C t=1ns  Switching Energy > 1 pJ –CMOS/MTJ hybrid logic circuits require frequent switching is hardly energy efficient. MTJ Switching Energy Analysis 9

 Switching Energy with scaling –δ, L, J C0  fJ Switching –δ ≤ 5 Ω ∙ um 2 & J C0 ≤ 0.6 MA/cm 2 & L ≤ 33nm 10

LUT-based Logic  Store the true table in memory  Reads out the logic value based on input selection. –Reconfigurable –Can implement all type of logics. e.g. FPGA  Replace storage cell with MTJ –No MTJ switching during the logic operation. Only need to be configured once. –Non-volatile, minimum stanby power. –Instant boot-up. 11 Example of 3 input LUT

MTJ Reading Circuit  Conventional current-mirror sense amplifier based reading circuit. (SA) –Slow (2 stages) –Power hungry (DC current) 12 ∆V∆V ∆V∆V VIP VIN

MTJ Reading Circuit  X-coupled inverter based reading circuit. (XSA) –Fast ● ∆V are generated and amplified at the same time –Power efficient ● no DC current, only charging discharging capacitance 13 ∆V at evaluation phase 1MTJ and 1R ref accessed per read Amplified by X- coupled inverter

Energy Performance Comparison 14

Instant Power 15

1 Bit Full Adder (CMOS_LUT)  Transistor Count –16xEDFF –4xMUX4 –2xMUX2 –672 Transistors 16

1 Bit Full Adder (MTJ_LUT1)  Transistor Count –16xREAD1XMTJ –4xMUX4 –2xMUX2 –2xWRTCKT –448 Transistors –33% Reduction –16 MTJ 17

READ1XMTJ  15T+1MTJ  Need writing circuit 18

1 Bit Full Adder (MTJ_LUT2)  Transistor Count –2x READ8XMTJ –1x 9-WORD DECODER –2x MUX2 –1x INV –1x WRTCKT –174 Transistors –76% Reduction –16 MTJ 19

READ8XMTJ  MTJs share reading circuit  1MTJ + 1 R ref are accessed / read  1MTJ is accessed / write  23T + 8 MTJ 20

Simulation Setup  3 LUT architecture are compared –CMOS-LUT –MTJ-LUT1: MTJ reading circuit + MUX –MTJ-LUT2: Shared MTJ reading circuit + decoder  Configured to implement 1-bit full adder –2 3-input LUTs  ASU predictive technology model (PTM) –90nm, 65nm (bulk) –45nm, 32nm (SOI)  MTJ characteristic –Rp = 700, Rap = 1400, TMR = 100%, I cap2p = 223uA, I cp2ap = 500uA –Verilog-A MTJ model from Richard. 21

Configuration Power  CMOS-LUT –1GHz  MTJ-LUT –250MHz –750uA Writing Current –About 3 ns Writing time / MTJ  MTJ-based LUT are 10x bigger configuration power –16 MTJ’s switching energy 22

Delay  MTJ-based LUT2 has 2.5x bigger delay 23

Leakage Power  MTJ-LUT1 has a little bit bigger leakage power  MTJ-LUT2 has about 5x smaller total leakage power and –10x smaller storage leakage (due to MTJ) –2x smaller logic leakage (from MUX to decoder) 24

Energy (Operation Frequency:100MHz)  LUT2 –4x total energy 32nm ● 1/10 leakage_storage, ½ leakage_logic, bigger dynamic_logic ● Dynamic_storage overhead decreases with technology scaling down. 25

Energy (Operation Frequency:250MHz)  LUT2 –3x total energy 32nm ● 1/10 leakage_storage, ½ leakage_logic, ½ dynamic_logic ● Dynamic_storage overhead decreases with technology scaling down. 26

Energy (Operation Frequency:500MHz)  LUT2 –2x total energy 32nm ● 1/10 leakage_storage, ½ leakage_logic, ½ dynamic_logic ● Dynamic_storage overhead decreases with technology scaling down. 27

Standby Power 28  Dynamic sleep transistor –50mV voltage drop across sleep transistor  5-20X reduction Standby Power (uW)Technology Node Structure90nm65nm45nm32nm CMOS-LUT MTJ-LUT MTJ-LUT

Conclusions  What architecture can best utilize MTJ's non-volatility feature to improve energy efficiency? –LUT-based logic which require no MTJ switching.  Can MTJ/CMOS hybrid circuit has better energy delay trade-off than CMOS circuit? –Yes.  How much leakage power can be saved by introducing MTJ to CMOS? –About 10x reduction  Any overhead? How much is the switching power of MTJ? –Yes. MTJ reading energy is overhead. MTJ writing energy of modern MTJ is around several pJ.  What will be the trend of MTJ/CMOS hybrid circuit with technology scaling? –Will play significant role in suppressing leakage below 45 nm. 29