A 256kb Sub-threshold SRAM in 65nm CMOS

Slides:



Advertisements
Similar presentations
Barcelona Forum on Ph.D. Research in Communications, Electronics and Signal Processing 21st October 2010 Soft Errors Hardening Techniques in Nanometer.
Advertisements

Fig Typical voltage transfer characteristic (VTC) of a logic inverter, illustrating the definition of the critical points.
Semiconductor Memory Design. Organization of Memory Systems Driven only from outside Data flow in and out A cell is accessed for reading by selecting.
Robust Low Power VLSI R obust L ow P ower VLSI Sub-threshold Sense Amplifier (SA) Compensation Using Auto-zeroing Circuitry 01/21/2014 Peter Beshay Department.
1 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea MAPLD 148:"Is Scaling the Correct Approach for Radiation Hardened Conversions.
Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.
A Leakage Current Replica Keeper for Dynamic Circuits  Based on the work presented in “A leakage Current Replica Keeper for Dynamic Circuits” by: Yolin.
The Cost of Fixing Hold Time Violations in Sub-threshold Circuits Yanqing Zhang, Benton Calhoun University of Virginia Motivation and Background Power.
Introduction to CMOS VLSI Design Lecture 13: SRAM
Fall 06, Sep 19, 21 ELEC / Lecture 6 1 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic.
Topic 9 MOS Memory and Storage Circuits
SRAM Mohammad Sharifkhani. Effect of Mismatch.
Introduction to CMOS VLSI Design Lecture 18: Design for Low Power David Harris Harvey Mudd College Spring 2004.
11/03/05ELEC / Lecture 181 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
Super-Drowsy Caches Single-V DD and Single-V T Super-Drowsy Techniques for Low- Leakage High-Performance Instruction Caches Nam Sung Kim, Krisztián Flautner,
Introduction to CMOS VLSI Design SRAM/DRAM
Spring 07, Feb 27 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Power Consumption in a Memory Vishwani D. Agrawal.
Die-Hard SRAM Design Using Per-Column Timing Tracking
Low-Power CMOS SRAM By: Tony Lugo Nhan Tran Adviser: Dr. David Parent.
Area-performance tradeoffs in sub-threshold SRAM designs
Lecture 5 – Power Prof. Luke Theogarajan
Lecture 19: SRAM.
Lecture 7: Power.
Parts from Lecture 9: SRAM Parts from
Power, Energy and Delay Static CMOS is an attractive design style because of its good noise margins, ideal voltage transfer characteristics, full logic.
Low Voltage Low Power Dram
The CMOS Inverter Slides adapted from:
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
An Ultra Low Power DLL Design
High Speed 64kb SRAM ECE 4332 Fall 2013 Team VeryLargeScaleEngineers Robert Costanzo Michael Recachinas Hector Soto.
Dept. of Computer Science, UC Irvine
Low Power via Sub-Threshold Circuits Mike Pridgen.
Jennifer Winikus Computer Engineering Seminar Michigan Technological University February 10,2011 2/10/2011J Winikus EE
DCSL & LVDCSL: A High Fan-in, High Performance Differential Current Switch Logic Families Dinesh Somasekhaar, Kaushik Roy Presented by Hazem Awad.
SRAM DESIGN PROJECT PHASE 2 Nirav Desai VLSI DESIGN 2: Prof. Kia Bazargan Dept. of ECE College of Science and Engineering University of Minnesota,
הפקולטה למדעי ההנדסה Faculty of Engineering Sciences.
ECE4430 Project Presentation
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 27: November 14, 2011 Memory Core.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 12.1 EE4800 CMOS Digital IC Design & Analysis Lecture 12 SRAM Zhuo Feng.
Advanced VLSI Design Unit 06: SRAM
CSE477 L24 RAM Cores.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 24: RAM Cores Mary Jane Irwin ( )
McKenneman, Inc. SRAM Proposal Design Team: Jay Hoffman Tory Kennedy Sholanda McCullough.
XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.
Guy Lemieux, Mehdi Alimadadi, Samad Sheikhaei, Shahriar Mirabbasi University of British Columbia, Canada Patrick Palmer University of Cambridge, UK SoC.
Low-Power SRAM ECE 4332 Fall 2010 Team 2: Yanran Chen Cary Converse Chenqian Gan David Moore.
CSE477 L07 Pass Transistor Logic.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 07: Pass Transistor Logic Mary Jane Irwin (
Project SRAM Stevo Bailey Kevin Linger Roger Lorenzo John Thompson ECE 4332: Intro to VLSI.
Inverter Chapter 5 The Inverter April 10, Inverter Objective of This Chapter  Use Inverter to know basic CMOS Circuits Operations  Watch for performance.
Dynamic Data Stability in Low-power SRAM Design Mohammad Sharifkhani, Shah M. Jahinuzzaman and Manoj Sachdev Electrical & Computer Engineering University.
Digital Integrated Circuits A Design Perspective
Integrated VLSI Systems EEN4196 Title: 4-bit Parallel Full Adder.
Washington State University
Design and Analysis of A Novel 8T SRAM Cell December 14, 2010 Department of Microelectronic Engineering & Centre for Efficiency Oriented Languages University.
© Digital Integrated Circuits 2nd Inverter Digital Integrated Circuits A Design Perspective The Inverter Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Low-Power BIST (Built-In Self Test) Overview 10/31/2014
Patricia Gonzalez Divya Akella VLSI Class Project.
SRAM Design for SPEED GROUP 2 Billy Chantree Daniel Sosa Justin Ferrante.
EE141 © Digital Integrated Circuits 2nd Combinational Circuits 1 A few notes for your design  Finger and multiplier in schematic design  Parametric analysis.
Modern VLSI Design 3e: Chapter 3 Copyright  1998, 2002 Prentice Hall PTR Topics n Electrical properties of static combinational gates: –transfer characteristics;
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 6.1 EE4800 CMOS Digital IC Design & Analysis Lecture 6 Power Zhuo Feng.
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
1 Dual-V cc SRAM Class presentation for Advanced VLSIPresenter:A.Sammak Adopted from: M. Khellah,A 4.2GHz 0.3mm 2 256kb Dual-V CC SRAM Building Block in.
Low Power SRAM VLSI Final Presentation Stephen Durant Ryan Kruba Matt Restivo Voravit Vorapitat.
M. Atef, Hong Chen, and H. Zimmermann Vienna University of Technology
Asynchronous SRAM in 45nM CMOS NCSU Free PDK Paper ID: CSMEPUN International Conference on Computer Science and Mechanical Engineering 10 th November.
High Gain Transimpedance Amplifier with Current Mirror Load By: Mohamed Atef Electrical Engineering Department Assiut University Assiut, Egypt.
Lecture 19: SRAM.
Low-Power SRAM Using 0.6 um Technology
Alireza Shafaei, Shuang Chen, Yanzhi Wang, and Massoud Pedram
Analyzing Sub-threshold Bitcell Topologies and the Effects of Assist Methods on SRAM Vmin By: James Boley.
Presentation transcript:

A 256kb Sub-threshold SRAM in 65nm CMOS Benton H. Calhoun, Anantha Chandrakasan Massachusetts Institute of Technology, Cambridge, MA ISSCC 2006 / SESSION 34 / SRAM / 34.4 Advanced VLSI class presentation Presented by: Pouya Kamalinejad 2006/12/28

Outline Introduction and preliminaries. SNM introduction. Proposed 10T SRAM Simulation and results Read SNM free SRAM Conclusion

Why Low Voltage SRAMs? The minimum supply voltage of LSIs is limited by their SRAMs for the following two reasons[2]: 1) with decreasing supply voltage (Vdd), SRAM delay increases at a higher rate than does CMOS logic circuit delay. 2) Read operations at low-Vdd levels result in storage data destruction in SRAM cells.

Preliminaries Traditional 6-T SRAM column[3] Bitline discharging for the reaoperation[3]

Static Noise Margin The large fraction of chip area often devoted to SRAM makes low power SRAM design very important. SNM quantifies the amount of voltage noise required at the internal nodes of a bitcell to flip the cell’s contents. degraded SNM can limit voltage scaling for SRAM designs. WL BLB BL Q QB M1 M2 M3 M4 M5 M6 VN SNM is length of side of the largest embedded square on the butterfly curve Inverter 1 Inverter 2 [1]

Cont’d Moves to the left Moves upward The minimum supply voltage of SRAMs is determined by both Read SNM and Write SNM levels; reducing Vth in the NMOS transistor improves Write SNM but worsens Read SNM. Moves to the left Moves upward SNM Butterfly Curve SNM is lower during read access because the VTC is degraded by the voltage divider across the access transistor (M2,M5) and drive transistor (M1,M4)[2]

SNM during HOLD and READ WL=0 BLB BL 1 M1 M2 M3 M4 M5 M6 WL=1 BLB prech 1 BL prech 1 1 M1 M2 M3 M4 M5 M6 [1] Read SNM is worst-case

Sub-VT SNM Dependencies SNM is mainly a function of: Vdd (limited to Vdd/2) Temperature (higher temp results in Lower SNM due to lower gain) Sizing (Cell ratio affects SNM less in sub-threshold due to logarithmic relation unless it affects Vt) Bit-line voltage Vt mismatch Model* gives good estimate for the distribution of SNM at the worst-case tail Normal distribution Vt mismatch is the worst [1]

How to reduce Vdd? Impact of local mismatch on 6T SNM in 65nm. Read SNM has larger standard deviation. Hold SNM at 0.3V has roughly the same mean as Read SNM at 0.5V and same 6σ SNM as Read SNM at 0.6V.[2] Thus, by eliminating the degraded Read SNM, a bitcell can be operated at 0.3V with the same 6σ stability as a 6T bitcell at 0.6V. A

Cont’d The idea is to add a 4T buffer at one side: 6T bitcell WL BLB BL Q QB VVDD RBL RWL M7 to M10 to remove the problem of Read SNM by buffering thestored data during a read access. M9 M8 M10 Thus, the worst-case SNM for this bitcell is the Hold SNM related to M1 to M6, which is the same as the 6T Hold SNM for same sized M1 to M6 M7 6T bitcell 4T buffer Proposed 10-T bitcell for Sub-VT[1]

10T Bitcell Reduces Bitline Leakage QB=1 RBL=1 QBB held near 1 so the leakage current through M8 is reduced QB=0 RBL=1 QBB =1 leakage reduced by stack Q QB [1] for iso-VDD, the 10T cell without M10 (a 9T cell) has 50% higher leakage current than the 6T, but adding M10 drops the overhead to 16%.

Leakage Power Savings with 10T Bitcell [1] 6T memories in 65nm usually at 0.9V or greater (lowest reported is 0.7V) 10T bitcell allows scaling to lower voltages Lower voltage operation reduces leakage power dramatically for unaccessed cells

Bitline Leakage Limits Integration Level “1” Bit-line “0” “0” [1] 16 bitcells on bitline is best can hope for standard 6T

Cont’d BL leakage limits the number of cells on a BL. The 10T bitcell can sustain 256 cells/BL at 0.3V compared to 16 without M10 (6T or 9T). higher level of integration allowed by the 10T cell reduces the peripheral circuits and slightly mitigates the bitcell area overhead[1].

10T Bitcell Allows Sub-VT Write To achieve write in sub-threshold, the virtual supply (VVDD) to the selected cells floats during the write operation VDDon MC MC VVDD RWL WL MC Folded WL shares VVDD Q QB BL RBL MC BLB [1]

Cont’d feedback restores ‘1’ to VDD floating feedback restores ‘1’ to VDD Floating VDD weakens feedback and allows Write. A virtual supply voltage (VVDD) that floats during write allows robust write operation into sub-VT (mono-stable butterfly curve). VVDD stops floating while WL_WR remains asserted to restore the ‘1’value to full VDD[1].

Test Chip Architecture 256 rows and 128 columns per block Static CMOS peripherals Separate WL VDD for boosting Assumed 1x1 redundancy Simulation: Operates at 300mV across all process corners from 0 to 100oC [1]

256Kb 65nm Sub-VT memory Test chip addressing the sub-VT problems using 10T bitcell: 1.89mm by 1.12mm. Chip functions to below 400mV, holds without error to <250mV: At 400mV, 3.28mW and 475kHz at 27oC. Reads without error to 320mV (27oC) and 360mV (85oC). Write without error to 380mV (27oC) and 350mV (85oC). [1]

Simulation results Chip functioned correctly to below 400mV. Scope plot shows 300mV operation; at this low voltage, some bit errors were observed[1].

Power Measurements Relative to 0.6V 6T SRAM, 2.2X less leakage power at 0.4V and 3.3X less leakage power at 0.3V >60X less leakage power than 1.2V [1]

Active Energy Savings with 10T Bitcell 200MHz at 1.2V [1] 6T memories in 65nm usually at 0.9V or greater (lowest reported is 0.7V). Operating 10T bitcell at lower voltages saves energy. 10T memory can provide high frequency operation at higher voltages when necessary.

1x1 redundancy and WL boosting: VDD Scaling Limits Read Bit Errors Redundancy and/or boosted WL account for mismatch 3 cols 2 cols 1 column (of 1024) 1x1 redundancy and WL boosting: Read works to 320mV Write works to 380mV 4 rows 1 row (of 2048) 5 rows [1] Write Bit Errors

Conclusions Standard 6T approach limited to ~0.6-0.7V and 16 cells per bitline. Proposed 10T bitcell shows sub-threshold operation with overall power and energy savings. Sub-VT memory requires circuits and architectures to manage variability and low Ion/Ioff.

A read SNM free SRAM decreases in Read SNM in conventional SRAM cells: When SNM>0mV, stable data retention is still achieved even though the voltage at Node V1 may slightly exceed “0”. When SNM<0mV, however, reversal data is overwritten. 1) Node V1 voltage greatly exceeds “0”. 2) Node V2 voltage falls below “1” because Node V1 voltage reaches the CMOS inverter logic threshold voltage (P2, N2). 3) The fall in Node V2 voltage raises Node V1 voltage further resulting in the overwriting of reversal data. [3]

Cont’d [3]

Proposed read SNM free SRAM N5 is added between Node V2 and NMOS transistor N2. When the cell is not accessed /WL is high and when the cell is accessed /wl is low. N5 prevents V2 from decreasing and thus the data bit is not reversed even if SNM equals zero. Period of /activation is less than V2 retention time. [3]

Cont’d A useless gap equal to one transistor results since 7 is a prime number. Solution: combining the SRAM cell and the sensing circuit. A PMOS and an NMOS transistor are placed, respectively, in GAP (P) and GAP (N), between two L-shaped SRAM cells. [3]

Measurement results [3] Organization: Clock access time: Power consumption: Supply voltage: Process technology: Macro size: Cell size: 4Kword x 16b 1.2 ns, at 1.0 V 20 ns, at 0.5 V 12.9 mA/GHz, at 1.0 V 1.0V to 0.44 V 90-nm ASPLA CMOS, NMOS Vth: 0.32V, PMOS Vth: -0.33V 0.4 mm x 0.7 mm 2.09 μm2 (based on logic rules) Vdd-min decreases with increasing temperature. Vdd-min is determined by the write SNM, which, unlike Read SNM, improves with decreasing Vth levels in NMOS transistors, and Vth decreases with increasing temperature. [3]

SRAM macro layout and chip microphotograph Since both Write SNM and SRAM cell current improve with decreasing Vth levels in NMOS transistors, it is possible to achieve even higher-speed and lower- Vdd operations by reducing Vth levels below 0.32V, [3]

REFERENCES [1]. A 256kb Sub-threshold SRAM in 65nm CMOS Benton H. Calhoun, Anantha Chandrakasan Massachusetts Institute of Technology, Cambridge, MA ISSCC 2006 / SESSION 34 / SRAM / 34.4 [2]. Analyzing Static Noise Margin for Subthreshold SRAM in 65nm CMOS Benton H. Calhoun and Anantha Chandrakasan MIT, 50 Vassar St 38-107, Cambridge, MA, 02139 USA {bcalhoun,anantha}@mtl.mit.edu [3]. A Read-Static-Noise-Margin-Free SRAM Cell for Low-Vdd and High-Speed Applications ISCC 2005 / SESSION 26 / STATIC MEMORY / 26.3Koichi Takeda1, Yasuhiko Hagihara1, Yoshiharu Aimoto2, Masahiro Nomura1, Yoetsu Nakazawa1, Toshio Ishii2, Hiroyuki Kobatake

THANK YOU…