Sp09 CMPEN 411 L23 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 23: Memory Cell Designs SRAM, DRAM [Adapted from Rabaey’s Digital Integrated.

Slides:



Advertisements
Similar presentations
COEN 180 SRAM. High-speed Low capacity Expensive Large chip area. Continuous power use to maintain storage Technology used for making MM caches.
Advertisements

Computer Organization and Architecture
Semiconductor Memory Design. Organization of Memory Systems Driven only from outside Data flow in and out A cell is accessed for reading by selecting.
+ CS 325: CS Hardware and Software Organization and Architecture Internal Memory.
Robust Low Power VLSI R obust L ow P ower VLSI Sub-threshold Sense Amplifier (SA) Compensation Using Auto-zeroing Circuitry 01/21/2014 Peter Beshay Department.
Sp09 CMPEN 411 L16 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 16: Introduction to Soft Errors [Adapted from Rabaey’s Digital Integrated Circuits,
Sistemi Elettronici Programmabili1 Progettazione di circuiti e sistemi VLSI Anno Accademico Lezione Memorie (vedi anche i file pcs1_memorie.pdf.
COEN 180 DRAM. Dynamic Random Access Memory Dynamic: Periodically refresh information in a bit cell. Else it is lost. Small footprint: transistor + capacitor.
Digital Integrated Circuits A Design Perspective
Elettronica T AA Digital Integrated Circuits © Prentice Hall 2003 SRAM & DRAM.
Introduction to CMOS VLSI Design Lecture 13: SRAM
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 9 - Combinational.
Designing Combinational Logic Circuits: Part2 Alternative Logic Forms:
11/29/2004EE 42 fall 2004 lecture 371 Lecture #37: Memory Last lecture: –Transmission line equations –Reflections and termination –High frequency measurements.
Digital Integrated Circuits© Prentice Hall 1995 Memory SEMICONDUCTOR MEMORIES.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 32: Array Subsystems (DRAM/ROM) Prof. Sherief Reda Division of Engineering,
Introduction to CMOS VLSI Design SRAM/DRAM
Digital Integrated Circuits© Prentice Hall 1995 Memory SEMICONDUCTOR MEMORIES.
Registers  Flip-flops are available in a variety of configurations. A simple one with two independent D flip-flops with clear and preset signals is illustrated.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 31: Array Subsystems (SRAM) Prof. Sherief Reda Division of Engineering,
Modern VLSI Design 2e: Chapter 6 Copyright  1998 Prentice Hall PTR Topics n Memories: –ROM; –SRAM; –DRAM. n PLAs.
Lecture 19: SRAM.
Parts from Lecture 9: SRAM Parts from
Lecture 21, Slide 1EECS40, Fall 2004Prof. White Lecture #21 OUTLINE –Sequential logic circuits –Fan-out –Propagation delay –CMOS power consumption Reading:
Digital Integrated Circuits for Communication
55:035 Computer Architecture and Organization
Semiconductor Memories Lecture 1: May 10, 2006 EE Summer Camp Abhinav Agarwal.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 28: November 15, 2013 Memory Periphery.
1 Delay Estimation Most digital designs have multiple data paths some of which are not critical. The critical path is defined as the path the offers the.
© Digital Integrated Circuits 2nd Sequential Circuits Digital Integrated Circuits A Design Perspective Designing Sequential Logic Circuits Jan M. Rabaey.
EE415 VLSI Design DYNAMIC LOGIC [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers l Multiplexers,
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 28: November 16, 2012 Memory Periphery.
Semiconductor Memories.  Semiconductor memory is an electronic data storage device, often used as computer memory, implemented on a semiconductor-based.
Digital Integrated Circuits© Prentice Hall 1995 Memory SEMICONDUCTOR MEMORIES Adapted from Jan Rabaey's IC Design. Copyright 1996 UCB.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n Latches and flip-flops. n RAMs and ROMs.
Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics Memories: –ROM; –SRAM; –DRAM; –Flash. Image sensors. FPGAs. PLAs.
Ratioed Circuits Ratioed circuits use weak pull-up and stronger pull-down networks. The input capacitance is reduced and hence logical effort. Correct.
SRAM DESIGN PROJECT PHASE 2 Nirav Desai VLSI DESIGN 2: Prof. Kia Bazargan Dept. of ECE College of Science and Engineering University of Minnesota,
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 27: November 14, 2011 Memory Core.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 12.1 EE4800 CMOS Digital IC Design & Analysis Lecture 12 SRAM Zhuo Feng.
Digital Logic Design Instructor: Kasım Sinan YILDIRIM
Advanced VLSI Design Unit 06: SRAM
CSE477 L24 RAM Cores.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 24: RAM Cores Mary Jane Irwin ( )
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 24: November 5, 2010 Memory Overview.
ECE 300 Advanced VLSI Design Fall 2006 Lecture 19: Memories
CSE477 L23 Memories.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 23: Semiconductor Memories Mary Jane Irwin (
CSE477 L07 Pass Transistor Logic.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 07: Pass Transistor Logic Mary Jane Irwin (
Computer Memory Storage Decoding Addressing 1. Memories We've Seen SIMM = Single Inline Memory Module DIMM = Dual IMM SODIMM = Small Outline DIMM RAM.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 28: November 16, 2011 Memory Periphery.
Washington State University
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 22: Memery, ROM
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 24: Peripheral Memory Circuits [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition,
Introduction to Computer Organization and Architecture Lecture 7 By Juthawut Chantharamalee wut_cha/home.htm.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
Digital Circuits Introduction Memory information storage a collection of cells store binary information RAM – Random-Access Memory read operation.
Sp09 CMPEN 411 L21 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey’s Digital Integrated Circuits,
EE 534 summer 2004 University of South Alabama EE534 VLSI Design System summer 2004 Lecture 14:Chapter 10 Semiconductors memories.
Low Power SRAM VLSI Final Presentation Stephen Durant Ryan Kruba Matt Restivo Voravit Vorapitat.
CSE477 L25 Memory Peripheral.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 25: Peripheral Memory Circuits Mary Jane Irwin (
EE586 VLSI Design Partha Pande School of EECS Washington State University
Norhayati Soin 06 KEEE 4426 WEEK 15/1 6/04/2006 CHAPTER 6 Semiconductor Memories.
Lecture 19: SRAM.
CSE477 VLSI Digital Circuits Fall 2003 Lecture 24: Memory Cell Designs
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2002 Lecture 25: Peripheral Memory Circuits Mary Jane.
MOS Memory and Storage Circuits
Hakim Weatherspoon CS 3410 Computer Science Cornell University
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2002 Lecture 22: Shifters, Decoders, Muxes Mary Jane.
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2003 Lecture 22: Shifters, Decoders, Muxes Mary Jane.
Semiconductor Memories
Presentation transcript:

Sp09 CMPEN 411 L23 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 23: Memory Cell Designs SRAM, DRAM [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]

Sp09 CMPEN 411 L23 S.2 Heads-up  IBM Kerry Bernstein’s talk Thursday 4 PM, IST 333 l To prepare for his talk, go to ANGEL system, find the file “New dimensions in performance”, under “interesting reading materials”  To make up last cancelled lecture: l Kerry Bernstein’s talk – “Microarchitecture’s Race for Performance and Power”, PSU talk, 11/2004, Slides and Videos are online in ANGEL system “Interesting Reading Materials”  DAC Young Student Scholarship

Sp09 CMPEN 411 L23 S.3 Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers l Multiplexers, decoders  Control l Finite state machines (PLA, ROM, random logic)  Interconnect l Switches, arbiters, buses  Memory l ROM, Caches (SRAMs), CAM, DRAMs, buffers

Sp09 CMPEN 411 L23 S.4 2D 4x4 SRAM Memory Bank A0A0 Row Decoder !BL WL[0] A1A1 A2A2 Column Decoder sense amplifiers write circuitry BL WL[1] WL[2] WL[3] bit line precharge 2 bit words clocking and control enable read precharge BL i BL i+1

Sp09 CMPEN 411 L23 S.5 6-Transistor SRAM Storage Cell !BLBL WL M1M1 M2M2 M3M3 M4M4 M5M5 M6M6 Q !Q 1 0 on off

Sp09 CMPEN 411 L23 S.6 SRAM Cell Analysis (Read) !BL=2.5V BL=2.5V WL=1 M1M1 M4M4 M5M5 M6M6 Q=1 !Q=0 C bit  Read-disturb (read-upset): must limit the voltage rise on !Q to prevent read-upsets from occurring while simultaneously maintaining acceptable circuit speed and area l M 1 must be stronger than M 5 when storing a 1 (as shown) l M 3 must be stronger than M 6 when storing a 0  0 0

Sp09 CMPEN 411 L23 S.7 Read Voltage Ratios V DD = 2.5V V Tn = 0.4V where CR is the Cell Ratio = (W 1 /L 1 )/(W 5 /L 5 )  Keep cell size minimal while maintaining read stability l Make M 1 minimum size and increase the L of M 5 (to make it weaker) -increases load on WL l Make M 5 minimum size and increase the W of M 1 (to make it stronger)  Similar constraints on (W 3 /L 3 )/(W 6 /L 6 ) when storing a 0  1.2

Sp09 CMPEN 411 L23 S.8 SRAM Cell Analysis (Write) !BL=2.5V BL=0V WL=1 M1M1 M4M4 M5M5 M6M6 Q=1 !Q=0 C bit  The !Q side of the cell cannot be pulled high enough to ensure writing of 0 (because M 1 is on and sized to protect against read upset). So, the new value of the cell has to be written through M 6. l M 6 must be able to overpower M 4 when storing a 1 and writing a 0 l M 5 must be able to overpower M 2 when storing a 0 and writing a 1  0 0

Sp09 CMPEN 411 L23 S.9 Write Voltage Ratios V DD = 2.5V |V Tp | = 0.4V  p /  n = 0.5 where PR is the Pull-up Ratio = (W 4 /L 4 )/(W 6 /L 6 )  Keep cell size minimal while allowing writes l Make M 4 and M 6 minimum size  1.8

Sp09 CMPEN 411 L23 S.10 Cell Sizing and Performance  Keeping cell size minimal is critical for large SRAMs l Minimum sized pull down fets (M 1 and M 3 ) -Requires longer than minimum channel length, L, pass transistors (M 5 and M 6 ) to ensure proper CR -But up-sizing L of the pass transistors increases capacitive load on the word lines and limits the current discharged on the bit lines both of which can adversely affect the speed of the read cycle l Minimum width and length pass transistors -Boost the width of the pull downs (M 1 and M 3 ) -Reduces the loading on the word lines and increases the storage capacitance in the cell – both are good! – but cell size may be slightly larger  Performance is determined by the read operation l To accelerate the read time, SRAMs use sense amplifiers (so that the bit line doesn’t have to make a full swing)

Sp09 CMPEN 411 L23 S.11 6-T SRAM Layout V DD GND Q Q WL BL M1 M3 M4M2 M5M6  Simple and reliable, but big l signal routing and connections to two bit lines, a word line, and both supply rails  Area is dominated by the wiring and contacts  Other alternatives to the 6-T cell include the resistive load 4-T cell and the TFT cell neither of which are available in a standard CMOS logic process

Sp09 CMPEN 411 L23 S.12 Multiple Read/Write Port Storage Cell !BL1 BL1 WL1 M1M1 M2M2 M3M3 M4M4 M5M5 M6M6 Q!Q WL2 BL2 !BL2 M7M7 M8M8  To avoid read upset, the widths of M 1 and M 3 will have to be sized up by a factor equal to the number of simultaneously open read ports

Sp09 CMPEN 411 L23 S.13 Resistance-load SRAM Cell M 3 R L R L V DD WL QQ M 1 M 2 M 4 BL

Sp09 CMPEN 411 L23 S.14 Remove R M 3 WL M 1 M 2 M 4 BL

Sp09 CMPEN 411 L23 S.15 Remove R M 3 WL M 2 M 4 Further remove one transistor

Sp09 CMPEN 411 L23 S.16 3-Transistor DRAM Cell M1 M2 M3 X BL1 BL2 WWL RWL XV DD -V T BL1 V DD WWL write RWLreadBL2 V DD -V T VV CsCs  Write: C s is charged (or discharged) by asserting WWL and BL1 l Value stored at node X when writing a 1 is V WWL - V Tn  Read: C s is “sensed” by asserting RWL and observing BL2 l Read is non-destructive and inverting (ratioless)

Sp09 CMPEN 411 L23 S.17 3-Transistor DRAM Cell M1 M2 M3 X BL1 BL2 WWL RWL XV DD -V T BL1 V DD WWL write RWLreadBL2 V DD -V T VV CsCs  Refresh: read stored data, put its inverse on BL1 and assert WWL (need to do this every 1 to 4 msec)  Note Vt drop at x: how to fix it?

Sp09 CMPEN 411 L23 S.18 3-T DRAM Layout BL2BL1GND RWL WWL M3 M2 M1  Fewer contacts & wires  Total cell area is (compared to 1,092 2 for the 6-T SRAM cell)  No special processing steps are needed (so compatible with logic CMOS process)  Can use bootstrapping (raise V WWL to a value higher than V DD ) to eliminate threshold drop when storing a “1”

Sp09 CMPEN 411 L23 S.19 1-Transistor DRAM Cell M1 X BL WL XV DD -V T WL write 1 BL V DD CsCs read 1 V DD /2 sensing C BL  Write: C s is charged (or discharged) by asserting WL and BL  Read: Charge redistribution occurs between C BL and C s l Read is destructive, so must refresh after read Voltage swing is small

Sp09 CMPEN 411 L23 S.20 Sense Amp Operation V(1) V(0) t V PRE V BL Sense amp activated Word line activated

Sp09 CMPEN 411 L23 S.21 1-T DRAM Cell Observations  Cell is single ended (complicates the design of the sense amp)  Cell requires a sense amp for each bit line due to charge redistribution based read l BL’s precharged to V DD /2 (not V DD as with SRAM design) l all previous designs used SAs for speed, not functionality  Cell read is destructive; refresh must follow to restore data  Cell requires an extra capacitor (C S ) that must be explicitly included in the design l May not compatible with logic CMOS process  A threshold voltage is lost when writing a 1 (can be circumvented by bootstrapping the word lines to a higher value than V DD )

Sp09 CMPEN 411 L23 S.22 1-T DRAM (3-D capacitor) Source: IBM Non-CMOS

Sp09 CMPEN 411 L23 S.23 Peripheral Memory Circuitry  Row and column decoders  Read bit line precharge logic  Sense amplifiers  Timing and control  Speed  Power consumption  Area – pitch matching

Sp09 CMPEN 411 L23 S.24 2D 4x4 __RAM Memory A0A0 Row Decoder !BL WL[0] A1A1 A2A2 Column Decoder sense amplifiers write circuitry BL WL[1] WL[2] WL[3] bit line precharge 2 bit words clocking and control enable read precharge BL i BL i+1

Sp09 CMPEN 411 L23 S.25 2D 4x4 ___RAM Memory A0A0 Row Decoder BL WL[0] A1A1 A2A2 Column Decoder sense amplifiers write circuitry WL[1] WL[2] WL[3] bit line precharge 2 bit words BL 0 BL 1 BL 2 BL 3 clocking, control, and refresh enable read precharge

Sp09 CMPEN 411 L23 S.26 Row Decoders  Collection of 2 M complex logic gates organized in a regular, dense fashion  (N)AND decoder for 8 address bits WL(0) = !A 7 & !A 6 & !A 5 & !A 4 & !A 3 & !A 2 & !A 1 & !A 0 … WL(255) = A 7 & A 6 & A 5 & A 4 & A 3 & A 2 & A 1 & A 0  NOR decoder for 8 address bits WL(0) = !(A 7 | A 6 | A 5 | A 4 | A 3 | A 2 | A 1 | A 0 ) … WL(255) = !(!A 7 | !A 6 | !A 5 | !A 4 | !A 3 | !A 2 | !A 1 | !A 0 )  Goals: Pitch matched, fast, low power

Sp09 CMPEN 411 L23 S.27 Dynamic Decoders Precharge devices V DD  GND WL A 0 A 0 GND A 1 A 1  WL 3 A 0 A 0 A 1 A V DD V V V 2-input NOR decoder 2-input NAND decoder Which one is faster? Smaller? Low power?

Sp09 CMPEN 411 L23 S.28 Pass Transistor Based Column Decoder BL 3 BL 2 BL 1 BL 0 data_out 2 input NOR decoder A1A1 A0A0 S3S3 S2S2 S1S1 S0S0  Read: connect BLs to the Sense Amps (SA) Writes: drive one of the BLs low to write a 0 into the cell l Fast since there is only one transistor in the signal path. However, there is a large transistor count ( (K+1)2 K + 2 x 2 K ) l For K = 2  3 x 2 2 (decoder) + 2 x 2 2 (PTs) = = 20 !BL 3 !BL 2 !BL 1 !BL 0 !data_out

Sp09 CMPEN 411 L23 S.29 Tree Based Column Decoder BL 3 BL 2 BL 1 BL 0 A0A0 !A 0 A1A1 !A 1 data_out  Number of transistors = (2 x 2 x (2 K -1)) for K = 2  2 x 2 x (2 2 – 1) = 4 x 3 = 12  Delay increases quadratically with the number of sections (K) (so prohibitive for large decoders) can fix with buffers, progressive sizing, combination of tree and pass transistor approaches !BL 3 !BL 2 !BL 1 !BL 0 !data_out

Sp09 CMPEN 411 L23 S.30 Bit Line Precharge Logic equalization transistor - speeds up equalization of the two bit lines by allowing the capacitance and pull-up device of the nondischarged bit line to assist in precharging the discharged line !PC !BLBL  First step of a Read cycle is to precharge (PC) the bit lines to V DD l every differential signal in the memory must be equalized to the same voltage level before Read  Turn off PC and enable the WL l the grounded PMOS load limits the bit line swing (speeding up the next precharge cycle)

Sp09 CMPEN 411 L23 S.31 Sense Amplifiers  Amplification – resolves data with small bit line swings (in some DRAMs required for proper functionality)  Delay reduction – compensates for the limited drive capability of the memory cell to accelerate BL transition SA input output t p = ( C *  V ) / I av large small make  V as small as possible  Power reduction – eliminates a large part of the power dissipation due to charging and discharging bit lines  Signal restoration – for DRAMs, need to drive the bit lines full swing after sensing (read) to do data refresh

Sp09 CMPEN 411 L23 S.32 Differential Sense Amplifier Directly applicable to SRAMs M 4 M 1 M 5 M 3 M 2 V DD bit SE Out y

Sp09 CMPEN 411 L23 S.33 Differential Sensing ― SRAM

Sp09 CMPEN 411 L23 S.34 Reliability and Yield  Memories operate under low signal-to-noise conditions l word line to bit line coupling can vary substantially over the memory array -folded bit line architecture (routing BL and !BL next to each other ensures a closer match between parasitics and bit line capacitances) l interwire bit line to bit line coupling -transposed (or twisted) bit line architecture (turn the noise into a common-mode signal for the SA) l leakage (in DRAMs) requiring refresh operation  suffer from low yield due to high density and structural defects l increase yield by using error correction (e.g., parity bits) and redundancy  and are susceptible to soft errors due to alpha particles and cosmic rays

Sp09 CMPEN 411 L23 S.35 Redundancy in the Memory Structure Row address Column address Redundant row Redundant columns Fuse bank

Sp09 CMPEN 411 L23 S.36 Page 4 == ? Redundant Wordline Fused Repair Addresses Enable Normal Wordline Decoder Normal Wordline Functional Address == ? Redundant Wordline Fused Repair Addresses Enable Normal Wordline Decoder Normal Wordline Row Redundancy

Sp09 CMPEN 411 L23 S.37 Page 5 Column Redundancy

Sp09 CMPEN 411 L23 S.38 Error-Correcting Codes Example: Hamming Codes e.g. If B3 flips = 3 2 K >= m+k+1. m # data bit, k # check bit For 64 data bits, needs 7 check bits

Sp09 CMPEN 411 L23 S.39 Performance and area overhead for ECC

Sp09 CMPEN 411 L23 S.40 Redundancy and Error Correction

Sp09 CMPEN 411 L23 S.41 Soft Errors  Nonrecurrent and nonpermanent errors from l alpha particles (from the packaging materials) l neutrons from cosmic rays  As feature size decreases, the charge stored at each node decreases (due to a lower node capacitance and lower V DD ) and thus Q critical (the charge necessary to cause a bit flip) decreases leading to an increase in the soft error rate (SER ) From Semico Research Corp. MTBF (hours).13  m.09  m Ground-based Civilian Avionics System Military Avionics System189 From Actel

Sp09 CMPEN 411 L23 S.42 Scary Fact  Avionics system in civilian aviation: altitude of 30,000 feet on a route crossing the north pole both cause increase in neutron flux. If avionics board uses four 1M 130nm SRAM-based FPGAs, it would be subject to upsets per day = 324 hours between upsets or 3million FITs. Assume one such system on-board each commercial aircraft, 4,000 civilian flights per day, 3 hours average flight time. Nearly 37 aircraft will experience a neutron-induced SRAM-based FPGA configuration failure during the duration of their flight.

Sp09 CMPEN 411 L23 S.43 Modeling of a particle strike

Sp09 CMPEN 411 L23 S.44 A SPICE simulation for SRAM A particle strike !BL BL WL 0->1 1->0 0

Sp09 CMPEN 411 L23 S.45 On-chip Memory: ITRS roadmap

Sp09 CMPEN 411 L23 S.46 State of Art

Sp09 CMPEN 411 L23 S.47 State of Art