CSE241 R2 Datapath/Memory.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Recitation 02: Datapath and Memory.

Slides:



Advertisements
Similar presentations
+ CS 325: CS Hardware and Software Organization and Architecture Internal Memory.
Advertisements

EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Jan M. Rabaey Anantha.
EE141 Adder Circuits S. Sundar Kumar Iyer.
Budapest University of Technology and Economics Department of Electron Devices Microelectronics, BSc course MOS circuits: basic construction.
CS.305 Computer Architecture Memory: Structures Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made.
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
Chapter 10. Memory, CPLDs, and FPGAs
11/29/2004EE 42 fall 2004 lecture 371 Lecture #37: Memory Last lecture: –Transmission line equations –Reflections and termination –High frequency measurements.
Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei ECE 300 Advanced VLSI Design Fall 2006 Lecture.
EECS Components and Design Techniques for Digital Systems Lec 18 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
Introduction to CMOS VLSI Design SRAM/DRAM
CSE477 VLSI Digital Circuits Fall 2002 Lecture 20: Adder Design
Memory Hierarchy.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
CSE241 L2 Datapath/Memory.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Lecture 02: Datapath and Memory.
Registers  Flip-flops are available in a variety of configurations. A simple one with two independent D flip-flops with clear and preset signals is illustrated.
CSE241 R2 Datapath/Memory.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Recitation 02: Datapath and Memory.
Digital Integrated Circuits© Prentice Hall 1995 Arithmetic Arithmetic Building Blocks.
Chapter 6 Memory and Programmable Logic Devices
55:035 Computer Architecture and Organization
12/1/2004EE 42 fall 2004 lecture 381 Lecture #38: Memory (2) Last lecture: –Memory Architecture –Static Ram This lecture –Dynamic Ram –E 2 memory.
Lecture on Electronic Memories. What Is Electronic Memory? Electronic device that stores digital information Types –Volatile v. non-volatile –Static v.
Memory and Programmable Logic Dr. Ashraf Armoush © 2010 Dr. Ashraf Armoush.
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers l Multiplexers,
Abdullah Aldahami ( ) Feb26, Introduction 2. Feedback Switch Logic 3. Arithmetic Logic Unit Architecture a.Ripple-Carry Adder b.Kogge-Stone.
Digital Integrated Circuits Chpt. 5Lec /29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin (
Logic and Computer Design Dr. Sanjay P. Ahuja, Ph.D. FIS Distinguished Professor of CIS ( ) School of Computing, UNF.
Arithmetic Building Blocks
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Reference: Digital Integrated.
Arithmetic Building Blocks
Advanced VLSI Design Unit 05: Datapath Units. Slide 2 Outline  Adders  Comparators  Shifters  Multi-input Adders  Multipliers.
Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics Memories: –ROM; –SRAM; –DRAM; –Flash. Image sensors. FPGAs. PLAs.
EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
Chapter 14 Arithmetic Circuits (I): Adder Designs Rev /12/2003
Memory Semiconductor Memory Classification ETEG 431 SG Size: Bits, Bytes, Words. Timing Parameter: Read, Write Cycle… Function: ROM, RWM, Volatile, Static,
Digital Design: Principles and Practices
CSE477 L24 RAM Cores.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 24: RAM Cores Mary Jane Irwin ( )
ECE 300 Advanced VLSI Design Fall 2006 Lecture 19: Memories
CSE477 L23 Memories.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 23: Semiconductor Memories Mary Jane Irwin (
Computer Memory Storage Decoding Addressing 1. Memories We've Seen SIMM = Single Inline Memory Module DIMM = Dual IMM SODIMM = Small Outline DIMM RAM.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Jan M. Rabaey Anantha.
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 22: Memery, ROM
Digital Integrated Circuits© Prentice Hall 1995 Arithmetic Arithmetic Building Blocks.
Budapest University of Technology and Economics Department of Electron Devices Microelectronics, BSc course MOS circuits: basic construction.
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 19: Adder Design
Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei ECE 300 Advanced VLSI Design Fall 2006 Lecture.
Introduction to Computer Organization and Architecture Lecture 7 By Juthawut Chantharamalee wut_cha/home.htm.
CSE477 VLSI Digital Circuits Fall 2002 Lecture 20: Adder Design
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
Sp09 CMPEN 411 L21 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey’s Digital Integrated Circuits,
EE 534 summer 2004 University of South Alabama EE534 VLSI Design System summer 2004 Lecture 14:Chapter 10 Semiconductors memories.
CSE477 L21 Multiplier Design.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin (
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003 Rev /05/2003.
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003.
Norhayati Soin 06 KEEE 4426 WEEK 15/1 6/04/2006 CHAPTER 6 Semiconductor Memories.
CSE477 L20 Adder Design.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 20: Adder Design Mary Jane Irwin (
Prof. Hsien-Hsin Sean Lee
Multiplier Design [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
CSE477 VLSI Digital Circuits Fall 2003 Lecture 21: Multiplier Design
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2002 Lecture 25: Peripheral Memory Circuits Mary Jane.
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2002 Lecture 22: Shifters, Decoders, Muxes Mary Jane.
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2003 Lecture 22: Shifters, Decoders, Muxes Mary Jane.
Digital Integrated Circuits A Design Perspective
Semiconductor Memories
Review: Basic Building Blocks
Arithmetic Building Blocks
Arithmetic Circuits.
Semiconductor memories are classified in different ways. A distinction is made between read-only (ROM) and read-write (RWM) memories. The contents RWMs.
Presentation transcript:

CSE241 R2 Datapath/Memory.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Recitation 02: Datapath and Memory

CSE241 R2 Datapath/Memory.2Kahng & Cichy, UCSD ©2003 Introduction: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers l Multiplexers, decoders  Control l Finite state machines (PLA, ROM, random logic)  Interconnect l Switches, arbiters, buses – not covered  Memory l Caches (SRAMs), TLBs, DRAMs, buffers

CSE241 R2 Datapath/Memory.3Kahng & Cichy, UCSD ©2003 The 1-bit Binary Adder 1-bit Full Adder (FA) A B S C in S = A  B  C in C out = A&B | A&C in | B&C in (majority function)  How can we use it to build a 64-bit adder?  How can we modify it easily to build an adder/subtractor?  How can we make it better (faster, lower power, smaller)? ABC in C out Scarry status 00000kill propagate generate C out G = A&B P = A  B K = !A & !B = P  C in = G | P&C in Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.4Kahng & Cichy, UCSD ©2003 FA Gate Level Implementations AB S C out C in t1 t0 t2 t0 t1 AB S C out C in t2 Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.5Kahng & Cichy, UCSD ©2003 Review: XOR FA C out S C in A B 16 transistors Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.6Kahng & Cichy, UCSD ©2003 Ripple Carry Adder (RCA) A0A0 B0B0 S0S0 C 0 =C in FA A1A1 B1B1 S1S1 A2A2 B2B2 S2S2 A3A3 B3B3 S3S3 C out =C 4 T = O(N) worst case delay T adder  T FA (A,B  C out ) + (N-2)T FA (C in  C out ) + T FA (C in  S) Real Goal: Make the fastest possible carry path Max delay = tdelay = tsum + (N-1) tcarry Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.7Kahng & Cichy, UCSD ©2003 Inversion Property AB S C in FA !C out (A, B, C in ) = C out (!A, !B, !C in ) C out AB S FAC out C in !S (A, B, C in ) = S(!A, !B, !C in )   Inverting all inputs to a FA results in inverted values for all outputs Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.8Kahng & Cichy, UCSD ©2003 Exploiting the Inversion Property A0A0 B0B0 S0S0 C 0 =C in FA’ A1A1 B1B1 S1S1 A2A2 B2B2 S2S2 A3A3 B3B3 S3S3 C out =C 4 Now need two “flavors” of FAs regular cellinverted cell Minimizes the critical path (the carry chain) by eliminating inverters between the FAs (will need to increase the transistor sizing on the carry chain portion of the mirror adder). Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.9Kahng & Cichy, UCSD ©2003 Fast Carry Chain Design  The key to fast addition is a low latency carry network  What matters is whether in a given position a carry is l generatedG i = A i & B i = A i B i l propagatedP i = A i  B i (sometimes use A i | B i ) l annihilated (killed)K i = !A i & !B i  Giving a carry recurrence of C i+1 = G i | P i C i C 1 = C 2 = C 3 = C 4 = Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.10Kahng & Cichy, UCSD ©2003 Fast Carry Chain Design  The key to fast addition is a low latency carry network  What matters is whether in a given position a carry is l generatedG i = A i & B i = A i B i l propagatedP i = A i  B i (sometimes use A i | B i ) l annihilated (killed)K i = !A i & !B i  Giving a carry recurrence of C i+1 = G i | P i C i C 1 = G 0 | P 0 C 0 C 2 = G 1 | P 1 G 0 | P 1 P 0 C 0 C 3 = G 2 | P 2 G 1 | P 2 P 1 G 0 | P 2 P 1 P 0 C 0 C 4 = G 3 | P 3 G 2 | P 3 P 2 G 1 | P 3 P 2 P 1 G 0 | P 3 P 2 P 1 P 0 C 0 Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.11Kahng & Cichy, UCSD ©2003 Binary Adder Landscape synchronous word parallel adders ripple carry adders (RCA) carry prop min adders signed-digit fast carry prop residue adders adders adders Manchester carry parallel conditional carry carry chain select prefix sum skip T = O(N), A = O(N) T = O(1), A = O(N) T = O(log N) A = O(N log N) T = O(  N), A = O(N) T = O(N) A = O(N)

CSE241 R2 Datapath/Memory.12Kahng & Cichy, UCSD ©2003 Parallel Prefix Adders (PPAs)  Define carry operator € on (G,P) signal pairs l € is associative, i.e., [(g’’’,p’’’) € (g’’,p’’)] € (g’,p’) = (g’’’,p’’’) € [(g’’,p’’) € (g’,p’)] € (G’’,P’’)(G’,P’) (G,P) where G = G’’  P’’G’ P = P’’P’ € €€ € G’G’ !G G ’’ P ’’ Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.13Kahng & Cichy, UCSD ©2003 PPA General Structure  Given P and G terms for each bit position, computing all the carries is equal to finding all the prefixes in parallel (G 0,P 0 ) € (G 1,P 1 ) € (G 2,P 2 ) € … € (G N-2,P N-2 ) € (G N-1,P N-1 )  Since € is associative, we can group them in any order l but note that it is not commutative  Measures to consider l number of € cells l tree cell depth (time) l tree cell area l cell fan-in and fan-out l max wiring length l wiring congestion l delay path variation (glitching) P i, G i logic (1 unit delay) S i logic (1 unit delay) C i parallel prefix logic tree (1 unit delay per level) Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.14Kahng & Cichy, UCSD ©2003 Adder Types  RCA = Ripple Carry  MCC = Manchester Carry Chain  CCSka = Carry-Chain haSave  VCSka =  CCSia = Carry-Chain Save with Invert  BK = Brent Kung  Others: (array type) l Ling-Ling l ELM l Kogge-Stone

CSE241 R2 Datapath/Memory.15Kahng & Cichy, UCSD ©2003 Adder Speed Comparisons Slide courtesy of Mary Jane Irwin, Penn state ns

CSE241 R2 Datapath/Memory.16Kahng & Cichy, UCSD ©2003 Adder Average Power Comparisons Slide courtesy of Mary Jane Irwin, Penn state Watt

CSE241 R2 Datapath/Memory.17Kahng & Cichy, UCSD ©2003 Power-Delay Product of Adder Comparisons From Nagendra, 1996 Slide courtesy of Mary Jane Irwin, Penn state Power Delay Product

CSE241 R2 Datapath/Memory.18Kahng & Cichy, UCSD ©2003 Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter -Register file and pipeline registers l Multiplexers, decoders  Control l Finite state machines (PLA, ROM, random logic)  Memory l SRAM cell l DRAM l Other types

CSE241 R2 Datapath/Memory.19Kahng & Cichy, UCSD ©2003 Parallel Programmable Shifters Data In Control = Data Out Shift amount Shift direction Shift type (logical, arith, circular) Shifters used in multipliers, floating point units Consume lots of area if done in random logic gates Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.20Kahng & Cichy, UCSD ©2003 Shifters - Applications  Linear shifting l Concatenate 2 words (N-bits) and pull out a contiguous N-bit word. l Take an portion of a word and shift to to the left or right -Multiply by 2 M -Pad the emptied position with 0’s or 1’s -Arithmetic shifts –Left shift, pad 0’s –Right shift, pad 1’s  Barrel shifting l Emptied position filled with bit dropped off. l Rotational shifting… circular convolution. wordA wordB wordC Slide courtesy of Ken Yang, UCLA

CSE241 R2 Datapath/Memory.21Kahng & Cichy, UCSD ©2003 A Programmable Binary Shifter rgtnopleft AiAi A i-1 B i-1 BiBi AiAi A i-1 rgtnopleftBiBi B i-1 A1A1 A0A0 010A1A1 A0A0 A1A1 A0A0 1000A1A1 A1A1 A0A0 001A0A0 0 Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.22Kahng & Cichy, UCSD ©2003 A Programmable Binary Shifter rgtnopleft AiAi A i-1 B i-1 BiBi AiAi A i-1 rgtnopleftBiBi B i-1 A1A1 A0A0 010A1A1 A0A0 A1A1 A0A0 1000A1A1 A1A1 A0A0 001A0A0 0 Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.23Kahng & Cichy, UCSD © bit Barrel Shifter A0A0 A1A1 A2A2 A3A3 B0B0 B1B1 B2B2 B3B3 Sh1 Sh2 Sh3 Sh0Sh1Sh2Sh3 Example: Sh0 = 1 B 3 B 2 B 1 B 0 = A 3 A 2 A 1 A 0 Sh1 = 1 B 3 B 2 B 1 B 0 = A 3 A 3 A 2 A 1 Sh2 = 1 B 3 B 2 B 1 B 0 = A 3 A 3 A 3 A 2 Sh3 = 1 B 3 B 2 B 1 B 0 = A 3 A 3 A 3 A 3 Area dominated by wiring Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.24Kahng & Cichy, UCSD © bit Barrel Shifter A0A0 A1A1 A2A2 A3A3 B0B0 B1B1 B2B2 B3B3 Sh1 Sh2 Sh3 Sh0Sh1Sh2Sh3 Example: Sh0 = 1 B 3 B 2 B 1 B 0 = A 3 A 2 A 1 A 0 Sh1 = 1 B 3 B 2 B 1 B 0 = A 3 A 3 A 2 A 1 Sh2 = 1 B 3 B 2 B 1 B 0 = A 3 A 3 A 3 A 2 Sh3 = 1 B 3 B 2 B 1 B 0 = A 3 A 3 A 3 A 3 Area dominated by wiring Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.25Kahng & Cichy, UCSD © bit Barrel Shifter Layout Width barrel ~ 2 p m N N = max shift distance, p m = metal pitch Delay ~ 1 fet + N diff caps Width barrel Only one Sh# active at a time l Slide courtesy of Mary Jane Irwin, Penn state multiplier

CSE241 R2 Datapath/Memory.26Kahng & Cichy, UCSD ©2003 Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers  Memories l SRAM cell l DRAM l Other types

CSE241 R2 Datapath/Memory.27Kahng & Cichy, UCSD ©2003 Multiplication  Binary multiplication l Same with 2’s complement l Sign-extend the negative.  2’s complement N-bit numbers l Rhombus of N partial products l Product has 2N number of bits. l Negative multiplier -Last term is equivalent to 2’s complement. l Sign extension is tricky -Drop 1’s into sign bit if 0’s -Otherwise invert sign bit x Multiplicand(B) = -13 Multiplier(A) = Multiplicand*( …) = Multiplcand*(1111…) = -1*Multiplicand = Nine bits + 1 sign. Partial products Slide courtesy of Ken Yang, UCLA

CSE241 R2 Datapath/Memory.28Kahng & Cichy, UCSD ©2003 Parallel Multipliers  Each partial product is independent.  Multiply with 2 steps. l First step: generate partial products in parallel. l Second step: add the partial products.  Generating the Partial Products l PP I,J = A I AND B J l Sign bit is a little different. -S I,N = B(sign)’ NAND A(sign) A0A0 A1A1 A2A2 B 0_N-1 PP 00 PP 01 PP 02 PP 10 PP 11 PP 12 Slide courtesy of Ken Yang, UCLA

CSE241 R2 Datapath/Memory.29Kahng & Cichy, UCSD ©2003 Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers  Memories l SRAM cell -6T l DRAM -1T l Other types -1T SRAM

CSE241 R2 Datapath/Memory.30Kahng & Cichy, UCSD ©2003 Semiconductor Memories RWM Read Write Memory NVRWM Non Volatile ROM Read Only Random Access Non-Random Access EPROMMask- programmed SRAM (cache, register file) FIFO/LIFOE 2 PROM DRAMShift Register CAM FLASHElectrically- programmed (PROM) Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.31Kahng & Cichy, UCSD ©2003 Second Level Cache (SRAM) A Typical Memory Hierarchy Control Datapath Secondary Memory (Disk) On-Chip Components RegFile Main Memory (DRAM) Data Cache Instr Cache ITLB DTLB eDRAM Speed (ns):.1’s 1’s 10’s 100’s 1,000’s Size (bytes): 100’s K’s 10K’s M’s T’s Cost: highest lowest  By taking advantage of the principle of locality: l Present the user with as much memory as is available in the cheapest technology. l Provide access at the speed offered by the fastest technology. Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.32Kahng & Cichy, UCSD ©2003 Access Time comparison TypeTime (ns) RDRAM30ns SDRAM20ns SRAM10ns FLASH80ns (.15.u) FRAM10ns ROM (read)50ns  Latency Time to read  Bandwidth Throughput of system (Generalized ~.13u)

CSE241 R2 Datapath/Memory.33Kahng & Cichy, UCSD ©2003 Embedded RAM  SRAMs and DRAMs SRAMDRAM 6-T / 4-T memory cellCapacitor based storage. High Density Low Power – important requirement for system on chip Refresh cycles required – hence high power Slower Data AccessFast Access cycles Relative transistor sizes determine Noise Margin Capacitor size determines Noise Margin  Noise Margin l Important figure of merit l Degraded with scaling

CSE241 R2 Datapath/Memory.34Kahng & Cichy, UCSD ©2003 Read-Write Memories (RAMs)  Static – SRAM l data is stored as long as supply is applied l large cells (6 fets/cell) – so fewer bits/chip l fast – so used where speed is important (e.g., caches) l differential outputs (output BL and !BL) l use sense amps for performance l compatible with CMOS technology  Dynamic – DRAM l periodic refresh required l small cells (1 to 3 fets/cell) – so more bits/chip l slower – so used for main memories l single ended output (output BL only) l need sense amps for correct operation l not typically compatible with CMOS technology Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.35Kahng & Cichy, UCSD © transistor SRAM Cell !BLBL WL M1 M2 M3 M4 M5 M6Q !Q Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.36Kahng & Cichy, UCSD ©2003 SRAM Cell Analysis (Read) !BL=1 BL=1 WL=1 M1 M4 M5 M6 Q=1 !Q=0 C bit Read-disturb (read-upset): must carefully limit the allowed voltage rise on !Q to a value that prevents the read-upset condition from occurring while simultaneously maintaining acceptable circuit speed and area constraints Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.37Kahng & Cichy, UCSD ©2003 SRAM Cell Analysis (Read) !BL=1 BL=1 WL=1 M1 M4 M5 M6 Q=1 !Q=0 C bit Cell Ratio (CR) = (W M1 /L M1 )/(W M5 /L M5 ) V !Q = [(V dd - V Tn )(1 + CR  (CR(1 + CR))]/(1 + CR) Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.38Kahng & Cichy, UCSD ©2003 Read Voltages Ratios V dd = 2.5V V Tn = 0.5V Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.39Kahng & Cichy, UCSD ©2003 SRAM Cell Analysis (Write) !BL=1 BL=0 WL=1 M1 M4 M5 M6 Q=1 !Q=0 Pullup Ratio (PR) = (W M4 /L M4 )/(W M6 /L M6 ) V Q = (V dd - V Tn )  ((V dd – V Tn ) 2 – (  p /  n )(PR)((V dd – V Tn - V Tp ) 2 ) Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.40Kahng & Cichy, UCSD ©2003 Write Voltages Ratios V dd = 2.5V |V Tp | = 0.5V  p /  n = 0.5 Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.41Kahng & Cichy, UCSD ©2003 Cell Sizing  Keeping cell size minimized is critical for large caches  Minimum sized pull down fets (M1 and M3) l Requires minimum width and longer than minimum channel length pass transistors (M5 and M6) to ensure proper CR l But sizing of the pass transistors increases capacitive load on the word lines and limits the current discharged on the bit lines both of which can adversely affect the speed of the read cycle  Minimum width and length pass transistors l Boost the width of the pull downs (M1 and M3) l Reduces the loading on the word lines and increases the storage capacitance in the cell – both are good! – but cell size may be slightly larger Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.42Kahng & Cichy, UCSD ©2003 6T-SRAM Layout V DD GND Q Q WL BL M1 M3 M4M2 M5M6 Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.43Kahng & Cichy, UCSD © Transistor DRAM Cell M1 X BL WL XV dd -V t WL write “1” BL V dd Write: C s is charged (or discharged) by asserting WL and BL Read: Charge redistribution occurs between C BL and C s CsCs read “1” V dd /2 sensing Read is destructive, so must refresh after read C BL Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.44Kahng & Cichy, UCSD © T DRAM Cell Slide courtesy of Mary Jane Irwin, Penn state

CSE241 R2 Datapath/Memory.45Kahng & Cichy, UCSD ©2003 DRAM Cell Observations  DRAM memory cells are single ended (complicates the design of the sense amp)  1T cell requires a sense amp for each bit line due to charge redistribution read  1T cell read is destructive; refresh must follow to restore data  1T cell requires an extra capacitor that must be explicitly included in the design  A threshold voltage is lost when writing a 1 l can be circumvented by bootstrapping the word lines to a higher value than V dd  Not usually available on chip, unless analog elements are present

CSE241 R2 Datapath/Memory.46Kahng & Cichy, UCSD ©2003 Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers  Memories l SRAM cell l DRAM l Other types -1T SRAM

CSE241 R2 Datapath/Memory.47Kahng & Cichy, UCSD ©2003 Non-Volatile Memories (Present)  Standard ROM l Programmed during fabrication l Diffusion programmable / metal or via programmable options  One Time Programmable (OTP) ROM Involves blowing of fuses – after fabrication  Erasable Programmable ROM (EPROM) Erase and Program through UV light application  Electrically Erasable Programmable ROM (EEPROM) l Programmable by application of high voltage l Involves two supply voltages – normally not a problem for today’s chips

CSE241 R2 Datapath/Memory.48Kahng & Cichy, UCSD ©2003 Future Memory Lanscape  Magneto-resistive RAM (~2004 ) l IBM, Motorola, Infineon, Nonvolatile Electronics (NVE)  Ferro-electric RAM (FRAM/ FeRAM) ( ~ 2004) l Ramtron, Symetrix, Fujitsu, Toshiba, IBM/ Infineon, Samsung, Motorola, Hitachi, Matsuhita, Micron  Ovonic Unified Memory (OUM) (~2004) l Ovonyx, Intel, STMicroelectronics, British Aerospace  Nano-Floating Gate memory ( >2005 )  Single/ Few electron memories (SET) ( >2007)  Molecular memories ( >2010 )

CSE241 R2 Datapath/Memory.49Kahng & Cichy, UCSD ©2003 Next Time  Recitation 3 l Performance coding: Verilog l Synthesis  Future l Lec #15 full lecture on memories l Recitation: -memory generators