Digital Integrated Circuits A Design Perspective Designing Sequential Logic Circuits
Naming Conventions In our text: a latch is level sensitive a register is edge-triggered There are many different naming conventions For instance, many books call edge-triggered elements flip-flops This leads to confusion however
Latch versus Register Latch Register stores data when clock is low stores data when clock rises D Q D Q Clk Clk Clk Clk D D Q Q
Latch-Based Design N latch is transparent when f = 0 P latch is transparent when f = 1 f N P Logic Latch Latch Logic
Timing Definitions CLK t Register t t D Q D DATA CLK STABLE t t Q DATA su hold D DATA CLK STABLE t t c 2 q Q DATA STABLE t
Writing into a Static Latch Use the clock as a decoupling signal, that distinguishes between the transparent and opaque states D CLK Forcing the state (can implement as NMOS-only) Converting into a MUX
Mux-Based Latches Negative latch Positive latch (transparent when CLK= 0) Positive latch (transparent when CLK= 1) 1 D Q CLK 1 D Q CLK
Edge-Triggered Flip-flop
Static SR Flip-Flop Clock version? Writing data by pure force No clock needed (Asynchronous)
Registers for Pipelining
Registers for Pipelining ? Pipelined
Semiconductor Memories
Memory Memory Classification Memory Architectures The Memory Core Periphery
Semiconductor Memory Classification Non-Volatile Read-Write Memory Read-Write Memory Read-Only Memory Random Non-Random EPROM Mask-Programmed Access Access 2 E PROM Programmable (PROM) SRAM FIFO FLASH LIFO DRAM Shift Register CAM
Memory Timing: Definitions
Memory Architecture: Decoders bits M bits S S Word 0 Word 0 S 1 Word 1 A Word 1 S 2 Storage Storage Word 2 A Word 2 cell 1 cell words A N S K 2 1 Decoder N 2 2 Word N 2 2 Word N 2 2 S N 2 1 Word N 2 1 Word N 2 1 K 5 log N 2 Input-Output Input-Output ( M bits) ( M bits) Intuitive architecture for N x M memory Too many select signals: N words == N select signals K = log 2 N Decoder reduces the number of select signals
Array-Structured Memory Architecture
Hierarchical Memory Architecture
Read-Only Memory Cells (ROM) BL BL BL VDD WL WL WL 1 BL BL BL WL WL WL GND Diode ROM MOS ROM 1 MOS ROM 2
MOS OR ROM BL [0] BL [1] BL [2] BL [3] WL [0] V WL [1] WL [2] V WL [3] DD WL [1] WL [2] V DD WL [3] V bias Pull-down loads
MOS NOR ROM WL [0] V Pull-up devices GND WL [1] WL [2] GND WL [3] BL DD Pull-up devices WL [0] GND WL [1] WL [2] GND WL [3] BL [0] BL [1] BL [2] BL [3]
MOS NOR ROM Layout Programmming using the Active Layer Only Cell (9.5l x 7l) Programmming using the Active Layer Only Polysilicon Metal1 Diffusion Metal1 on Diffusion
MOS NAND ROM V DD Pull-up devices BL [0] BL [1] BL [2] BL [3] WL [0] WL [1] WL [2] WL [3] All word lines high by default with exception of selected row
MOS NAND ROM Layout Programmming using the Metal-1 Layer Only Cell (8l x 7l) Programmming using the Metal-1 Layer Only No contact to VDD or GND necessary; Loss in performance compared to NOR ROM drastically reduced cell size Polysilicon Diffusion Metal1 on Diffusion
Precharged MOS NOR ROM V f pre DD Precharge devices WL [0] GND WL [1] WL [2] GND WL [3] BL [0] BL [1] BL [2] BL [3] PMOS precharge device can be made as large as necessary, but clock driver becomes harder to design.
Non-Volatile Memories The Floating-gate transistor (FAMOS) D Source Drain t ox t ox n + p n +_ Substrate Schematic symbol Device cross-section
Floating-Gate Transistor Programming 20 V 10 V 5 V D S Avalanche injection 0 V 2 5 V D S Removing programming voltage leaves charge trapped 5 V 2 2.5 V D S Programming results in higher V T .
FLOTOX EEPROM Fowler-Nordheim I-V characteristic FLOTOX transistor Floating gate Gate I Source Drain V 20 – 30 nm -10 V GD 10 V n 1 n 1 Substrate p 10 nm Fowler-Nordheim I-V characteristic FLOTOX transistor
EEPROM Cell BL WL V Absolute threshold control is hard Unprogrammed transistor might be depletion 2 transistor cell V DD
Flash EEPROM Many other options … Control gate n drain programming p- Floating gate erasure Thin tunneling oxide n 1 source n 1 drain programming p- substrate Many other options …
Cross-sections of NVM cells Flash EPROM Courtesy Intel
Characteristics of State-of-the-art NVM
Read-Write Memories (RAM) STATIC (SRAM) Data stored as long as supply is applied Large (6 transistors/cell) Fast Differential DYNAMIC (DRAM) Periodic refresh required Small (1-3 transistors/cell) Slower Single Ended
6-transistor CMOS SRAM Cell WL V DD M M 2 4 Q Q M M 6 5 M M 1 3 BL BL
CMOS SRAM Analysis (Read) WL V DD BL M 4 BL Q = Q = 1 M 6 M 5 V M V DD 1 DD V DD C C bit bit
CMOS SRAM Analysis (Write) BL = 1 Q M 4 5 6 V DD WL
6T-SRAM — Layout VDD GND Q WL BL M1 M3 M4 M2 M5 M6
Resistance-load SRAM Cell WL V DD R R L L Q Q M M 3 4 BL M M BL 1 2 Static power dissipation -- Want R L large Bit lines precharged to V DD to address t p problem
SRAM Characteristics
3-Transistor DRAM Cell No constraints on device ratios WWL BL 1 M X 3 2 C S RWL V DD T D No constraints on device ratios Reads are non-destructive Value stored at node X when writing a “1” = V WWL -V Tn
3T-DRAM — Layout BL2 BL1 GND RWL WWL M3 M2 M1
1-Transistor DRAM Cell
DRAM Cell Observations 1T DRAM requires a sense amplifier for each bit line, due to charge redistribution read-out. DRAM memory cells are single ended in contrast to SRAM cells. The read-out of the 1T DRAM cell is destructive; read and refresh operations are necessary for correct operation. Unlike 3T cell, 1T cell requires presence of an extra capacitance that must be explicitly included in the design. When writing a “1” into a DRAM cell, a threshold voltage is lost. This charge loss can be circumvented by bootstrapping the word lines to a higher value than VDD
Sense Amp Operation D V (1) (0) t Sense amp activated PRE BL Sense amp activated Word line activated
1-T DRAM Cell Cross-section Layout Capacitor Metal word line Poly SiO 2 Field Oxide n + Inversion layer induced by plate bias M word 1 line Diffused bit line Polysilicon plate Polysilicon gate Cross-section Layout
Periphery Decoders Sense Amplifiers
Row Decoders Collection of 2M complex logic gates Organized in regular and dense fashion (N)AND Decoder NOR Decoder
Hierarchical Decoders Multi-stage implementation improves performance • • • WL 1 WL A A A A A A A A A A A A A A A A 1 1 1 1 2 3 2 3 2 3 2 3 • • • NAND decoder using 2-input pre-decoders A A A A A A A A 1 1 3 2 2 3
Dynamic Decoders 2-input NOR decoder 2-input NAND decoder V WL A A A A Precharge devices GND GND V DD WL 3 WL 3 WL WL 2 2 WL 1 WL 1 WL WL V f A A A A DD 1 1 A A A A 1 1 f 2-input NOR decoder 2-input NAND decoder
4-input pass-transistor based column decoder S BL 1 2 3 D 2-input NOR decoder Advantages: speed (tpd does not add to overall memory access time) Only one extra transistor in signal path Disadvantage: Large transistor count
4-to-1 tree based column decoder BL BL BL BL 1 2 3 A A A 1 A 1 D Number of devices drastically reduced Delay increases quadratically with # of sections; prohibitive for large decoders Solutions: buffers progressive sizing combination of tree and pass transistor approaches
Sense Amplifiers Idea: Use Sense Amplifer small s.a. transition input output
Differential Sense Amplifier V DD M M 3 4 y Out bit M M bit 1 2 SE M 5 Directly applicable to SRAMs
DRAM Timing