Presentation is loading. Please wait.

Presentation is loading. Please wait.

8 Memory Subsystem Contents 1. Classification 2. Architectures

Similar presentations


Presentation on theme: "8 Memory Subsystem Contents 1. Classification 2. Architectures"— Presentation transcript:

1 8 Memory Subsystem Contents 1. Classification 2. Architectures
3. Circuits 1) SRAM 2) DRAM 3) Address decoders 4) Sense Amplifier 4. PLA 5. Gate Matrix 6. ROM

2 1. Classification RWM(Read-Write Memory) NVRWM(Nonvolatile RWM) ROM
Random Access : SRAM, DRAM Sequential Access : FIFO, Stack(LIFO) Content Access : CAM(Associative Memory) NVRWM(Nonvolatile RWM) EPROM E2PROM FLASH ROM Mask Programmed OTP(One-Time Programmable) ; PROM

3 2. Architectures 1-dimensional memory : N(words)  M(bits/word)
Decoder reduces the number of wires

4 2-dimensional array structure uses column decoder to make the chip square.

5 Hierarchical memory architecture using block address
Block address is used to activate only one block. Other blocks(nonactive) are put in power-saving mode.

6 Architecture of large memory

7 Basic organization for a 4K SRAM(1989 Philips research).

8 Schematic circuit diagram of 64K SRAM(Hitachi 1982).

9 Another schematic of SRAM(column grouping).
SRAM chip block diagram

10 Design Considerations
bit line precharge, sense amp enable 등을 위한 모든 clock의 발생은 address, CS, WE 등 신호의 transition을 detect하는 회로에 의해 internal clock 발생기가 trigger 됨으로써 이루어진다.(전력소모 억제) 2-stage row address decoding : WL driver decodes A1. Sense amp는 column switch 앞에, 혹은 뒤에 놓을 수 있다. 앞에 놓을 경우 : column의 cell pitch에 맞추기 위해 아주 simple한 SA를 사용 뒤에 놓을 경우 : 상대적으로 복잡한 SA 사용가능(SA의 input cap.는 증가 윗 그림은 column을 (1024 column의 경우, by 4 인 경우) 크게 4로 나누고, 각각을 16으로 나누어 각 소 group의 16개의 column을 한 SA가 담당토록하는 compromise 임.

11 3. Circuits Address decoders Single stage(10-to-1024) decoder
i) # of transistors = 20/NAND 10  1024 = 20,480 ii) Large fanout requirement on buffers generating Xi’s. iii) series-connected transistors limit discharge time.

12 Predecoded scheme i) Group 2 bits and predecode the word using 2-bit segments ; (X9, X8), (X7, X6), …. (X1, X0) ii) 2nd-stage decoder logic # of transistors ; 10/NAND 5    12,000

13 Divided Word Line architecture
Global word line selects a block, while the local line is used to activate a word line within the selected block.

14 Hierarchical word decoding logic

15 Row decoder circuits (Complementary AND, pseudo NMOS, cascade NAND)

16 Typical Symbolic Layout Style of row decoders

17 Various other decoder circuits(Power saving, Decoder-powered)

18 Tree style column decoder

19  Sense Amplifier for SRAM Single differential stage의 전압이득 Av = gm·ro
gm : current/voltage(transducer gain) of M1, M2 ro : output impedance( = ro M1 ro M2) Av가 크기 위해서는 M1과 P1(M2와 P2)가 모두 saturation 영역에 있어야 함. ( Sat. 영역에서 gm= 가 크고, ro도 크기때문) 따라서 point X의 전압을 로 precharge해 두는 것이 response time을 짧게 하고, signal swing을 크게하는데 유리.

20 Single-ended amp를 두개 symmetric하게 연결함으로써 voltage gain을 높인다
Single-ended amp를 두개 symmetric하게 연결함으로써 voltage gain을 높인다. (다음 단에 latch나 another double-ended amp. Stage 혹은 diff. Input을 갖는 output buffer를 달면 된다.)

21 SRAM sense amp precharged to
SA의 출력점을 로 충전하여 SA의 high-gain 영역에서 동작토록하는 회로. 1 : V1은 VDD로 prech 됨 power-down 상태 2 : WL이 access되면 V1을 로 prech. 3 : BL, BL에 전압차가 생기면 high-gain SA 동작하면서 column decoder/switch인 pass gate가 동작 data output bus로 신호전달 4 : power-down 상태

22 2차구간에서 Static 전력소모가 있음

23 SRAM circuit before sense Amp.

24 Evolution of SRAM cells
i) 6- and 4-transistor SRAM cells

25 ii) Dual-port/double-ended access and dual-port/single access

26 iii) Content-addressable memory cell

27 Evolution of DRAM cells
(a) basic bi-stable f/f w/o load (b) 2C-2D(C:control lines, D:data lines)

28 ( c) 1C-2D (d) 2C-1D scheme

29 (e) 1C-1D (f) 1C-1D(industry standard DRAM)

30 DRAM read cycle

31

32

33 Dummy word line scheme

34

35 DRAM differential sense amp with dummy cell structure

36 Cross-coupled Latch Assume node 1 & 2 are precharged, and node 2 begins to drop. When clk is on, node 3 pulls down. N2 strongly turns on, leaving n1 off. 주의) cross-coupled TR pair 의 layout이 대칭이어야 함. threshold 전압차이에 의한 영향

37 Charge transfer-based Circuit

38 Charge-transfer Circuit(cont’d)
Operation Sequence  As clk goes high, node 1 & 2 are precharged; V1 (Vref-Vth, n2), V2 min(VDD, Vclk-Vth, n3) > Vref  n3 turns off.  Cell(n1, Cc) is selected(Assume Vc was ‘0’) Due to charge sharing between Cc & Clarge, V1 becomes  n2 is turned on until is transferred from Cout .i.e., until V1 reaches Vref-Vth. Voltage drop at node 2 due to charge transfer is : amplif. factor

39 Sense amplifier for single - Tr. DRAM cells.
dummy cell(Cd=Cc), dummy bit line complete Symmetry

40 Operation 1. Precharge 전에는 BL, DBL 모두 로 되어 있다.* precharge(n1, n2 on)를 통해 node 1,2가 pull up 된다. 그리고 n1과 n2는 off된다. 2. Cc와 Cd가 select 되어 charge transfer에 의해 ( =0라 하자) node 1의 전압은 node 2의 전압보다 많이 강하 된다. ( Cd는 로 충전되어 있었기 때문)* 3. Clk1이 high가 되어 n4는 on, n5는 off(V1은 Vss로 됨) n7이 다시 conduction 되어 BL이 Vss로 방전되어 Cc가 ‘0’으로 restore 된다. 4. Sel ‘0’로 하여 Cc를 isolate 한 후에 clk2를 on하여 BL과 DBL을 로 함. 그 후에 seld ‘0’ 하여 Cd에 를 만들고 n3를 off 시킴. (Cc에 ‘1’이 저장되어 있는 경우도 비슷한 방식으로 동작한다.)

41 Column SA와 main SA를 사용한 SRAM SA 회로
n개의 colunm간에 multiplex

42 (input 신호) (Column SA가 있는 경우)

43 (Column SA가 없는 경우)

44 Resistive-load SRAM cells
Undoped polysilicon as resistors with R  1 / Just enough(10-12A) to compensate for leakage current of 10-15A BL & BL precharged to VDD, thus preventing slow charging of BL, BL.

45 Standby current(per cell)
TFT SRAM cell Instead of traditional PMOS devices, pull-up transistors realized by PMOS TFT(thin-film transistor) on top of the cell structure. ON current : 10-8A, OFF current : 10-13A Complementary CMOS Resistive Load TFT cell Number of transistors 6 4 4(+2 TFT) Cell size 58.2m2 (0.7 m rule) 40.8 m2 (0.7 m rule) 41.1m2 (0.8 m rule) Standby current(per cell) 10-15A 10-12A 10-13A

46 Bipolar SRAM cells : Very fast SRAMs are necessary for cache & microcode memory in high-speed computers. SBD(Schottky Barrier Diode) bipolar SRAM

47 3-T DRAM cell : Resulted by removing the loads to obtain 4-T DRAM cell and further removing redundemt complementary pull down device Separate Read Word line(RWL) & Write word line(WWL) Refreshing by writing the inverted BL2 signal onto BL1.

48 1-T DRAM cell : : charge transfer ratio

49 1-T DRAM cell structure :

50 Trench capacitor type & Stacked-capacitor type

51 NOR-type address decoder

52 NAND-type address decoder

53 Reducing coupling noise bet. WL&BL : Folded bit line.

54 Reducing coupling noise bet. BL & neighbor bit lines :
Transposed bit line : worst-case variation on each bit line. : signal swing on bit line.

55 4. PLA(Programmable Logic Array)
Generally two classes exist for implementing control logic functions. Multi-level logic through logic optimization on random logic Regular structure type, i.e., ROM : firmware, mask-programmable PLA : Customized logic to remove unnecessary Product(AND) terms and sum(OR) terms.

56 Sum of product form, F = ab+c d
i) NAND-NAND PLA 이러한 2-level Boolean 식은 decoder를 2단 연속 붙인것으로 볼 수 있다. a b c d F a b c d AND OR F

57 NOR-NOR is faster, but requires larger space
i) NOR-NOR PLA NOR-NOR is faster, but requires larger space ( 30% additional) than NAND-NAND. a b c d F a b F c d

58 Various ways for decoding
(NOR형 decoder) (NAND형 decoder) NOR : fast NAND : compact : diffusion : polysilicon : metal

59 Complementary형 decoder(CMOS-like)
저전력 소모 large area

60 MOS ROM vs. MOS PLA

61 (PLA) (ROM)

62 Various Programmable Logic Devices(PLD’s)
FSM(Finite State Machine) FPLA(Field-Programmable PLA) : PLA with latched feedback

63 PLA(Programmable Array Logic)
= FPLA where the OR array is not programmable, AND array is field programmable. ROM : (single)mask programmable PLA:(multiple) mask programmable FPLA:field programmable, bulky PAL:field programmable, less bulky

64 (Multilevel Gate Array)
MGA (Multilevel Gate Array)

65 Associative Logic Matrix

66 Pseudo-NMOS PLA

67 Dynamic NMOS PLA NOR형 NAND형
T1 : product line precharge, input latch in T2 : sum line precharge T3 : product line evaluate T4 : sum line evaluate, output latch out

68 Dynamic CMOS PlA(2-phase) - I

69 T1 : product line precharge, latch input
T2 : product line evaluate, T2’:sum line precharge T3 : sum line evaluate T4 : latch output Dummy row는 모든 TR pair 중의 하나는 항상 ‘ON’ 상태이므로 큰 capacitance, C가 있는것과 같아 Vx 파형 은 파형이 delay 된 것과 같다. C

70 Dynamic CMOS PLA - II T1 : product line precharge, latch input/output(master-slave 방식) T2 : product line evaluate T3 : AND-OR plane connect, sum line evaluate T4 : sum line evaluate

71 Dynamic CMOS PLA - V (NORA type)
AND plane : NMOS OR plane : PMOS T1 T2 T2(=low) : p-line precharge, s-line predischarge latch input T1(=high) : p-line, s-line evaluate, latch output

72 Decoded PLA partition input variables into multiple groups

73 PLA folding(row & column folding)
row folding : partition inputs into two groups such that one can find an order of rows(product lines) with one input group fed from below while the other input group fed from top.

74 PPL(Programmable Path Logic)
merging of AND and OR plane. Do=1 if i.e., ; two-level Boolean eq. OR

75 Associative Logic Array(subset of Storage Logic Array)
Ex.

76 MGA(Multiple Gate Array, or Multi-level PLA)

77 MGA with three associative logic matrices

78 5. Gate Matrix Use regularly-spaced polysilicon lines for both gate electrode and interconnect. (a) : NMOS TR과 채널이 분리됨 (b) : 각 TR을 polysilicon grid 상에 배치 (c) : series(혹은 parallel)로 연결된 TR group을 한 row에 배치하고 연결.

79 Rule 1. Polysilicon은 일정간격으로 수직방향으로 달린다. 2. 인접한 column같은 row에 위치한 TR의 series 연결은 diffusion butting으로 한다. 3. Metal은 parallel 연결, 인접되지 않은 TR의 series 연결 및 각 gate 간의 연결을 하며, 수평 및 수직방향으로 달린다. 4. Transistor는 polysilicon column상에서만 존재한다. 5. Diffusion wire는 polysilicon grid 중간으로 수직방향으로(짧게) 달릴 수 있다.

80 Static CMOS layout in Gate Matrix
 L(f,h) is realizable if h is realizable.  h is realizable if every diffusion runs(vertical) it generates is legal.

81 Automation of Gate Matrix Layout : Ref. O. Wing et. al
Automation of Gate Matrix Layout : Ref. O.Wing et.al. “Gate Matrix Layout”, IEEE Trans. on CAD, Vol.4, July. 1985 Find a function f(gate assignment) : assign the transistor gate and output terminal to each column(TR gates connected to the same node must be assigned same column) Find function(net assignment) : assign the net(segment of horiz. Metal line) to each row. Find layout L(f,h) which is realizable* & has min. rows.

82 Problem Formulation for Gate Matrix Optimization

83 Example : CMOS Half-Adder Circuit

84 Net Representation(Case I)
Gate Nets 1 N1, N2 2 N1, N3 3 N4 4 N2, N4 5 N1, N2, N3, N5 6 N3, N4 7 N5 Net Representation(Case II) Problem Statement ; Given a set of nets which connect at gates, find a permutation of gates and an assignment of nets to tracks, such that the number of tracks is minimized.

85 6. ROM(Read Only Memory) ROM cells
Diode cell : consumes large power from WL Transistor(BJT) cell : consumes less current(IB vs. IC) MOSFET cell :

86 Sharing supply voltage lines and mirroring cells

87 NOR ROM with contact programming

88 NOR ROM with Vth-raising implant or thick-oxide implants.

89 NAND Rom

90 논문을 쓰기 전에 논문을 쓰려면 두 가지 중의 하나를 고르라. 현재 매우 실용적이거나, 당신의 시기에 파급효과가 큰 기술 분야를 고르든지, 아니면 매우 학문적, 이론적인 탁월성을 추구하라.

91 성공의 비결 힘든 일을 시작하라. 그러면 심각해 질 것이다. 물러서지만 않으면 성공할 것이다.


Download ppt "8 Memory Subsystem Contents 1. Classification 2. Architectures"

Similar presentations


Ads by Google