ECE 4100/6100 Advanced Computer Architecture Lecture 11 DRAM and Storage Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia.

Slides:



Advertisements
Similar presentations
Faculty of Information Technology Department of Computer Science Computer Organization Chapter 7 External Memory Mohammad Sharaf.
Advertisements

RAID Oh yes Whats RAID? Redundant Array (of) Independent Disks. A scheme involving multiple disks which replicates data across multiple drives. Methods.
Chapter 5 Internal Memory
Computer Organization and Architecture
Prith Banerjee ECE C03 Advanced Digital Design Spring 1998
CP1610: Introduction to Computer Components Primary Memory.
+ CS 325: CS Hardware and Software Organization and Architecture Internal Memory.
5-1 Memory System. Logical Memory Map. Each location size is one byte (Byte Addressable) Logical Memory Map. Each location size is one byte (Byte Addressable)
COEN 180 DRAM. Dynamic Random Access Memory Dynamic: Periodically refresh information in a bit cell. Else it is lost. Small footprint: transistor + capacitor.
CSCE 212 Chapter 8 Storage, Networks, and Other Peripherals Instructor: Jason D. Bakos.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Chapter 9 Memory Basics Henry Hexmoor1. 2 Memory Definitions  Memory ─ A collection of storage cells together with the necessary circuits to transfer.
1 Storage (cont’d) Disk scheduling Reducing seek time (cont’d) Reducing rotational latency RAIDs.
1 Lecture 26: Storage Systems Topics: Storage Systems (Chapter 6), other innovations Final exam stats:  Highest: 95  Mean: 70, Median: 73  Toughest.
1 Lecture 27: Disks, Reliability, SSDs, Processors Topics: HDDs, SSDs, RAID, Intel and IBM case studies Final exam stats:  Highest 91, 18 scores of 82+
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.
CompE 460 Real-Time and Embedded Systems Lecture 5 – Memory Technologies.
Memory Technology “Non-so-random” Access Technology:
Storage & Peripherals Disks, Networks, and Other Devices.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
1 Chapter 7: Storage Systems Introduction Magnetic disks Buses RAID: Redundant Arrays of Inexpensive Disks.
Survey of Existing Memory Devices Renee Gayle M. Chua.
Main Memory -Victor Frandsen. Overview Types of Memory The CPU & Main Memory Types of RAM Properties of DRAM Types of DRAM & Enhanced DRAM Error Detection.
Chapter 5 Internal Memory. Semiconductor Memory Types.
Systems Overview Computer is composed of three main components: CPU Main memory IO devices Refers to page
Main Memory CS448.
CPEN Digital System Design
University of Tehran 1 Interface Design DRAM Modules Omid Fatemi
Asynchronous vs. Synchronous Counters Ripple Counters Deceptively attractive alternative to synchronous design style State transitions are not sharp! Can.
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
Computer Memory Storage Decoding Addressing 1. Memories We've Seen SIMM = Single Inline Memory Module DIMM = Dual IMM SODIMM = Small Outline DIMM RAM.
Computer Architecture Lecture 24 Fasih ur Rehman.
Semiconductor Memory Types
COMP541 Memories II: DRAMs
ECE/CS 552: Main Memory and ECC © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and.
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
1 Lecture: Storage, GPUs Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4)
1 Lecture: DRAM Main Memory Topics: DRAM intro and basics (Section 2.3)
Gunjeet Kaur Dronacharya Group of Institutions. Outline I Random-Access Memory Memory Decoding Error Detection and Correction Read-Only Memory Programmable.
Computer Architecture Chapter (5): Internal Memory
Types of RAM (Random Access Memory) Information Technology.
ECE 411: Computer Organization & Design 1 ECE 411 DRAM & Storage Acknowledgement: Many slides were adapted from Prof. Hsien-Hsin Lee’s ECE4100/6100 Advanced.
Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 25 Memory Hierarchy Design (Storage Technologies Trends and Caching) Prof.
LECTURE 13 I/O. I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access.
CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles.
CS 1251 Computer Organization N.Sundararajan
William Stallings Computer Organization and Architecture 7th Edition
ECE 4100/6100 Advanced Computer Architecture Lecture 11 DRAM
Types of RAM (Random Access Memory)
William Stallings Computer Organization and Architecture 7th Edition
Computer Architecture & Operations I
William Stallings Computer Organization and Architecture 8th Edition
Computer Architecture
William Stallings Computer Organization and Architecture 7th Edition
William Stallings Computer Organization and Architecture 8th Edition
Lecture 28: Reliability Today’s topics: GPU wrap-up Disk basics RAID
BIC 10503: COMPUTER ARCHITECTURE
Chapter 4: MEMORY.
DRAM Hwansoo Han.
William Stallings Computer Organization and Architecture 8th Edition
Bob Reese Micro II ECE, MSU
Presentation transcript:

ECE 4100/6100 Advanced Computer Architecture Lecture 11 DRAM and Storage Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia Institute of Technology

2 The DRAM Cell Why DRAMs –Higher density than SRAMs Disadvantages –Longer access times –Leaky, needs to be refreshed –Cannot be easily integrated with CMOS Stack capacitor (vs. Trench capacitor) Source: Memory Arch Course, Insa. Toulouse Word Line (Control) Storage Capacitor Bit Line (Information) 1T1C DRAM cell

3 One DRAM Bank

4 Column decoder Row decoder Example: 512Mb 4-bank DRAM (x4) Sense amps I/O gating Row decoder Column decoder Data out D[3:0] Address A[13:0] A[10:0] Address Multiplexing 16K 2k A x4 DRAM chip A DRAM page = 2kx4 = 1KB BA[1:0] Bank x 2048 x 4

5 DRAM Cell Array Wordline0Wordline1Wordline2Wordline1023 bitline0 bitline1 bitline2 bitline15 Wordline3

6 DRAM Sensing (Open Bitline Array) WL0WL1WL2WL127 A DRAM Subarry WL128WL129WL130WL255 A DRAM Subarry SenseAmp

7 Basic DRAM Operations SenseAmp Vdd/2 WL BL Vdd/2 Vdd Write ‘1’ driver Vdd - Vth SenseAmp Vdd/2 WL BL Precharge to Vdd/2 Vdd/2 + V signal Read ‘1’ Vdd - Vth Cm C BL Amplified V signal refresh

8 DRAM Basics Address multiplexing –Send row address when RAS asserted –Send column address when CAS asserted DRAM reads are self-destructive –Rewrite after a read Memory array –All bits within an array work in unison Memory bank –Different banks can operate independently DRAM rank –Chips inside the same rank are accessed simultaneously

9 Examples of DRAM DIMM Standards D0D7 x8 D8 D15 x8 D16D23 x8 D24D31 x8 D32D39 x8 D40 D47 x8 D48D55 x8 D56 D63 x8 x64 (No ECC) D0D7 x8 D8 D15 x8 CB0 CB7 x8 D16D23 x8 D24D31 x8 D32 D39 x8 D40 D47 x8 D48D55 x8 X72 (ECC) D56 D63 x8

10 DRAM Ranks x8 D0D7D8 D15D16D23D24D31D32D39D40 D47D48D55D56 D63 CS1 CS0 Memory Controller Rank0 Rank1

11 DRAM Ranks Single Rank 8b 64b Single Rank 4b 64b 4b Dual- Rank 8b 64b 8b

12 DRAM Organization Source: Memory Systems Architecture Course, B. Jacobs, Maryland

13 Organization of DRAM Modules Source: Memory Systems Architecture Course Bruce Jacobs, University of Maryland Memory Controller Addr and Cmd Bus Data Bus Channel Multi-Banked DRAM Chip

14 DRAM Configuration Example Source: MICRON DDR3 DRAM

15 Memory Controller DRAM Module Addr Bus WE CAS RAS Assert RAS Row Address Row Opened Data Bus Column Address Assert CAS DRAM Access (Non Nibble Mode) RAS CAS ADDR DATA Row Addr Col Addr Data Col Addr Data

16 DRAM Refresh Leaky storage Periodic Refresh across DRAM rows Un-accessible when refreshing Read, and write the same data back Example: –4k rows in a DRAM –100ns read cycle –Decay in 64ms –4096*100ns = 410  s to refresh once –410  s / 64ms = 0.64% unavailability

17 DRAM Refresh Styles Bursty 64ms 410  s =(100ns*4096) 410  s 64ms Distributed 64ms 15.6  s 64ms 100ns

18 RAS-Only Refresh CAS-Before-RAS (CBR) Refresh Memory Controller DRAM Module Memory Controller Addr Bus WE CAS RAS Addr Bus WE# CAS RAS Assert RAS Row Address Refresh Row Assert RAS Refresh Row Assert CAS WE High Increment counter DRAM Refresh Policies Addr counter No address involved

19 Types of DRAM Asynchronous DRAM –Normal: Responds to RAS and CAS signals (no clock) –Fast Page Mode (FPM): Row remains open after RAS for multiple CAS commands –Extended Data Out (EDO): Change output drivers to latches. Data can be held on bus for longer time –Burst Extended Data Out: Internal counter drives address latch. Able to provide data in burst mode. Synchronous DRAM –SDRAM: All of the above with clock. Adds predictability to DRAM operation –DDR, DDR2, DDR3: Transfer data on both edges of the clock –FB-DIMM: DIMMs connected using point to point connection instead of bus. Allows more DIMMs to be incorporated in server based systems RDRAM –Low pin count

20 Disk Storage

21 Disk Organization Platters A track A sector A cylinder (1 to 12) (5000 to 30000) (100 to 500) 512 Bytes 3600 to RPM

22 Disk Organization Read/write Head (10s of nanometers above magnetic surface) Arm

23 Disk Access Time Seek time –Move the arm to the desired track –5ms to 12ms Rotation latency (or delay) –For example, average rotation latency for a 10,000 RPM disk is 3ms (= 0.5/(10,000/60 )) Data transfer latency (or throughput) –Some tens of hundreds of MB per second –E.g., Seagate Cheetah 15K.6 sustained 164MB/sec Disk controller overhead Use Disk cache (or cache buffer) to exploit locality –4 to 32MB today –Come with the embedded controller in the HDD

24 Reliability, Availability, Dependability Program faults

25 Reliability, Availability, Dependability Program faults Static Permanent faults –Design flaw FDIV ~500 million$ –Manufacturing Stuck-at-faults Process variability Dynamic faults –Soft errors –Noise-induced –Wear-out

26 Solution Space DRAM / SRAM –Use ECC (SECDED) Disks –Use redundancy User’s backup Disk arrays

27 RAID Reliability and Performance consideration Redundant Array of Inexpensive Disks Combine multiple small, inexpensive disk drives Break arrays into “reliability groups” Data are divided and replicated across multiple disk drives RAID-0 to RAID-5 Hardware RAID –Dedicated HW controller Software RAID –Implemented in the OS

28 Basic Principles Data mirroring Data striping Error correction code

29 RAID-1 Mirrored disks Most expensive (100% overhead) Every write to disk also writes to the check disk Can improve read/seek performance with sufficient number of controllers A4 A3 A2 A1 A0 A4 A3 A2 A1 A0 Disk 0 (Data Disk) Disk 1 (Check Disk)

30 RAID-10 Combine data striping atop of RAID-1 B5 B2 A3 A0 B5 B2 A3 A0 Data Disk 0 Data Disk 1 C0 B3 B0 A1 Data Disk 2 C0 B3 B0 A1 Data Disk 3 B4 B1 A2 Data Disk 4 B4 B1 A2 Data Disk 5

31 RAID-2 Bit-interleaving striping Use Hamming Code to generate and store ECC on check disks (e.g., Hamming(7,4)) –Space: 4 data disks need 3 check disks (75%), 10 data disks need 4 check disks (40% overhead), 25 data disks need 5 check disks (20%) –CPU needs more compute power to generate Hamming code than parity Complex controller Not really used today! D0 C0 B0 A0 D1 C1 B1 A1 Data Disk 0 Data Disk 1 D2 C2 B2 A2 Data Disk 2 D3 C3 B3 A3 Data Disk 3 dECC0 cECC0 bECC0 aECC0 Check Disk 0 dECC1 cECC1 bECC1 aECC1 Check Disk 1 dECC2 cECC2 bECC2 aECC2 Check Disk 2

32 RAID-3 Byte-level striping Use XOR parity to generate and store parity code on the check disk At least 3 disks: 2 data disks + 1 check disk D0 C0 B0 A0 D1 C1 B1 A1 Data Disk 0 Data Disk 1 D2 C2 B2 A2 Data Disk 2 D3 C3 B3 A3 Data Disk 3 ECCd ECCc ECCb ECCa Check Disk 0 One Transfer Unit

33 RAID-4 Block-level striping Keep each individual accessed unit in one disk –Do not access all disks for (small) transfers –Improved parallelism Use XOR parity to generate and store parity code on the check disk Check info is calculated over a piece of each transfer unit Small read  one read on one disk Small write  two reads and two writes (data and check disks) –New parity = (old data  new data)  old parity –No need to read B0, C0, and D0 when read-modify-write A0 Write is the bottlenecks as all writes access the check disk A3 A2 A1 A0 B3 B2 B1 B0 Data Disk 0 Data Disk 1 C3 C2 C1 C0 Data Disk 2 D3 D2 D1 D0 Data Disk 3 ECC3 ECC2 ECC1 ECC0 Check Disk 0

34 E3 D3 B3 ECC2 C3 ECC3 C2 ECC4 D2 D1 ECC0 RAID-5 Block-level striping Distributed parity to enable write parallelism. Remove bottleneck of accessing parity Example: write “sector A” and write “sector B” can be performed simultaneously A3 A2 A1 A0 E2 B2 B1 B0 Data Disk 0 Data Disk 1 E1 C1 C0 Data Disk 2 E0 D0 Data Disk 3 ECC1 Data Disk 4

35 ECC4q D2 D1 D0 E2 B2 A2 ECC4p ECC3p ECC3q C2 RAID-6 Similar to RAID-5 with “dual distributed parity” ECC_p = XOR(A0, B0, C0); ECC_q = Code(A0, B0, C0, ECC_p) Sustain 2 drive failures with no data loss Minimum requirement: 4 disks –2 for data striping –2 for dual parity A1 ECC2p ECC1q A0 E1 ECC2q B1 B0 Data Disk 0 Data Disk 1 E0 C1 C0 Data Disk 2 ECC1p ECC0p Data Disk 3 ECC0q Data Disk 4