1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Nov. 13, 2002 Topic: Main Memory (DRAM) Organization.

Slides:



Advertisements
Similar presentations
Main MemoryCS510 Computer ArchitecturesLecture Lecture 15 Main Memory.
Advertisements

Chapter 5 Internal Memory
Computer Organization and Architecture
Computer Organization and Architecture
Prith Banerjee ECE C03 Advanced Digital Design Spring 1998
+ CS 325: CS Hardware and Software Organization and Architecture Internal Memory.
5-1 Memory System. Logical Memory Map. Each location size is one byte (Byte Addressable) Logical Memory Map. Each location size is one byte (Byte Addressable)
Main Mem.. CSE 471 Autumn 011 Main Memory The last level in the cache – main memory hierarchy is the main memory made of DRAM chips DRAM parameters (memory.
CS.305 Computer Architecture Memory: Structures Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made.
1 The Basic Memory Element - The Flip-Flop Up until know we have looked upon memory elements as black boxes. The basic memory element is called the flip-flop.
1 Lecture 15: DRAM Design Today: DRAM basics, DRAM innovations (Section 5.3)
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 Lecture 16B Memories. 2 Memories in General Computers have mostly RAM ROM (or equivalent) needed to boot ROM is in same class as Programmable Logic.
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Nov. 19, 2003 Topic: Main Memory (DRAM) Organization.
Memory Hierarchy.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output.
Registers  Flip-flops are available in a variety of configurations. A simple one with two independent D flip-flops with clear and preset signals is illustrated.
1 Above: The first magnetic core memory, from the IBM 405 Alphabetical Accounting Machine. This experimental system was tested successfully in April 1952.
Chapter 5 Internal Memory
IT Systems Memory EN230-1 Justin Champion C208 –
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
1 Lecture 16B Memories. 2 Memories in General RAM - the predominant memory ROM (or equivalent) needed to boot ROM is in same class as Programmable Logic.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 18, 2002 Topic: Main Memory (DRAM) Organization – contd.
Contemporary Logic Design Sequential Case Studies © R.H. Katz Transparency No Chapter #7: Sequential Logic Case Studies 7.6 Random Access Memories.
Overview Booth’s Algorithm revisited Computer Internal Memory Cache memory.
1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.
Computing Systems Memory Hierarchy.
Physical Memory By Gregory Marshall. MEMORY HIERARCHY.
CPE232 Memory Hierarchy1 CPE 232 Computer Organization Spring 2006 Memory Hierarchy Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.
CSIE30300 Computer Architecture Unit 07: Main Memory Hsin-Chou Chi [Adapted from material by and
Chapter 5 Internal Memory. Semiconductor Memory Types.
Systems Overview Computer is composed of three main components: CPU Main memory IO devices Refers to page
FAMU-FSU College of Engineering 1 Computer Architecture EEL 4713/5764, Fall 2006 Dr. Linda DeBrunner Module #17—Main Memory Concepts.
EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
Part V Memory System Design
Memory System Unit-IV 4/24/2017 Unit-4 : Memory System.
Main Memory CS448.
CPEN Digital System Design
University of Tehran 1 Interface Design DRAM Modules Omid Fatemi
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
COMP203/NWEN Memory Technologies 0 Plan for Memory Technologies Topic Static RAM (SRAM) Dynamic RAM (DRAM) Memory Hierarchy DRAM Accelerating Techniques.
CS/EE 5810 CS/EE 6810 F00: 1 Main Memory. CS/EE 5810 CS/EE 6810 F00: 2 Main Memory Bottom Rung of the Memory Hierarchy 3 important issues –capacity »BellÕs.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
Semiconductor Memory Types
Memory Devices 1. Memory concepts 2. RAMs 3. ROMs 4. Memory expansion & address decoding applications 5. Magnetic and Optical Storage.
1  1998 Morgan Kaufmann Publishers Chapter Seven.
COMP541 Memories II: DRAMs
1 Adapted from UC Berkeley CS252 S01 Lecture 18: Reducing Cache Hit Time and Main Memory Design Virtucal Cache, pipelined cache, cache summary, main memory.
07/11/2005 Register File Design and Memory Design Presentation E CSE : Introduction to Computer Architecture Slides by Gojko Babić.
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
CS35101 Computer Architecture Spring 2006 Lecture 18: Memory Hierarchy Paul Durand ( ) [Adapted from M Irwin (
Computer Architecture Chapter (5): Internal Memory
RAM RAM - random access memory RAM (pronounced ramm) random access memory, a type of computer memory that can be accessed randomly;
Introduction to computer architecture April 7th. Access to main memory –E.g. 1: individual memory accesses for j=0, j++, j
Memory Hierarchy and Cache. A Mystery… Memory Main memory = RAM : Random Access Memory – Read/write – Multiple flavors – DDR SDRAM most common 64 bit.
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
COMP541 Memories II: DRAMs
William Stallings Computer Organization and Architecture 7th Edition
COMP541 Memories II: DRAMs
William Stallings Computer Organization and Architecture 7th Edition
William Stallings Computer Organization and Architecture 7th Edition
Digital Logic & Design Dr. Waseem Ikram Lecture 40.
DRAM Hwansoo Han.
Chapter Five Large and Fast: Exploiting Memory Hierarchy
Presentation transcript:

1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Nov. 13, 2002 Topic: Main Memory (DRAM) Organization

2 Basics of DRAM Technology  DRAM (Dynamic RAM)  Used mostly in main mem.  Capacitor + 1 transistor/bit  Need refresh every 4-8 ms 5% of total time 5% of total time  Read is destructive (need for write-back)  Access time < cycle time (because of writing back)  Density (25-50):1 to SRAM  Address multiplexed  SRAM (Static RAM)  Used mostly in caches (I, D, TLB, BTB)  1 flip-flop (4-6 transistors) per bit  Read is not destructive  Access time = cycle time  Speed (8-16):1 to DRAM  Address not multiplexed

3 DRAM Organization: Fig. 5.29

4 Chip Organization  Chip capacity (= number of data bits) tends to quadruple 1K, 4K, 16K, 64K, 256K, 1M, 4M, … 1K, 4K, 16K, 64K, 256K, 1M, 4M, …  In early designs, each data bit belonged to a different address (x1 organization)  Chip tended to be a single square array Minimizes decoding circuitry and drivers Minimizes decoding circuitry and drivers Reduces pins by permitting address multiplexing Reduces pins by permitting address multiplexing  Starting with 1Mbit chips, wider chips (4, 8, 16, 32 bits wide) began to appear Advantage: Higher bandwidth Advantage: Higher bandwidth Disadvantage: More pins, hence more expensive packaging Disadvantage: More pins, hence more expensive packaging

5 Chip Organization Example: 64Mb DRAM

6 3D vs 2.5D Organization  3D organization (used in magnetic core memories) Half of the address bits select a row of the square array Half of the address bits select a row of the square array Other half of address bits select a column Other half of address bits select a column Required single bit is at their intersection Required single bit is at their intersection  2.5D organization (used in DRAM chips) Half of the address bits select a row of the square array Half of the address bits select a row of the square array Whole row of bits is brought out of the memory array into a buffer register (slow, 60-80% of access time) Whole row of bits is brought out of the memory array into a buffer register (slow, 60-80% of access time) Other half of address bits select one bit of buffer register (with the help of multiplexer), which is read or written Other half of address bits select one bit of buffer register (with the help of multiplexer), which is read or written Whole row is written back to memory array Whole row is written back to memory array Organization demanded by needs of refresh Organization demanded by needs of refresh Has other advantages such as nibble, page, and static column mode operation Has other advantages such as nibble, page, and static column mode operation

7 DRAM Refresh  Consider a 1Mx1 DRAM chip with 190 ns cycle time  Time for refreshing one bit at a time 190   10 6 = 190 ms > 4-8 ms 190   10 6 = 190 ms > 4-8 ms  Time for refreshing one row at a time 190   10 3 = 0.19 ms < 4-8 ms 190   10 3 = 0.19 ms < 4-8 ms  Refresh complicates operation of memory  Refresh control competes with CPU for access to DRAM  Each row refreshed once every 4-8 ms irrespective of the use of that row  Want to keep refresh fast (< 5-10% of total time)

8 Memory Performance Characteristics  Latency (access time) The time interval between the instant at which the data is called for (READ) or requested to be stored (WRITE), and the instant at which it is delivered or completely stored The time interval between the instant at which the data is called for (READ) or requested to be stored (WRITE), and the instant at which it is delivered or completely stored  Cycle time The time between the instant the memory is accessed, and the instant at which it may be validly accessed again The time between the instant the memory is accessed, and the instant at which it may be validly accessed again  Bandwidth (throughput) The rate at which data can be transferred to or from memory The rate at which data can be transferred to or from memory Reciprocal of cycle time Reciprocal of cycle time “Burst mode” bandwidth is of greatest interest “Burst mode” bandwidth is of greatest interest  Cycle time > access time for conventional DRAM  Cycle time < access time in “burst mode” when a sequence of consecutive locations is read or written

9 Improving Performance  Latency can be reduced by Reducing access time of chips Reducing access time of chips Using a cache (“cache trades latency for bandwidth”) Using a cache (“cache trades latency for bandwidth”)  Bandwidth can be increased by using Wider memory Wider memory More data pins per DRAM chip More data pins per DRAM chip Increased bandwidth per data pin Increased bandwidth per data pin

10 Two Recent Problems  DRAM chip sizes quadrupling every three years  Main memory sizes doubling every three years  Thus, the main memory of the same kind of computer is being constructed from fewer and fewer DRAM chips  This results in two serious problems Diminishing main memory bandwidth Diminishing main memory bandwidth Increasing granularity of memory systems Increasing granularity of memory systems

11 Diminishing Main Memory Bandwidth  Amdahl’s Rule says that a typical, well-balanced computer system requires 1 MB main memory per 1 MIPS of CPU performance  What CPU-MM bandwidth is needed to support 1 MIPS? Assume 32-bit instructions, 40% load-stores Assume 32-bit instructions, 40% load-stores  1*(4+0.4*4) = 5.6 MB/s Thus each DRAM chip must provide at least 5.6 MBps/MB Thus each DRAM chip must provide at least 5.6 MBps/MB This quantity is also called fill frequency This quantity is also called fill frequency

12 Trends in DRAM Technology

13 Increasing Granularity of Memory Systems  Granularity of memory system is the minimum memory size, and also the minimum increment in the amount of memory permitted by the memory system  Too large a granularity is undesirable Increases cost of system Increases cost of system Restricts its competitiveness Restricts its competitiveness  Granularity can be decreased by Widening the DRAM chips Widening the DRAM chips Increasing the per-pin bandwidth of the DRAM chips Increasing the per-pin bandwidth of the DRAM chips

14 Granularity Example We are using 16K  1 DRAM parts, running at 2.5 MHz (400ns cycle time). Eight such DRAM parts provide 16KB of memory with 2.5MB/s bandwidth. We are using 16K  1 DRAM parts, running at 2.5 MHz (400ns cycle time). Eight such DRAM parts provide 16KB of memory with 2.5MB/s bandwidth. Industry switches to 64Kb (64K  1) DRAM parts. Two such DRAM parts provide the desired 16KB of memory. Such a system would have a 2-bit wide bus. Industry switches to 64Kb (64K  1) DRAM parts. Two such DRAM parts provide the desired 16KB of memory. Such a system would have a 2-bit wide bus. To maintain a 2.5MB/s bandwidth, parts would need to run at 10 MHz. But the parts run only at 3.7 MHz. What are the option? To maintain a 2.5MB/s bandwidth, parts would need to run at 10 MHz. But the parts run only at 3.7 MHz. What are the option? 8 2

15 Granularity Example (2) 8 Solution 1 Use eight 64K  1 DRAM parts (six would suffice for required bandwidth). Problem: Now we have 64KB of memory rather than 16KB. Solution 1 Use eight 64K  1 DRAM parts (six would suffice for required bandwidth). Problem: Now we have 64KB of memory rather than 16KB. Solution 2 Use two 16K  4 DRAM parts (same capacity, different organization). This provides 16KB of memory at the required bandwidth. Solution 2 Use two 16K  4 DRAM parts (same capacity, different organization). This provides 16KB of memory at the required bandwidth. 8