Main Memory Background Random Access Memory (vs. Serial Access Memory) Cache uses SRAM: Static Random Access Memory –No refresh (6 transistors/bit vs.

Slides:



Advertisements
Similar presentations
CS252/Culler Lec 5.1 2/5/02 CS203A Computer Architecture Lecture 15 Cache and Memory Technology and Virtual Memory.
Advertisements

Main MemoryCS510 Computer ArchitecturesLecture Lecture 15 Main Memory.
Outline Memory characteristics SRAM Content-addressable memory details DRAM © Derek Chiou & Mattan Erez 1.
Chapter 5 Internal Memory
Computer Organization and Architecture
Anshul Kumar, CSE IITD CSL718 : Main Memory 6th Mar, 2006.
COEN 180 DRAM. Dynamic Random Access Memory Dynamic: Periodically refresh information in a bit cell. Else it is lost. Small footprint: transistor + capacitor.
EKT 221 : DIGITAL 2. Today’s Outline  Dynamic RAM (DRAM)  DRAM Cell – The Hydraulic Analogy  DRAM Block Diagram  Types of DRAM.
Main Mem.. CSE 471 Autumn 011 Main Memory The last level in the cache – main memory hierarchy is the main memory made of DRAM chips DRAM parameters (memory.
CS.305 Computer Architecture Memory: Structures Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made.
Chapter 9 Memory Basics Henry Hexmoor1. 2 Memory Definitions  Memory ─ A collection of storage cells together with the necessary circuits to transfer.
11/29/2004EE 42 fall 2004 lecture 371 Lecture #37: Memory Last lecture: –Transmission line equations –Reflections and termination –High frequency measurements.
©UCB CS 161 Ch 7: Memory Hierarchy LECTURE 16 Instructor: L.N. Bhuyan
Memory Hierarchy.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output.
Registers  Flip-flops are available in a variety of configurations. A simple one with two independent D flip-flops with clear and preset signals is illustrated.
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
1 Lecture 14: Virtual Memory Today: DRAM and Virtual memory basics (Sections )
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 18, 2002 Topic: Main Memory (DRAM) Organization – contd.
8-5 DRAM ICs High storage capacity Low cost Dominate high-capacity memory application Need “refresh” (main difference between DRAM and SRAM) -- dynamic.
F1020/F1031 COMPUTER HARDWARE MEMORY. Read-only Memory (ROM) Basic instructions for booting the computer and loading the operating system are stored in.
Memory Technology “Non-so-random” Access Technology:
Computer Architecture Part III-A: Memory. A Quote on Memory “With 1 MB RAM, we had a memory capacity which will NEVER be fully utilized” - Bill Gates.
Chapter 4 ระบบหน่วยความจำ The Memory System
Advanced Microarchitecture
CSIE30300 Computer Architecture Unit 07: Main Memory Hsin-Chou Chi [Adapted from material by and
Survey of Existing Memory Devices Renee Gayle M. Chua.
1 Lecture: Virtual Memory, DRAM Main Memory Topics: virtual memory, TLB/cache access, DRAM intro (Sections 2.2)
IT253: Computer Organization Lecture 11: Memory Tonga Institute of Higher Education.
Chapter 5 Internal Memory. Semiconductor Memory Types.
IT253: Computer Organization
EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
Charles Kime & Thomas Kaminski © 2004 Pearson Education, Inc. Terms of Use (Hyperlinks are active in View Show mode) Terms of Use ECE/CS 352: Digital Systems.
Chapter 3 Internal Memory. Objectives  To describe the types of memory used for the main memory  To discuss about errors and error corrections in the.
Main Memory CS448.
CPEN Digital System Design
University of Tehran 1 Interface Design DRAM Modules Omid Fatemi
Asynchronous vs. Synchronous Counters Ripple Counters Deceptively attractive alternative to synchronous design style State transitions are not sharp! Can.
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
The Three C’s of Misses 7.5 Compulsory Misses The first time a memory location is accessed, it is always a miss Also known as cold-start misses Only way.
Modern DRAM Memory Architectures Sam Miller Tam Chantem Jon Lucas CprE 585 Fall 2003.
Computer Architecture Lecture 24 Fasih ur Rehman.
Semiconductor Memory Types
COMP541 Memories II: DRAMs
1 Adapted from UC Berkeley CS252 S01 Lecture 18: Reducing Cache Hit Time and Main Memory Design Virtucal Cache, pipelined cache, cache summary, main memory.
CS6290 Memory. Views of Memory Real machines have limited amounts of memory –640KB? A few GB? –(This laptop = 2GB) Programmer doesn’t want to be bothered.
1 Memory Hierarchy (I). 2 Outline Random-Access Memory (RAM) Nonvolatile Memory Disk Storage Suggested Reading: 6.1.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
CS 1410 Intro to Computer Tecnology Computer Hardware1.
CS35101 Computer Architecture Spring 2006 Lecture 18: Memory Hierarchy Paul Durand ( ) [Adapted from M Irwin (
CS203 – Advanced Computer Architecture Virtual Memory.
“With 1 MB RAM, we had a memory capacity which will NEVER be fully utilized” - Bill Gates.
CS161 – Design and Architecture of Computer
CS 704 Advanced Computer Architecture
COMP541 Memories II: DRAMs
Chapter 5 Internal Memory
William Stallings Computer Organization and Architecture 7th Edition
CSE 502: Computer Architecture
ECE232: Hardware Organization and Design
CS161 – Design and Architecture of Computer
COMP541 Memories II: DRAMs
William Stallings Computer Organization and Architecture 7th Edition
William Stallings Computer Organization and Architecture 8th Edition
William Stallings Computer Organization and Architecture 8th Edition
William Stallings Computer Organization and Architecture 8th Edition
Presentation transcript:

Main Memory Background Random Access Memory (vs. Serial Access Memory) Cache uses SRAM: Static Random Access Memory –No refresh (6 transistors/bit vs. 1 transistor Size: DRAM Cost: DRAM Speed: SRAM Main Memory is DRAM: Dynamic Random Access Memory –Dynamic since needs to be refreshed periodically –Addresses divided into 2 halves (Memory as a 2D matrix): RAS or Row Access Strobe CAS or Column Access Strobe

SRAM vs. DRAM DRAM = Dynamic RAM SRAM: 6T per bit –built with normal high-speed CMOS technology DRAM: 1T per bit –built with special DRAM process optimized for density

Hardware Structures bb SRAM wordline b DRAM wordline

DRAM Chip Organization Row Decoder Sense Amps Column Decoder Memory Cell Array Row Buffer Row Address Column Address Data Bus

DRAM Chip Organization (2) Differences with SRAM reads are destructive: contents are erased after reading –row buffer read lots of bits all at once, and then parcel them out based on different column addresses –similar to reading a full cache line, but only accessing one word at a time “Fast-Page Mode” FPM DRAM organizes the DRAM row to contain bits for a complete page –row address held constant, and then fast read from different locations from the same page

Row Buffer Refresh So after a read, the contents of the DRAM cell are gone The values are stored in the row buffer Write them back into the cells for the next read in the future Sense Amps DRAM cells

Refresh (2) Fairly gradually, the DRAM cell will lose its contents even if it’s not accessed –This is why it’s called “dynamic” –Contrast to SRAM which is “static” in that once written, it maintains its value forever (so long as power remains on) All DRAM rows need to be regularly read and re-written 1 Gate Leakage 0 If it keeps its value even if power is removed, then it’s “non-volatile” (e.g., flash, HDD, DVDs)

DRAM Read Timing Accesses are asynchronous: triggered by RAS and CAS signals, which can in theory occur at arbitrary times (subject to DRAM timing constraints)

SDRAM Read Timing Burst Length Double-Data Rate (DDR) DRAM transfers data on both rising and falling edge of the clock Timing figures taken from “A Performance Comparison of Contemporary DRAM Architectures” by Cuppu, Jacob, Davis and Mudge Command frequency does not change

Dynamic RAM SRAM cells exhibit high speed/poor density DRAM: simple transistor/capacitor pairs in high density form Word Line Bit Line C Sense Amp......

Other Types of DRAM Synchronous DRAM (SDRAM): Ability to transfer a burst of data given a starting address and a burst length – suitable for transferring a block of data from main memory to cache. Page Mode DRAM: Access all bits on the same ROW –RAS keep active, Toggle CAS with new column address Extended Data Output (EDO) –A new access cycle can be started while keeping the data output of the previous cycle active. Rambus DRAM (RDRAM) - Uses pipelining to move data from RAM to cache memory.

Rambus (RDRAM) Synchronous interface Row buffer cache –last 4 rows accessed cached Uses other tricks since adopted by SDRAM –multiple data words per clock, high frequencies Chips can self-refresh Expensive for PC’s, used by X-Box, PS2

Faster DRAM Speed Clock FSB faster –DRAM chips may not be able to keep up Latency dominated by wire delay –Bandwidth may be improved (DDR vs. regular) but latency doesn’t change much Instead of 2 cycles for row access, may take 3 cycles at a faster bus speed Doesn’t address latency of the memory access

Memory Interleaving Interleaved memory is a design made to compensate for the relatively slow speed of dynamic random-access memory (DRAM). Main memory divided into two or more sections. The CPU can access alternate sections immediately, without waiting for memory to catch up (through wait states). Interleaved memory is more flexible than wide-access memory in that it can handle multiple independent accesses at once.

Memory Interleaving cont. For example, in an interleaved system with two memory banks (assuming word- addressable memory), if logical address 32 belongs to bank 0, then logical address 33 would belong to bank 1, logical address 34 would belong to bank 0, and so on. An interleaved memory is said to be n-way interleaved when there are n banks and memory location i resides in bank i mod n.word- addressable

Latency Width/Speed varies depending on memory type Significant wire delay just getting from the CPU to the memory controller More wire delay getting to the memory chips (plus the return trip…)

So what do we do about it? Caching –reduces average memory instruction latency by avoiding DRAM altogether Limitations –Capacity programs keep increasing in size –Compulsory misses

Idea: Caching! Not caching of data, but caching of translations 0K 4K 8K 12K Virtual Addresses 0K 4K 8K 12K 16K 20K 24K 28K Physical Addresses X VPN 8 PPN 16

Data movement in a memory hierarchy. Memory Hierarchy: The Big Picture

Virtual Memory has own terminology Each process has its own private “virtual address space” (e.g., 2 32 Bytes); CPU actually generates “virtual addresses” Each computer has a “physical address space” (e.g., 128 MegaBytes DRAM); also called “real memory” Address translation: mapping virtual addresses to physical addresses –Allows multiple programs to use (different chunks of physical) memory at same time –Also allows some chunks of virtual memory to be represented on disk, not in main memory (to exploit memory hierarchy)

Virtual Memory Idea 1: Many Programs sharing DRAM Memory so that context switches can occur Idea 2: Allow program to be written without memory constraints – program can exceed the size of the main memory Idea 3: Relocation: Parts of the program can be placed at different locations in the memory instead of a big chunk. Virtual Memory: (1) DRAM Memory holds many programs running at same time (processes) (2) use DRAM Memory as a kind of “cache” for disk

Programmer’s View Example 32-bit memory –When programming, you don’t care about how much real memory there is –Even if you use a lot, memory can always be paged to disk Kernel Text Data Heap Stack 0-2GB 4GB AKA Virtual Addresses

Pages Memory is divided into pages, which are nothing more than fixed sized and aligned regions of memory –Typical size: 4KB/page (but not always) … Page 0 Page 1 Page 2 Page 3

Mapping Virtual Memory to Physical Memory Divide Memory into equal sized “chunks” (say, 4KB each) 0 Physical Memory  Virtual Memory Heap 64 MB 0 Any chunk of Virtual Memory assigned to any chunk of Physical Memory (“page”) Stack Heap Static Code Single Process

Page Table Map from virtual addresses to physical locations 0K 4K 8K 12K Virtual Addresses 0K 4K 8K 12K 16K 20K 24K 28K Physical Addresses “Physical Location” may include hard-disk Page Table implements this V  P mapping

Page Tables 0K 4K 8K 12K 0K 4K 8K 12K 16K 20K 24K 28K 0K 4K 8K 12K Physical Memory

Need for Translation Virtual Address Virtual Page NumberPage Offset Page Table Main Memory Physical Address 0xFC51908B 0x001520xFC519 0x B

Choosing a Page Size Page size inversely proportional to page table overhead Large page size permits more efficient transfer to/from disk –vs. many small transfers –Like downloading from Internet Small page leads to less fragmentation –Big page likely to have more bytes unused

Translation Cache: TLB TLB = Translation Look-aside Buffer TLB Virtual Address Cache Data Physical Address Cache Tags Hit? If TLB hit, no need to do page table lookup from memory Note: data cache accessed by physical addresses now

Impact on Performance? Every time you load/store, the CPU must perform two (or more) accesses! Even worse, every fetch requires translation of the PC! Observation: –Once a virtual page is mapped into a physical page, it’ll likely stay put for quite some time