Lecture 14 Virtual Memory and the Alpha Memory Hierarchy

Lecture 14 Virtual Memory and the Alpha 21064 Memory Hierarchy
Computer Architecture COE 501

Virtual Memory Virtual memory (VM) allows main memory (DRAM) to act like a cache for secondary storage (magnetic disk). VM address translation a provides a mapping from the virtual address of the processor to the physical address in main memory or on disk. VM provides the following benefits Allows multiple programs to share the same physical memory Allows programmers to write code as though they have a very large amount of main memory Automatically handles bringing in data from disk Cache terms vs. VM terms Cache block => page or segment Cache Miss => page fault or address fault

Cache and VM Parameters
How is virtual memory different from caches? Software controls replacement - why? Size of virtual memory determined by size of processor address Disk is also used to store the file system - nonvolatile

Paged and Segmented VM (Figure 5.38, pg. 442)
Virtual memories can be catagorized into two main classes Paged memory : fixed size blocks Segmented memory : variable size blocks

Paged vs. Segmented VM Paged memory Segmented memory Hybrid approaches
Fixed sized blocks (4 KB to 64 KB) One word per address (page number + page offset) Easy to replace pages (all same size) Internal fragmentation (not all of page is used) Efficient disk traffic (optimize for page size) Segmented memory Variable sized blocks (up to 64 KB or 4GB) Two words per address (segment + offset) Difficult to replace segments (find where segment fits) External fragmentation (unused portions of memory) Inefficient disk traffic (may have small or large transfers) Hybrid approaches Paged segments: segments are a multiple of a page size Multiple page sizes: (e.g., 8 KB, 64 KB, 512 KB, 4096 KB)

4 Qs for Virtual Memory Q1: Where can a block be placed in the upper level? Miss penalty for virtual memory is very high Have software determine location of block while accessing disk Allow blocks to be place anywhere in memory (fully assocative) to reduce miss rate. Q2: How is a block found if it is in the upper level? Address divided into page number and page offset Page table and translation buffer used for address translation Q3: Which block should be replaced on a miss? Want to reduce miss rate & can handle in software Least Recently Used typically used Q4: What happens on a write? Writing to disk is very expensive Use a write-back strategy

Address Translation with Page Table (Figure 5.40, pg. 444)
A page table translates a virtual page number into a physical page number The page offset remains unchaged Page tables are large 32 bit virtual address 4 KB page size 2^20 4 byte table entries = 4MB Page tables are stored in main memory => slow Cache table entries in a translation buffer

Fast Address Translation with Translation Buffer (TB) (Figure 5.41, pg. 446)
Cache translated addresses in TB Alpha data TB 32 entries fully associative 30 bit tag 21 bit physical address Valid and read/write bits Separate TB for instr. Steps in translation compare page no. to tags check for memory access violation send physical page no. of matching tag combine physical page no. and page offset

Selecting a Page Size Reasons for larger page size
Page table size is inversely proportional to the page size; therefore memory saved Fast cache hit time easy when cache size < page size (VA caches); bigger page makes this feasible as cache size grows Transferring larger pages to or from secondary storage, possibly over a network, is more efficient Number of TLB entries are restricted by clock cycle time, so a larger page size maps more memory, thereby reducing TLB misses Reasons for a smaller page size Want to avoid internal fragmentation: don’t waste storage; data must be contiguous within page Quicker process start for small processes - don’t need to bring in more memory than needed

Memory Protection With multiprogramming, a computer is shared by several programs or processes running concurrently Need to provide protection Need to allow sharing Mechanisms for providing protection Provide Base and Bound registers: Base £ Address £ Bound Provide both user and supervisor (operating system) modes Provide CPU state that the user can read, but cannot write Branch and bounds registers, user/supervisor bit, exception bits Provide method to go from user to supervisor mode and vice versa system call : user to supervisor system return : supervisor to user Provide permissions for each flag or segment in memory

Alpha VM Mapping (Figure 5.43, pg. 451)
“64-bit” address divided into 3 segments seg0 (bit 63=0) user code seg1 (bit 63 = 1, 62 = 1) user stack kseg (bit 63 = 1, 62 = 0) kernel segment for OS Three level page table, each one page Reduces page table size Increases translation time PTE bits; valid, kernel & user read & write enable

Cross Cutting Issues Superscalar CPU & Number Cache Ports
increase instruction issue => increase no. of cache ports Speculative Execution memory should identify speculative instructions and supress faults Should have not blocking cache to avoid miss stalls Instruction Level Parallelism vs Reduce misses Want far separation to find independent operations vs. want reuse of data accesses to avoid misses Consistency of data between cache and memory Multiple Caches => multiple copies of data Consistency must be controlled by HW or by SW

Alpha 21064 Memory Hierarchy
The Alpha memory hierarchy includes A 32 entry, fully associative, data TB A 12 entry, fully associative instruction TB A 8 KB direct-mapped physically addressed data cache A 8 KB direct-mapped physically addressed instruction cache A 4 entry by 64-bit instruction prefetch stream buffer A 4 entry by 256-bit write buffer A 2 MB directed mapped second level unified cache The virtual memory Maps a 43-bit virtual address to a 34-bit physical address Has a page size of 8 KB

Alpha Memory Performance: Miss Rates
8K 8K 2M

Alpha CPI Components Largest increase in CPI due to
I stall: Instruction stalls from branch mispredictions Other: data hazards, structural hazards

Pitfall: Address space to small
One of the biggest mistakes than can be made when designing an architect is to devote to few bits to the address address size limits the size of virtual memory difficult to change since many components depend on it (e.g., PC, registers, effective-address calculations) As program size increases, larger and larger address sizes are needed 8 bit: Intel (1975) 16 bit: Intel (1978) 24 bit: Intel (1982) 32 bit: Intel (1985) 64 bit: Intel Merced (1998)

Pitfall: Predicting Cache Performance of one Program from Another Program
4KB Data cache miss rate 8%,12%, or 28%? 1KB Instr cache miss rate 0%,3%, or 10%? Alpha vs. MIPS for 8KB Data: 17% vs. 10%

Pitfall: Simulating Too Small an Address Trace

Virtual Memory Summary
Virtual memory (VM) allows main memory (DRAM) to act like a cache for secondary storage (magnetic disk). The large miss penalty of virtual memory leads to different stategies from cache Fully associative, TB + PT, LRU, Write-back Designed as paged: fixed size blocks segmented: variable size blocks hybrid: segmented paging or multiple page sizes Avoid small address size

Lecture 14 Virtual Memory and the Alpha Memory Hierarchy

Similar presentations

Presentation on theme: "Lecture 14 Virtual Memory and the Alpha Memory Hierarchy"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 14 Virtual Memory and the Alpha Memory Hierarchy

Similar presentations

Presentation on theme: "Lecture 14 Virtual Memory and the Alpha Memory Hierarchy"— Presentation transcript:

Similar presentations

About project

Feedback