Download presentation
Presentation is loading. Please wait.
2
Caching IV Andreas Klappenecker CPSC321 Computer Architecture
3
Virtual Memory Processor generates virtual addresses Memory is accessed using physical addresses Virtual and physical memory is broken into blocks of memory, called pages A virtual page may be absent from main memory, residing on the disk or may be mapped to a physical page
4
Virtual Memory Main memory can act as a cache for the secondary storage (disk) Virtual address generated by processor (left) Address translation (middle) Physical addresses (right) Advantages: illusion of having more physical memory program relocation protection
5
Pages: virtual memory blocks Page faults: if data is not in memory, retrieve it from disk huge miss penalty, thus pages should be fairly large (e.g., 4KB) reducing page faults is important (LRU is worth the price) can handle the faults in software instead of hardware using write-through takes too long so we use writeback Example: page size 2 12 =4KB; 2 18 physical pages; main memory <= 1GB; virtual memory <= 4GB
6
Page Faults Incredible high penalty for a page fault Reduce number of page faults by optimizing page placement Use fully associative placement full search of pages is impractical pages are located by a full table that indexes the memory, called the page table the page table resides within the memory
7
Page Tables The page table maps each page to either a page in main memory or to a page stored on disk
8
Page Tables
9
Making Memory Access Fast Page tables slow us down Memory access will take at least twice as long access page table in memory access page What can we do? Memory access is local => use a cache that keeps track of recently used address translations, called translation lookaside buffer
10
Making Address Translation Fast A cache for address translations: translation lookaside buffer
11
Translation Lookaside Buffer Some typical values for a TLB TLB size 32-4096 Block size: 1-2 page table entries (4-8bytes each) Hit time: 0.5-1 clock cycle Miss penalty: 10-30 clock cycles Miss rate: 0.01%-1%
12
TLBs and Caches
13
More Modern Systems Very complicated memory systems:
14
Processor speeds continue to increase very fast — much faster than either DRAM or disk access times Design challenge: dealing with this growing disparity Trends: synchronous SRAMs (provide a burst of data) redesign DRAM chips to provide higher bandwidth or processing restructure code to increase locality use prefetching (make cache visible to ISA) Some Issues
15
Where can a Block be Placed? NameNumber of SetsBlocks per Set Direct mapped# Blocks in Cache1 Set associative(#Blocks in Cache) Associativity Associativity (typically 2-8) Fully associative1Number of blocks in cache
16
How is a Block Found? AssociativityNumber of Sets# Comparisons Direct mappedIndex1 Set associativeIndex the set, search among elements Degree of Associativity Fully associativesearch all cache entriessize of the cache separate lookup table0
17
Algorithm for Success Read Chapters 5 - 7 get the big picture Read again focus on the little details do calculations work problems Get enough sleep! What should be reviewed?
18
Project Provide a working solution it is better to submit a working solution implementing a subset of instructions if you submit a faulty version, comment your bugs have test programs that exercise all instruction have a full report that explains your design should include a table of control signals
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.