Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 19 Translation Lookaside Buffer

Similar presentations


Presentation on theme: "Chapter 19 Translation Lookaside Buffer"— Presentation transcript:

1 Chapter 19 Translation Lookaside Buffer
Chien-Chung Shen CIS/UD

2 Introduction High performance overheads of paging
large amount of mapping information (in memory) extra memory access for each virtual address How to speed up address translation? Hardware support translation-lookaside buffer (TLB) [part of MMU] hardware cache of popular virtual-to-physical address translations better name would be address-translation cache Upon each virtual memory reference, hardware first checks TLB to see if the desired translation is held therein; if so, the translation is performed (quickly) without having to consult the page table (which has all the translations)

3 TLB Algorithm

4 TLB Algorithm VPN = (VirtualAddress & VPN_MASK) >> SHIFT (Success, TlbEntry) = TLB_Lookup(VPN) if (Success == True) // TLB Hit if (CanAccess(TlbEntry.ProtectBits) == True) Offset = VirtualAddress & OFFSET_MASK PhysAddr = (TlbEntry.PFN << SHIFT) | Offset AccessMemory(PhysAddr) else RaiseException(PROTECTION_FAULT) else // TLB Miss PTEAddr = PTBR + (VPN * sizeof(PTE)) PTE = AccessMemory(PTEAddr) if (PTE.Valid == False) RaiseException(SEGMENTATION_FAULT) else if (CanAccess(PTE.ProtectBits) == False) RaiseException(PROTECTION_FAULT) else TLB_Insert(VPN, PTE.PFN, PTE.ProtectBits) RetryInstruction()

5 Example: Access an Array
8-bit virtual address space and 16-byte pages 10 4-byte integers starting at VA 100 4-bit VPN and 4-bit offset TLB hit rate: 70% (m,h,h,m,h,h,h,m,h,h) Who’s contribution: spatial locality Any other way to improve hit rate? larger pages Quick re-reference of memory in time temporal locality

6 Caching and Locality Caching is one of the most fundamental performance techniques in computer systems to make common-case faster Idea behind caching is to take advantage of locality in instruction and data references Temporal locality: an instruction or data item that has been recently accessed will likely be re-accessed soon in the future (e.g., instructions in a loop) Spatial locality: if program accesses memory x, it will likely soon access memory near x

7 Who handles TLB Misses? For CISC (complex-instruction set computers) architecture, by hardware using page-table base register (PTBR) For RISC (reduced-instruction set computers) architecture, by software (where hardware simply raises an exception and jumps to a trap handler) advantage: flexibility (OS may use any data structure to implement page table) and simplicity return-from-trap returns to the same instruction that caused the trap, which runs again (resulting in a TLB hit) avoid causing an infinite chain of TLB misses keep TLB miss handlers in physical memory (not subject to address translation) reserve some entries in TLB for permanently-valid translations and use some of those permanent translation slots for the handler code itself

8 TLB Control Flow Algorithm
Hardware based design Software based design Trap to execute OS TLB miss handler

9 TLB Contents 32, 64, or 128 entries
Fully associative: any given translation can be anywhere in TLB, and hardware will search the entire TLB in parallel to find the desired translation An entry looks like: VPN | PFN | other bits e.g., valid bit TLB valid bit ≠ page table valid bit in page table, when a PTE is marked invalid, it means that the page has not been allocated by the process a TLB valid bit refers to whether a TLB entry has a valid translation within it [useful for context switching]

10 Context Switch TLB contains virtual-to-physical translations that are only valid for the currently running process, which are not meaningful for other processes What to do on a context switch? flush TLB on context switches by sets all valid bits to 0 Each time a process (re)runs, it incurs TLB misses (after context switches): what can you do better? Add Address Space ID (ASID), TLB may hold translations from different processes Sharing of page in physical memory process 1 process 2

11 Replacement Policy Cache replacement with goal of minimizing miss rate
Policies (1) evict the least-recently-used (LRU) entry take advantage of locality (2) random Scenario: a loop accessing n + 1 pages, a TLB of size n, which replacement policy works better? random!!!

12 A Real TLB Entry MIPS R4000 with software-managed TLB

13 Culler’s Law The term random-access memory (RAM) implies that you can access any part of RAM just as quickly as another. While it is generally good to think of RAM in this way, because of hardware/OS features such as TLB, accessing a particular page of memory may be costly, particularly if that page isn’t currently mapped by TLB. Thus, it is always good to remember the implementation tip: RAM isn’t always RAM. Sometimes randomly accessing your address space, particular if the number of pages accessed exceeds the TLB coverage, can lead to severe performance penalties. -- David Culler TLB is the source of many performance problems


Download ppt "Chapter 19 Translation Lookaside Buffer"

Similar presentations


Ads by Google