Chapter 19 Translation Lookaside Buffer Chien-Chung Shen CIS, UD

Slides:



Advertisements
Similar presentations
SE-292: High Performance Computing
Advertisements

EECS 470 Virtual Memory Lecture 15. Why Use Virtual Memory? Decouples size of physical memory from programmer visible virtual memory Provides a convenient.
Virtual Memory Chapter 18 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi.
Lecture 7 Memory Management. Virtual Memory Approaches Time Sharing: one process uses RAM at a time Static Relocation: statically rewrite code before.
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
Memory Management and Paging CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
Translation Buffers (TLB’s)
Mem. Hier. CSE 471 Aut 011 Evolution in Memory Management Techniques In early days, single program run on the whole machine –Used all the memory available.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
Operating Systems COMP 4850/CISG 5550 Page Tables TLBs Inverted Page Tables Dr. James Money.
Chapter 18 Paging Chien-Chung Shen CIS, UD
8.1 Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Paging Physical address space of a process can be noncontiguous Avoids.
Chapter 21 Swapping: Mechanisms Chien-Chung Shen CIS, UD
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
PA1 due in one week TA office hour is 1-3 PM The grade would be available before that.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Lecture 26 Virtual Machine Monitors. Virtual Machines Goal: run an guest OS over an host OS Who has done this? Why might it be useful? Examples: Vmware,
Lecture 7 TLB. Virtual Memory Approaches Time Sharing Static Relocation Base Base+Bounds Segmentation Paging.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
Page Table Implementation. Readings r Silbershatz et al:
CHAPTER 3-3: PAGE MAPPING MEMORY MANAGEMENT. VIRTUAL MEMORY Key Idea Disassociate addresses referenced in a running process from addresses available in.
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
Chapter 18 Paging Chien-Chung Shen CIS/UD
Chapter 21 Swapping: Mechanisms Chien-Chung Shen CIS/UD
Chapter 20 Smaller Tables Chien-Chung Shen CIS, UD
Chapter 19 Translation Lookaside Buffer
CS161 – Design and Architecture of Computer
Translation Lookaside Buffer
Memory Hierarchy Ideal memory is fast, large, and inexpensive
Paging Review Page tables store translations from virtual pages to physical frames Issues: Speed - TLBs Size – Multi-level page tables.
Virtual Memory Chapter 7.4.
Review: Memory Virtualization
CS161 – Design and Architecture of Computer
From Address Translation to Demand Paging
Paging COMP 755.
ECE/CS 552: Virtual Memory
Memory Hierarchy Virtual Memory, Address Translation
PA1 is out Best by Feb , 10:00 pm Enjoy early
Paging Review: Virtual address spaces are broken into pages
CS510 Operating System Foundations
CSCI206 - Computer Organization & Programming
Evolution in Memory Management Techniques
Lecture 6 Memory Management
Andy Wang Operating Systems COP 4610 / CGS 5765
Andy Wang Operating Systems COP 4610 / CGS 5765
Translation Lookaside Buffer
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
CSE 451: Operating Systems Autumn 2005 Memory Management
Translation Buffers (TLB’s)
TLB Performance Seung Ki Lee.
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
© 2004 Ed Lazowska & Hank Levy
CSE451 Virtual Memory Paging Autumn 2002
Virtual Memory Prof. Eric Rotenberg
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Translation Buffers (TLB’s)
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Translation Lookaside Buffers
Translation Buffers (TLBs)
Virtual Memory.
Sarah Diesburg Operating Systems CS 3430
Andy Wang Operating Systems COP 4610 / CGS 5765
Review What are the advantages/disadvantages of pages versus segments?
Paging Andrew Whitaker CSE451.
Sarah Diesburg Operating Systems COP 4610
Presentation transcript:

Chapter 19 Translation Lookaside Buffer Chien-Chung Shen CIS, UD

Introduction High performance overheads of paging –large amount of mapping information (in memory) –extra memory access for each virtual address Hardware support –translation-lookaside buffer (TLB) –part of MMU –hardware cache of popular virtual-to-physical address translations –better name would be address-translation cache –Upon each virtual memory reference, hardware first checks TLB to see if the desired translation is held therein; if so, the translation is performed (quickly) without having to consult the page table (which has all translations)

TLB Algorithm VPN = (VirtualAddress & VPN_MASK) >> SHIFT (Success, TlbEntry) = TLB_Lookup(VPN) if (Success == True) // TLB Hit if (CanAccess(TlbEntry.ProtectBits) == True) Offset = VirtualAddress & OFFSET_MASK PhysAddr = (TlbEntry.PFN << SHIFT) | Offset AccessMemory(PhysAddr) else RaiseException(PROTECTION_FAULT) else // TLB Miss PTEAddr = PTBR + (VPN * sizeof(PTE)) PTE = AccessMemory(PTEAddr) if (PTE.Valid == False) RaiseException(SEGMENTATION_FAULT) else if (CanAccess(PTE.ProtectBits) == False) RaiseException(PROTECTION_FAULT) else TLB_Insert(VPN, PTE.PFN, PTE.ProtectBits) RetryInstruction()

Example: Access Array 8-bit virtual address space and 16-byte pages 10 4-byte integers starting at VA bit VPN and 4-bit offset int sum = 0; for (i = 0; i < 10; i++) { sum += a[i]; } TLB hit rate: 70% Spatial locality Any other way to improve hit rate? –larger pages Quick re-reference of memory in time –temporal locality

Caching and Locality Caching is one of the most fundamental performance techniques in computer systems to make common-case faster Idea behind caching is to take advantage of locality in instruction and data references Temporal locality: an instruction or data item that has been recently accessed will likely be re-accessed soon in the future (e.g., instructions in a loop) Spatial locality: if program accesses memory x, it will likely soon access memory near x

Who handles TLB Misses For CISC (complex-instruction set computers) architecture, by hardware –using page-table base register For RISC (reduced-instruction set computers) architecture, by software (where hardware simply raises an exception and jumps to a trap handler) –advantage: flexibility (OS may use any data structure to implement page table) and simplicity –return-from-trap returns to the same instruction that caused the trap –avoid causing an infinite chain of TLB misses keep TLB miss handlers in physical memory (not subject to address translation) reserve some entries in TLB for permanently-valid translations and use some of those permanent translation slots for the handler code itself

TLB Contents 32, 64, or 128 entries Fully associative: any given translation can be anywhere in TLB, and hardware will search the entire TLB in parallel to find the desired translation An entry looks like: VPN | PFN | other bits –e.g., valid bit –TLB valid bit ≠ page table valid bit in page table, when a PTE is marked invalid, it means that the page has not been allocated by the process a TLB valid bit refers to whether a TLB entry has a valid translation within it

Context Switch TLB contains virtual-to-physical translations that are only valid for the currently running process, which are not meaningful for other processes What to do on a context switch? –flush TLB on context switches by sets all valid bits to 0 Incur TLB misses after context switches: what can you do better? VPN PFN valid prot ASID (Address Space ID) rwx 1 — — 0 — — rwx 2 — — 0 — — With ASID, TLB may hold translations from different processes VPN PFN valid prot ASID rwx 1 — — 0 — — rwx 2 — — 0 — — Sharing of page

Replacement Policy Cache replacement with goal of minimizing miss rate Policies –evict the least-recently-used (LRU) entry how about a loop accessing n + 1 pages, a TLB of size n, and an LRU replacement policy ? –random

A Real TLB Entry MIPS R4000 with software-managed TLB

Culler’s Law The term random-access memory (RAM) implies that you can access any part of RAM just as quickly as another. While it is generally good to think of RAM in this way, because of hardware/OS features such as TLB, accessing a particular page of memory may be costly, particularly if that page isn’t currently mapped by TLB. Thus, it is always good to remember the implementation tip: RAM isn’t always RAM. Sometimes randomly accessing your address space, particular if the number of pages accessed exceeds the TLB coverage, can lead to severe performance penalties. -- David Culler TLB is the source of many performance problems