Review: Memory Virtualization

Slides:



Advertisements
Similar presentations
Lecture 7 Memory Management. Virtual Memory Approaches Time Sharing: one process uses RAM at a time Static Relocation: statically rewrite code before.
Advertisements

Chapter 19 Translation Lookaside Buffer Chien-Chung Shen CIS, UD
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
CS 153 Design of Operating Systems Spring 2015
Memory Management (II)
Memory Management and Paging CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
Chapter 3.2 : Virtual Memory
Translation Buffers (TLB’s)
1 Chapter 3.2 : Virtual Memory What is virtual memory? What is virtual memory? Virtual memory management schemes Virtual memory management schemes Paging.
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
Chapter 18 Paging Chien-Chung Shen CIS, UD
PA0 due 60 hours. Lecture 4 Memory Management OSTEP Virtualization CPU: illusion of private CPU RAM: illusion of private memory Concurrency Persistence.
Chapter 4 Memory Management Virtual Memory.
Memory Management Fundamentals Virtual Memory. Outline Introduction Motivation for virtual memory Paging – general concepts –Principle of locality, demand.
Chapter 21 Swapping: Mechanisms Chien-Chung Shen CIS, UD
Review °Apply Principle of Locality Recursively °Manage memory to disk? Treat as cache Included protection as bonus, now critical Use Page Table of mappings.
PA1 due in one week TA office hour is 1-3 PM The grade would be available before that.
Lecture 7 TLB. Virtual Memory Approaches Time Sharing Static Relocation Base Base+Bounds Segmentation Paging.
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
Memory Management Continued Questions answered in this lecture: What is paging? How can segmentation and paging be combined? How can one speed up address.
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
W4118 Operating Systems Instructor: Junfeng Yang.
Chapter 18 Paging Chien-Chung Shen CIS/UD
Chapter 21 Swapping: Mechanisms Chien-Chung Shen CIS/UD
Chapter 19 Translation Lookaside Buffer
CS161 – Design and Architecture of Computer
Translation Lookaside Buffer
MODERN OPERATING SYSTEMS Third Edition ANDREW S
Paging Review Page tables store translations from virtual pages to physical frames Issues: Speed - TLBs Size – Multi-level page tables.
Segmentation COMP 755.
CS161 – Design and Architecture of Computer
CS703 - Advanced Operating Systems
Paging COMP 755.
Christo Wilson Lecture 7: Virtual Memory
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Memory Hierarchy Virtual Memory, Address Translation
PA1 is out Best by Feb , 10:00 pm Enjoy early
Paging Review: Virtual address spaces are broken into pages
Chapter 8: Main Memory.
CSCI206 - Computer Organization & Programming
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Lecture 6 Memory Management
Translation Lookaside Buffer
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
CSE 451: Operating Systems Autumn 2005 Memory Management
Translation Buffers (TLB’s)
Virtual Memory Overcoming main memory size limitation
Christo Wilson Lecture 7: Virtual Memory
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE451 Virtual Memory Paging Autumn 2002
Virtual Memory Prof. Eric Rotenberg
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CSE 451: Operating Systems Autumn 2004 Page Tables, TLBs, and Other Pragmatics Hank Levy 1.
Translation Buffers (TLB’s)
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Translation Lookaside Buffers
Paging and Segmentation
CS703 - Advanced Operating Systems
CSE 451: Operating Systems Winter 2005 Page Tables, TLBs, and Other Pragmatics Steve Gribble 1.
Virtual Memory Lecture notes from MKP and S. Yalamanchili.
Translation Buffers (TLBs)
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Review What are the advantages/disadvantages of pages versus segments?
Sarah Diesburg Operating Systems COP 4610
Virtual Memory 1 1.
Presentation transcript:

Review: Memory Virtualization Goal: Provide each process with the illusion of its own address space Must map virtual address spaces to physical memory Must provide efficient address translation Must provide protection Approaches so far: Base and bounds Segmentation

Hardware/software responsibilities?

Hardware/software responsibilities? Efficient address translation Enforcing isolation/protection Software?

Hardware/software responsibilities? Efficient address translation Enforcing isolation/protection Software Maintain hardware structures Base/bound registers, etc. Handle exceptions

Issues? Base/bounds Segmentation to address wasted memory Sparse address space: waste of memory Segmentation to address wasted memory External fragmentation Not much flexibility Fine grained segmentation can increase flexibility, but decreases efficiency

Addressing External Fragmentation: Paging Paging splits up address space into fixed sized units called a page. Segmentation: variable size of logical segments(code, stack, heap, etc.) With paging, physical memory is also split into some number of pages called a page frame. Page table per process is needed to translate the virtual address to physical address.

Advantages Of Paging Flexibility: Supports the abstraction of address space effectively Don’t need assumption how heap and stack grow Simplicity: ease of free-space management Pages and page frames are the same size Easy to allocate and keep a free list No external fragmentation

A Simple Paging Example 128-byte physical memory with 16 byte page frames 64-byte address space with 16 byte pages 16 reserved for OS page 3 of AS (unused) page 0 of AS page 2 of AS 64-Byte Address Space Placed In Physical Memory page 1 of AS 32 48 64 80 96 112 128 page frame 0 of physical memory page frame 1 page frame 2 page frame 3 page frame 4 page frame 5 page frame 6 page frame 7 16 32 48 64 (page 0 of the address space) (page 1) (page 2) (page 3) A Simple 64-byte Address Space

Address Translation Two components in the virtual address VPN: virtual page number Offset: offset within the page Example: virtual address 21 in a 64-byte address space Va5 Va4 Va3 Va2 Va1 Va0 VPN offset VPN offset 1 1 1

Example: Address Translation The virtual address 21 in 64-byte address space 1 VPN offset PFN Virtual Address Physical Translation (Page table lookup)

Where Are Page Tables Stored? Page Table: Maps a VPN (virtual page number) to a PFN (page frame number) Page tables must be protected: allowing a user space process to modify page tables would be very bad…. So, store in kernel memory space Page tables can be very large 32 bit address space 4 KB pages – 12 bits 20 bits left over for VPN = 1 M entries Typical page table entry (PTE) size: 4B So… page table size = 4B * 1M = 4MB total 1 Page table per process Easily > 100 processes running on modern systems Hundreds of MBs of storage for page tables

Page Table in Kernel Physical Memory page table 3 7 5 2 page frame 0 of physical memory 16 (unused) page frame 1 32 page 3 of AS page frame 2 48 page 0 of AS page frame 3 64 (unused) page frame 4 80 page 2 of AS page frame 5 96 (unused) page frame 6 112 page 1 of AS page frame 7 128 Physical Memory

Page Table Entry (PTE) The page table is just a data structure that is used to map the VPNs to FPNs. Simplest form: a linear page table, an array The OS indexes the array by VPN, and looks up the page-table entry.

Common Flags Of Page Table Entry Valid Bit: Indicating whether the particular translation is valid. Critical for handling sparse address spaces: The unused space between the heap and stack is invalid and need not be allocated Protection Bit: Indicating whether the page could be read from, written to, or executed from Trap into OS is a page is accessed incorrectly Present Bit: Indicating whether this page is in physical memory or on disk (swapped out) Dirty Bit: Indicating whether the page has been modified since it was brought into memory Reference Bit(Accessed Bit): Indicating that a page has been accessed Critical for page replacement policies

Example: x86 Page Table Entry 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 PFN G PAT D A PCD PWT U/S R/W P P: present bit R/W: read/write bit U/S: user/supervisor bit A: accessed bit D: dirty bit PFN: the page frame number The rest: hardware caching bits An x86 Page Table Entry(PTE)

Problems With Paging?

Problems with Paging? Page table lookup requires an additional memory access per address translation Slow!! Page tables are too big!! Next time…

Paging is Too Slow!!! To find the location of a desired PTE, the starting location of the page table is needed. Page table base register For every memory reference, paging requires the OS to perform one extra memory reference.

Accessing Memory With Paging 1 // Extract the VPN from the virtual address 2 VPN = (VirtualAddress & VPN_MASK) >> SHIFT 3 4 // Form the address of the page-table entry (PTE) 5 PTEAddr = PTBR + (VPN * sizeof(PTE)) 6 7 // Fetch the PTE 8 PTE = AccessMemory(PTEAddr) 9 10 // Check if process can access the page 11 if (PTE.Valid == False) 12 RaiseException(SEGMENTATION_FAULT) 13 else if (CanAccess(PTE.ProtectBits) == False) 14 RaiseException(PROTECTION_FAULT) 15 else 16 // Access is OK: form physical address and fetch it 17 offset = VirtualAddress & OFFSET_MASK 18 PhysAddr = (PTE.PFN << PFN_SHIFT) | offset 19 Register = AccessMemory(PhysAddr)

A Memory Trace Example: A Simple Memory Access Compile and execute Resulting Assembly code int array[1000]; ... for (i = 0; i < 1000; i++) array[i] = 0; prompt> gcc –o array array.c –Wall –o prompt>./array 0x1024 movl $0x0,(%edi,%eax,4) 0x1028 incl %eax 0x102c cmpl $0x03e8,%eax 0x1030 jne 0x1024

A Virtual(And Physical) Memory Trace Page Table[39] Page Table[1] 1024 1074 1124 1174 1224 Page Table(PA) Array(PA) 7232 7282 7132 Array(VA) 40000 40050 40100 mov Code(PA) 4096 4146 4196 1024 1074 1124 Code(VA) 10 20 30 40 50 mov inc cmp jne Memory Access

Locality Temporal locality Spatial locality Data is likely to be accessed again in the near future Ex: code, page table entries Spatial locality Data close to recently used data is likely to be used soon Ex: array traversal

Exploiting Locality to Speed up Paging Hardware caches Keep likely to be used data close to the CPU Caches are fast because they are small Cache access: 1-3 CPU cycles Memory access: ~100 cycles Translation Lookaside Buffers Small hardware caches specialized for address translation

Translation Lookaside Buffer (TLB) Part of the chip’s memory-management unit (MMU). A hardware cache of popular virtual-to-physical address translation. TLB Lookup MMU TLB Hit Logical Address Physical Address TLB popular v to p TLB Miss CPU Page 0 Page Table all v to p entries 말그대로 파풀러한 애들을 캐시해주는 거니까 그림이 잘못됬음. Page 1 Page 2 … Page n Address Translation with MMU Physical Memory

TLB Basic Algorithm (line 1) extract the virtual page number(VPN). 1: VPN = (VirtualAddress & VPN_MASK ) >> SHIFT 2: (Success , TlbEntry) = TLB_Lookup(VPN) 3: if(Success == True){ // TLB Hit 4: if(CanAccess(TlbEntry.ProtectBit) == True ){ 5: offset = VirtualAddress & OFFSET_MASK 6: PhysAddr = (TlbEntry.PFN << SHIFT) | Offset 7: AccessMemory( PhysAddr ) 8: }else RaiseException(PROTECTION_ERROR) (line 1) extract the virtual page number(VPN). (line 2) check if the TLB holds the translation for this VPN. (lines 5-8) extract the page frame number from the relevant TLB entry, and form the desired physical address and access memory.

TLB Basic Algorithms (Cont.) 11: }else{ //TLB Miss 12: PTEAddr = PTBR + (VPN * sizeof(PTE)) 13: PTE = AccessMemory(PTEAddr) 14: (Check valid bit and protection bit, RaiseException if needed) 15: (If no exception, update TLB) 16: TLB_Insert( VPN , PTE.PFN , PTE.ProtectBits) 17: RetryInstruction() 18: } 19:} (lines 11-12) The hardware accesses the page table to find the translation. (line 16) updates the TLB with the translation.

Example: Accessing An Array How a TLB can improve performance. OFFSET 00 04 08 12 16 VPN = 00 VPN = 01 VPN = 03 VPN = 04 VPN = 05 VPN = 06 a[0] a[1] a[2] VPN = 07 a[3] a[4] a[5] a[6] VPN = 08 a[7] a[8] a[9] VPN = 09 VPN = 10 VPN = 11 VPN = 12 VPN = 13 VPN = 14 VPN = 15 0: int sum = 0 ; 1: for( i=0; i<10; i++){ 2: sum+=a[i]; 3: } The TLB improves performance due to spatial locality 3 misses and 7 hits. Thus TLB hit rate is 70%.

Locality Temporal Locality Spatial Locality An instruction or data item that has been recently accessed will likely be re-accessed soon in the future. Spatial Locality If a program accesses memory at address x, it will likely soon access memory near x

Who Handles The TLB Miss? Hardware managed TLB Hardware handles the TLB miss entirely Common in CISC architectures The hardware has to know exactly where the page tables are located in memory. The hardware would “walk” the page table, find the correct page-table entry and extract the desired translation, update and retry instruction Software must conform to hardware page table design: lose flexibility

Who Handles The TLB Miss? (Cont.) Software-managed TLB Common in RISC architectures On a TLB miss, the hardware raises an exception ( trap handler ). Trap handler is code within the OS that is written with the express purpose of handling TLB misses Added flexibility at cost of OS complexity Reduced hardware complexity Requires new instructions: Protected instructions that allow the OS to modify the TLB

TLB Control Flow algorithm(OS Handled) 1: VPN = (VirtualAddress & VPN_MASK) >> SHIFT 2: (Success, TlbEntry) = TLB_Lookup(VPN) 3: if (Success == True) // TLB Hit 4: if (CanAccess(TlbEntry.ProtectBits) == True) 5: Offset = VirtualAddress & OFFSET_MASK 6: PhysAddr = (TlbEntry.PFN << SHIFT) | Offset 7: Register = AccessMemory(PhysAddr) 8: else 9: RaiseException(PROTECTION_FAULT) 10: else // TLB Miss 11: RaiseException(TLB_MISS)

TLB entry TLB is Fully Associative A typical TLB might have 32,64, or 128 entries. Hardware searches the entire TLB in parallel to find the desired translation. Other bits: valid bits, protection bits, address-space identifier, dirty bit VPN PFN other bits A typical TLB entry

TLB Issue: Context Switching Page 0 Page 1 Process A Page 2 access VPN10 Insert TLB Entry … TLB Table Page n VPN PFN valid prot 10 100 1 rwx - Virtual Memory Page 0 Process B 현재 프로세스 A, B의 주소공간을 가상공간으로 변경 예정 Page 1 Page 2 … Page n Virtual Memory

TLB Issue: Context Switching Page 0 Page 1 Process A Page 2 … TLB Table Page n VPN PFN valid prot 10 100 1 rwx - 170 Virtual Memory Context Switching Page 0 Process B Page 1 access VPN10 현재 프로세스 A, B의 주소공간을 가상공간으로 변경 예정 Page 2 Insert TLB Entry … Page n Virtual Memory

TLB Issue: Context Switching Page 0 Page 1 Page 2 Virtual Memory Page n … Process A TLB Table VPN PFN valid prot 10 100 1 rwx - 170 Page 0 Process B Page 1 현재 프로세스 A, B의 주소공간을 가상공간으로 변경 예정 Page 2 … Can’t Distinguish which entry is meant for which process Page n Virtual Memory

To Solve Problem Flush the TLB on context switch Or… Provide an address space identifier(ASID) field in the TLB Page 0 Page 1 Page 2 Virtual Memory Page n … Process A TLB Table VPN PFN valid prot ASID 10 100 1 rwx - 170 2 Page 0 Process B Page 1 Page 2 … Page n Virtual Memory

Another Case Two processes share a page Process 1 is sharing physical page 101 with Process2. P1 maps this page into the 10th page of its address space. P2 maps this page to the 50th page of its address space. Sharing of pages is useful as it reduces the number of physical pages in use. VPN PFN valid prot ASID 10 101 1 rwx - 50 2

TLB Replacement Policy LRU(Least Recently Used) Evict an entry that has not recently been used Take advantage of locality in the memory-reference stream Reference Row 7 1 2 3 4 2 3 3 2 1 2 1 7 7 7 2 2 4 4 4 1 1 3 3 3 Page Frame: 1 1 3 3 2 2 2 2 2 Total 11 TLB miss

All 64 bits of this TLB entry(example of MIPS R4000) A Real TLB Entry All 64 bits of this TLB entry(example of MIPS R4000) 0 1 2 3 4 5 6 7 8 9 10 11 … 19 … 31 VPN G ASID PFN C D V Flag Content 19-bit VPN The rest reserved for the kernel. 24-bit PFN Systems can support with up to 64GB of main memory( pages ). Global bit(G) Used for pages that are globally-shared among processes. ASID OS can use to distinguish between address spaces. Coherence bit(C) determine how a page is cached by the hardware. Dirty bit(D) marking when the page has been written. Valid bit(V) tells the hardware if there is a valid translation present in the entry.

TLB Entry vs. PTE Contents are not the same Ex. Valid bits PTE: is the page allocated by the process? Causes an exception to be raised if we try to access an invalid page TLB: is the entry valid? May set all entries to invalid to flush TLB On start up, entries are invalid until populated