Lecture 6 Memory Management

Slides:



Advertisements
Similar presentations
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
Advertisements

Lecture 7 Memory Management. Virtual Memory Approaches Time Sharing: one process uses RAM at a time Static Relocation: statically rewrite code before.
Chapter 19 Translation Lookaside Buffer Chien-Chung Shen CIS, UD
Lecture 8 Memory Management. Paging Too slow -> TLB Too big -> multi-level page table What if all that stuff does not fit into memory?
COMP 3221: Microprocessors and Embedded Systems Lectures 27: Virtual Memory - III Lecturer: Hui Wu Session 2, 2005 Modified.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Virtual Memory 2 P & H Chapter
CS 153 Design of Operating Systems Spring 2015
Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley.
Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley and Rabi Mahapatra & Hank Walker.
Memory Management and Paging CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
Translation Buffers (TLB’s)
Chapter 18 Paging Chien-Chung Shen CIS, UD
PA0 due 60 hours. Lecture 4 Memory Management OSTEP Virtualization CPU: illusion of private CPU RAM: illusion of private memory Concurrency Persistence.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
Chapter 21 Swapping: Mechanisms Chien-Chung Shen CIS, UD
Review °Apply Principle of Locality Recursively °Manage memory to disk? Treat as cache Included protection as bonus, now critical Use Page Table of mappings.
PA1 due in one week TA office hour is 1-3 PM The grade would be available before that.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Lecture 7 TLB. Virtual Memory Approaches Time Sharing Static Relocation Base Base+Bounds Segmentation Paging.
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
Memory Management Continued Questions answered in this lecture: What is paging? How can segmentation and paging be combined? How can one speed up address.
Chapter 18 Paging Chien-Chung Shen CIS/UD
Chapter 21 Swapping: Mechanisms Chien-Chung Shen CIS/UD
Chapter 19 Translation Lookaside Buffer
CS161 – Design and Architecture of Computer
Translation Lookaside Buffer
Paging Review Page tables store translations from virtual pages to physical frames Issues: Speed - TLBs Size – Multi-level page tables.
Review: Memory Virtualization
Virtualization Virtualize hardware resources through abstraction CPU
ECE232: Hardware Organization and Design
CS161 – Design and Architecture of Computer
Lecture 12 Virtual Memory.
Section 9: Virtual Memory (VM)
From Address Translation to Demand Paging
CS703 - Advanced Operating Systems
Paging COMP 755.
Christo Wilson Lecture 7: Virtual Memory
From Address Translation to Demand Paging
Today How was the midterm review? Lab4 due today.
CS 704 Advanced Computer Architecture
ECE/CS 552: Virtual Memory
PA1 is out Best by Feb , 10:00 pm Enjoy early
Paging Review: Virtual address spaces are broken into pages
CSE 153 Design of Operating Systems Winter 2018
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Evolution in Memory Management Techniques
PA0 is due in 12 hours PA1 will be out in 12 hours
Virtual Memory Hardware
Translation Lookaside Buffer
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Translation Buffers (TLB’s)
Christo Wilson Lecture 7: Virtual Memory
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE451 Virtual Memory Paging Autumn 2002
Virtual Memory Prof. Eric Rotenberg
CSE 451: Operating Systems Autumn 2004 Page Tables, TLBs, and Other Pragmatics Hank Levy 1.
Translation Buffers (TLB’s)
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Paging and Segmentation
CSE 451: Operating Systems Lecture 10 Paging & TLBs
CS703 - Advanced Operating Systems
CSE 451: Operating Systems Winter 2005 Page Tables, TLBs, and Other Pragmatics Steve Gribble 1.
CSE 153 Design of Operating Systems Winter 2019
Translation Buffers (TLBs)
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
CSE 451: Operating Systems Winter 2012 Page Table Management, TLBs, and Other Pragmatics Mark Zbikowski Gary Kimura 1.
Review What are the advantages/disadvantages of pages versus segments?
CS 444/544 Operating Systems II Virtual Memory Translation
Instructor: Phil Gibbons
Virtual Memory 1 1.
Presentation transcript:

Lecture 6 Memory Management

Virtual Memory Approaches Time Sharing: one process uses RAM at a time Static Relocation: statically rewrite code before run Base: add a base to virtual address to get physical Base+Bounds: also check physical is in range Segmentation: many base+bounds pairs Paging: divide mem into small, fix-sized page frames

Fetch instruction at 0x4010 Exec, load from 0x6900 Code Heap Stack VirtA 0KB 4KB 8KB 12KB 16KB PhysA 16KB 20KB 24KB 26KB 28KB 32KB Segment Base Size Positive? Code 0 16KB 4KB 1 Heap 1 22KB Stack 2 28KB 2KB Fetch instruction at 0x4010 Exec, load from 0x6900 Fetch instruction at 0x4014 Fetch instruction at 0x4017 Exec, store to 0x6900 How to put offset 0x0010 movl 0x2900, %r8d 0x0014 addl $0x3, %r8d 0x0017 movl %r8d, 0x2900

Paging Introduction Memory divided into small, fix-sized page frames Each virtual page is mapped independently Each process has its own page table stored in memory Flexible Addr Space - don’t need to find contiguous RAM - doesn’t waste whole data pages (valid bit) Easy to manage - fixed size pages - simple free list for unused pages - no need to coalesce

Page Mapping with Linear Page Table VirtMem PhysMem 0 1 2 3 4 5 6 7 8 9 10 11 3 P1 1 7 10 P2 4 2 6 8 P3 5 9 11 VA size PA size Page Tables

Where are Page Tables Stored? The size of a typical page table? assume 32-bit address space assume 4 KB pages assume 4 byte entries (or this could be less) 2 ^ (32 - log(4KB)) * 4 = 4 MB Store in memory, and CPU finds it via registers

offset = VirtualAddress & OFFSET_MASK PhysAddr = (PFN << SHIFT) | offset 1 // Extract the VPN from the virtual address 2 VPN = (VirtualAddress & VPN_MASK) >> SHIFT 3 4 // Form the address of the page-table entry (PTE) 5 PTEAddr = PTBR + (VPN * sizeof(PTE)) 6 7 // Fetch the PTE 8 PTE = AccessMemory(PTEAddr) 9 10 // Check if process can access the page 11 if (PTE.Valid == False) 12 RaiseException(SEGMENTATION_FAULT) 13 else if (CanAccess(PTE.ProtectBits) == False) 14 RaiseException(PROTECTION_FAULT) 15 else 16 // Access is OK: form physical address and fetch it 17 offset = VirtualAddress & OFFSET_MASK 18 PhysAddr = (PTE.PFN << PFN_SHIFT) | offset 19 Register = AccessMemory(PhysAddr)

Memory Accesses Again PT, load from 0x5000 Fetch instruction at 0x2010 PT, load from 0x5004 Exec, load from 0x0100 … 0x0010 movl 0x1100, %r8d 0x0014 addl $0x3, %r8d 0x0017 movl %r8d, 0x1100 2 PT 80 99 Assume 4KB pages Assume PTBR is 0x5000 Assume PTE’s are 4 bytes TOO SLOW

Basic Paging Flexible Addr Space Easy to manage Too slow Too big don’t need to find contiguous RAM doesn’t waste whole data pages (valid bit) Easy to manage fixed size pages simple free list for unused pages no need to coalesce Too slow Too big

Translation Steps H/W: for each mem reference: 1. extract VPN (virt page num) from VA (virt addr) 2. calculate addr of PTE (page table entry) 3. fetch PTE 4. extract PFN (page frame num) 5. build PA (phys addr) 6. fetch PA to register

Array Iterator int sum = 0; for (i = 0; i < 10; i++) { sum += a[i]; } What’s the virtual address for a[0] 0x604, and the following? PTBR+24, 0xA04 PTBR+24, 0xA08 PTBR+24, 0xA12

Basic strategy Take advantage of repetition. Use a CPU cache. CPU TLB RAM PT ATC?

TLB Cache Type Fully-Associative: entries can go anywhere most common for TLBs must store whole key/value in cache search all in parallel There are other general cache types

TLB Contents VPN | PFN | other bits TLB valid bit TLB protection bits whether the entry has a valid translation TLB protection bits rwx Address Space Identifier TLB dirty bit

A MIPS TLB Entry Why not really big pages? 19 bits for the VPN; as it turns out, user addresses will only come from half the address space (the rest reserved for the kernel) (PFN), and hence can support systems with up to 64GB of (physical) main memory (2^24 4KB pages). a global bit (G), which is used for pages that are globally-shared among processes. Thus, if the global bit is set, the ASID is ignored Too much memory to store page tables assume 32-bit address space assume 4 KB pages assume 4 byte entries (or this could be less) 2 ^ (32 - log(4KB)) * 4 = 4 MB

1 VPN = (VirtualAddress & VPN_MASK) >> SHIFT 2 (Success, TlbEntry) = TLB_Lookup(VPN) 3 if (Success == True) // TLB Hit 4 if (CanAccess(TlbEntry.ProtectBits) == True) 5 Offset = VirtualAddress & OFFSET_MASK 6 PhysAddr = (TlbEntry.PFN << SHIFT) | Offset 7 AccessMemory(PhysAddr) 8 else 9 RaiseException(PROTECTION_FAULT) 10 else // TLB Miss 11 PTEAddr = PTBR + (VPN * sizeof(PTE)) 12 PTE = AccessMemory(PTEAddr) 13 if (PTE.Valid == False) 14 RaiseException(SEGMENTATION_FAULT) 15 else if (CanAccess(PTE.ProtectBits) == False) 16 RaiseException(PROTECTION_FAULT) 17 else 18 TLB_Insert(VPN, PTE.PFN, PTE.ProtectBits) 19 RetryInstruction()

Array Iterator with TLB int sum = 0; for (i = 0; i < 10; i++) { sum += a[i]; } How many TLB hits? How many TLB misses? Hit rate? Miss rate? TLB Valid VPN PFN What’s the virtual address for a[0] 0x604, and the following? PTBR+24, 0xA04 PTBR+24, 0xA08 PTBR+24, 0xA12

Reasoning about TLB Workload: series of loads/stores to accesses TLB: chooses entries to store in CPU Metric: performance (i.e., hit rate)

TLB Workloads Spatial locality Temporal locality Sequential array accesses can almost always hit in the TLB, and so are very fast! Temporal locality What pattern would be slow? highly random, with no repeat accesses

TLB Replacement Policies LRU: evict least-recently used a TLB slot is needed Random: randomly choose entries to evict When is each better? Sometimes random is better than a “smart” policy!

Who Handles The TLB Miss? H/W or OS? H/W: CPU must know where page tables are CR3 on x86 Page table structure not flexible OS: CPU traps into OS upon TLB miss

1 VPN = (VirtualAddress & VPN_MASK) >> SHIFT 2 (Success, TlbEntry) = TLB_Lookup(VPN) 3 if (Success == True) // TLB Hit 4 if (CanAccess(TlbEntry.ProtectBits) == True) 5 Offset = VirtualAddress & OFFSET_MASK 6 PhysAddr = (TlbEntry.PFN << SHIFT) | Offset 7 AccessMemory(PhysAddr) 8 else 9 RaiseException(PROTECTION_FAULT) 10 else // TLB Miss 11 RaiseException(TLB_MISS)

OS TLB Miss Handler check page table for page table entry if valid, extract PFN and update TLB w special inst return from trap OS: CPU traps into OS upon TLB miss where to resume execution how to avoid double traps? Modifying TLB entries is privileged

Context Switches What happens if a process uses the cached TLB entries from another process? Solutions? flush TLB on each switch remember which entries are for each process Address Space Identifier

Address Space Identifier Tag each TLB entry with an 8-bit ASID how many ASIDs to we get? why not use PIDs? what if there are more PIDs than ASIDs? P1 (ASID 11) P2 (ASID 12) valid VPN PFN ASID - 1 ? 4 3 3 1 4 7 2 10 6

Next time: solving the too big problems