Virtual Memory II CSE 351 Spring 2017

Slides:

Advertisements

Similar presentations

Request Dispatching for Cheap Energy Prices in Cloud Data Centers

Advertisements

SpringerLink Training Kit

Luminosity measurements at Hadron Colliders

From Word Embeddings To Document Distances

Choosing a Dental Plan Student Name

Virtual Environments and Computer Graphics

Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI

THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –

D. Phát triển thương hiệu

NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN

Điều trị chống huyết khối trong tai biến mạch máu não

BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.

Nasal Cannula X particulate mask

Evolving Architecture for Beyond the Standard Model

HF NOISE FILTERS PERFORMANCE

Electronics for Pedestrians – Passive Components –

Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel

L-Systems and Affine Transformations

CMSC423: Bioinformatic Algorithms, Databases and Tools

Some aspect concerning the LMDZ dynamical core and its use

Bayesian Confidence Limits and Intervals

实习总结（Internship Summary)

Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,

Front End Electronics for SOI Monolithic Pixel Sensor

Face Recognition Monday, February 1, 2016.

Solving Rubik's Cube By: Etai Nativ.

CS284 Paper Presentation Arpad Kovacs

انتقال حرارت 2 خانم خسرویار.

Summer Student Program First results

Theoretical Results on Neutrinos

HERMESでのHard Exclusive生成過程による核子内クォーク全角運動量についての研究

Wavelet Coherence & Cross-Wavelet Transform

yaSpMV: Yet Another SpMV Framework on GPUs

Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.

MOCLA02 Design of a Compact L-band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Fuel cell development program for electric vehicle

Overview of TST-2 Experiment

Optomechanics with atoms

داده کاوی سئوالات نمونه

Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium

ლექცია 4 - ფული და ინფლაცია

10. predavanje Novac i financijski sustav

Wissenschaftliche Aussprache zur Dissertation

FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,

Particle acceleration during the gamma-ray flares of the Crab Nebular

Interpretations of the Derivative Gottfried Wilhelm Leibniz

Advisor: Chiuyuan Chen Student: Shao-Chun Lin

Widow Rockfish Assessment

SiW-ECAL Beam Test 2015 Kick-Off meeting

On Robust Neighbor Discovery in Mobile Wireless Networks

Chapter 6 并发：死锁和饥饿 Operating Systems: Internals and Design Principles

You NEED your book!!! Frequency Distribution

Y V =0 a V =V0 x b b V =0 z

Fairness-oriented Scheduling Support for Multicore Systems

Climate-Energy-Policy Interaction

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Ch48 Statistics by Chtan FYHSKulai

The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.

Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs

Online Learning: An Introduction

Factor Based Index of Systemic Stress (FISS)

What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.

THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*

Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.

The Toroidal Sporadic Source: Understanding Temporal Variations

FW 3.4: More Circle Practice

ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف

Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM

Limits on Anomalous WWγ and WWZ Couplings from DØ

Presentation transcript:

Virtual Memory II CSE 351 Spring 2017 Instructor: Ruth Anderson Teaching Assistants: Dylan Johnson Kevin Bi Linxing Preston Jiang Cody Ohlsen Yufang Sun Joshua Curtis

Administrivia Homework 4 – Due this Friday 5/19 Cache questions Lab 4 – Due Tuesday 5/23 Cache runtimes and parameter puzzles

Virtual Memory (VM) Overview and motivation VM as a tool for caching Address translation VM as a tool for memory management VM as a tool for memory protection

Review: Terminology Context switch Page in Page out Thrashing Switch between processes on the same CPU Page in Move pages of virtual memory from disk to physical memory Page out Move pages of virtual memory from physical memory to disk Thrashing Total working set size of processes is larger than physical memory and causes excessive paging in and out instead of doing useful computation

VM for Managing Multiple Processes Key abstraction: each process has its own virtual address space It can view memory as a simple linear array With virtual memory, this simple linear virtual address space need not be contiguous in physical memory Process needs to store data in another VP? Just map it to any PP! Virtual Address Space for Process 1: Physical Address Space (DRAM) N-1 (e.g., read-only library code) Virtual Address Space for Process 2: VP 1 VP 2 ... PP 2 PP 6 PP 8 M-1 translation

Simplifying Linking and Loading Kernel virtual memory Memory-mapped region for shared libraries Run-time heap (created by malloc) User stack (created at runtime) Unused %rsp (stack pointer) Memory invisible to user code brk 0x400000 Read/write segment (.data, .bss) Read-only segment (.init, .text, .rodata) Loaded from the executable file Linking Each program has similar virtual address space Code, Data, and Heap always start at the same addresses Loading execve allocates virtual pages for .text and .data sections & creates PTEs marked as invalid The .text and .data sections are copied, page by page, on demand by the virtual memory system

VM for Protection and Sharing The mapping of VPs to PPs provides a simple mechanism to protect memory and to share memory between processes Sharing: map virtual pages in separate address spaces to the same physical page (here: PP 6) Protection: process can’t access physical pages to which none of its virtual pages are mapped (here: Process 2 can’t access PP 2) Virtual Address Space for Process 1: Physical Address Space (DRAM) N-1 (e.g., read-only library code) Virtual Address Space for Process 2: VP 1 VP 2 ... PP 2 PP 6 PP 8 M-1 translation

Memory Protection Within Process VM implements read/write/execute permissions Extend page table entries with permission bits MMU checks these permission bits on every memory access If violated, raises exception and OS sends SIGSEGV signal to process (segmentation fault) Why no Dirty bit? Typically handled by OS. Physical Address Space PP 2 PP 4 PP 6 PP 8 PP 9 PP 11 Process i: PPN WRITE EXEC PP 6 No PP 4 Yes PP 2 READ VP 0: VP 1: VP 2: Valid • Process j: WRITE EXEC PP 9 Yes No PP 6 PP 11 READ VP 0: VP 1: VP 2: Valid PPN

Address Translation: Page Hit Cache/ Memory 2 CPU Chip PTEA MMU 1 PTE CPU VA 3 PA 4 Data 5 1) Processor sends virtual address to MMU (memory management unit) 2-3) MMU fetches PTE from page table in cache/memory (Uses PTBR to find beginning of page table for current process) 4) MMU sends physical address to cache/memory requesting data 5) Cache/memory sends data to processor VA = Virtual Address PTEA = Page Table Entry Address PTE= Page Table Entry PA = Physical Address Data = Contents of memory stored at VA originally requested by CPU

Address Translation: Page Fault Exception Page fault handler 4 2 CPU Chip Cache/ Memory Disk PTEA Victim page MMU 1 5 CPU VA PTE 7 3 New page 6 1) Processor sends virtual address to MMU 2-3) MMU fetches PTE from page table in cache/memory 4) Valid bit is zero, so MMU triggers page fault exception 5) Handler identifies victim (and, if dirty, pages it out to disk) 6) Handler pages in new page and updates PTE in memory 7) Handler returns to original process, restarting faulting instruction

Hmm… Translation Sounds Slow The MMU accesses memory twice: once to get the PTE for translation, and then again for the actual memory request The PTEs may be cached in L1 like any other memory word But they may be evicted by other data references And a hit in the L1 cache still requires 1-3 cycles What can we do to make this faster? Solution: add another cache!

Speeding up Translation with a TLB Translation Lookaside Buffer (TLB): Small hardware cache in MMU Maps virtual page numbers to physical page numbers Contains complete page table entries for small number of pages Modern Intel processors have 128 or 256 entries in TLB Much faster than a page table lookup in cache/memory “The architecture of the IBM System/370” R.P. Case and A. Padegs Communications of the ACM. 21:1, 73-96, January 1978. Perhaps the first paper to use the term translation lookaside buffer. The name arises from the historical name for a cache, which was a lookaside buffer as called by those developing the Atlas system at the University of Manchester; a cache of address translations thus became a translation lookaside buffer. Even though the term lookaside buffer fell out of favor, TLB seems to have stuck, for whatever reason. “x86info -c” on Intel Core2 Duo CPU: L1 Data TLB: 4KB pages, 4-way set associative, 16 entries Data TLB: 4K pages, 4-way associative, 256 entries. L1 Data TLB: 4MB pages, 4-way set associative, 16 entries Data TLB: 4MB pages, 4-way associative, 32 entries TLB PTE VPN →

TLB Hit A TLB hit eliminates a memory access! CPU Chip TLB Cache/ MMU PA Data CPU VA CPU Chip PTE 1 2 4 5 TLB VPN 3 → A TLB hit eliminates a memory access!

TLB Miss A TLB miss incurs an additional memory access (the PTE) MMU Cache/ Memory PA Data CPU VA CPU Chip PTE 1 2 5 6 TLB VPN 4 PTEA 3 → Does a TLB miss require disk access? A TLB miss incurs an additional memory access (the PTE) Fortunately, TLB misses are rare

Fetching Data on a Memory Read Check TLB Input: VPN, Output: PPN TLB Hit: Fetch translation, return PPN TLB Miss: Check page table (in memory) Page Table Hit: Load page table entry into TLB Page Fault: Fetch page from disk to memory, update corresponding page table entry, then load entry into TLB Check cache Input: physical address, Output: data Cache Hit: Return data value to processor Cache Miss: Fetch data value from memory, store it in cache, return it to processor

Address Translation Virtual Address TLB Lookup Protection Check TLB Miss TLB Hit Need to restart instruction. Soft and hard page faults. Page Table “Walk” Protection Check Page not in Mem Page in Mem Access Denied Access Permitted Page Fault (OS loads page) Update TLB Protection Fault Physical Address Find in Disk Find in Mem SIGSEGV Check cache CS252 S05

Context Switching Revisited What needs to happen when the CPU switches processes? Registers: Save state of old process, load state of new process Including the Page Table Base Register (PTBR) Memory: Nothing to do! Pages for processes already exist in memory/disk and protected from each other TLB: Invalidate all entries in TLB – mapping is for old process’ VAs Cache: Can leave alone because storing based on PAs – good for shared data

Summary of Address Translation Symbols Basic Parameters N= 2 𝑛 Number of addresses in virtual address space M= 2 𝑚 Number of addresses in physical address space P= 2 𝑝 Page size (bytes) Components of the virtual address (VA) VPO Virtual page offset VPN Virtual page number TLBI TLB index TLBT TLB tag Components of the physical address (PA) PPO Physical page offset (same as VPO) PPN Physical page number