Revisiting Hardware-Assisted Page Walks for Virtualized Systems

Slides:

Advertisements

Similar presentations

Hardware-assisted Virtualization

Advertisements

虛擬化技術 Virtualization Technique

KAIST Computer Architecture Lab. The Effect of Multi-core on HPC Applications in Virtualized Systems Jaeung Han¹, Jeongseob Ahn¹, Changdae Kim¹, Youngjin.

Efficient Virtual Memory for Big Memory Servers U Wisc and HP Labs ISCA’13 Architecture Reading Club Summer'131.

CS530 Operating System Nesting Paging in VM Replay for MPs Jaehyuk Huh Computer Science, KAIST.

KMemvisor: Flexible System Wide Memory Mirroring in Virtual Environments Bin Wang Zhengwei Qi Haibing Guan Haoliang Dong Wei Sun Shanghai Key Laboratory.

CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.

April 27, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 23: Putting it all together: Intel Nehalem Krste Asanovic Electrical.

CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Putting it all together: Intel Nehalem Steve Ko Computer Sciences and Engineering University.

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

WCED: June 7, 2003 Matt Ramsay, Chris Feucht, & Mikko Lipasti University of Wisconsin-MadisonSlide 1 of 26 Exploring Efficient SMT Branch Predictor Design.

Memory Management (II)

CS 333 Introduction to Operating Systems Class 11 – Virtual Memory (1)

Chapter 3.2 : Virtual Memory

Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.

Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.

Virtual Memory Topics Virtual Memory Access Page Table, TLB Programming for locality Memory Mountain Revisited.

Answers to the VM Problems Spring First question A computer has 32 bit addresses and a virtual memory with a page size of 8 kilobytes.  How many.

Mem. Hier. CSE 471 Aut 011 Evolution in Memory Management Techniques In early days, single program run on the whole machine –Used all the memory available.

Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.

CS 241 Section Week #12 (04/22/10).

Zen and the Art of Virtualization Paul Barham, et al. University of Cambridge, Microsoft Research Cambridge Published by ACM SOSP’03 Presented by Tina.

Lecture 19: Virtual Memory

Carnegie Mellon /18-243: Introduction to Computer Systems Instructors: Bill Nace and Gregory Kesden (c) All Rights Reserved. All work.

CoLT: Coalesced Large-Reach TLBs December 2012 Binh Pham §, Viswanathan Vaidyanathan §, Aamer Jaleel ǂ, Abhishek Bhattacharjee § § Rutgers University ǂ.

1 Chapter 3.2 : Virtual Memory What is virtual memory? What is virtual memory? Virtual memory management schemes Virtual memory management schemes Paging.

The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.

Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.

Operating Systems COMP 4850/CISG 5550 Page Tables TLBs Inverted Page Tables Dr. James Money.

1 Virtual Memory and Address Translation. 2 Review Program addresses are virtual addresses.  Relative offset of program regions can not change during.

Accelerating Two-Dimensional Page Walks for Virtualized Systems Jun Ma.

CS399 New Beginnings Jonathan Walpole. Virtual Memory (1)

Micro-sliced Virtual Processors to Hide the Effect of Discontinuous CPU Availability for Consolidated Systems Jeongseob Ahn, Chang Hyun Park, and Jaehyuk.

Virtual Hierarchies to Support Server Consolidation Mike Marty Mark Hill University of Wisconsin-Madison ISCA 2007.

Miseon Han Thomas W. Barr, Alan L. Cox, Scott Rixner Rice Computer Architecture Group, Rice University ISCA, June 2011.

Virtual Memory 1 1.

Chapter 91 Logical Address in Paging  Page size always chosen as a power of 2.  Example: if 16 bit addresses are used and page size = 1K, we need 10.

4.3 Virtual Memory. Virtual memory  Want to run programs (code+stack+data) larger than available memory.  Overlays programmer divides program into pieces.

Full and Para Virtualization

Redundant Memory Mappings for Fast Access to Large Memories

Protection of Processes Security and privacy of data is challenging currently. Protecting information – Not limited to hardware. – Depends on innovation.

Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.

LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”

CS203 – Advanced Computer Architecture Virtual Memory.

Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.

Agile Paging: Exceeding the Best of Nested and Shadow Paging

Lecture 13: Virtual Machines

Memory: Page Table Structure

Above: The Burrough B5000 computer

COSC6385 Advanced Computer Architecture Lecture 7. Virtual Memory

Xen and the Art of Virtualization

Performance Implications of Extended Page Tables on Virtualized x86 Processors Tim Merrifield and Reza Taheri © 2014 VMware Inc. All rights reserved.

Section 8 Address Translation March 10th, 2017 Taught by Joshua Don.

Chang Hyun Park, Taekyung Heo, and Jaehyuk Huh

Presented by Yoon-Soo Lee

ECE232: Hardware Organization and Design

Section 9: Virtual Memory (VM)

CS510 Operating System Foundations

CSE 153 Design of Operating Systems Winter 2018

Energy-Efficient Address Translation

OS Virtualization.

CSE 451: Operating Systems Autumn 2004 Page Tables, TLBs, and Other Pragmatics Hank Levy 1.

Translation Lookaside Buffers

CSE 451: Operating Systems Winter 2005 Page Tables, TLBs, and Other Pragmatics Steve Gribble 1.

CSE 153 Design of Operating Systems Winter 2019

Virtual Memory.

Hardware Virtualization

4.3 Virtual Memory.

Slides from E0-253 taught by Arkaprava Basu and Vinod Ganapathy

Virtual Memory 1 1.

Presentation transcript:

Revisiting Hardware-Assisted Page Walks for Virtualized Systems Jeongseob Ahn, Seongwook Jin, and Jaehyuk Huh Computer Science Department KAIST

System Virtualization Widely used for cloud computing as well as server consolidation Hypervisor serves resource managements E.g.,) CPU, Memory, I/O, and etc Hypervisor OS App VM1 Virtual Physical System VM2

Address Translation for VMs Virtualization requires two-level address translations To provide isolated address space for each VM Guest page tables are used to translate gVA to gPA Guest Page Table (per process) Guest Virtual Address Guest Physical Nested Page Table (per VM) System Physical Address Guest physical address Offset gPPN Page table Guest virtual address gVPN

Does it make sense ? Memory required to hold page tables can be large 48-bit address space with 4KB page = 248/212 = 236 Pages Flat page table: 512GB (236 x 23= 239) Page tables are required for each process Multi-level page table 47 offset VPN 11 236 212 VA: L1 offset L2 L3 L4 Base address (CR3)

Address Translation for VMs Virtualization requires two-level address translations To provide isolated address space for each VM Current hardware page walker assumes the same organization for guest and nested page tables AMD Nested paging Intel Extended Page Table Guest Page Table (per process) Guest Virtual Address Guest Physical Nested Page Table (per VM) System Physical Address

Hardware-Assisted Page Walks Two-dimensional page walks for virtualized systems gVA gCR3 gL4 5 gL3 10 nL4 16 nL3 17 nL2 18 nL1 19 gL2 15 gL1 20 sPA nL4 6 nL3 7 nL2 8 nL1 9 nL4 11 nL3 12 nL2 13 nL1 14 Guest page table (gLn) x86_64 gPA (gL4) nL4 21 nL3 22 nL2 23 nL1 24 sPA nL4 1 Nested page table (nLn) nCR3 nL3 2 nL2 3 nL1 4 gVA: guest virtual address gPA: guest physical address sPA: system physical address

Hardware-Assisted Page Walks Two-dimensional page walks for virtualized systems Guest (m-levels) and nested (n-levels) page tables # of page walks: mn + m + n x86_64: 4*4 + 4 + 4 = 24 gVA gCR3 gL4 5 gL3 10 nL4 16 nL3 17 nL2 18 nL1 19 gL2 15 gL1 20 sPA nL4 6 nL3 7 nL2 8 nL1 9 nL4 11 nL3 12 nL2 13 nL1 14 Guest page table (gLn) x86_64 gPA (L4) nL4 21 nL3 22 nL2 23 nL1 24 sPA nL4 1 nCR3 Nested page table (nLn) nL3 2 Can we simplify the nested page tables ? nL2 3 gVA: guest virtual address gPA: guest physical address nL1 4 sPA: system physical address

Revisiting Nested Page Table # of virtual machines < # of processes There are 106 processes in a Linux system after booting Differences address space between VMs and processes Virtual machines use much of the guest physical memory Processes use a tiny fraction of virtual memory space Exploit the characteristics of VM memory management < 31 offset gPPN 11 Guest physical address space e.g.) 32bit (4GB VM) 47 47 offset gVPN 11 Guest virtual address space 48bit

Revisiting Nested Page Table # of virtual machines < # of processes There are 106 processes in a Linux system after booting Differences address space between VMs and processes Virtual machines use much of the guest physical memory Processes use a tiny fraction of virtual memory space Exploit the characteristics of VM memory management Multi-level nested page tables are not necessary!! < 31 offset gPPN 11 Guest physical address space e.g.) 32bit (4GB VM) 47 47 offset gVPN 11 Guest virtual address space 48bit

Flat Nested Page Table Guest physical address Offset gPPN Flat nested page table Base address (nCR3) Physical memory Reduces the number of memory references for nested page walks

Flat Nested Page Table Memory consumption Process Virtual Machine (4GB) # of pages 248 / 4KB = 68,719,476,736 232 / 4KB = 1,048,576 Flat Page table size # of pages x 8B = 512GB #of pages x 8B = 8MB

Page Walks with Flat Nested Page Table gL4 2 4 6 8 gCR3 gVA Guest page table nL4 3 5 1 7 9 nCR3 sPA gPA Nested page table gL4 5 10 15 20 gCR3 gVA Guest page table nL4 6 nL3 7 nL2 8 nL1 9 11 1 2 3 4 12 13 14 16 17 18 19 21 22 23 24 nCR3 sPA gPA Nested page table Reducing 15 memory references from current 24 references

Traditional inverted page table can do it !! Does it make sense ? Flat nested table cannot reduce # of page walks for guest page tables It still requires 9 memory references We would like to fetch a page table entry by a single memory reference ? Guest Virtual Address System Physical Traditional inverted page table can do it !!

Traditional Inverted Page Table Provides direct translation from guest to physical pages Virtual address Offset VPN Inverted Page Table (per system) Physical frames P-ID Hash Key Hash()

Inverted Shadow Page Table Nested Page Table (per VM) System Physical Address Guest (per process) Guest Virtual Guest Physical Inverted shadow page table Offset VPN Guest virtual address Physical frames Inverted Page Table (per system) VM-ID P-ID VPN Hash Key Hash()

Inverted Shadow Page Table Nested Page Table (per VM) System Physical Address Guest (per process) Guest Virtual Guest Physical Inverted shadow page table Whenever guest page table entries change, the inverted shadow page table must be updated Offset VPN Guest virtual address Physical frames Inverted Page Table (per system) VM-ID P-ID VPN Hash Key Hash()

Inverted Shadow Page Table Guest OS Guest virtual address L4 L3 L2 L1 offset 1: ... 2: ... 3: update_page_table_entries() 4: ... Guest CR3 Intercepts on page table edits, CR3 changes Hypervisor Offset VPN Guest virtual address Physical frames Inverted Page Table (per system) 1: static int sh_page_fault(...) 2: { 3: ... 4: sh_update_page_tables() 5: ... 6: return 0; 7: } VM-ID P-ID VPN Hash Key Hash()

Inverted Shadow Page Table Guest page tables by guest OS Guest Virtual address L4 L3 L2 L1 offset 1: ... 2: ... 3: update_page_table_entries() 4: ... Guest CR3 To sync between guest and inverted shadow page table, a lot of hypervisor interventions are required Intercepts on page table edits, CR3 changes Shadow page tables by hypervisor Offset VPN Guest Virtual address Physical frames Inverted Page Table (per system) 1: static int sh_page_fault(...) 2: { 3: ... 4: sh_update_page_tables() 5: ... 6: return 0; 7: } VM-ID P-ID VPN Hash Key Hash()

Overheads of Synchronization Significant performance overhead [SIGOPS ‘10] Exiting from a guest VM to the hypervisor Polluting caches, TLBs, branch predictor, prefetcher, and etc. Hypervisor intervention Whenever guest page table entries change, the inverted shadow page table must be updated Similar with traditional shadow paging [VMware Tech. report ‘09] [Wang et al. VEE ‘11] Performance behavior (Refer to our paper)

Speculatively Handling TLB misses We propose a speculative mechanism to eliminate the synchronization overheads SpecTLB first proposed to use speculation to predict address translation [Barr et al., ISCA 2011] No need for hypervisor interventions, even if a guest page changes Inverted shadow page table may have the obsolete address mapping information Misspeculation rates are relatively low With re-order buffer or checkpointing

Speculative Page Table Walk 2 4 6 8 gCR3 gVA nCR3 sPA 3 5 1 7 9 TLB miss Page walks with flat nested page table Retired ? Speculative page walk with inverted shadow page table Speculative execution gVA sPA* 1 PID VMID sPA*: Speculatively obtained system physical address

Experimental Methodology Simics with custom memory hierarchy model Processor Single in-order processor for x86 Cache Split L1 I/D and unified L2 TLB Split L1 I/D and L2 I/D Page Walk Cache intermediate translations Nested TLB guest physical to system physical translation Xen hypervisor on Simics Domain-0 and Domain-U(guest VM) are running Workloads (more in the paper) SPECint 2006: Gcc, mcf, sjeng Commercial: SPECjbb, RUBiS, OLTP likes

Evaluated Schemes State-of-the-art hardware 2D page walker (base) With 2D PWC and NTLB [Bhargava et al. ASPLOS ’08] Flat nested walker (flat) With 1D PWC and NTLB Speculative inverted shadow paging (SpecISP) With flat nested page tables as backing page tables Perfect TLB

Performance Improvements Better [SPECint] [Commercial]

Performance Improvements [SPECint] [Commercial]

Performance Improvements [SPECint] [Commercial]

Performance Improvements Up to 25%(Volano), Average 14%

Conclusions Our paper is revisiting the page walks for virtualized systems Differences of memory managements for virtual machines and for processes in native systems We propose a bottom-up reorganization of address translation supports for virtualized systems Flattening nested page tables Reduce memory references for 2D page walks with little extra hardware Speculative inverted shadow paging Reduce the cost of a nested page walk

Thank you ! Revisiting Hardware-Assisted Page Walks for Virtualized Systems Jeongseob Ahn, Seongwook Jin, and Jaehyuk Huh Computer Science Department KAIST

Backup Slides

Penalty: misspeculation << hypervisor intervention Details on SpecISP Cumulative distributions of TLB miss latencies Misspeculation rates CDF 90 cycles Volano Penalty: misspeculation << hypervisor intervention workloads Mis-spec. rate gcc 2.072% SPECjbb 0.057% mcf 0.008% Volano ≈ 0.000% sjeng 0.150% KernelCompile 5.312%

Penalty: misspeculation << hypervisor intervention Details on SpecISP Cumulative distributions of TLB miss latencies Misspeculation rates CDF 90 cycles Volano Penalty: misspeculation << hypervisor intervention workloads Mis-spec. rate gcc 2.072% SPECjbb 0.057% mcf 0.008% Volano ≈ 0.000% sjeng 0.150% KernelCompile 5.312%