November 2004 J. E. Smith Virtual Machines: An Architecture Perspective.

Slides:



Advertisements
Similar presentations
An Overview Of Virtual Machine Architectures Ross Rosemark.
Advertisements

Virtualization Technology
Evaluating Indirect Branch Handling Mechanisms in Software Dynamic Translation Systems Jason D. Hiser, Daniel Williams, Wei Hu, Jack W. Davidson, Jason.
Chapter 12 CPU Structure and Function. CPU Sequence Fetch instructions Interpret instructions Fetch data Process data Write data.
Distributed Systems CS Virtualization- Part I Lecture 23, Dec 5, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1.
1 Last Class: Introduction Operating system = interface between user & architecture Importance of OS OS history: Change is only constant User-level Applications.
Chapter 12 CPU Structure and Function. Example Register Organizations.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
ELEC6200, Fall 07, Oct 29 Westrom: Virtual Machines 1 Kenneth Westrom ELEC-6620.
Introduction to Virtual Machines
Virtual Machine Monitors CSE451 Andrew Whitaker. Hardware Virtualization Running multiple operating systems on a single physical machine Examples:  VMWare,
An Overview of Virtual Machine Architectures by J.E. Smith and Ravi Nair presented by Sebastian Burckhardt University of Pennsylvania CIS 700 – Virtualization.
Distributed Systems CS Virtualization- Overview Lecture 22, Dec 4, 2013 Mohammad Hammoud 1.
CSE 451: Operating Systems Winter 2012 Module 18 Virtual Machines Mark Zbikowski and Gary Kimura.
A Survey on Virtualization Technologies
Tanenbaum 8.3 See references
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Virtual Machines: Versatile Platforms for Systems and Processes
A Survey on Virtualization Technologies. Virtualization is “HOT” Microsoft acquires Connectix Corp. EMC acquires VMware Veritas acquires Ejascent IBM,
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran)
CS533 Concepts of Operating Systems Jonathan Walpole.
Operating System Support for Virtual Machines Samuel T. King, George W. Dunlap,Peter M.Chen Presented By, Rajesh 1 References [1] Virtual Machines: Supporting.
Virtualization Concepts Presented by: Mariano Diaz.
Introduction 1-1 Introduction to Virtual Machines From “Virtual Machines” Smith and Nair Chapter 1.
Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.
Spring 2003CSE P5481 VLIW Processors VLIW (“very long instruction word”) processors instructions are scheduled by the compiler a fixed number of operations.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
Processes Introduction to Operating Systems: Module 3.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
Virtual Memory Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University.
CS 346 – Chapter 2 OS services –OS user interface –System calls –System programs How to make an OS –Implementation –Structure –Virtual machines Commitment.
Using Dynamic Binary Translation to Fuse Dependent Instructions Shiliang Hu & James E. Smith.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Distributed Systems CS Lecture 25, November 23, 2014 Gregory Kesden Borrowed from our good friends in Doha: Majd F. Sakr, Mohammad Hammoud andVinay.
Full and Para Virtualization
Lecture 26 Virtual Machine Monitors. Virtual Machines Goal: run an guest OS over an host OS Who has done this? Why might it be useful? Examples: Vmware,
Lecture 12 Virtualization Overview 1 Dec. 1, 2015 Prof. Kyu Ho Park “Understanding Full Virtualization, Paravirtualization, and Hardware Assist”, White.
Introduction Why are virtual machines interesting?
Protection of Processes Security and privacy of data is challenging currently. Protecting information – Not limited to hardware. – Depends on innovation.
CSE 451: Operating Systems Winter 2015 Module 25 Virtual Machine Monitors Mark Zbikowski Allen Center 476 © 2013 Gribble, Lazowska,
E Virtual Machines Lecture 1 What is Virtualization? Scott Devine VMware, Inc.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
Virtualization Neependra Khare
1 Virtualization "Virtualization software makes it possible to run multiple operating systems and multiple applications on the same server at the same.
CS 695 Topics in Virtualization and Cloud Computing, Autumn 2012 CS 695 Topics in Virtualization and Cloud Computing More Introduction + Processor Virtualization.
Introduction to Operating Systems Concepts
Virtualization.
Virtual Machine Monitors
Dynamic Compilation Vijay Janapa Reddi
Virtualization Dr. Michael L. Collard
CS352H: Computer Systems Architecture
Virtual Machines: Versatile Platforms for Systems and Processes
Georgia Tech November 2006 J. E. Smith
OS Virtualization.
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Virtualization Techniques
A Survey on Virtualization Technologies
Virtual Machines (Introduction to Virtual Machines)
Translation Buffers (TLB’s)
An Overview of Virtual Machine Architectures
CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors
Translation Buffers (TLB’s)
Introduction to Virtual Machines
Introduction to Virtual Machines
Distributed Systems CS
Translation Buffers (TLBs)
CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors
Review What are the advantages/disadvantages of pages versus segments?
Dynamic Binary Translators and Instrumenters
Presentation transcript:

November 2004 J. E. Smith Virtual Machines: An Architecture Perspective

VMs (c) 2004, J. E. Smith 2 Introduction Why are virtual machines interesting? They involve computer architecture in a pure sense They allow transcending of interfaces (which often seem to be an obstacle to innovation) They enable innovation in flexible, adaptive hardware, security, fault-tolerance, support for network computing (and others)

VMs (c) 2004, J. E. Smith 3 Performance Isn’t Everything  The BIG ideas are all at least 20 years old and they have been very thoroughly explored  Focus research on other important areas Power efficiency Performance efficiency Security Ease of design Software compatibility / interoperability  Virtual Machines can be important enablers for all the above

VMs (c) 2004, J. E. Smith 4 Outline  Virtualization  The Family of Virtual Machines  Process VMs and Code Caching  High Level Language VMs  Co-Designed VMs  Research in Co-Designed VMs

VMs (c) 2004, J. E. Smith 5 Abstraction  Computer systems are built on levels of abstraction  Instruction Set Architecture Major division between hardware and software I/O devices and Networking Controllers System Interconnect (bus) Controllers Memory Translation Execution Hardware Drivers Memory Manager Scheduler Operating System Libraries Application Programs Main Memory Software Hardware  Application Binary Interface Observed by user processes User ISA + OS calls  Higher level of abstraction hide details at lower levels  Example: files are an abstraction of a disk file abstraction

VMs (c) 2004, J. E. Smith 6 Virtualization  An isomorphism from guest to host Map guest state to host state Implement “equivalent” functions S i S S i ' S j ' Guest Host V(S i ) S j ) e(S i ) e'(S i ') j

VMs (c) 2004, J. E. Smith 7 Virtualization  Similar to abstraction Except Details not necessarily hidden  Construct Virtual Disks As files on a larger disk Map state Implement functions  Now do the same thing with the whole “machine” file virtualization

VMs (c) 2004, J. E. Smith 8 The Family of Virtual Machines  Lots of things are called “virtual machines” IBM VM/370 Java VMware Some things not called “virtual machines”, are virtual machines IA-32 EL Dynamo Transmeta Crusoe

VMs (c) 2004, J. E. Smith 9 System Virtual Machines  Provide a system environment  Constructed at ISA level  Persistent  Examples: IBM VM/360, VMware, Transmeta Crusoe guest process HOST PLATFORM virtual network communication Guest OS VMM guest process guest process guest process Guest OS2 VMM guest process guest process

VMs (c) 2004, J. E. Smith 10 System Virtual Machines  Native VM System VMM privileged mode Guest OS user mode Example: classic IBM VMs  User-mode Hosted VM VMM runs as user application  Dual-mode Hosted VM Parts of VMM privileged, parts non-privileged Example VMware Non-privileged modes Privileged Mode Virtual Machine VMM Hardware Virtual Machine Host OS Hardware VMM Virtual Machine Host OS Hardware VMM

VMs (c) 2004, J. E. Smith 11 Process Virtual Machines  Constructed at ABI level  Runtime manages guest process  Guest processes may intermingle with host processes  Not persistent  As a practical matter, guest and host OSes are often the same  Dynamic optimizers are a special case  Examples: IA-32 EL, FX!32, Dynamo HOST OS Disk file sharing network communication guest process create host process guest process runtime guest process runtime host process

VMs (c) 2004, J. E. Smith 12 The Virtual Machine Space Multi programmed Systems HLL VMs Co-Designed VMs same ISA different ISA Process VMsSystem VMs Whole System VMs different ISA same ISA Classic OS VMs Dynamic Binary Optimizers Dynamic Translators Hosted VMs

VMs (c) 2004, J. E. Smith 13 Architecture Issues: System VMs  Why System VMs are of interest today Security & Fault Tolerance (isolation) Platform Consolidation Application/Environment portability  “Efficiently Virtualizable” Instruction Sets Goldberg and Popek (1974) should still be required reading (An architecture paper with theorems and proofs!)  Virtual Machine Assists Compensate for inefficiencies due to privilege level “compression” Fast emulation of system functions Many developed for IBM mainframe VMs

VMs (c) 2004, J. E. Smith 14 System Virtualization  Traps and interrupts (& sys calls) Transfer to VMM VMM determines appropriate Guest OS VMM transfers to Guest OS  Guest performs privileged operation Trap to VMM VMM reads/modifies guest state May modify shadow state Returns to Guest  Guest OS “return” to user app. Transfer to VMM VMM bounces return back to Guest app. privileged operation next instruction check privileges perform operation return system call/trap vector location: virtual vector location: Application Guest OS VMM

VMs (c) 2004, J. E. Smith 15 Popek and Goldberg (in brief)  Control Sensitive instructions All instructions that change hardware resource allocation (or mapping) Example: write TLB  Behavior Sensitive instructions All instructions whose outcome depends on hardware resource allocation Example: read processor mode  Theorem (paraphrase) Efficiently virtualizable if all sensitive instructions trap in user mode

VMs (c) 2004, J. E. Smith 16 System VM Research  Architecture Challenge: Make IA-32 efficiently virtualizable  Virtual Machine Assists Compensate for inefficiencies due to privilege level “compression” Fast emulation of system functions Many developed for IBM mainframe VMs  Applications to Chip Multiprocessors Technology changes often require innovation and “re-invention”

VMs (c) 2004, J. E. Smith 17 The Virtual Machine Space Multi programmed Systems HLL VMs Co-Designed VMs same ISA different ISA Process VMsSystem VMs Whole System VMs different ISA same ISA Classic OS VMs Dynamic Binary Optimizers Dynamic Translators Hosted VMs

VMs (c) 2004, J. E. Smith 18 Architecture Issues: Process VMs  Generally to allow application migration Or to run popular software on a less popular platform Goal is generally to minimize performance loss  Same-ISA dynamic optimizers are special case HP Dynamo  Architecture problems Efficient code-caching Indirect jump problem Protecting runtime from guest process

VMs (c) 2004, J. E. Smith 19 Staged Emulation with Code Caching  An important part of many VM implementations  Translate, optimize & cache frequent code sequences Binary Memory Image Code Cache Profile Data Interpreter Translator/ Optimizer runtime  Start interpreting  Profile to find “hot” code regions

VMs (c) 2004, J. E. Smith 20 Superblocks  Based on “hot” paths  One entry multiple exits  May contain redundant blocks (tail duplication) 15 BD C G A EF BD C G A EF GG

VMs (c) 2004, J. E. Smith 21 Binary Translation Example 4FD0:addl%edx,(%eax);load and accumulate sum movl(%eax),%edx;store to memory sub%ebx,1;decrement loop count jz51C8;branch if at loop end 4FDC:add%eax,4;increment %eax jmp4FD0;jump to loop top 51C8:movl(%ecx),%edx;store last value of %edx xorl%edx,%edx;clear %edx jmp6200;jump elsewhere x86 Binary 9AC0:lwzr16,0(r4);load value from memory addr7,r7,r16;accumulate sum stw0(r5),r7;store to memory subi.r5,r5,1;decrement loop count, set cr0 bezcr0,pc+12;branch if loop exit blF000;branch & link to EM 4FDC;save source PC in link register 9AE4:blF000;branch & link to EM 51C8;save source PC in link register 9C08:stw0(r6),r7;store last value of %edx subir7,r7,r7;clear %edx blF000;branch & link to EM 6200;save source PC in link register PowerPC Translation

VMs (c) 2004, J. E. Smith 22 Code Caches  Contain Basic blocks Superblocks (one entrance, multiple exits) Optimized Superblocks  A base technology for many VMs Dynamic binary translators: Intel IA-32 EL, Compaq FX!32 Dynamic binary optimizers: Dynamo family Co-designed virtual machines: Transmeta, IBM DAISY High performance Java virtual machines System VMs with “inefficiently virtualizable” ISAs “Sandboxing” secure VMs (x86 DynamoRIO)

VMs (c) 2004, J. E. Smith 23 Indirect Jumps  Translated code cache PC (TPC) differs from Source binary PC (SPC) Need branch/jump target address translation (Direct) branches are easier; target address is fixed  Chaining can be used Super block Dispatch table lookup code Super block Without chaining Super block Dispatch table lookup code Super block With chaining Super block

VMs (c) 2004, J. E. Smith 24 The Indirect Jump Problem  Target addresses (SPCs) can change SPC needs to be translated at run-time, not translation time  Conventional solution: superblock construction-time software prediction (aka inline caching) If Rx == #addr_1 goto #target_1 Else if Rx == #addr_2 goto #target_2 Else dispatch_table_lookup(Rx); do it the slow way The biggest overhead in code caches –Compare-and-branch: 6 instructions –Hash table lookup: 15 instructions in Dynamo x86

VMs (c) 2004, J. E. Smith 25 Protecting the Runtime  The runtime shares process memory space with application Must protect runtime from application Expensive memory protection changes on switches between runtime and code cache If guest registers are mapped to host memory How are memory mapped registers protected? Guest Code Guest Data Runtime Data Runtime Code N R/W Code Cache Ex R/W N Guest Code Guest Data Runtime Data Runtime Code N N Code Cache N Ex N R/W R Runtime modeEmulation mode

VMs (c) 2004, J. E. Smith 26 Process VM Research  Same-ISA dynamic binary optimizers are probably not a winning proposition Indirect jumps lead to performance losses on modern processors (optimizers with patching are better) Complete (intrinsic) compatibility is extremely difficult May have to rely on extrinsic assurances Topic of architecture research similar to Goldberg and Popek  For general process VMs some primitive support in ISA will be useful / necessary Indirect jumps (more later) Code caching Protection

VMs (c) 2004, J. E. Smith 27 Computer Architecture Innovation HLL VMs – software people invent ISA to solve SW problems Co-Designed VMs – hardware people invent ISA to solve HW problems These two are the most interesting VMs from an architecture perspective and provide the biggest opportunities.

VMs (c) 2004, J. E. Smith 28 The Virtual Machine Space Multi programmed Systems HLL VMs Co-Designed VMs same ISA different ISA Process VMsSystem VMs Whole System VMs different ISA same ISA Classic OS VMs Dynamic Binary Optimizers Dynamic Translators Hosted VMs

VMs (c) 2004, J. E. Smith 29 High Level Language Virtual Machines  Raise the “ABI” level of abstraction User higher level virtual ISA OS abstracted as standard libraries  A form of process VM HLL Program Intermediate Code Memory Image Object Code ( ISA ) Compiler front-end Compiler back-end Loader HLL Program Portable Code ( Virtual ISA ) Host Instructions Virt. Mem. Image Compiler VM loader VM Interpreter/Translator Traditional HLL VM

VMs (c) 2004, J. E. Smith 30 Architecture Issues: High Level VMs  Examples: Sun Java Microsoft.NET Framework and MSIL  Why are HLL VMs important? Microsoft says so. It’s a good idea. Combines object oriented programming and network computing

VMs (c) 2004, J. E. Smith 31 HLL VMs: Architecture Perspective  Here, architects were deprived (or let themselves be deprived) of some interesting architecture work  Don’t look at it bottom-up, i.e. Take existing software for supporting HLL VMs, Generate traces for standard ISAs, Analyze traces Conclude its “just like C”… problem solved!  Look top-down – start with features of MSIL and look for computer architecture opportunities Will require a mix of hardware and software innovation (else just continue to ignore real architecture in favor of implementation)

VMs (c) 2004, J. E. Smith 32 HLL VM Research  Metadata – an interesting concept Data Set Architecture Don’t have to discover data structures – compare with C programs. Metadata Code Machine Independent Program File Loader Virtual Machine Implementation Interpreter Internal Data Structures Translator Native Code

VMs (c) 2004, J. E. Smith 33 HLL VM Research  Precise trap model Problems in conventional processors: All state precise Many instructions can trap Enable/disable “remote” and at any time HLL VMs Not all state must be precise PC not needed operand stack never local variables only if trap is handled locally Trap enable explicit and locally specified

VMs (c) 2004, J. E. Smith 34 HLL VM Research  Stack tracking At any given point, operand stack must have same number of elements and types regardless of control flow path This property could simplify exploitation of control independence

VMs (c) 2004, J. E. Smith 35 HLL VMs Summary  Claim: Slow-downs due to OO programming, probably not dynamic compilation – and not stack-based ISA  Research opportunities abound For VM implementation For speeding up OO programs (look beyond C/C++) Use co-designed HW/SW Base design on MSIL/Java and implement conventional ISA as the uncommon case

VMs (c) 2004, J. E. Smith 36 The Virtual Machine Space Multi programmed Systems HLL VMs Co-Designed VMs same ISA different ISA Process VMsSystem VMs Whole System VMs different ISA same ISA Classic OS VMs Dynamic Binary Optimizers Dynamic Translators Hosted VMs

VMs (c) 2004, J. E. Smith 37 Co-Designed Virtual Machines  Separate the hardware/software interface from the ISA level of abstraction  Restore the ISA to its “natural” place  as an I mplementation ISA that reflects actual hardware  Support existing ISAs  as a Virtual ISA  Let processor designers use both hardware and software  A form of system VM OS libs. User Applications V-ISA I-ISA Hardware Software Hardware OS libs. User Applications ISA

VMs (c) 2004, J. E. Smith 38 Co-Designed VMs  Should be of interest to both architects and micro-architects Offers opportunities for performance, power saving, fault tolerance and other implementation- dependent features Allows transcending conventional ISAs Don’t confuse them with VLIW!

VMs (c) 2004, J. E. Smith 39 Architecture Issues: Concealed Memory  VM software resides in memory concealed from all conventional software Source ISA Data Code Cache VM Code ICache Hierarchy DCache Hierarchy Processor Core Source ISA Code VM Data concealed memory conventional memory

VMs (c) 2004, J. E. Smith 40 Another Way of Doing Things conventional dynamic translation Code Cache Processor Pipeline Software Translator Main Memory Func. Unit Func. Unit... Main Memory Cache Hierarchy Processor Pipeline Translation Unit (form uops) Func. Unit Func. Unit Func. Unit... Translation Unit (form uops) Cache Hierarchy

VMs (c) 2004, J. E. Smith 41 Jump Target-address Lookup Table  A hardware cache of dispatch table entries  Similar to software-managed TLB in virtual memory Jump insn TPC BTB Predicted next fetch TPC Tag TPC Jump insn Register identifier SPC Register file Jump Target SPC SPC TPC JTLT Jump Target TPC Hit? Match? Yes BTB prediction correct Yes No BTB misprediction: Redirect fetch to jump target TPC from JTLT No JTLT miss: Redirect fetch to the dispatch code

VMs (c) 2004, J. E. Smith 42 SPC TPC Push-dual- address-RAS insn Dual-address RAS  Problem: function call instruction saves return SPC not TPC Conventional software-based chaining cannot utilize a RAS  Solution: save both SPC and TPC Dual-address RAS SPC TPC JTLT

VMs (c) 2004, J. E. Smith 43 IPC performance  “Translate” Alpha to Alpha; start with highly optimized code  Conventional method (ala Dynamo) results in 14% IPC loss  Dual-address RAS provides the most benefit  Using both JTLT & RAS, 7.7% IPC improvement Due to superblock re-layout

VMs (c) 2004, J. E. Smith 44  Wide pipelines are at odds with fast pipelines Fast pipeline => low complexity per stage More instructions per stage => high complexity per stage  Process larger atomic units in pipeline stages  Narrower “effective” width  Reduce decoding stages Do more in software  Pipeline the issue stage Research: Efficient Microarchitectures

VMs (c) 2004, J. E. Smith 45 Fused Instruction Set  Co-designed VM x86 implementation Shorten and simplify pipeline front-end  Combine pairs of dependent instructions For single “unit” for pipeline processing  Use VM software to “Crack” x86 instructions into RISC-ops Re-order RISC-ops Reassemble into (new) fused pairs  Related: Pentium-M fuses in front-end Using original x86 instructions

VMs (c) 2004, J. E. Smith 46 Conventional Issue Logic  Select and issue instructions free of data dependences  Based on the selection, clear dependences And “wake-up” newly independent instructions  Single cycle select-wakeup important for good performance OPR1Imm.R2 OPR6R7R1 Issue Buffer select fanout/ wakeup

VMs (c) 2004, J. E. Smith 47  Fuse dependent instructions into single slot  Fused instructions traverse entire pipeline  Make single issue decision for the pair Pipelined Issue Logic

VMs (c) 2004, J. E. Smith 48 Instruction Set

VMs (c) 2004, J. E. Smith 49 Translation Algorithm Two Pass Algorithm: 1. Form superblocks using Dynamo MRET method 2. Crack x86 instructions into RISC-like micro-ops 3. Attempt to fuse ALU ops only 4. Fuse LD/ST instructions as tails and ALU ops as heads

VMs (c) 2004, J. E. Smith 50 Fusing Profile  About 50% of operations are fused  Only 5-10% of non-fused are single-cycle ALU ops

VMs (c) 2004, J. E. Smith 51 Distance Between Fused Operations  Most fused operations close together 70% of fused ops from different x86 instructions 60% contain two ALU operations

VMs (c) 2004, J. E. Smith 52 Performance (Normalized IPC)  Baseline: generic superscalar  Macro-op: Fused macro-ops with pipelined issue logic  Baseline Pipelined: superscalar with pipelined issue logic Issue Window Size Relative IPC performance 4-wide Macro-op 4-wide Baseline 4-wide Baseline Pipelined 2-wide Macro-op

VMs (c) 2004, J. E. Smith 53 VM Research  Architecture Support for VMs Enable spectrum of VMs (process, system, HLL, co-designed) Support for dynamic translation and optimization Primitives: code caches & indirect jumps; concealed memory Pays for itself – helps get rid of obsolete ISA baggage  VM applications Security Fault Tolerance  Co-Designed VMs Efficient microarchitecture Adaptive microarchitecture For power efficiency For performance  New ISAs Application-area specific ISAs Support for Java/MSIL “Convergence” architectures  Computer Architects can do Computer Architecture!