Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multiprocessor Virtualization

Similar presentations


Presentation on theme: "Multiprocessor Virtualization"— Presentation transcript:

1 Multiprocessor Virtualization
This material is based on the book, Virtual Machines: Versatile Platforms for Systems and Processes, Copyright 2005 by Elsevier Inc. All rights reserved. It has been used and/or modified with permission from Elsevier Inc.

2 Multiprocessor Virtualization Outline
Physical Partitioning Logical Partitioning Hypervisors System VMs Shared Memory Issues and Emulation Multiprocessor Virtualization

3 Multiprocessor Systems
Workhorse of commercial computing Both large SMPs and clusters Becoming more common at the lower end Desktop processors will have multiple processors Scalability of applications Many low-end applications cannot use large SMPs Scalability limited by Low parallelism in application Performance hit due to communication costs Multiprocessor Virtualization

4 A Common Theme: Partitioning
Divide single large MP into multiple, smaller MPs Virtual Machine 1 Virtual Machine 2 Virtual Machine 3 P P P P P P P P M I/O M I/O M I/O Figs1/repvm Virtual Machine Monitor P P P P P P P P P Shared-Memory Multiprocessing Hardware M I/O Multiprocessor Virtualization

5 Partitioning of Multiprocessor Systems
Partitioning in time Processor in System VMs partitioned in time Multiprogramming Partitioning in space Multiprocessor system partitioned into Multiple independent systems Cluster of smaller multiprocessor systems Small SMPs on a Large SMP Several virtual shared-memory systems on a single large shared-memory system Multiprocessor Virtualization

6 Partitioning Advantages
Database Server Workload Consolidation Large systems may be naturally composed of multiple, different subystems Consolidate logical subsystems into single physical system Application Application Application Server Server Server User PC User PC User PC User PC User PC User PC Typical 3-tier server model Database Server Virtualized Virtualized Virtualized App. Server App. Server App. Server User PC User PC User PC User PC User PC User PC Consolidation of Application Server with Database Server Multiprocessor Virtualization

7 Partitioning Advantages (contd.)
System Migration Testing and verification can occur in parallel with production Heterogeneous Systems Run two different versions of same OS or two different OSes on different partitions Improving System Utilization Average workload often much smaller than peak Some resources like tape drives, optical storage, can be moved from one partition to another Multiple Time-Zone Requirements Consolidation of resources across time-zones Reduction of System Downtime System still available Multiprocessor Virtualization

8 Partitioning Advantages (contd.)
Failure Isolation Network attacks Software malfunctions Hardware failures Software failure Brings down faulty operating system and apps OSes in other partitions unaffected Hardware failure Could bring down all partitions affected by the fault Physical partitioning isolates better from hardware failures Reliability of VMM is also important Keep it small and simple Multiprocessor Virtualization

9 Mechanisms to Support Partitioning
Virtual Machine Monitor – several possible implementations Software only Completely in hardware Software enhanced by hardware Microcode supported by hardware Most modern systems add features to processor Improves performance of partitioned system Puts VMM in a separate privileged layer However, cannot recursively partition Multiprocessor Virtualization

10 Partitioning Techniques
With hardware support Without hardware Physical Logical SLVM Based Approaches OS Based Microprogram Based Hypervisor Same ISA Different ISA Multiprocessor Virtualization

11 Physical Partitioning
M M M M M M I/O I/O I/O I/O I/O I/O Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Multiprocessor Virtualization

12 Physical Partitioning Examples
Sun: Domains Each domain in a distinct physical unit Minimum: One system board with 2/4 processors, up to 4 GB memory and up to 4 I/O busses HP: nPartitions Boards referred to as cells Cell has up to 4 processors in an SMP up to 16 GB memory and up to 12 PCI slots Each cell in a partition must be identically configured as other cells in the partition Multiprocessor Virtualization

13 Features of Physical Partitioning
Robustness to failures Control unit reboots only the failing OS Good security isolation One partition cannot attack another Each partition has its own system administration Ability to meet SLOs (system-level objectives) Behavior is similar to hardware dedicated to guest OS System utilization not optimal Utilization determined by guest OS and applications on each partition Multiprocessor Virtualization

14 Logical Partitioning with Firmware
Avoid software overhead of conventional VMM via combination of firmware (microcode) and special hardware Allocation of resources happens before OSes are booted Hardware resources are physically shared, but logically partitioned Lacks “elegance” of software VMM, e.g. recursive virtualizability But not much different than hardware “assists” for conventional system VMs Multiprocessor Virtualization

15 Logical Partitioning with Firmware
Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Multiprocessor Virtualization

16 Multiprocessor Virtualization
Example: IBM 390 LPAR Implemented via Processor Resource/Systems Manager (PR/SM) Microcode/Hardware under control of system admin. Up to six partitions Partitions communicate via storage system, network Resource sharing Each partition gets a fraction of physical memory I/O elements also divided among partitions Processors may be dedicated or shared Processor resource can be overcommitted Multiprocessor Virtualization

17 Dispatcher and other LPAR software
LPAR Example A combination of dedicated and shared partitions Two Shared Partitions Dedicated Partition App App App App App App App App App App OS 2 OS 3 Operating System 1 Dispatcher and other LPAR software microcode microcode microcode microcode microcode Processor Processor Processor Processor Processor Multiprocessor Virtualization

18 LPAR Resource Management
Through System Console Allocate resources at boot time Most resources stay with a partition Some resources e.g. I/O devices can be dynamically moved Fair scheduling of shared processors via LPAR dispatch function; based on Priority Activity level (inactive processors enter wait state) High priority I/O interrupts OS-initiated swap (e.g. due to busy loops) Requires OS cooperation Interval timeout Multiprocessor Virtualization

19 Logical Partitioning with Hypervisors
Software-only approach Fewer microcoded processors today (at least with very large control stores) Add special operating mode not visible to ISA Hypervisor software running in this mode can be hidden from conventional OS - Similar to co-designed VMs - Tends to be lightweight Primary use is to provide partitioning Special mode for hypervisor allows conventional OS to run in native supervisor mode Multiprocessor Virtualization

20 Hypervisor Hardware Support
Hypervisor state registers For efficient resource management Replicated general registers Fast mode switches Isolation from conventional software Multiprocessor Virtualization

21 Hypervisor Memory Management
Partition physical memory via Partition Base/Limit registers Hypervisor also gives itself a chunk of memory Hypervisor addressing If allocated in lower memory, hypervisor can use physical addressing Partition addressing After conventional address translation to real address Add base and check against limit for physical address Can combine real and physical translation in TLB Hypervisor software intercedes in TLB misses and provides physical address Multiprocessor Virtualization

22 Hypervisor Memory Management
Real Memory of Partition 2 PMLR1 PMBR1 PMLR2 PMBR2 Real Memory of Partition 1 Real Memory of Hypervisor Physical Memory on System Multiprocessor Virtualization

23 Hypervisor Memory Management
Current Page Table Pointer Partition Virtual Physical ID Address Address Page Table for Partition 1 Partition1 Partition 2 Hardware TLB Partition n Pointer Table Virtualize page table pointer (and page tables) Physical Memory on System Multiprocessor Virtualization

24 Current Page Table Pointer
TLB Management Convert real address to physical address Hardware TLB Current pointer Points to page table of current partition Partition pointer table List of base and limit pointers for partitions Extra field in TLB to denote partition Software TLB Virtualize the TLB May need to swap TLB in and out during partition changes Current Page Table Pointer Partition Virtual Physical ID Address Address Page Table for Partition 1 Partition1 Partition 2 Hardware TLB Partition n Pointer Table Physical Memory on System Multiprocessor Virtualization

25 Multiprocessor Virtualization
Interrupt Handling Hypervisor controls which interrupts it wishes to handle I/O responses to hypervisor query about operational status System console commands/requests, e.g. configuration for a new partition Request for service from a partition Machine-check interrupt Hypervisor checks the extent of corruption If limited to partition, hypervisor emulates guest OS machine check Multiprocessor Virtualization

26 Hypervisor Services Interface
Allows OS in a partition to request services from the hypervisor Hypervisor implements mechanism Guest OS implements policy Page Table example in PowerPC Multiprocessor Virtualization

27 Dynamic Memory Allocation
Physical memory requirements may change over time But memory is allocated contiguously in original Hypervisor System Solution: Give each system up to 16 chunks of size 256 Mbytes. Allows flexibility Still allows fault isolation on memory module failures (if memory is interleaved properly) Multiprocessor Virtualization

28 Dynamic Memory Allocation, cont.
Chunk 1 Chunk 2 Chunk 6 Real Memory of Partition 2 Chunk 16 Chunk Map for Partition 1 Chunk 1 Chunk 2 Real Memory of Chunk 6 Partition 1 Chunk 16 Chunk Map for Real Memory of Partition 2 Hypervisor Physical Memory on System Multiprocessor Virtualization

29 Expanding the Role of the Hypervisor
Supported by implementation-dependent operations Monitor power usage Collect power information from subsystems Notify OS to cut back on resource usage Manage fault tolerance functions Manage retry Isolate and reassign resources in response to failures Multiprocessor Virtualization

30 Multiprocessor Virtualization
Dynamic Partitioning Allow for changes in partition configuration A partition may complete its task A partition may find its needs changing A new partition is needed Hypervisor maintains pool of resources Recovers resources when partition is shut down Dynamically change resources I/O not difficult Adding/deleting processors – need OS to be able to dynamically change its configuration too Changing memory – difficult with contiguous address space allocation Need to have smaller granularity allocation of memory Multiprocessor Virtualization

31 Effect of Dynamic Partitioning
Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk P P P P P P P P P P P P P P P P P P P P P P P P M M M M M M I/O I/O I/O I/O I/O I/O Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Multiprocessor Virtualization

32 System-VM Based Partitioning
Conceptually similar to System VMs VMM works in privileged mode Guest OS and applications work in non-privileged mode Example: Cellular Disco User Level Applications Applications Supervisor Level Operating System Privileged Operations trap Privileged Level Operating System Cellular Disco VMM Hardware (a) Native System (b) Cellular Disco Multiprocessor Virtualization

33 Multiprocessor Virtualization
Cellular Disco Implemented on MIPS Has third level of privilege – supervisor mode More efficient page allocation Traps all reads and writes to the paging disk of the guest OS Keeps information about whether page is in VMM paging space Avoids redundant copy operations Real memory is virtualized Unlike partitioning schemes, where physical memory is partitioned VMM tracks page usage and keeps enough pages resident in physical memory VMM also tracks whether page has been written to avoid writing out the page to paging disk Multiprocessor Virtualization

34 Cellular Disco – Memory Sharing
Allows applications in different partitions to share global memory Shared regions are registered through system calls Faster communication than message-passing VMM maintains sharing status with page Information about sharing maintained when paged out Virtual Machine 0 Virtual Machine 1 Code Data Buffer Cache Code Data Buffer Cache Machine Memory Private Shared Free pages pages pages Multiprocessor Virtualization

35 Cellular Disco – Fault Containment
Hardware treated as set of cells Each cell runs its own VMM and manages its own physical memory Failure affects only the VM using the resources of failing cell Communication between cells through trusted interprocessor communication Virtual CPUs – VCPUs mapped to physical CPUs Can migrate, even across cell boundaries – but only if absolutely needed VMM code size small – less than 50K lines Virtual Machine Virtual Machine Virtual Machine Cellular Disco Node Node Node Node Node Node Node Node Interconnect Fault location Cell Boundaries Multiprocessor Virtualization

36 Cellular Disco – Memory Borrowing
Normally VMM manages memory attached to cell Could lead to imbalance in use of memory resources Cellular Disco allows a cell to temporarily borrow memory from other cells Deviates from strict fault containment policy List of cells to borrow from determined by List maintained by each virtual machine Virtual machine can essentially specify its vulnerability Cells that have already been supplying memory Availability of pages in remote cells Cell may refuse when free memory runs low Overall memory requirements of virtual machine Keep small machines from spilling to too many cells Multiprocessor Virtualization

37 Cellular Disco - Recovering from a Fault
Crossing cell boundaries makes recovery more complicated Step 1: Hardware attempts to recover from fault Step 2: Hardware informs all processors. Cellular Disco initiates recovery process Step 3: Cells agree on set of “live” nodes. Live set unjammed of messages. Step 4: Each VMM determines virtual machines affected by the fault Not just processors, also borrowed memory and I/O resources Multiprocessor Virtualization

38 Multiprocessor Virtualization
ISA Emulation Needed when guest and host run different ISAs Much of ISA emulation similar to other VMs covered earlier Important exception is memory ordering (consistency) model Clustered systems (separate address spaces) straightforward – treat as a bunch of uniprocessors Shared memory MPs more difficult Multiprocessor Virtualization

39 Uniprocessor Cluster Emulation
No shared memory  No memory ordering problems Virtual Machine 1 Virtual Machine 2 Virtual Machine 2 P P P Virtual Interconnection M I/O M I/O M I/O Virtual Machine Monitor Virtual Machine Monitor Virtual Machine Monitor Real Processor 1 Real Processor 2 Real Processor 2 Real Real Real Real I/O Real I/O Real I/O Memory Memory Memory Real Interconnection Network Multiprocessor Virtualization

40 Multiprocessor Virtualization
Memory Ordering Memory updates may become re-ordered by the memory system Example P1: A=0; P2: B=0; A=1; B=1; L1: if (B==0)... L2: if (A==0)... Intuitively it is impossible for both A and B to be 0 But it can happen if the updates to memory are reordered by the memory system In an MP system, memory ordering rules must be carefully defined and maintained Multiprocessor Virtualization

41 Some Causes of Inconsistency
Write buffers Interconnect delays P1 P2 B = 0 A = 0 Read B Read A A  1 B  1 Shared Bus P1 P2 A: 0 Memory Flag 1 A  9 A: 3 Flag: 1 B: 0 Interconnection Net Flag 1 A  9 Flag: 1 Flag: 0 A: 3 Memory Multiprocessor Virtualization

42 Sequential Consistency
"A system is sequentially consistent if the result of any execution is the same as if the operations of all processors were executed in some sequential order and the operations of each individual processor appears in this sequence in the order specified by its program“ -- Leslie Lamport MEMORY P1 P2 P3 P4 P5 Multiprocessor Virtualization

43 Relaxed Consistency Sequential consistency may impede performance
Relaxed consistency: can relax program order or write atomicity Often driven by hardware implementation  a number of different models have been developed Mechanisms to override relaxed rules: fence operations (membar) semantics of explicit synchronization instructions Relaxation Relax Read to Read and Read to Write program orders Relax Write to Write program order Relax Write to Read program order Read others' write early Read own write early Multiprocessor Virtualization

44 Multiprocessor Virtualization
Weak Ordering Weak Ordering: no assumption about ordering between explicit synchronization points Informal rules: Synchronization instructions are made known to hardware Actions to maintain consistency are taken only at the time of synchronization instructions. Examples: In critical section example, reads/writes of A and B are synchronization points; In producer/consumer example, reads/writes of Flag are synchronization points Multiprocessor Virtualization

45 Memory Model Emulation
Emulation involves two aspects Memory Coherence Architectures may or may not enforce coherence Some architectures (e.g. PowerPC) allow both modes Memory Consistency Strong consistency Processor consistency Weak consistency Memory barriers are provided to enforce ordering when consistency is not strong Multiprocessor Virtualization

46 Memory Coherence Emulation
Coherence not required Coherence mode not available mode available No special coherence instructions needed Run in coherence mode Insert synchronization H O S T GUEST Must also emulate synchronization primitives E.g. Emulate Test-and-set using load-reserved and store-conditional Multiprocessor Virtualization

47 Memory Consistency Emulation
Guest ordering model is same or weaker than host Nothing special needs to be done as long as emulation does not eliminate or re-order memory operations This poses a problem for VLIW co-designed VMs Re-ordering is a big reason for performance gain Guest ordering model is stronger than the host Memory barrier instructions must be inserted during emulation Guest ordering model is stronger in some ways and weaker in others Memory barriers are needed to “fix up” those cases where the guest model is stronger Multiprocessor Virtualization

48 Adding Memory Barriers
Adding membar after every load or store trivially works But it would be very slow Optimization Relax those cases (remove membar) in cases where guest ordering model allows Eliminate membars where next sequential access is to same location Uniprocessor ordering rules take care of these Remove membars in cases where memory locations are known not to be shared Multiprocessor Virtualization

49 Example: Adding Memory Barriers
Accesses in guest Accesses in host Eliminating membars in host (processor consistency) (release consistency) (W-R relaxation) Read A Read A Read A membar membar Write A Write A Write A Read B membar Read B Read C Read B membar membar Write C Read C Read C Write B membar Write C membar Write C membar Write B membar Write B membar Multiprocessor Virtualization

50 Non-shared locations Accesses in guest (processor consistency) Read A Write A Read B Read C Write C Write B Location of membars in host (Location B is not shared) membar (Locations A and B not shared) Eliminated membars which enforce ordering between non-shared location accesses Last membar left because of unknown interaction with rest of code Multiprocessor Virtualization


Download ppt "Multiprocessor Virtualization"

Similar presentations


Ads by Google