Cellular Disco: resource management using virtual clusters on shared memory multiprocessors Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Disco: Running Commodity Operation Systems on Scalable Multiprocessors E Bugnion, S Devine, K Govil, M Rosenblum Computer Systems Laboratory, Stanford.
1 Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, Mendel Rosenblum, Stanford University, 1997 Presented.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Virtualization in HPC Minesh Joshi CSC 469 Dr. Box Feb 1, 2012.
Multiple Processor Systems
Disco Running Commodity Operating Systems on Scalable Multiprocessors Presented by Petar Bujosevic 05/17/2005 Paper by Edouard Bugnion, Scott Devine, and.
Multiprocessors CS 6410 Ashik Ratnani, Cornell University.
Shared Memory Multiprocessors Ravikant Dintyala. Trends Higher memory latencies Large write sharing costs Large secondary caches NUMA False sharing of.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa.
G Robert Grimm New York University Disco.
CS533 Concepts of Operating Systems Class 14 Virtualization.
1: Operating Systems Overview
Tao Yang, UCSB CS 240B’03 Unix Scheduling Multilevel feedback queues –128 priority queues (value: 0-127) –Round Robin per priority queue Every scheduling.
OPERATING SYSTEM OVERVIEW
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
1 Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, and Mendel Rosenblum, Stanford University, 1997.
1 Distributed Systems: Distributed Process Management – Process Migration.
DISTRIBUTED COMPUTING
Multiprocessors Deniz Altinbuken 09/29/09.
MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Tanenbaum 8.3 See references
Computer System Architectures Computer System Software
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Mac OS X Panther Operating System
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran)
CS533 Concepts of Operating Systems Jonathan Walpole.
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
1 Previous lecture review n Out of basic scheduling techniques none is a clear winner: u FCFS - simple but unfair u RR - more overhead than FCFS may not.
CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.
Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum
Operating Systems ECE344 Ashvin Goel ECE University of Toronto OS-Related Hardware.
Kit Cischke 09/09/08 CS Overview  Background  What are we doing here?  A Return to Virtual Machine Monitors  What does Disco do?  Disco: A.
Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, and Mendel Rosenblum Summary By A. Vincent Rayappa.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard et al. Madhura S Rama.
Supporting Multi-Processors Bernard Wong February 17, 2003.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
The Mach System Abraham Silberschatz, Peter Baer Galvin, Greg Gagne Presentation By: Agnimitra Roy.
VMWare MMU Ranjit Kolkar. Designed for efficient use of resources. ESX uses high-level resource management policies to compute a target memory allocation.
MEMORY RESOURCE MANAGEMENT IN VMWARE ESX SERVER 김정수
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
Full and Para Virtualization
Lecture 26 Virtual Machine Monitors. Virtual Machines Goal: run an guest OS over an host OS Who has done this? Why might it be useful? Examples: Vmware,
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
Memory Resource Management in VMware ESX Server By Carl A. Waldspurger Presented by Clyde Byrd III (some slides adapted from C. Waldspurger) EECS 582 –
Presented by: Sagnik Bhattacharya Kingshuk Govil, Dan Teodosiu, Yongjang Huang, Mendel Rosenblum.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
Cellular Disco Resource management using virtual clusters on shared-memory multiprocessors.
Introduction to Operating Systems Concepts
The Multikernel: A New OS Architecture for Scalable Multicore Systems
Disco: Running Commodity Operating Systems on Scalable Multiprocessors
Introduction to Operating Systems
OS Virtualization.
A Survey on Virtualization Technologies
Multiple Processor Systems
Virtual Machines Disco and Xen (Lecture 10, cs262a)
Presented by Neha Agrawal
Virtual Memory: Working Sets
Database System Architectures
CSE 542: Operating Systems
Presentation transcript:

Cellular Disco: resource management using virtual clusters on shared memory multiprocessors Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M. Rosenblum. Presenter: Soumya Eachempati

Motivation Large scale shared-Memory Multiprocessors –Large number of CPUs (32-128) –NUMA Architectures Off-the-shelf OS not scalable –Cannot handle large number of resources –Memory management not optimized for NUMA –No fault containment

Existing Solutions Hardware partitioning –Provides fault containment –Rigid resource allocation –Low resource utilization –Cannot dynamically adapt to workload New Operating System –Provides flexibility and efficient resource management. –Considerable effort and time Goal: To exploit hardware resources to the fullest with minimal effort while improving flexibility and fault-tolerance.

Solution: DISCO(VMM) –Virtual Machine monitor –Addresses NUMA awareness issues and scalability Issues not dealt by DISCO: –Hardware fault tolerance/containment –Resource management policies

Cellular DISCO Approach: Convert Multiprocessor machine into a Virtual Cluster Advantages: –Inherits the benefits of DISCO –Can support legacy OS transparently –Combines the goodness of H/W Partitioning and new OS. –Provides fault containment –Fine grained resource sharing –Less effort than developing an OS

Cellular DISCO Internally structured into semi-independent cells. Much less development effort compared to HIVE No performance loss - with fault containment. WARRANTED DESIGN DECISION: Code of Cellular DISCO is correct.

Cellular Disco Architecture

Resource Management Over-commits resources Gives flexibility to adjust fraction of resources assigned to VM. Restrictions on resource allocation due to fault containment. Both CPU and memory load balancing under constraints. –Scalability –Fault containment –Avoid contention First touch allocation, dynamic migration, replication of hot memory pages

Hardware Virtualization VM’s interface mimics the underlying H/W. Virtual Machine Resources (User-defined) –VCPUs, memory, I/O devices(physical) Physical vs. machine resources(allocated dynamically - priority of VM) –VCPUs - CPUs –Physical - machine pages VMM intercepts privileged instructions –3 modes - user & supervisor(guest OS), kernel(VMM). –Supervisor mode all memory accesses are mapped. Allocates machine memory to back the physical memory. Pmap and memmap data structure. Second level software TLB(L2TLB).

Hardware fault containment

VMM - software fault containment. Cell Inter-cell communication –Inter-processor RPC –Messages - no need for locking since serialized. –Shared memory for some data structures(pmap, memmap). –Low latency, exactly once semantics Trusted system software layer - enables us to use shared memory.

Implementation 1: MIPS R processor SGI Origin 2000 Piggybacked on IRIX 6.4(Host OS) Guest OS - IRIX 6.2 Spawns Cellular DISCO(CD) as a multi- threaded kernel process. –Additional overhead < 2%(time spent in host IRIX) –No fault isolation: IRIX kernel is monolithic Solution: Some host OS support needed-one copy of host OS per cell.

I/O Request execution Cellular Disco piggybacked on IRIX kernel

32 - MIPS R10000

Characteristics of workloads Database - decision support workload Pmake - IO intensive workload Raytrace - CPU intensive Web - kernel intensive web-server workload.

Virtualization Overheads

Fault-containment Overheads Left bar - single cell config Right bar - 8 cell system.

CPU Management Load Balancing mechanisms: –Three types of VCPU migrations - Intra-node, Inter-node, Inter-cell. –Intra node - loss of CPU cache affinity –Inter node - cost of copying L2TLB, higher long term cost. –Inter cell - loss of both cache and node affinity, increases fault vulnerability. Alleviates penalty by replicating pages. Load balancing policies - idle (local load stealer) and periodic (global redistribution) balancers. Each CPU has local run queue of VCPUs. Gang-scheduling –Run all VCPUs of a VM simultaneously.

Load Balancing Low contention distributed data structure - load tree. Contention on higher level nodes List of cells vulnerable to - VCPU. Heavy loaded - idle balancer not enough Local periodic balancer for 8 CPU region.

CPU Scheduling and Results Scheduling - highest-priority gang runnable VCPU that has been waiting. Sends out RPC. 3 configs: 32- processors. a)One VM - 8 VCPUs--8 process raytrace. b)4 VMs c)8 VMs (total of 64 VCPUs). Pmap migrated only when all VCPUs are migrated out of a cell. Data pages also migrated for independence

Memory Management Each cell has its own freelist of pages indexed by the home node. Page allocation request –Satisfied from local node –Else satisfied from same cell –Else borrowed from another cell Memory balancing –Low memory threshold for borrowing and lending –Each VM has priority list of lender cells

Memory Paging Page Replacement –Second-chance FIFO Avoids double paging overheads. Tracking used pages –Use annotated OS routines Page Sharing –Explicit marking of shared pages Redundant Paging –Avoids by trapping every access to virtual paging disk

Implementation 2: FLASH Simulation FLASH has hardware fault recovery support Simulation of FLASH architecture on SimOS Use Fault injector –Power failure –Link failure –Firmware failure (?) Results: 100% fault containment

Fault Recovery Hardware support needed –Determine what resources are operational –Reconfigure the machine to use good resources Cellular Disco recovery –Step 1: All cells agree on a liveset of nodes –Step 2: Abort RPCs/messages to dead cells –Step 3: Kill VMs dependent on failed cells

Fault-recovery Times Recovery times higher for larger memory –Requires memory scanning for fault detections

Summary Virtual Machine Monitor –Flexible Resource Management –Legacy OS support Cellular Disco –Cells provide fault-containment –Create Virtual Cluster –Need hardware support