Kit Cischke 09/09/08 CS 5090. Overview  Background  What are we doing here?  A Return to Virtual Machine Monitors  What does Disco do?  Disco: A.

Slides:



Advertisements
Similar presentations
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Advertisements

Disco: Running Commodity Operation Systems on Scalable Multiprocessors E Bugnion, S Devine, K Govil, M Rosenblum Computer Systems Laboratory, Stanford.
1 Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, Mendel Rosenblum, Stanford University, 1997 Presented.
Segmentation and Paging Considerations
Virtual Machines What Why How Powerpoint?. What is a Virtual Machine? A Piece of software that emulates hardware.  Might emulate the I/O devices  Might.
Disco Running Commodity Operating Systems on Scalable Multiprocessors Presented by Petar Bujosevic 05/17/2005 Paper by Edouard Bugnion, Scott Devine, and.
Multiprocessors CS 6410 Ashik Ratnani, Cornell University.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa.
OS Fall ’ 02 Introduction Operating Systems Fall 2002.
G Robert Grimm New York University Disco.
Architectural Support for Operating Systems. Announcements Most office hours are finalized Assignments up every Wednesday, due next week CS 415 section.
1: Operating Systems Overview
Xen and the Art of Virtualization A paper from the University of Cambridge, presented by Charlie Schluting For CS533 at Portland State University.
Threads vs. Processes April 7, 2000 Instructor: Gary Kimura Slides courtesy of Hank Levy.
Operating System Structure. Announcements Make sure you are registered for CS 415 First CS 415 project is up –Initial design documents due next Friday,
Memory Management 2010.
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
OS Spring’03 Introduction Operating Systems Spring 2003.
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
1 Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, and Mendel Rosenblum, Stanford University, 1997.
Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.
OS Spring’04 Introduction Operating Systems Spring 2004.
November 1, 2004Introduction to Computer Security ©2004 Matt Bishop Slide #29-1 Chapter 33: Virtual Machines Virtual Machine Structure Virtual Machine.
Virtual Machine Monitors CSE451 Andrew Whitaker. Hardware Virtualization Running multiple operating systems on a single physical machine Examples:  VMWare,
Multiprocessors Deniz Altinbuken 09/29/09.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Zen and the Art of Virtualization Paul Barham, et al. University of Cambridge, Microsoft Research Cambridge Published by ACM SOSP’03 Presented by Tina.
Computer System Architectures Computer System Software
Cellular Disco: resource management using virtual clusters on shared memory multiprocessors Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran)
CS533 Concepts of Operating Systems Jonathan Walpole.
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum
Operating Systems ECE344 Ashvin Goel ECE University of Toronto OS-Related Hardware.
IT253: Computer Organization
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, and Mendel Rosenblum Summary By A. Vincent Rayappa.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard et al. Madhura S Rama.
Supporting Multi-Processors Bernard Wong February 17, 2003.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
CS 346 – Chapter 2 OS services –OS user interface –System calls –System programs How to make an OS –Implementation –Structure –Virtual machines Commitment.
We will focus on operating system concepts What does it do? How is it implemented? Apply to Windows, Linux, Unix, Solaris, Mac OS X. Will discuss differences.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Introduction to virtualization
Review °Apply Principle of Locality Recursively °Manage memory to disk? Treat as cache Included protection as bonus, now critical Use Page Table of mappings.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Full and Para Virtualization
Lecture 26 Virtual Machine Monitors. Virtual Machines Goal: run an guest OS over an host OS Who has done this? Why might it be useful? Examples: Vmware,
1 Lecture 1: Computer System Structures We go over the aspects of computer architecture relevant to OS design  overview  input and output (I/O) organization.
Operating Systems CSE 411 Revision and final thoughts Revision and final thoughts Dec Lecture 33 Instructor: Bhuvan Urgaonkar.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
CSE 451: Operating Systems Winter 2015 Module 25 Virtual Machine Monitors Mark Zbikowski Allen Center 476 © 2013 Gribble, Lazowska,
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
Background Computer System Architectures Computer System Software.
Lecture 4 Page 1 CS 111 Online Modularity and Memory Clearly, programs must have access to memory We need abstractions that give them the required access.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Virtualization.
Virtual Machine Monitors
Kernel Design & Implementation
Lecture 24 Virtual Machine Monitors
Disco: Running Commodity Operating Systems on Scalable Multiprocessors
Page Replacement.
CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors
Virtualization Dr. S. R. Ahmed.
Xen and the Art of Virtualization
CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors
COMP755 Advanced Operating Systems
Presentation transcript:

Kit Cischke 09/09/08 CS 5090

Overview  Background  What are we doing here?  A Return to Virtual Machine Monitors  What does Disco do?  Disco: A Return to VMMs  How does Disco do it?  Experimental Results  How well does Disco dance?

The Basic Problem  With the explosion of multiprocessor machines, especially of the NUMA variety, the problem of effectively using the machines becomes more immediate.  NUMA = Non-Uniform Memory Access – shows up a lot in clusters.  The authors point out that the problem applies to any major hardware innovation, not just multiprocessors.

Potential Solution  Solution: Rewrite the operating system to address fault-tolerance and scalability.  Flaws:  Rewriting will introduce bugs.  Bugs can disrupt the system or the applications.  Instabilities are usually less-tolerated on these kinds of systems because of their application space.  You may not have access to the OS.

Not So Good  Okay. So that wasn’t so good. What else do we have?  How about Virtual Machine Monitors?  A new twist on an old idea, which may work better now that we have faster processors.

Enter Disco Disco is a system VM that presents a similar fundamental machine to all of the various OS’s that might be running on the machine. These can be commodity OS’s, uniprocessor, multiprocessor or specialty systems.

Disco VMM  Fundamentally, the hardware is a cluster, but Disco introduces some global policies to manage all of the resources, which makes for better usage of the hardware.  We’ll use commodity operating systems and write the VMM. Rather than millions of lines of code, we’ll write a few thousand.  What if the resource needs exceed that of the commodity OS?

Scalability  Very simple changes to the commodity OS (maybe on the driver level or kernel extension) can allow virtual machines to share resources.  E.g., a parallel database could have a cache in shared memory and multiple virtual processors running on virtual machines.  Support for specialized OS’s that need the power of multiple processors but not all of the features offered by a commodity OS.

Further Benefits  Multiple copies of an OS naturally addresses scalability and fault containment.  Need greater scaling? Add a VM.  Only the monitor and the system protocols (NFS, etc.) need to scale.  OS or application crashes? No problem. The rest of the system is isolated.  NUMA memory management issues are addressed.  Multiple versions of different OS’s provide legacy support and convenient upgrade paths.

Not All Sunshine & Roses  VMM Overhead  Additional exception processing, instruction execution and memory to virtualize hardware.  Privileged instructions aren’t directly executed on the hardware, so we need to fake it. I/O requests need to be intercepted and remapped.  Memory overhead is rough too.  Consider having 6 copies of Vista in memory simultaneously.  Resource Management  VMM can’t make intelligent decisions about code streams without info from OS.

One Last Disadvantage  Communication  Sometimes resources simply can’t be shared the way we want.  Most of these can be mitigated though.  For example, most operating systems have good NFS support. So use it.  But… We can make it even better! (Details forthcoming.)

Introducing Disco  VMM designed for the FLASH multiprocessor machine  FLASH is an academic machine designed at Stanford University  Is a collection of nodes containing a processor, memory, and I/O. Use directory cache coherence which makes it look like a CC-NUMA machine.  Has also been ported to a number of other machines.

Disco’s Interface  The virtual CPU of Disco is an abstraction of a MIPS R  Not only emulates but extends (e.g., reduces some kernel operations to simple load/store instructions.  A presented abstraction of physical memory starting at address 0 (zero).  I/O Devices  Disks, network interfaces, interrupts, clocks, etc.  Special interfaces for network and disks.

Disco’s Implementation  Implemented as a multi-threaded shared- memory program.  Careful attention paid to memory placement, cache-aware data structures and processor communication patterns.  Disco is only 13,000 lines of code.  Windows Server ~50,000,000  Red Hat ~ 30,000,000  Mac OS X ~86,000,000

Disco’s Implementation  The execution of a virtual processor is mapped one-for-one to a real processor.  At each context switch, the state of a processor is made to be that of a VP.  On MIPS, Disco runs in kernel mode and puts the processor in appropriate modes for what’s being run  Supervisor mode for OS, user mode for apps  Simple scheduler allows VP’s to be time- shared across the physical processors.

Disco’s Implementation  Virtual Physical Memory  This discussion goes on for 1.5 pages. To sum up:  The OS makes requests to physical addresses, and Disco translates them to machine addresses.  Disco uses the hardware TLB for this.  Switching a different VP onto a new processor requires a TLB flush, so Disco maintains a 2 nd -level TLB to offset the performance hit.  There’s a technical issue with TLBs, Kernel space and the MIPS processor that threw them for a loop.

NUMA Memory Management In an effort to mitigate the non- uniform effects of a NUMA machine, Disco does a bunch of stuff: Allocating as much memory to have “affinity” to a processor as possible. Migrates or replicates pages across virtual machines to reduce long memory accesses.

Virtual I/O Devices  Obviously Disco needs to intercept I/O requests and direct them to the actual device.  Primarily handled by installing drivers for Disco I/O in the guest OS.  DMA provides an interesting challenge, in that the DMA addresses need the same translation as regular accesses.  However, we can do some especially cool things with DMA requests to disk.

Copy-on-Write Disks  All disk DMA requests are caught and analyzed. If the data is already in memory, we don’t have to go to disk for it.  If the request is for a full page, we just update a pointer in the requesting virtual machine.  So what?  Multiple VM’s can share data without being aware of it. Only modifying the data causes a copy to be made.  Awesome for scaling up apps by using multiple copies of an OS. Only really need one copy of the OS kernel, libraries, etc.

My Favorite – Networking  The Copy-on-write disk stuff is great for non- persistent disks. But what about persistent ones? Let’s just use NFS.  But here’s a dumb thing: A VM has a copy of information it wants to send to another VM on the same physical machine. In a naïve approach, we’d let that data be duplicated, taking up extra memory pointlessly.  So, let’s use copy-on-write for our network interface too!

Virtual Network Interface  Disco provides a virtual subnet for VM’s to talk to each other.  This virtual device is Ethernet-like, but with no maximum transfer size.  Transfers are accomplished by updating pointers rather than actually copying data (until absolutely necessary).  The OS sends out the requests as NFS requests.  “Ah,” but you say. “What about the data locality as a VM starts accessing those files and memory?”  Page replication and migration!

About those Commodity OS’s  So what do we really need to do to get these commodity operating systems running on Disco?  Surprisingly a lot and a little.  Minor changes were needed to IRIX’s HAL, amounting to 2 header files and 15 lines of assembly code. This did lead to a full kernel recompile though.  Disco needs device drivers. Let’s just steal them from IRIX!  Don’t trap on every privileged register access. Convert them into normal loads/stores to special address space, linked to the privileged registers.

More Patching  “Hinting” added to HAL to help the VMM not do dumb things (or at least do fewer dumb things).  When the OS goes idle, the MIPS (usually) defaults to a low power mode. Disco just stops scheduling the VM until something interesting happens.  Other minor things were done, but that required patching the kernel.

SPLASHOS  Some high-performance apps might need most or all of the machine. The authors wrote a “thin” operating system to run SPLASH-2 applications.  Mostly proof-of-concept.

Experimental Results  Bad Idea: Target your software for a machine that doesn’t physically exist.  Like, I don’t know, FLASH?  Disco was validated using two alternatives:  SimOS  SGI Origin2000 Board that will form the basis of FLASH

Experimental Design  Use 4 representative workloads for parallel applications:  Software Development (Pmake of a large app)  Hardware Development (Verilog simulator)  Scientific Computing (Raytracing and a sorting algorithm)  Commercial Database (Sybase)  Not only are they representative, but they each have characteristics that are interesting to study  For example, Pmake is multiprogrammed, lots of short-lived processes, OS & I/O intensive.

Simplest Results Graph Overhead of Disco is pretty modest compared to the uniprocessor results. Raytrace is the lowest, at only 3%. Pmake is the highest, at 16%. The main hits come from additional traps and TLB misses (from all the flushing Disco does). Interestingly, less time is spent in the kernel in Raytrace, Engineering and Database. Running a 64-bit system mitigates the impact of TLB misses.

Memory Utilization Key thing here is how 8 VM’s doesn’t require 8x the memory of 1 VM. Interestingly, we have 8 copies of IRIX running in less than 256 MB of physical RAM!

Scalability Page migration and replication were disabled for these runs. All use 8 processors and 256 MB of memory. IRIX has a terrible bottleneck in synchronizing the system’s memory management code It also has a “lazy” evaluation policy in the virtual memory system that drags “normal” RADIX down. Overall though, check out those performance gains!

Page Migration Benefits The 100% UMA results give a lower bound on performance gains from page migration and replication. But in short, the policies work great.

Real Hardware  Experiences on the real SGI hardware pretty much confirms the simulations, at least at the uniprocessor level.  Overheads tend to be in the range of 3-8% on Pmake and the Engineering simulation.

Summing Up  Disco works pretty well.  Memory usage scales well, processor utilization scales well.  Performance overheads are relatively small for most loads.  Lots of engineering challenges, but most seem to have been overcome.

Final Thoughts  Everything in this paper seems, in retrospect, to be totally obvious. However, the combination of all of these factors seems like it would have taken just a ton of work.  Plus, I don’t think I could have done it half as well, to be honest.  Targeting a non-existent machine seems a little silly.  Overall, interesting paper.