Virtuoso: Distributed Computing Using Virtual Machines Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University

Slides:



Advertisements
Similar presentations
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Advertisements

PlanetLab: An Overlay Testbed for Broad-Coverage Services Bavier, Bowman, Chun, Culler, Peterson, Roscoe, Wawrzoniak Presented by Jason Waddle.
Distributed Processing, Client/Server and Clusters
Operating System.
1 Scoped and Approximate Queries in a Relational Grid Information Service Dong Lu, Peter A. Dinda, Jason A. Skicewicz Prescience Lab, Dept. of Computer.
Nondeterministic Queries in a Relational Grid Information Service Peter A. Dinda Dong Lu Prescience Lab Department of Computer Science Northwestern University.
PlanetLab Operating System support* *a work in progress.
Introduction CSCI 444/544 Operating Systems Fall 2008.
Xen , Linux Vserver , Planet Lab
Introduction to Operating Systems CS-2301 B-term Introduction to Operating Systems CS-2301, System Programming for Non-majors (Slides include materials.
Technical Architectures
1 Virtual Machine Resource Monitoring and Networking of Virtual Machines Ananth I. Sundararaj Department of Computer Science Northwestern University July.
Towards Virtual Networks for Virtual Machine Grid Computing Ananth I. Sundararaj Peter A. Dinda Prescience Lab Department of Computer Science Northwestern.
Automatic Run-time Adaptation in Virtual Execution Environments Ananth I. Sundararaj Advisor: Peter A. Dinda Prescience Lab Department of Computer Science.
Increasing Application Performance In Virtual Environments Through Run-time Inference and Adaptation Ananth I. Sundararaj Ashish Gupta Peter A. Dinda Prescience.
Increasing Application Performance In Virtual Environments Through Run-time Inference and Adaptation Ananth I. Sundararaj Ashish Gupta Peter A. Dinda Prescience.
MobiDesk: Mobile Virtual Desktop Computing Ricardo A. Baratto, Shaya Potter, Gong Su, Jason Nieh Network Computing Laboratory Columbia University September.
Virtuoso: Distributed Computing Using Virtual Machines Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University
An Introduction to the Prescience Lab Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University
MobiDesk: Mobile Virtual Desktop Computing Ricardo A. Baratto, Shaya Potter, Gong Su, Jason Nieh Network Computing Laboratory Columbia University.
Dynamic Topology Adaptation of Virtual Networks of Virtual Machines Ananth I. Sundararaj Ashish Gupta Peter A. Dinda Prescience Lab Department of Computer.
Hardness of Approximation and Greedy Algorithms for the Adaptation Problem in Virtual Environments Ananth I. Sundararaj, Manan Sanghi, John R. Lange and.
Virtuoso: Distributed Computing Using Virtual Machines Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University
Advanced Computing and Information Systems laboratory A Case for Grid Computing on Virtual Machines Renato Figueiredo Assistant Professor ACIS Laboratory,
An Optimization Problem in Adaptive Virtual Environments Ananth I. Sundararaj Manan Sanghi Jack R. Lange Peter A. Dinda Prescience Lab Department of Computer.
1 Dong Lu, Peter A. Dinda Prescience Laboratory Computer Science Department Northwestern University Virtualized.
Towards Virtual Networks for Virtual Machine Grid Computing Ananth I. Sundararaj Peter A. Dinda Prescience Lab Department of Computer Science Northwestern.
Peter Dinda Department of Computer Science Northwestern University Beth Plale Department.
Virtualization and the Cloud
The Whats and Whys of Whole System Virtualization Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University
Adaptive Virtual Networking For Virtual Machine-based Distributed Computing Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University.
Dynamic Topology Adaptation of Virtual Networks of Virtual Machines Ananth I. Sundararaj Ashish Gupta Peter A. Dinda Prescience Lab Department of Computer.
Inferring the Topology and Traffic Load of Parallel Programs in a VM environment Ashish Gupta Peter Dinda Department of Computer Science Northwestern University.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
Jennifer Rexford Princeton University MW 11:00am-12:20pm SDN Software Stack COS 597E: Software Defined Networking.
+ Virtualization in Clusters and Grids Dr. Lizhe Wang.
VMware vCenter Server Module 4.
The Client/Server Database Environment
CSE598C Virtual Machines and Their Applications Operating System Support for Virtual Machines Coauthored by Samuel T. King, George W. Dunlap and Peter.
Chapter 3.1:Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
STRATEGIES INVOLVED IN REMOTE COMPUTATION

Virtual Infrastructure in the Grid Kate Keahey Argonne National Laboratory.
LECTURE 9 CT1303 LAN. LAN DEVICES Network: Nodes: Service units: PC Interface processing Modules: it doesn’t generate data, but just it process it and.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Wave Relay System and General Project Details. Wave Relay System Provides seamless multi-hop connectivity Operates at layer 2 of networking stack Seamless.
PrimoGENI Tutorial Miguel Erazo, Neil Goldman, Nathanael Van Vorst, and Jason Liu Florida International University Other project participants: Julio Ibarra.
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
Operating Systems CS3502 Fall 2014 Dr. Jose M. Garrido
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
Module 11: Implementing ISA Server 2004 Enterprise Edition.
© 1999, Cisco Systems, Inc. Module 9: Understanding Virtual LANs.
Types of Operating Systems
Copyright © cs-tutorial.com. Overview Introduction Architecture Implementation Evaluation.
Virtual Machines Created within the Virtualization layer, such as a hypervisor Shares the physical computer's CPU, hard disk, memory, and network interfaces.
Types of Operating Systems 1 Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2015.
Networking Material taken mainly from HowStuffWorks.com.
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
Linux Operations and Administration
Hands-On Virtual Computing
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Chapter Objectives In this chapter, you will learn:
The Client/Server Database Environment
GGF15 – Grids and Network Virtualization
Chapter 16: Distributed System Structures
Middleware for Grid Computing On Virtual Machines
Ananth I. Sundararaj Ashish Gupta Peter A. Dinda Prescience Lab
Lecture Topics: 11/1 General Operating System Concepts Processes
An Optimization Problem in Adaptive Virtual Environments
Presentation transcript:

Virtuoso: Distributed Computing Using Virtual Machines Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University

2 People and Acknowledgements Students –Ashish Gupta, Ananth Sundararaj, Dong Lu, Jason Skicewicz, Billy Davidson, Andrew Weinrich Collaborators –In-Vigo project at University of Florida Renato Figueiredo, Jose Fortes Funder –NSF through several awards

3 Outline Motivation Virtuoso Model Virtual networking and remote devices Information services Resource measurement and prediction Resource control Related work Conclusions R. Figueiredo, P. Dinda, J. Fortes, A Case For Grid Computing on Virtual Machines, ICDCS 2003

4 How do we deliver arbitrary amounts of computational power to ordinary people?

5 Distributed and Parallel Computing Interactive Applications

6 How do we deliver arbitrary amounts of computational power to ordinary people? Distributed and Parallel Computing Interactive Applications

7 IBM xSeries virtual cluster (64 CPUs), 1 TB RAID Northwestern Internet Interactivity Environment Cluster, CAVE (~90 CPUs), 8 TB RAID 2 Distributed Optical Testbed Clusters IBM xSeries (14-28 CPUs), 1 TB RAID Nortel Optera Metro Edge Optical Router Distributed Optical Testbed (DOT) Private Optical Network DOT clusters with optical connectivity IBM xSeries (14-28 CPUs), 1 TB RAID: Argonne, U.Chicago, IIT, NCSA, others

8 Grid Computing “Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources” I. Foster, C. Kesselman, S. Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, International J. Supercomputer Applications, 15(3), 2001 Globus, Condor/G, Avaki, EU DataGrid SW, …

9 Complexity from User’s Perspective Process or job model –Lots of complex state: connections, special shared libraries, licenses, file descriptors Operating system specificity –Perhaps even version-specific –Symbolic supercomputer example Need to buy into some “Grid API” Install and learn complex Grid software

10 Users already know how to deal with this complexity at another level

11 Complexity from Resource Owner’s Perspective Install and learn complex Grid software Deal with local accounts and privileges –Associated with global accounts or certificates Protection Support users with different OS, library, license, etc, needs.

12 Virtual Machines Language-oriented VMs –Abstract interpreted machine, JIT Compiler, large library –Examples: UCSD p-system, Java VM,.NET VM Application-oriented VMs –Redirect library calls to appropriate place –Examples: Entropia VM Virtual servers –Kernel makes it appear that a group of processes are running on a separate instance of the kernel –Examples: Ensim, Virtuozzo, SODA, … Virtual machine monitors (VMMs) –Raw machine is the abstraction –VM represented by a single image –Examples: IBM’s VM, VMWare, Virtual PC/Server, Plex/86, SIMICS, Hypervisor, DesQView/TaskView. VM/386

13 VMWare GSX VM

14 Isn’t It Going to Be Too Slow? ApplicationResourceExecTime (10^3 s) Overhead SpecHPC Seismic (serial, medium) Physical16.4N/A VM, local % VM, Grid virtual FS % SpecHPC Climate (serial, medium) Physical9.31N/A VM, local % VM, Grid virtual FS % Experimental setup: physical: dual Pentium III 933MHz, 512MB memory, RedHat 7.1, 30GB disk; virtual: Vmware Workstation 3.0a, 128MB memory, 2GB virtual disk, RedHat 2.0 NFS-based grid virtual file system between UFL (client) and NWU (server) Small relative virtualization overhead; compute-intensive

15 Isn’t It Going To Be Too Slow? Synthetic benchmark: exponentially arrivals of compute bound tasks, background load provided by playback of traces from PSC Relative overheads < 10%

16 Isn’t It Going To Be Too Slow? Virtualized NICs have very similar bandwidth, slightly higher latencies –J. Sugerman, G. Venkitachalam, B-H Lim, “Virtualizing I/O Devices on VMware Workstation’s Hosted Virtual Machine Monitor”, USENIX 2001 Disk-intensive workloads (kernel build, web service): 30% slowdown –S. King, G. Dunlap, P. Chen, “OS support for Virtual Machines”, USENIX 2003

17 Virtuoso Approach: Lower level of abstraction –Raw machines, not processes Mechanism: Virtual machine monitors Our Focus: Middleware support to hide complexity –Ordering, instantiation, migration of machines –Virtual networking and remote devices –Connectivity to remote files, machines –Information services –Monitoring and prediction –Resource control

18 The Virtuoso Model 1.User orders raw machine(s) Specifies hardware and performance Basic software installation available OS, libraries, licenses, etc. 2.Virtuoso creates raw image and returns reference Image contains disk, memory, configuration, etc. 3.User “powers up” machine 4.Virtuoso chooses provider Information service 5.Virtuoso migrates image to provider Efficient network transfer rsync, demand paging, versioned filesystems

19 The Virtuoso Model 6.Provider instantiates machine Virtual networking ties machine back to user’s home network Remote device support makes user’s desktop’s devices available on remote VM Remote display support gives user the console of the machine (VNC) Resource control to give user expected performance 7.User goes to his network admin to get address, routing for his new machine 8.User customizes machine Feeds in CDs, floppies, ftp, up2date, etc.

20 The Virtuoso Model 9.User uses machine Shutdown, hibernate, power-off, throw away 10.Virtuoso continuously monitors and adapts Various mechanisms, all invisible to user Migrating the machine Routing traffic between machines Virtual network topology Predictive scheduling versus reservations Various goals Price Interactivity Information service Resource monitoring and prediction

21 Outline Motivation Virtuoso Model Virtual networking and remote devices Information services Resource measurement and prediction Resource control Related work Conclusions R. Figueiredo, P. Dinda, J. Fortes, A Case For Grid Computing on Virtual Machines, ICDCS 2003

22 Why Virtual Networking? A machine is suddenly plugged into your network. What happens? –Does it get an IP address? –Is it a routeable address? –Does firewall let its traffic through? –To any port? How do we make virtual machine hostile environments as friendly as the user’s LAN?

23 A Layer 2 Virtual Network (VLAN) for the User’s Virtual Machines Why Layer 2? –Protocol agnostic –Mobility –Simple to understand –Ubiquity of Ethernet on end-systems What about scaling? –Number of VMs limited –Hierarchical routing possible because MAC addresses can be assigned hierarchically

24 A Simple Layer 2 Virtual Network ClientServer Remote VM Physical NIC VM monitor Virtual NIC Physical NIC SSH Hostile Remote NetworkFriendly Local Network

25 A Simple Layer 2 Virtual Network ClientServer Remote VM Physical NIC VM monitor Virtual NIC Physical NIC SSH Hostile Remote NetworkFriendly Local Network

26 A Simple Layer 2 Virtual Network ClientServer Remote VM Physical NIC VM monitorBridged Virtual NIC Physical NIC SSH Tunnel Hostile Remote NetworkFriendly Local Network

27 An Overlay Network Bridgeds and connections form an overlay network for routing traffic among virtual machines and the user’s home network Links can trivially be added or removed

28 Bootstrapping the Virtual Network Star topology always possible TCP session from client must have been possible Better topology may be possible Depends on security at each site Topology may change Virtual machines can migrate Bootstrap to higher layers Virtual filesystems

29 Remote Devices ClientServer Remote VM VM monitornbd-servernbd-client Virtual CDROM SSH Tunnel Linux Network Block Device Driver /dev/cdrom /dev/nb0 VMWare CD Image Physical CDROM

30 Extending a Grid Information Service (GIS) to Support Virtual Machines A GIS contains information about the available resources in a grid –Hosts, routers, switches, software, etc. URGIS project at Northwestern –GIS based on the relational data model –Compositional queries (joins) to find collections of resources. “Find physical machines which can instantiate a virtual machine with 1 GB of memory” “Find sets of four different virtual machines on the same network with a total memory between 512 MB and 1 GB” –Nondeterministic query extension for scalability

31 Oracle 9i Back End Windows, Linux, Parallel Server, etc Oracle 9i Back End Windows, Linux, Parallel Server, etc Oracle 9i Back End Windows, Linux, Parallel Server, etc The RGIS Design (Per Site) Oracle 9i Front End transactional inserts and updates using stored procedures, queries using select statements (uses database’s access control) Update Manager Web Interface Content Delivery Network Interface For loose consistency Query Manager and Rewriter Scripts Schema, type hierarchy, indices, PL/SQL stored procedures for each object C API Updates encrypted using asymmetric cryptography on network. Only those with appropriate keys have access RDBMS Use of Oracle is not a requirement of approach External user identification mapped to database users and roles site-to-site (tentative)

32 Motivation for Non-deterministic Queries Queries for compositions of resources easily expressed in SQL: But such queries can be very expensive to execute However, we typically don’t need the entire result set, just some rows, and not always the same ones And we need them in a bounded amount of time Approach: return random sample of result set “Find 2 hosts with Linux that together have 3 GB of RAM” select h1.insertid, h2.insertid from hosts h1, hosts h2 where h1.os=‘LINUX’ and h2.os=‘LINUX’ and h1.mem_mb+h2.mem_mb>=3072

33 Implementing non-deterministic queries select nondeterministically h1.insertid, h2.insertid from hosts h1, hosts h2 where h1.os=‘LINUX’ and h2.os=‘LINUX’ and h1.mem_mb+h2.mem_mb>=3072 within 2 seconds SELECT H1.INSERTID, H2.INSERTID FROM HOSTS H1, HOSTS H2, INSERTIDS TEMP_H1, INSERTIDS TEMP_H2 WHERE (H1.OS='LINUX' AND H2.OS='LINUX' AND H1.MEM_MB+H2.MEM_MB>=3072) AND (H1.INSERTID=TEMP_H1.INSERTID AND TEMP_H1.rand > AND TEMP_H1.rand AND TEMP_H2.rand <= ) Query Manager and Rewriter Random sample of input tables Probability of inclusion determined by time constraint and server load

34 Nondeterministic query performance Select two hosts that together have >3GB of RAM 500,000 host grid generated by GridG Memory distribution according to Smith study of MDS contents Dual Xeon 1 GHz, 2 GB, 240 GB RAID, RGIS2, Oracle 9i Enterprise Average of five trials Meaningful tradeoff between query processing time and result set size is possible

35 Nondeterministic query performance Select n hosts that together have >3GB of RAM 500,000 host grid generated by GridG Memory distribution according to Smith study of MDS contents Dual Xeon 1 GHz, 2 GB, 240 GB RAID, RGIS2, Oracle 9i Enterprise Average of five trials Can use tradeoff to control query time independent of query complexity

36 Deadlines

37 Extending a Grid Information Service (GIS) to Support Virtual Machines Virtual indirection –Each RGIS object has a unique id –Virtualization table associates unique id of virtual resources with unique ids of their constituent physical resources –Virtual nature of resource is hidden unless query explicitly requests it Futures –An RGIS object that does not exist yet –Futures table of unique ids –Future nature of resource hidden unless query explicitly requests it

38 Extending a Resource Monitoring and Prediction System to Support Virtual Machines Measuring and predicting dynamic resource availability to support adaptation –Virtual machine migration –Routing on the virtual network –Application-level adaptation RPS System at Northwestern –Host and network measurements for Unix and Windows –Emphasis on prediction (wide range of linear and nonlinear models) and communication (wide range of transports)

39 RPS Toolkit Extensible toolkit for implementing resource signal prediction systems [CMU-CS ] Growing: RTA, RTSA, Wavelets, GUI, etc Easy “buy-in” for users C++ and sockets (no threads) Prebuilt prediction components Libraries (sensors, time series, communication)

40 Example: Multiscale Network Prediction Large, recent study of predictability Hundreds of NLANR and other traces –Mostly WANs Different resolutions –Binning and low-pass via wavelets Sweet Spot –Predictability often maximized at particular resolution

41 Multiresolution Network Prediction

42 Goal: monitor physical machine and infer behavior inside of virtual machine Current approach: /proc on physical machine to slowdown on resource rate in virtual machine –ARX models Extending a Resource Prediction System to Support Virtual Machines

43 Resource Control Owner has an interest in controlling how much and when compute time is given to a virtual machine Our approach: A language for expressing these constraints, and compilation to real-time schedules, proportional share, etc. Very early stages. Trying to avoid kernel modifications.

44 Outline Motivation Virtuoso Model Virtual networking and remote devices Information services Resource measurement and prediction Resource control Related work Conclusions R. Figueiredo, P. Dinda, J. Fortes, A Case For Grid Computing on Virtual Machines, ICDCS 2003

45 Related Work Collective / Capsule Computing (Stanford) –VMM, Migration/caching, Hierarchical image files Denali (U. Washington) –Highly scalable VMMs (1000s of VMMs per node) CoVirt (U. Michigan) Xenoserver (Cambridge) SODA (Purdue) –Virtual Server, fast deployment of services Internet Suspend/Resume (Intel Labs Pittsburgh) Ensim –Virtual Server, widely used for web site hosting –WFQ-based resource control released into open-source Linux kernel Virtouzzo (SWSoft) –Ensim competitor Available VMMs: IBM’s VM, VMWare, Virtual PC/Server, Plex/86, SIMICS, Hypervisor, DesQView/TaskView. VM/386

46 Current Status (At Northwestern) Bridged components done –Mechanism for virtual networking –No policy yet Very preliminary system for acquiring and instantiating VMs done RGIS schema extensions done Work In Progress –Remote devices (management) –Virtual networking (policy + adaptation) –VM Monitoring using RPS

47 For More Information Prescience Lab (Northwestern University) – ACIS (University of Florida) –